Docker
Bring your own docker container images. ( This might have higher coldstarts )
You can provide a custom docker image to host your model on the inferless platform. Supported formats for custom docker images.
-
Private Image URI
-
Dockerfile
For Private image URI
-
Docker Hub
-
Amazon Elastic Container Registry (ECR)
Providers for Dockerfile
-
GitHub Repository
-
Gitlab Project
An Inferless Platform can pull or build the image based on the format and host the model.
Integration for Private Image URI
Requirements for Docker Import
-
The image should be created for the platform
linux/amd64
-
These are details that are needed while importing the model from DockerHub or ECR
- Health check API:- GET API - returns status 200 for a healthy state
- Infer API - POST API - returns s dict
- Server Port - The port on which the model server is running
- Input - Sample Json input for model inference
# Sample Input
{
"prompt": "Once upon a time ",
"max_length": 50
}
For Docker File Integration
Requirements for Docker Import
- These are details that are needed while importing the model from GITHUB or GITLAB
- Health check API:- GET API - returns status 200 for a healthy state
- Infer API - POST API - Returns s Dict
- Server Port - The Port on which the model server is hosted
- Input - Sample Json input for model inference
- Dockerfile - The file for building the Docker
Example
GitHub - inferless/inferless-docker-import-examplesGitHub
UI walkthrough for Docker Hub Import
Step 1: Add Model in your workspace.
- Navigate to your desired workspace in Inferless and Click on
"Add a custom model"
button that you see on the top right. An import wizard will open up.
Click on Add Model
Step 2: Choose the source of your model.
-
Since we are using a model from Docker Hub, select
Docker Hub
as the method of upload from theProvider
list. -
To proceed with the upload, you will need to connect your
Docker Hub account
. This is a mandatory step as this helps us download the image from your Docker Hub.
After you have selected Repo
Step 3: Enter the model details
- Provide the details of
Image URL
,Health check API
,Infer API
andServer Port
Model Name
: The desired name of the model that you wish to give
Enter the details as noted
- In case you would like to set up
Automatic rebuild
for your model, enable it- You would need to set up a webhook for this method. Click here for more details.
Step 4: Configure Machine and Environment.
- Choose the type of machine, and specify the minimum and maximum number of replicas for deploying your model.
- Min scale -
The number of inference workers to keep on at all times.
- Max scale -
The maximum number of inference workers to allow at any point of time
- Configure Custom Runtime ( If you have pip or apt packages), choose Volume, Secrets and set Environment variables like Inference Timeout / Container Concurrency / Scale Down Timeout
Set runtime and configuration
Step 5: Review your model details
-
Once you click “Continue,” you will be able to review the details added for the model.
-
If you would like to make any changes, you can go back and make the changes.
-
Once you have reviewed everything, click
Deploy
to start the model import process.
Review all the details carefully before proceeding
Step 6 : Run your model
-
Once you click submit, the model import process would start.
-
It may take some time to complete the import process, and during this time, you will be redirected to your workspace and can see the status of the import under
"In Progress/Failed"
tab.
View the model under `In-Progress/ Failed`
-
If you encounter any errors during the model import process or if you want to view the build logs for any reason, you can click on the three dots menu and select “View build logs”. This will show you a detailed log of the import process, which can help you troubleshoot any issues you may encounter.
-
Post-upload, the model will be available under “My Models”
-
You can then select the model and go to
-> API -> Inference Endpoint details.
Here you would find the API endpoints that can be called. You can click on the copy button on the right and can call your model.
Under the API Tab, you can view the API endpoint details.
Extra Step: Getting API key details
- You can now call using this from your end. The inference result would be the output for these calls.
- In case you need help with API Keys:
- Click on settings, available on the top, next to your Workspace Name
- Click on “Workspace API keys”
- You can view the details of your key or generate a new one
Sample for now
UI walkthrough for Dockerfile Import
Step 1: Add Model in your workspace.
- Navigate to your desired workspace in Inferless and Click on
"Add a custom model"
button that you see on the top right. An import wizard will open up.
Click on Add Model
Step 2: Choose the source of your model.
-
Since we are using a model from Dockerfile, select
Dockerfile
as the method of upload from theProvider
list. -
To proceed with the upload, you will need to connect your
GitHub/GitLab account
. This is a mandatory step as this helps us get the file from your repository.
After you have selected Repo
Step 3: Enter the model details
- Select your
Github Repository
and thebranch
. - Provide the details of
Health check API
,Infer API
,Server Port
and theDocker File Path
Model Name
: The desired name of the model that you wish to give
Enter the details as noted
- In case you would like to set up
Automatic rebuild
for your model, enable it- You would need to set up a webhook for this method. Click here for more details.
Step 4: Configure Machine and Environment.
- Choose the type of machine, and specify the minimum and maximum number of replicas for deploying your model.
- Min scale -
The number of inference workers to keep on at all times.
- Max scale -
The maximum number of inference workers to allow at any point of time
- Configure Custom Runtime ( If you have pip or apt packages), choose Volume, Secrets and set Environment variables like Inference Timeout / Container Concurrency / Scale Down Timeout
Set runtime and configuration
Step 5: Review your model details
-
Once you click “Continue,” you will be able to review the details added for the model.
-
If you would like to make any changes, you can go back and make the changes.
-
Once you have reviewed everything, click
Deploy
to start the model import process.
Review all the details carefully before proceeding
Step 6 : Run your model
- Once you click submit, the model import process would start.
-
It may take some time to complete the import process, and during this time, you will be redirected to your workspace and can see the status of the import under
"In Progress/Failed"
tab.
View the model under `In-Progress/ Failed`
-
If you encounter any errors during the model import process or if you want to view the build logs for any reason, you can click on the three dots menu and select “View build logs”. This will show you a detailed log of the import process, which can help you troubleshoot any issues you may encounter.
-
Post-upload, the model will be available under “My Models”
-
You can then select the model and go to
-> API -> Inference Endpoint details.
Here you would find the API endpoints that can be called. You can click on the copy button on the right and can call your model.
-
Under the API Tab, you can view the API endpoint details.
Extra Step: Getting API key details
- You can now call using this from your end. The inference result would be the output for these calls.
- In case you need help with API Keys:
- Click on settings, available on the top, next to your Workspace Name
- Click on “Workspace API keys”
- You can view the details of your key or generate a new one
Sample for now