Getting Started
Concepts
- Overview
- Cli import
- Handling Input / Output with Inferless
- Handling variable input array
- Handling variable output array
- Bring custom packages
- Working with Files on Inferless
- Working with NFS - My Volumes
- Configuring Concurrent Requests
- Dynamic Batching
- Streaming with SSE Events
- Automatic Build on Inferless
- Managing Secrets on Inferless
Integrations
- Hugging face
- Git (Custom Code)
- Docker
- Cloud Buckets - S3/ GCS
- File Import from System
API Reference
- AWS PrivateLink - Inferless
- Remote Run: Run your code remotely
- Model Endpoint
- Debugging your Model with Logs
- Version Management
- Model Management APIs
Model Import
- File Structure Requirements
- Input / Output Schema
- Bring custom packages
- Automatic Build via webhooks
- Configuring the Inference Service
- My Volumes
- My Secrets
Input / Output Schema
Input Schema
You have to define the input_schema.py in your GitHub/Gitlab repository this will help us create the Input parameters :
For each input, there are 3 fields required
-
datatype: “STRING”, “BOOL”, “INT8”, “INT16”, “INT32”, “FP16” “FP32”, “UINT8”, “UINT16”, “UINT32”, “UINT64”, “INT64” , “FP64” , “BYTES”, “BF16”
-
shape: The length of the array, If the shape is [1] you will get the variable, if the array > 1 you will get an array, If the length is variable you can put -1
-
required: If the parameter is required in all API calls
-
example( optional ): Sample value for calling the API
In code
def infer(self, inputs):
prompt = inputs["prompt"] # "There is a fine house in the forest"
shape = inputs["shape"] # [ 512,1 ]
In input_schema.py
INPUT_SCHEMA = {
"prompt": {
'datatype': 'STRING',
'required': True,
'shape': [1],
'example': ["There is a fine house in the forest"]
},
'shape': {
'datatype': 'INT8',
'required': False,
'example': [ 512, 1 ],
'shape': [2]
},
}
Output Schema
You can return any dictionary in the return statement of app.py. You don’t need to provide any configuration.
Returning Dicts
# Example Return Statement
return { "label_1" : 0.398 , "label_2" : 0.563, "label_3" : 0.434 }
Returning Variable Length Array
# Example Return Statement
return { "generated_images_base64" : [ img_str1 , img_str2 , img_str3 ] }
Returning Dictionary with Variable keys
# Example Return Statement
dict = {"label_x": 0.4554 , "label_y", 0.3232 }
return { "result": json.dumps(dict) }
Depreciated - Input / Output Json
Sample Input
The input JSON
should contain the following fields:
-
name
- the name should match the name of the input/output that is specified in the model -
shape
- the shape of the input array for the model. if the shape is variable use -1 -
datatype
- One of the formats is given below:
BOOL, UINT8, UINT16, UINT32, UINT64, INT8, INT16, INT32, INT64, FP16, FP32, FP64, BYTES, BF16.
For more details, you can view the matrix below the page
data
- An example of the data.
Note: Since an Array of Inputs and Outputs is expected, you may have to convert the dimension of your array with an additional dimension of no of requests.
An example of a model that takes attention_mask and input_ids Tensor Arrays of shape [10] for 1 Request will be
{ "inputs" :
[
{
"name": "attention_mask",
"shape": [1,10],
"datatype": "INT64",
"data": [[1, 1, 1, 1, 1, 1, 1, 1, 1, 1]]
},
{
"name": "input_ids",
"shape": [1,10],
"datatype": "INT64",
"data": [[3041, 5372, 502, 416, 597, 2420, 345, 1549, 588, 13]]
}
]
}
Example of a Model that takes prompt (as string) as the input ( Stable Diffusion )
{ "inputs" :
[
{
"name": "prompt",
"shape": [
1
],
"datatype": "BYTES",
"data": [
"Once upon a time"
]
}
]
}
Example of Model using the Param
// Code to take the input var in the infer function.
def infer(self, inputs):
prompt = inputs["prompt"]
Sample Output
There is the output field that your model returns, Having a sample helps us validate that the name you are expecting in output is generated by the model.
{ "outputs" :
[
{
"name": "generated_text",
"shape": [
1
],
"datatype": "BYTES",
"data": [
"Sample Output"
]
}
]
}
Returning Dicts
# Example Return Statement
return { "label_1" : 0.398 , "label_2" : 0.563, "label_3" : 0.434 }
Corresponding Output.json for the Python code
// Sample
{
"outputs": [
{
"name": "label_1",
"shape": [
1
],
"datatype": "FP64",
"data": [
"Sample Output"
]
},
{
"name": "label_2",
"shape": [
1
],
"datatype": "FP64",
"data": [
"Sample Output"
]
},
{
"name": "label_3",
"shape": [
1
],
"datatype": "FP64",
"data": [
"Sample Output"
]
}
]
}
Returning Variable Length Array
# Example Return Statement
return { "generated_images_base64" : [ img_str1 , img_str2 , img_str3 ] }
Corresponding Output.json for the Python code, Making the Shape parameter -1 will allow variable length items
// Sample
{
"outputs": [
{
"data": [
""
],
"name": "generated_image_base64",
"shape": [
-1
],
"datatype": "BYTES"
}
]
}
Returning Dictionary with Variable keys
# Example Return Statement
# dict = {"label_x": 0.4554 , "label_y", 0.3232 }
return { "result": json.dumps(dict) }
Corresponding Output.json for the Python code
// Sample
{
"outputs": [
{
"data": [ "Sample" ],
"name": "result",
"shape": [
1
],
"datatype": "BYTES"
}
]
}
Example
Below is a representation after giving the details during model import.