Demo GCS

Pre Requisite: Note the GCS Link from GCP

You would need to note down the GCS Link of the Model file from GCP. A sample of how the S3 link is like -> gs://pp-samplebucket/gpt2-medium-1.5gb.zip

Steps to import model file from Google Cloud Storage

Step 1: Add Model in your workspace.

Navigate to your desired workspace in Inferless and Click on "Add a custom model" button that you see on the top right. An import wizard will open up.

Click on Add Model

Step 2: Choose the source of your model.

Since we are using a model from GCS in this example, select Google Cloud Storage as the method of upload.

After you have selected Repo

Step 3: Enter the model details

Provide the Cloud URL, which is a GCS Link of the Model file from GCP.
Model Name: The desired name of the model that you wish to give.
Sample Input: Enter an example of how the input should be formatted for the model in JSON format.
Sample Output: Enter an example of how the output would be formatted for the model in JSON format.
Enter the details as noted
- In case you would like to set up Automatic rebuild for your model, enable it
  - You would need to set up a webhook for this method. Click here for more details.

Step 4: Configure Machine and Environment.

Choose the type of machine, and specify the minimum and maximum number of replicas for deploying your model.
- Min scale -
```
The number of inference workers to keep on at all times.  
```
- Max scale -
```
The maximum number of inference workers to allow at any point of time  
```
- Configure Custom Runtime ( If you have pip or apt packages), choose Volume, Secrets and set Environment variables like Inference Timeout / Container Concurrency / Scale Down Timeout

Set runtime and configuration

Step 5: Review your model details

Once you click “Continue,” you will be able to review the details added for the model.
If you would like to make any changes, you can go back and make the changes.
Once you have reviewed everything, click Deploy to start the model import process.

Review all the details carefully before proceeding

Step 6 : Run your model

Once you click submit, the model import process would start.
It may take some time to complete the import process, and during this time, you will be redirected to your workspace and can see the status of the import under "In Progress/Failed" tab.

View the model under `In-Progress/ Failed`
If you encounter any errors during the model import process or if you want to view the build logs for any reason, you can click on the three dots menu and select “View build logs”. This will show you a detailed log of the import process, which can help you troubleshoot any issues you may encounter.
Post-upload, the model will be available under “My Models”
You can then select the model and go to -> API -> Inference Endpoint details. Here you would find the API endpoints that can be called. You can click on the copy button on the right and can call your model.

Under the API Tab, you can view the API endpoint details.

Extra Step: Getting API key details

You can now call using this from your end. The inference result would be the output for these calls.
In case you need help with API Keys:
- Click on settings, available on the top, next to your Workspace Name
- Click on “Workspace API keys”
- You can view the details of your key or generate a new one
  
  Sample for now

Using CLI

Make sure you follow this below folder structure in the zip file

.
|--config.pbtxt (optional)
|--input.json
|--output.json
|--1/
|--|--model.xxx ( pt/onnx/tf )

Connect your Google cloud storage account using this below command and enter the name, access-key and secret-key
```
inferless integration add GCS --name <integrationname> --gcp-json-path <jsoncredentialspath> 
```

Once your integration is done. You can initialise the model using this command

inferless init file --name <name> --provider gcs --url <url> --framework <pytorch/onnx/tensorflow> --autobuild

Now that your model is initialised.

To do default deployment use this command

inferless deploy --gpu t4

To do customised deployment use this command

inferless deploy --gpu t4 --region <regionname> --runtime <runtimename> --volume <volumename> --fractional

Getting Started

Concepts

Integrations

API Reference

Model Import

Pre Requisite: Note the GCS Link from GCP

Steps to import model file from Google Cloud Storage

Step 1: Add Model in your workspace.

Step 2: Choose the source of your model.

Step 3: Enter the model details

Step 4: Configure Machine and Environment.

Step 5: Review your model details

Step 6 : Run your model

Extra Step: Getting API key details

Using CLI

Getting Started

Concepts

Integrations

API Reference

Model Import

​Pre Requisite: Note the GCS Link from GCP

​Steps to import model file from Google Cloud Storage

​Step 1: Add Model in your workspace.

​Step 2: Choose the source of your model.

​Step 3: Enter the model details

​Step 4: Configure Machine and Environment.

​Step 5: Review your model details

​Step 6 : Run your model

​Extra Step: Getting API key details

​Using CLI

Pre Requisite: Note the GCS Link from GCP

Steps to import model file from Google Cloud Storage

Step 1: Add Model in your workspace.

Step 2: Choose the source of your model.

Step 3: Enter the model details

Step 4: Configure Machine and Environment.

Step 5: Review your model details

Step 6 : Run your model

Extra Step: Getting API key details

Using CLI