To test a model quickly, you can use the inferless run command. This command will run the model locally for you test and display the endpoint.

Usage

inferless run

You will need to have ‘app.py’ and ”

Options:

  • -r, --runtime TEXT: custom runtime name or file location. if not provided default Inferless runtime will be used.
  • -t, --runtime-type TEXT: Type of runtime to deploy [fastapi, triton]. Defaults to triton. [default: triton]
  • -n, --name TEXT: Name of the model to deploy on inferless [default: inferless-model]
  • -f, --env-file TEXT: Path to an env file containing environment variables (one per line in KEY=VALUE format)
  • -e, --env TEXT: Environment variables to set for the runtime (e.g. ‘KEY=VALUE’). If the env variable contains special chars please escape them.
  • -u, --docker-base-url TEXT: Docker base url. Defaults to system default, feteched from env
  • --volume TEXT: Volume name.
  • -f, --framework TEXT: Framework type. (PYTORCH, ONNX, TENSORFLOW) [default: PYTORCH]
  • -i, --input-schema TEXT: Input schema path. (Default: input_schema.json) [default: input_schema.py]
  • -i, --input TEXT: Input json path
  • -o, --output TEXT: Output json path
  • --runtimeversion TEXT: Runtime version (default: latest). Examples:
inferless run

In Windows

Make sure you have the Deamon enabled on the port

inferless run --docker-base-url tcp://localhost:2375

Advance Usage

Executing Into the Container

To access the container’s shell, you can use the following command:

docker exec -it <CONTAINER_ID> /bin/bash

Replace <CONTAINER_ID> with the actual container ID of your running Docker container.

Accessing and Modifying Model Files

Inside the container, the app.py and model files are located in the model directory. You can access them by navigating to the following path:

cd /models/<model_name>/1

Replace <model_name> with the name of your model. This directory contains all the files related to your model, including app.py.

You can modify the app.py file and the model files as needed. However, for the changes to take effect, you need to unload and then reload the model using the following steps.

Unloading the Model

To unload the model, execute the following curl command:

curl --location 'http://localhost:8000/v2/repository/models/<model_name>/unload' \
--header 'Content-Type: application/json' \
--request POST

Replace with the name of your model. This command will unload the model from memory, allowing you to make changes to the model files.

Loading the Model

After modifying the files, reload the model using this curl command:

curl --location 'http://localhost:8000/v2/repository/models/<model_name>/load' \
--header 'Content-Type: application/json' \
--request POST

This command loads the updated model back into memory, making your changes effective.

NOTE: .inferless-logs directory is created to store the logs. You can add this directory to your .gitignore file to exclude it from version control.