@inferless.method(gpu="T4")
is used to specify the function that you want to run for inference you can specify the gpu to be used here.
gpu
parameter which specifies the GPU that you want to use on the remote server.
Currently, the supported GPUs are T4
& A10
& A100
Create a new app object with inferless.Cls
@app.load
is used to specify the function that you want to load the model before inference
@app.infer
is used to specify the function that you want to run for inference.
@inferless.local_entry_point
annotation lets you mark a module-level function as the local entry point for remote run.
@inferless.local_entry_point
annotation lets you mark a module-level function as the local entry point for remote run.
apt-get
) python packages (packages installed using pip
) and run commands (shell commands) that you want to configure on the remote server.
runtime.yaml
.gitignore
file is present in the working directory, the files mentioned in the .gitignore
file will not be copied to the server.
You can also specify a custom ignore file using the --exclude
-e
option.
inferless remote-run app.py -c runtime.yaml -e custom_ignore_file.txt --gpu A10 --prompt "Hello, Write a story about a dragon"