CLI
inferless remote run
Use the command inferless remote-run
to run model inference on remote GPU from your local machine.
This command will execute a particular function or class in the cloud environment.
Getting Started
Let’s assume you have an app.py with 2 functions init and load, You need to 4 lines of code to your app.py to make it run with remote run by initialising the inferless cls and adding functional annotations
from threading import Thread
from inferless import Cls # Add the inferless library
model_id = 'meta-llama/Llama-2-7b-chat-hf'
InferlessCls = Cls(gpu="A10") # Init the class with the type of GPU you want to run with
class InferlessPythonModel:
@InferlessCls.load # Add the annotation
def initialize(self):
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, TextIteratorStreamer
self.model = AutoModelForCausalLM.from_pretrained(
"meta-llama/Llama-2-7b-chat-hf",
torch_dtype=torch.float16,
device_map='auto',
token=token,
)
self.tokenizer = AutoTokenizer.from_pretrained(model_id, token=token)
@InferlessCls.infer # Add the annotation
def infer(self, inputs):
message = inputs['message']
chat_history = inputs['chat_history'] if 'chat_history' in inputs else []
system_prompt = inputs['system_prompt'] if 'system_prompt' in inputs else ''
result = self.run_function(
message=message,
chat_history=chat_history,
system_prompt=system_prompt,
)
return {"generated_text": result}
model = InferlessPythonModel() # Call the test classs
print(model.infer({'message': 'Hello'}))
Usage
inferless remote-run <filename>
Params:
--config -c
: Path to the runtime configuration file--exclude -e
: Path to the ignore file. This file contains the list of files that you want to exclude from the remote run similar to.gitignore
file.
Examples:
inferless remote-run app.py -c runtime.yaml
inferless remote-run app.py -c runtime.yaml -e .ignore
For more details and examples refer to the Remote Run documentation .