> ## Documentation Index
> Fetch the complete documentation index at: https://docs.inferless.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Cli import

### Getting started

1. Login to [Inferless.com](https://inferless.com/) console and copy the CLI keys from [Keys section](https://console.inferless.com/user/settings?current-tab=keys) here

2. Now install Inferless CLI package using this command

   ```bash theme={null}
    pip install inferless-cli
   ```

3. Login to Inferless CLI using this command and paste the CLI keys here, you’ll be logged in

   ```bash theme={null}
   inferless login
   ```

### Sample model deployment

1. To deploy a model with Inferless you would ideally require 3 files
   * `app.py` is a Python file that plays a crucial role in setting up and running models on the Inferless platform. It typically contains a class with three main functions:
   * `input_schema.py` is a python file specifies the input parameters for your model's API calls.
   * `config.yaml(Runtime)` refers to the software and dependencies you can add to your runtime environment to support your model's specific needs.

2. To get started with your first deployment we will provide you `app.py` and `input_schema.py`. Run this command to download the files

   ```bash theme={null}
   inferless scaffold --demo 
   ```

3. Now that the files are downloaded. Initialise the model using this command

   ```bash theme={null}
   inferless init --name <modelname>
   ```

4. Once the model is initialised. Deploy it using this command

   ```bash theme={null}
    inferless deploy --gpu T4 
   ```

5. Hurray! You’ve successfully deployed your first model with us

### Runtime

Runtime in Inferless refers to the environment and configuration in which your model runs. It includes:

1. System packages
2. Python packages
3. Custom shell commands

### Create and deploy with runtime

1. Create a new file called `inferless_runtime_config.yaml` with below data

   ```bash theme={null}
   build:
   cuda_version: "12.1.1"
   python_packages:
       - "accelerate==0.33.0"
       - "torch==2.4.0"
       - "transformers==4.44.0"
       - "diffusers==0.30.0"
   ```

2. Run the below command to create the runtime

   ```bash theme={null}
   inferless runtime create --name <runtime_name> --path ./inferless_runtime_config.yaml
   ```

3. Deploy with runtime

   ```bash theme={null}
   inferless deploy --gpu t4 --region <region_name> --runtime <runtime_name> 
   ```

### Volumes

Volumes in Inferless are NFS-like writable storage spaces that can be connected to multiple replicas simultaneously. They serve several key purposes:

1. Storing model parameters
2. Archiving datasets (similar to centralized storage)
3. Setting up shared caches for collaborative tasks

### Creating and uploading weights

1. To create a new volume. Run this command

   ```bash theme={null}
   inferless volume create --name <volume_name>
   ```

2. New volume will be created and you will be shown the `infer_path` where you can store the weights. Make sure you keep this handy

3. Once the volume is created. Upload the weights using this command and paste the `infer_path` in the destination

   ```bash theme={null}
   inferless volume cp --source <source_path> --destination <remote_path (Infer path)>
   ```

4. Your files will be copied to the server and your volume is ready to be used.

5. To use these weights, in `app.py` you can specify the mount path from where the weights can be accessed (Eg: `/var/nfs-mount/<volume_name>`)

   ```json theme={null}
   from diffusers import StableDiffusionPipeline
   import torch
   from io import BytesIO
   import base64

   MODEL_WEIGHTS_DIR =  "/var/nfs-mount/my_volume"

   class InferlessPythonModel:    
       def initialize(self):
           self.pipe = StableDiffusionPipeline.from_pretrained(
               "/var/nfs-mount/my_volume",
               use_safetensors=True,
               torch_dtype=torch.float16,
               device_map='auto'
               local_files_only=True
           )
   ```

6. Run the command to create a new model using the above model weights

   ```bash theme={null}
   inferless init --name <modelname>
   ```

7. Once the model is initialized. Deploy it using this command (Make sure the path defined inside `app.py` and here in the command should be same)

   ```bash theme={null}
   inferless deploy --gpu t4 --region <region_name> --volume <volume_name> --volume-mount-path <volume_mount_path> 
   ```
