list
: List all models in the current workspacedelete
: delete a model from the system.rebuild
: This deploys the new code and runtime for the model.info
: Get model details (min replicas, max replicas, current replicas, status).activate
: activate a model this will restore the min and max replicas to the original values.deactivate
: deactivate a model this will scale the min and max replicas to 0.patch
: patch model configuration.--model-id
: Model ID--runtime-path (optional)
: Runtime file path which will be created as new version for your current runtime.--runtime-version
: new runtime version--model-id <id>
: Model ID--help
: Show this message and exit.--model-id TEXT
: Model ID--gpu TEXT
: Denotes the machine type (A10/A100/T4). [required]--fractional
: Use fractional machine type (default: dedicated).--volume TEXT
: Volume name.--mount-path TEXT
: Volume Mount path for the volume.--env TEXT
: Key=value pairs for model environment variables.--inference-timeout INTEGER
: Inference timeout in seconds. [default: 180]--scale-down-timeout INTEGER
: Scale down timeout in seconds. [default: 600]--container-concurrency INTEGER
: Container concurrency level. [default: 1]--secret TEXT
: Secret names to attach to the deployment (—secret secret-name).--runtimeversion TEXT
: Runtime version (default: latest).--max-replica INTEGER
: Maximum number of replicas. [default: 1]--min-replica INTEGER
: Minimum number of replicas. [default: 0]--help
: Show this message and exit.