This method allows you to pull custom code to load the model, and write custom pre-processing and post-processing functions. This method is best suited if you want to run a pipeline of models.

Language Supported: Python

Types of Provider

  • GitHub

  • GitLab

Creating the interface Code

We have created a Template repository that you can use as a base to inject your code you can find a sample here with the GPT Neo model.

Github Repo: https://github.com/infer-less/template-method

## Implement the Load function here for the model 
    def initialize(self):
        self.generator = pipeline("text-generation", model="EleutherAI/gpt-neo-125M",device=0)

    
# Function to perform inference 
    def infer(self, inputs):
        # inputs is a dictionary where the keys are input names and values are actual input data
        # e.g. in the below code the input name is "prompt"
        prompt = inputs["prompt"]
        pipeline_output = self.generator(prompt, do_sample=True, min_length=20)
        generated_txt = pipeline_output[0]["generated_text"]
        # The output generated by the infer function should be a dictionary where keys are output names and values are actual output data
        # e.g. in the below code the output name is "generated_txt"
        return {"generated_text": generated_txt}

# perform any cleanup activity here
    def finalize(self,args):
        self.pipe = None

input_schema.py

INPUT_SCHEMA = {
    "prompt": {
        'datatype': 'STRING',
        'required': True,
        'shape': [1],
        'example': ["There is a fine house in the forest"]
    }
}