Latest Enhancements & Features:

  1. HF Model Import Improvements: We’ve enhanced the Hugging Face Model Import functionality by adding support for Transformers 4.44+ models, including LLaMA 3.1 and many others that previously threw errors. Additionally, you can now pass more input parameters (such as output length, temperature, etc.) during model import, to run your model efficiently
  2. Model Build Progress Visibility: You can now track your Model Build progress more effectively through detailed logs at different stages: Queue, Runtime Build, Worker (preparing the model), and Inference Validation. This gives you greater visibility into the status of your model builds and helps you identify any issues early in the process.
  3. Improved Docker Build Logs: Docker build logs are now streamed in real-time, allowing you to track the progress of custom runtime builds faster and more easily, rather than waiting for all logs at the end.

These updates provide more efficient model management and enhanced visibility for smoother workflows in Inferless. Stay tuned for more improvements!