Latest Enhancements & Features:

  1. New Runtime UI: The Runtime UI has been redesigned for better usability. You can now easily track runtime changes, identify models using specific runtimes, and clean up unused runtimes effortlessly.
  2. Input/Output Tracking in Serverless V2: We’ve added Input/Output tracking in Serverless V2, giving you deeper insights into how data flows through your deployments. This is only applicable for beta users with access to latest serverless capabilities.
  3. Enhanced Metrics: Added percentile-based request latency to provide better visibility into inference performance. The cold start tracker now shows percentiles instead of individual container starts for improved clarity.
  4. Enhanced Autoscaling: RPS-based scaling has been introduced to maintain low cold start times even at extremely high scales, resulting in significantly better p95 latency.