MODAL
Powered by
High Performance LLM Inference in Production

High Performance LLM Inference in Production

FEB

11

Wednesday, February 11

5:00 PM - 6:00 PM

Register

The era of actually open AI is here. We’ve spent the past year helping leading organizations deploy open models and inference engines in production at scale.

Hosted by Charles Frye, this live session will walk you through:

  1. The three types of LLM workloads: offline, online and semi-online.
  2. The challenges engineers face and our recommended solutions to control cost, latency, and throughput
  3. How you can implement those solutions on our cloud platform

Speaker

Charles Frye

Charles Frye

GPU Enjoyer @ Modal

MODAL
Hosted by Modal
FEB

11

Wednesday, February 11

5:00 PM - 6:00 PM

Register