public-talks

Slides, videos, and supporting files for my public talks

View project on GitHub

Engineering Cloud Native AI Platform

speaker-card

Abstract:

In recent years, advances in ML/AI have made tremendous progress. Yet designing large-scale data science and ML applications still remains challenging. The variety of ML frameworks, hardware accelerators, and cloud vendors, as well as the complexity of data science workflows, brings new challenges to MLOps. For example, it’s non-trivial to build an inference system suitable for models of different sizes, especially for LLMs or large models in general.

This talk presents various best practices and challenges in building large, efficient, scalable, and reliable AI/ML platforms using cloud-native technologies such as Kubernetes, Kubeflow, and KServe. We’ll deep dive into a reference platform dedicated to modern cloud-native AI infrastructure.