
AI Platform Engineering Program Overview
The AI platform engineering program focuses on equipping individuals with the skills required to manage and optimize the infrastructure, SRE, and DevOps workflows for modern Large Language Model (LLM) applications.
Topics Covered
-
Introduction to Platform engineering and DevOps
-
Changing landscape of DevOps due to AI
-
Developer first mindset for long term success
Hands on Labs
-
Deploy a Simple Web Application
-
Simulate Node Failure
-
Update a Deployment
-
Explore the Control Plane
-
Break and Fix
-
DevOps Engineers looking to specialize in AI infrastructure and LLM applications.
-
Machine Learning Engineers who want to integrate DevOps practices for efficient model deployment and scaling.
-
Site Reliability Engineers (SREs) interested in ensuring the reliability and performance of AI systems using GPU infrastructure.
-
Cloud Engineers eager to learn how to manage and optimize cloud environments for AI workloads on platforms like NVIDIA.