Skip to Main Content
Monitor GPU Metrics in Oracle Cloud Infrastructure (OCI) with DCGM, Grafana and Prometheus

About This Workshop

Youtube Video

About This Workshop
This workshop provides hands-on experience in monitoring GPU metrics in Oracle Cloud Infrastructure using NVIDIA DCGM, Grafana, and Prometheus. Discover how to automate the installation and and how to install and configure DCGM, enabling seamless setup and streamlined performance monitoring. Learn to visualize real-time GPU stats in Grafana and set up Prometheus for efficient data collection, making GPU monitoring in OCI more efficient and scalable.

Workshop Info

2 hours
  • Lab 1 - Deploy VM infrastructure
  • Lab 2 - Install and configure Prometheus and Grafana
  • Lab 3 - Deploy DCGM on Oracle Linux and Ubuntu
  • Lab 4 - Simulate activity on the GPU VM and review metrics in Grafana
  • Administrative access to an OCI tenancy.
  • Ability to spin-up GPU instances in OCI,
  • Ability to create resources with Public IP addresses
  • Some understanding of cloud and database terms is helpful
  • Familiarity with Oracle Cloud Infrastructure (OCI) is helpful

Other Workshops you might like