Skip to Main Content
Deploy an AI chat-bot app on an Ampere A1 instance using Minikube

About This Workshop

Youtube Video

About This Workshop
Generative AI inference using ARM-based CPUs has proven to be very effective, however we need more proof points to support this claim. Thus, we conducted extensive research to test popular open-source LLM models such as Llama 2, Mistral, and Orcas with Ampere Altra ARM-based CPUs on Oracle Cloud Infrastructure(OCI).

Ampere A1 compute shapes provide flexible VM shapes and bare metal options across numerous regions with competitive pricing while providing flexibility in choosing CPU and memory. This allowed us to run various sized open-source LLM models and derive a conclusion from our hypothesis.

This guide will provide a thorough, step-by-step process for creating, provisioning, and deploying the necessary resources to access the lama.cpp, or an application of your choosing.

Workshop Info

1 hour, 30 minutes
  • Lab 1 - Setting up VCN and Networing 
  • Lab 2 - Creating Compute Instance 
  • Lab 3 - Setting compute and installing dependencies
  • Lab 4 - Pulling the chat-bot image 
  • Lab 5 - Deployment of the application
  • Lab 6 - Interacting with the application
  • Familiarity with Kubernetes and cloud native concepts of deployment and containerization is required.
  • Some understanding of linux shell commands.
  • Familiarity with Oracle Cloud Infrastructure (OCI) components like OCI Compute, networking, OCIR
  • Basic familiarity with open-source tools like GIT and GitHub.

Other Workshops you might like

Ask Oracle
Helping you on LiveLabs