Skip to Main Content
Run vLLM on OKE with Oracle DB 23ai Vector Search

About This Workshop

Youtube Video

About This Workshop
Running AI workloads on Oracle Kubernetes Engine (OKE) is more straightforward than you might expect. Oracle Cloud Infrastructure (OCI) provides a variety of GPU-enabled nodes, available in both virtual and bare metal options, designed to act as worker nodes within your Kubernetes clusters. These nodes come pre-configured with NVIDIA drivers and a device plugin daemon-set, making setup quick and easy.

Oracle DB 23ai adds significant value by offering robust vector storage capabilities, which are critical for managing complex, high-dimensional data such as embeddings used in AI applications. Utilizing Oracle DB 23ai as a vector store allows AI systems to efficiently store and retrieve data for more advanced operations. With OCI, you can securely run Large Language Models (LLMs) in your own environment, enabling them to draw insights from your enterprise’s specific data. The implementation of Retrieval Augmented Generation (RAG) pipelines further enhances LLM functionality, allowing these models to provide accurate answers based on new data, improving overall performance while cutting down on training and fine-tuning costs.

Workshop Info

3 hours
  • Lab 1 - Provision of infrastructure to run JupyterHub Notebook
  • Lab 2 - Run JupyterHub Notebook to chat with LLM
  • Lab 3 - Retrieval Augmented Generation (RAG) Application
  • Administrative access to an OCI tenancy.
  • Ability to spin-up A10 instances in OCI.
  • Ability to create resources with Public IP addresses (Load Balancer, Instances, OKE API Endpoint).
  • Access to HuggingFace.
  • Accept selected HuggingFace model license agreement.

Other Workshops you might like

Ask Oracle
Helping you on LiveLabs