Title

Run vLLM on OKE with Oracle DB 23ai Vector Search

About Or Title

About This Workshop

Desc Long

Running AI workloads on Oracle Kubernetes Engine (OKE) is more straightforward than you might expect. Oracle Cloud Infrastructure (OCI) provides a variety of GPU-enabled nodes, available in both virtual and bare metal options, designed to act as worker nodes within your Kubernetes clusters. These nodes come pre-configured with NVIDIA drivers and a device plugin daemon-set, making setup quick and easy.

Oracle DB 23ai adds significant value by offering robust vector storage capabilities, which are critical for managing complex, high-dimensional data such as embeddings used in AI applications. Utilizing Oracle DB 23ai as a vector store allows AI systems to efficiently store and retrieve data for more advanced operations. With OCI, you can securely run Large Language Models (LLMs) in your own environment, enabling them to draw insights from your enterprise’s specific data. The implementation of Retrieval Augmented Generation (RAG) pipelines further enhances LLM functionality, allowing these models to provide accurate answers based on new data, improving overall performance while cutting down on training and fine-tuning costs.

Workshop Length

3 hours

Outline

Lab 1 - Provision of infrastructure to run JupyterHub Notebook
Lab 2 - Run JupyterHub Notebook to chat with LLM
Lab 3 - Retrieval Augmented Generation (RAG) Application

Prerequisites

Administrative access to an OCI tenancy.
Ability to spin-up A10 instances in OCI.
Ability to create resources with Public IP addresses (Load Balancer, Instances, OKE API Endpoint).
Access to HuggingFace.
Accept selected HuggingFace model license agreement.

Other Workshops you might like

NVIDIA Morpheus on OCI: AI-Driven Cybersecurity Powered by NVIDIA GPUs

NVIDIA Morpheus on OCI: AI-Driven Cybersecurity Powered by NVIDIA GPUs

Automated NVIDIA Morpheus with Oracle Cloud Infrastructure (OCI) by leveraging NVIDIA GPUs such as (..)

2 hrs
Run a simple RAG enabled chatbot in OKE using NVIDIA NIM, Qdrant and Gradio

Run a simple RAG enabled chatbot in OKE using NVIDIA NIM, Qdrant and Gradio

Demonstration of running a RAG pipeline in OKE using vLLM (to run LLM from HuggingFace), Qdrant as (..)

3 hrs 528 Views
Try Machine Learning and Predict Optimal Pit Strategy with Oracle Red Bull Racing

Try Machine Learning and Predict Optimal Pit Strategy with Oracle Red Bull Racing

In this workshop, you’ll have the opportunity to experiment with predictive models and choose which (..)

2 hrs 2761 Views
Getting started with HeatWave Lakehouse on AWS

Getting started with HeatWave Lakehouse on AWS

In this workshop, get hands-on experience and learn how to use HeatWave Lakehouse on AWS. With (..)

1 hr, 30 mins