Retrieval Augmented Generation

LinkedIn Learning: LLMOps in Practice: A Deep Dive

Christopher A. Murphy

Winter Term, 2025

Course Overview

As LLM-based apps proliferate, many function as thin veneers around more robust models such as GPT. In this course, instructor Laurence Moroney leads students through the hands-on development of an LLM-powered chat application interface that implements retrieval augmented generation (RAG) with reinforcement learning through human feedback. By the end of this course, you’ll be prepared to create an ops ecosystem whose core functionality is based on an LLM. (Maroney 2024).

Course Objectives

Create a web chat application that wraps an LLM API.
Use Retrieval Augmented Generation to realize content generation aware of proprietary context.
Integrate a vector database to enable semantic search and extend chat context.

Methods

Built a real-time web chat (HTML/CSS/Node.js) to call OpenAI via API, managing dialogue with role-based messages (system/user/assistant).
Added RAG with ChromaDB (embeddings + top-k context injection) and RLHF-style feedback (good/bad buttons + auto-regeneration) to improve responses.

Key Learnings

Deploying and populating a vector database with proprietary content.
Programming against the OpenAI API.
Programmatically interacting with the system prompt, agent prompt, user prompt, context, and conversation history.
Programmatically retrieving content from the vector database via semantic search and populating the prompt context.
Developing with Node.js
Implementing Reinforcement Learning with Human Feedback to improve assistant responses.
Implementing lightweight logging to monitor and evaluate LLM response performance.

Visualizations

Competencies Employed

Reinforcement Learning from Human Feedback

A training method that treats human preference judgments as the reward signal to guide and align a model’s behavior.

Conversational AI

Designing and building interactive chatbots or assistants powered by LLMs.

Vector Embeddings

Creating and using vector representations of text to support semantic search and similarity matching.

LLM Integration

Connecting large language models (LLMs) to applications via APIs (e.g., OpenAI, Anthropic, Gemini).

Retrieval-Augmented Generation (RAG)

Enhancing LLM responses by integrating external knowledge via vector search and embedding-based retrieval.

Embedding-Based Search

Implementing semantic search using vector databases

Prompt Engineering

Crafting and optimizing inputs to guide LLM behavior effectively and reliably.

LLM Evaluation & Feedback

Capturing human feedback and analyzing LLM outputs to improve quality, relevance, and safety.

Additional Technical Information

Course Link

https://www.linkedin.com/learning/llmops-in-practice-a-deep-dive

Data

Students learned to embedded course-provided e-book chapter text into the ChromaDB vector database as overlapping chunks.

Results Summary

Demonstrated that RAG is a cost-effective alternative to custom model training for leveraging proprietary data.
Demonstrated the ease of using embedding-based semantic search over a vector database to generate proprietary context for RAG.