Simple RAG Consumption with Databricks Mosaic AI > Introduction to Simple RAG Consumption with Databricks Mosaic AI recipe

Introduction to Simple RAG Consumption with Databricks Mosaic AI recipe

The Simple Retrieval Augmented Generation (RAG) Consumption with Databricks Mosaic AI recipe is based on REST APIs. Use the recipe to receive a user query, search for relevant context in a vector database, send the query and context to the Databricks Mosaic AI Large Language Model (LLM), and return a comprehensive response.

The process begins by receiving a user input prompt. It then converts the query into a vector representation using a pre-trained embedding model, translating the text into a numerical format that captures the semantic meaning.

The process then uses a vectorized query to search for similar vectors within a vector database that contains representations of various contexts. The process retrieves the top K closest matches from the vector database based on similarity scores, representing contexts similar to the user's query.

The process then filters the retrieved contexts to include only those with a similarity score exceeding a specified cutoff parameter, ensuring that only relevant contexts are considered. The process combines these filtered contexts to formulate the final context, which provides additional information to the model.

Subsequently, the process invokes a Large Language Model (LLM) using the original user query and the processed context. The LLM leverages this information to generate a comprehensive response. This response is returned to the user in the same channel.

This approach ensures that the LLM's response is relevant and enriched by the most contextually appropriate information from the vector database.