Blogs > Latest > Vector Databases for AI Chatbots: The Backbone of Modern AI Applications

Vector Databases for AI Chatbots: The Backbone of Modern AI Applications

Syed RazaOctober 18, 2024 • 8 min read

As an AI Chatbot Builder, you'll be using vector databases to store and retrieve data efficiently, so might as well learn about them! Vector databases store vectors (list of numbers) - Now it doesnt represent data, but a cluster of vectors represents the relation between the data [Key Point]. The core Idea is that we can figure out which data is most similar to our question (semantically the query) and then retrieve the data that is most similar to it, but why would we do this? Because LLMs have limited context (you can't paste an entire encyclopedia into ChatGPT now can you) so we need to be efficient, getting the most relevant snippets of data possible. So how do we do this? Well, we can use vector databases to store our data in a way that allows us to query it efficiently, so with that said, lets get started!

Why Vector Databases
How to Create an Embedding
What is Querying a Vector Database
Bringing it All Together

Why Vector Databases

Vectors are mathematical representations of data that capture semantic relationships. In the context of AI and machine learning, vectors allow us to represent complex data like text, images, and audio using numbers so that models can understand the relationships between different data points. Objects that are similar end up closer together in vector space, while unrelated objects are farther apart - meaning that if we can measure the distance between two vectors, we can determine which data is most similar to a given query - this is called similarity search (can be dot product or cosine similarity).

Several providers offer vector database solutions to manage and query these high-dimensional vectors efficiently. Some of the leading providers include:

Pinecone

Qdrant

Milvus

Weaviate

Chroma

How to Create an Embedding

Creating an embedding involves a three-step process:

Turn content into vectors with an AI Embedding Model: This step involves converting your data (text, images, etc.) into vector representations using embedding models. Some models are better suited for specific tasks than others. You can explore top embedding models on platforms like Hugging Face.
Query the database for relevant vectors: Once your data is stored as vectors in a vector database, you can query the database to find vectors that are similar to a given input vector.
Use the vectors as "ids" for the content to retrieve back: The retrieved vectors act as identifiers for your original content, allowing you to fetch and present the relevant data.

Source: https://github.com/GoogleCloudPlatform/generative-ai/blob/main/embeddings/intro-textemb-vectorsearch.ipynb/

What is Querying a Vector Database

After your vector database is set up, querying it involves converting a user input or query into a vector using the *SAME* embedding model otherwise it won't work!.

Once we have similar vectors, we do calculate distance using dot product or cosine similarity to find vectors that are nearest to the input vector (that we made from our query).

The last step is to use the vectors as "ids" for the content to retrieve back.

This process helps fetch relevant content from massive datasets and presents it to the Large Language Model (LLM). It essentially allows us to interface with and infer from large datasets through LLMs, making the search for content more efficient and accurate.

Source: https://qdrant.tech/articles/what-is-rag-in-ai/

What is Retrieval Augmented Generation (RAG)?

Now lets bring it full circle and talk about Retrieval Augmented Generation (RAG)! Now that we know how to create a vector database & query it, we can use it to build a chatbot that can understand and respond to user queries.

RAG is a technique that combines the power of vector databases with the capabilities of LLMs to create a chatbot that can understand and respond to user queries. Now lets say we have a vector database that stores information about different products, and we want to build a chatbot that can recommend products to users based on their preferences. We can use the vector database to store information about different products, query it to find similar products to what the user is looking for.

The last step is to give the relevant products to the LLM and ask it to recommend the best few products based on the user's preferences - highlighting benefits, features & weighing pros & cons based on users condition! Wow that was easy right? Splutter AI allows hotswappable vector databases so you can easily swap out the vector database you want to use for your chatbot! Our built-in RAG allows you too easily build a chatbot that can understand and respond to user queries using vector databases.

Retrieval Augmented Generation (RAG) Process Graphic

Source: https://qdrant.tech/articles/what-is-rag-in-ai/

Bringing it All Together

Building an AI chatbot with vector databases involves several detailed steps:

Data Collection: Gather the content you want your chatbot to understand and interact with.
Embedding Generation: Use an AI embedding model to convert your content into vectors.
Vector Storage: Store these vectors in a vector database like Pinecone, Qdrant, or Milvus.
User Input Processing: When a user inputs a query, convert it into a vector using the same embedding model.
Similarity Search: Query the vector database to find vectors that are similar to the input vector.
Content Retrieval: Use the retrieved vectors to fetch the corresponding original content.
Response Generation: Present the relevant content to the LLM to generate a coherent and context-aware response.

Congratulations! 🎉 You made it to the end! You can now understand & use RAG wherever you like! If you liked this blog, or want to know more about AI Chatbots, feel free to sign-up for Splutter AI to create your first AI Chatbot! 🤖 Good Luck!

Only Need Query Token? Just take what you need!

Need a Plan? Try our Hobby Plan!