...
The original raw query is transformed into a clean query. For the default knowledge base configuration, this means taking an ambiguous sentence, like
location of food
and turning it into the format of a question, such asWhere is the food located?
so that the text matches the format of the questions that were generated by the ingestion engineWe lookup the embedding vector for the transformed query
We use the embedding vector to perform a query on the knowledge base and find the top K knowledge chunks whose
matching text
(a.k.a the generated question) is the closest to the query textWe use a reranking algorithm to rerank the matched knowledge chunks against the query
We discard all except the top 5 matching knowledge chunks
The matching knowledge chunks are returned. For your standard AI Agents, these matching knowledge chunks are then fed into the conversation history for the agent as a tool-result, which the agent can then use to formulate and write its response
...
Configuration
The knowledge base can be connected to many different sources of data, combining them together into a homogenous knowledge system. Sources of data include documents and web-pages that you import through the user interface. But may also include custom data schemas that you have created yourself, containing information that the bot has extracted from conversations (this allows the agent to have a ‘memory’ in-between multiple sessions). Eventually we may even allow third-party API services to become sources of data for the knowledge base.
...
Note |
---|
IT IS RECOMMENDED TO NEVER MODIFY THE IDENTITY BINDINGS. PARTS OF THE SYSTEM DEPEND UPON THE FACT THAT THE IDENTITY BINDINGS DO AS TOLD. |
Customizing for Specific Use Cases
The key to making the knowledge base work effectively in a wide variety of situations ultimately comes down to the way that you customize different smart chains.
The most important part of that customizing comes in the form of the Matching Text and the Query Transformer. Lets look at an example where we might need to customize these.
E-Commerce Product Search
The default knowledge base is designed for doing Q&A based on documents and data that was imported to it. But what if you aren’t so much asking a question but rather searching for something specific?
This is the situation of performing search for an e-commerce retailer. When I am looking for a product, I’m not really asking a question. Instead I have a product that I’m imagining in my head, and I’m just trying to get the right words and phrases to find that product in the retailers database. The basis of my hunt is that imagined product, not a question. So the default question format produced by the Matching Text Chain
and Query Transformer Chain
would not be appropriate.
Instead, a better thing to use for a matching text might be a one or two sentence description of the product itself. It is recommended to keep the matching text short, so no more then two sentences.
To support this use-case, I would apply the following customization:
Chunker - Our goal should be that the knowledge base looks up the complete information available about a product. So this depends somewhat on the format of the data we are uploading. If we are importing documents or pages where there is a strict
one page
==one product
relationship, then we can use theidentity
chunker. Let’s say we are uploading product information sheets in the form of PDF documents, where each document contains several products. Then we might want to create a custom chunker that breaks apart the document into different sections containing information from different productsMatching Text - We want to search through the knowledge base based on characteristics of the product. So the matching text we use should be a one or two sentence description containing the characteristics of the product. It might be most effective to generate multiple different candidate one-sentence descriptions and embed all of them, similar to how the default Q&A matching text chain generates multiple questions
Query Transformer - Our query transformer now needs to take the ambiguous query provided by the user, and turn it into a hypothesized one or two sentence description of a product. E.g. the same way that the original query transformer took ambiguous text and cleaned it up into a proper question. Our new Product Search query transformer takes ambiguous text and cleans it up into a standardized product description, matching a specific length and format.
Reranker - The default reranker is prompted to look at how well the searched knowledge chunk matches the query provided by the user. We would now need a new reranker that is prompted to determine how well the product we found matches the description given by the user of what they wanted