Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Convert into raw text

  2. Break the text into small sections based on the semantics and headers of the document

  3. For each section, we generate a list of questions that the section of text answers. E.g. almost like Jeopardy, we are going backwards from the answer (contained in the document) to questions

  4. We fetch the embedding vector for each of the questions

  5. Create a summary of the entire document, referred to as the “Qualifying Text” internally because it helps qualify whether a particular knowledge chunk is relevant

  6. Write the knowledge chunks out to the database

Querying

...

When you query the knowledge base, the system goes through the following steps:

  1. The original raw query is transformed into a clean query. For the default knowledge base configuration, this means taking an ambiguous sentence, like location of food and turning it into the format of a question, such as Where is the food located? so that the text matches the format of the questions that were generated by the ingestion engine

  2. We lookup the embedding vector for the transformed query

  3. We use the embedding vector to perform a query on the knowledge base and find the top K knowledge chunks whose matching text (a.k.a the generated question) is the closest to the query text

  4. We use a reranking algorithm to rerank the matched knowledge chunks against the query

  5. We discard all except the top 5 matching knowledge chunks, as determined by the reranker

Loading In Knowledge

There are many different ways to load in knowledge to your agent, which will depend a lot on the different use-cases for your agent.

...