Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Chunker - Our goal should be that the knowledge base looks up the complete information available about a product. So this depends somewhat on the format of the data we are uploading. If we are importing documents or pages where there is a strict one page == one product relationship, then we can use the identity chunker. Let’s say we are uploading product information sheets in the form of PDF documents, where each document contains several products. Then we might want to create a custom chunker that breaks apart the document into different sections containing information from different products

  • Matching Text - We want to search through the knowledge base based on characteristics of the product. So the matching text we use should be a one or two sentence description containing the characteristics of the product. It might be most effective to generate multiple different candidate one-sentence descriptions and embed all of them, similar to how the default Q&A matching text chain generates multiple questions

  • Query Transformer - Our query transformer now needs to take the ambiguous query provided by the user, and turn it into a hypothesized one or two sentence description of a product. E.g. the same way that the original query transformer took ambiguous text and cleaned it up into a proper question. Our new Product Search query transformer takes ambiguous text and cleans it up into a standardized product description, matching a specific length and format.

  • Reranker - The default reranker is prompted to look at how well the searched knowledge chunk matches the query provided by the user. We would now need a new reranker that is prompted to determine how well the product we found matches the description given by the user of what they wanted

Example 2 - Best Practices Engine for Pitch Deck

Many agents are based on the premise that they can a large body of rules and regulations against some input provided by the user. E.g. analyzing a pitch deck that the user provided against a knowledge base containing best practices for pitch decks generally. Since our standard Q&A style knowledge base has only been built for answering questions, we need to customize the knowledge base to suit our unique needs.

Let’s look at the situation of a pitch-deck best-practice knowledge base. The key part of customizing our knowledge base is deciding what information is relevant for matching queries to knowledge, and what information can be discarded. This is a matter of design and there is no one right answer. Included bits of information might make results better on some queries but worse on others. The best practice would be to measure your results on a statistical basis. But we can still get great results without ground-truth data simply by applying a bit of design principle to our matching text.

For the purpose of applying our pitch deck, lets say that we want to group together our best-practices depending on the type of slide, stage of company, and industry or vertical. Therefore we want to construct a matching text that contains each of these three elements. For example, our matching text might look like this:

Code Block
A go-to-market slide for a pre-seed startup in fintech.

And you could imagine different rules having different texts but in a similar format:

Code Block
A problem slide for a series b startup in health care.
A solution slide for a angel-phase pre-seed startup in e-commerce tech.

If we make both the Matching Text Smart Chain and the Query Transformer Smart Chain produce outputs that look like the above bits of text, then we will be able to match the specific sections of our pitch deck with specific best practices that have been loaded into the knowledge base.

Specifically, the customizations look like so.