Retrieval for RAG

TellusR is specifically designed to provide context within a Retrieval-Augmented Generation (RAG) system, enabling seamless integration of its advanced retrieval capabilities with cutting-edge generative models.

TellusR provides a predefined search pipeline rag, designed to offer relevant context for an LLM. Instead of returning the top most suitable documents, it delivers the most fitting text excerpts based on the user´s query. The retrieval parameters can be customized, such as the quantity of relevant text chunks per document, and the size of each chunk, allowing for a tailored LLM-context retrieval.

Example

Given a SERVER_URL and a PROJECT, the following request returns top chunks given the parameters.

curl -X GET $SERVER_URL/tellusr/api/v1/$PROJECT/compute/rag?q=mathematics&topN=4&highlightWindow=3&rag.simplify=true
  • q: the query
  • topN: how many top documents to consider?
  • highlightWindow: for each search hit on a chunk, expand the chunk by also returning surrounding chunks.
  • rag.simplify: if set to true, chunks that are connected will be contatenated and grouped as continuous text in the response. Alternatively, if set to false (default) then chunks will have detailed metadata about their origin, such as page number, which may be needed to provide references as links.