The ZeroEntropy API is designed to give you full control over your search index and query granularity.

Data

  1. Collections: Collections will act as separate and independent datastores for your documents. If you want to index multiple distinct datasets, or are looking to maintain a multi-tenant architecture, then you will want to create collections in order to separate those datasets into their own search indexes.
  2. Documents: The foundational units for indexing. You can upload and delete documents, and query them using various filters. Metadata can be applied per-document, in order to allow for document-level filtering.
  3. Pages: Many documents, such as PDFs, .docx files, and powerpoint files, are fundamentally segmented into pages. Text files can also optionally be segmented into pages for more fine-grained control.

Queries

  1. Top k Documents Retrieval: Specify a value for k to retrieve the k documents most relevant to your query.
  2. Top k Pages Retrieval: Specify a value for k to retrieve the k pages most relevant to your query.
  3. Top k Snippets Retrieval: Specify a value for k to retrieve the k snippets most relevant to your query. You can choose to between coarse snippets (~2000 characters on average), or precise snippets (~200 characters on average).

By understanding these core concepts, you’ll be well-equipped to harness the full power of the ZeroEntropy API. Proceed to the API Reference for detailed information on how to implement these features.