Retrieves information about a specific page. The request parameters define what information you would like to receive.

A `404 Not Found` will be returned if either the collection name does not exist, or the document path does not exist within the provided collection.

Get Page Info

Collection Name

The content of the page. This field will only be provided if `include_content` was set to `true`, and the document has finished parsing. Otherwise, this field will be set to `null`.

Content

A URL to an image of the page. This field will only be provided if the document has finished parsing, and if it is a filetype that is capable of producing images (e.g. PDF, DOCX, PPT, etc). In all other cases, this field will be `null`.

NOTE: If a `/documents/update-document` call returned a new document id, then this url will be invalidated and must be retrieved again.

Image Url

The specific page index of this page. Pages are 0-indexed, so that the 1st page of a PDF is of page index 0.

Page Index

The filepath of the document associated with this page.

Path

PageResponse

Location

Message

Error Type

ValidationError

GetPageInfoRequest

HTTPBearer

GetPageInfoResponse

HTTPValidationError

ZeroEntropy

Welcome to the ZeroEntropy documentation.

Introduction

Understand the core concepts of the ZeroEntropy API.

Core Concepts

Getting Started using the ZeroEntropy API

Quickstart

Metadata Filtering

Architecture

Models

Gets the current indexing status across all documents.

If a collection name is passed in, it will get the indexing status of only the documents within that collection. Otherwise, it will show the cumulative status across all of your collections.

A `404 Not Found` status code will be returned, if a collection name was provided, but it does not exist.

Get Status

Adds a collection.

If the collection already exists, a `409 Conflict` status code will be returned.

Add Collection

Gets a complete list of all of your collections.

Get Collection List

Deletes a collection.

A `404 Not Found` status code will be returned, if the provided collection name does not exist.

Delete Collection

Adds a document to a given collection.

A status code of `201 Created` will be returned if a document was successfully added. A status code of `409 Conflict` will be returned if the given collection already has a document with the same path.

If `overwrite` is given a value of `true`, then a status code of `200 OK` will be returned if a document was overwritten (Rather than a status code of `409 Conflict`).

When a document is inserted, it can take time to appear in the index. Check the `/status/get-status` endpoint to see progress.

Add Document

Updates a document. This endpoint is atomic.

Currently both `metadata` and `index_status` are supported.

- When updating with a non-null `metadata`, the document must have `index_status` of `indexed`. After this call, the document will have an `index_status` of `not_indexed`, since the document will need to reindex with the new metadata.
- When updating with a non-null `index_status`, setting it to `not_parsed or `not_indexed` requires that the document must have `index_status` of `parsing_failed` or `indexing_failed`, respectively.

A `404 Not Found` status code will be returned, if the provided collection name or document path does not exist.

Update Document

Retrieves information about a specific document. The request parameters define what information you would like to receive.

A `404 Not Found` will be returned if either the collection name does not exist, or the document path does not exist within the provided collection.

Get Document Info

Retrives a list of document metadata information that matches the provided filters.

The documents returned will be sorted by path in lexicographically ascending order. `path_gt` can be used for pagination, and should be set to the path of the last document returned in the previous call.

A `404 Not Found` will be returned if either the collection name does not exist, or the document path does not exist within the provided collection.

Get Document Info List

Deletes a document

A `404 Not Found` status code will be returned, if the provided collection name or document path does not exist.

Delete Document

Get the top K documents that match the given query

Top Pages

Get the top K snippets that match the given query.

You may choose between coarse and precise snippets. Precise snippets will average ~200 characters, while coarse snippets will average ~2000 characters. The default is coarse snippets. Use the `precise_responses` parameter to adjust.

Top Snippets

Reranks the provided documents, according to the provided query.

The results will be sorted by descending order of relevance. For each document, the index and the score will be returned. The index is relative to the documents array that was passed in. The score is the query-document relevancy determined by the reranker model. The results will be returned in descending order of relevance.

Status

Collections

Documents

Queries

Models

Get Page Info

Authorizations

Body

Response