Skip to main content
POST
/
models
/
embed
Embed
curl --request POST \
  --url https://api.zeroentropy.dev/v1/models/embed \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "<string>",
  "input_type": "query",
  "input": "<string>",
  "output_dimensions": 123,
  "output_format": "float",
  "latency": "fast"
}
'
{
  "results": [
    {
      "embedding": [
        123
      ]
    }
  ],
  "usage": {
    "total_bytes": 123,
    "total_tokens": 123
  }
}

Authorizations

Authorization
string
header
required

The Authorization header must be provided in the format Bearer <your-api-key>.

You can get your API Key at the Dashboard!

Body

application/json
model
string
required

The model ID to use for embedding. Options are: ["zembed-1"]

input_type
enum<string>
required

The input type. For retrieval tasks, either query or document.

Available options:
query,
document
input
required

The string, or list of strings, to embed

output_dimensions
integer | null

The output dimensionality of the embedding model.

output_format
enum<string>
default:float

The output format of the embedding. base64 is significantly more efficient than float. The default is float.

Available options:
float,
base64
latency
enum<string> | null

Whether the call will be inferenced "fast" or "slow". RateLimits for slow API calls are orders of magnitude higher, but you can expect >10 second latency. Fast inferences are guaranteed subsecond, but rate limits are lower. If not specified, first a "fast" call will be attempted, but if you have exceeded your fast rate limit, then a slow call will be executed. If explicitly set to "fast", then 429 will be returned if it cannot be executed fast.

Available options:
fast,
slow

Response

Successful Response

results
EmbedResult · object[]
required

The list of embedding results.

usage
EmbedUsage · object
required

Statistics regarding the tokens used by the request.