Embeddings

Generate 1536-dimensional vector embeddings from text using Axon Embed 1. Use embeddings to build semantic search, power RAG pipelines, cluster documents, and calculate similarity scores — all within your data residency region.

Create embeddings

bash

POST /v1/axon/embeddings

Name	Type	Required	Description
input	string \| array	Yes	A string or array of strings to embed. Maximum 100 inputs per request, each under 32,768 characters.
model	string	No	Must be "axon-embed-1". This is the default.
data_residency	string	No	au \| us \| eu \| uk \| sg. Defaults to "au".

bash

curl -X POST https://api.hldgroup.org/v1/axon/embeddings \
  -H "x-internal-secret: <key>" \
  -H "x-tenant-id: ten_01hxyz" \
  -H "x-user-id: usr_01hxyz" \
  -H "x-platform-role: tenant-standard-user" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "axon-embed-1",
    "data_residency": "au",
    "input": [
      "Ransomware was detected on endpoint WIN-DEV-042",
      "User account [email protected] showed impossible travel"
    ]
  }'

json

{
  "data": {
    "object": "list",
    "model": "axon-embed-1",
    "sovereign": true,
    "data_residency": "au",
    "data": [
      { "object": "embedding", "index": 0, "embedding": [0.0012, -0.0834, ...] },
      { "object": "embedding", "index": 1, "embedding": [0.0230, -0.0421, ...] }
    ],
    "usage": { "prompt_tokens": 28, "total_tokens": 28 }
  }
}

Building a semantic search pipeline

typescript

// 1. Embed all documents at index time
const docTexts = documents.map(d => d.content)
const { data: { data: embeddings } } = await fetch('/v1/axon/embeddings', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json', 'x-internal-secret': key, ... },
  body: JSON.stringify({ input: docTexts }),
}).then(r => r.json())

// Store embeddings in your vector DB (pgvector, Pinecone, etc.)
await vectorDb.upsert(documents.map((doc, i) => ({
  id: doc.id,
  vector: embeddings[i].embedding,
  metadata: { title: doc.title },
})))

// 2. At query time, embed the query and search
const { data: { data: [queryEmbed] } } = await fetch('/v1/axon/embeddings', {
  method: 'POST',
  body: JSON.stringify({ input: userQuery }),
}).then(r => r.json())

const results = await vectorDb.query({ vector: queryEmbed.embedding, topK: 5 })

Note:For most RAG use cases you don't need to call the embeddings endpoint directly — use POST /v1/axon/knowledge-bases/:id/query which handles embedding, retrieval, and generation in a single request.

Knowledge bases RAG queries