Embeddings
Generate 1536-dimensional vector embeddings from text using Axon Embed 1. Use embeddings to build semantic search, power RAG pipelines, cluster documents, and calculate similarity scores — all within your data residency region.
Create embeddings
bash
POST /v1/axon/embeddings| Name | Type | Required | Description |
|---|---|---|---|
| input | string | array | Yes | A string or array of strings to embed. Maximum 100 inputs per request, each under 32,768 characters. |
| model | string | No | Must be "axon-embed-1". This is the default. |
| data_residency | string | No | au | us | eu | uk | sg. Defaults to "au". |
bash
curl -X POST https://api.hldgroup.org/v1/axon/embeddings \
-H "x-internal-secret: <key>" \
-H "x-tenant-id: ten_01hxyz" \
-H "x-user-id: usr_01hxyz" \
-H "x-platform-role: tenant-standard-user" \
-H "Content-Type: application/json" \
-d '{
"model": "axon-embed-1",
"data_residency": "au",
"input": [
"Ransomware was detected on endpoint WIN-DEV-042",
"User account [email protected] showed impossible travel"
]
}'json
{
"data": {
"object": "list",
"model": "axon-embed-1",
"sovereign": true,
"data_residency": "au",
"data": [
{ "object": "embedding", "index": 0, "embedding": [0.0012, -0.0834, ...] },
{ "object": "embedding", "index": 1, "embedding": [0.0230, -0.0421, ...] }
],
"usage": { "prompt_tokens": 28, "total_tokens": 28 }
}
}Building a semantic search pipeline
typescript
// 1. Embed all documents at index time
const docTexts = documents.map(d => d.content)
const { data: { data: embeddings } } = await fetch('/v1/axon/embeddings', {
method: 'POST',
headers: { 'Content-Type': 'application/json', 'x-internal-secret': key, ... },
body: JSON.stringify({ input: docTexts }),
}).then(r => r.json())
// Store embeddings in your vector DB (pgvector, Pinecone, etc.)
await vectorDb.upsert(documents.map((doc, i) => ({
id: doc.id,
vector: embeddings[i].embedding,
metadata: { title: doc.title },
})))
// 2. At query time, embed the query and search
const { data: { data: [queryEmbed] } } = await fetch('/v1/axon/embeddings', {
method: 'POST',
body: JSON.stringify({ input: userQuery }),
}).then(r => r.json())
const results = await vectorDb.query({ vector: queryEmbed.embedding, topK: 5 })Note:For most RAG use cases you don't need to call the embeddings endpoint directly — use
POST /v1/axon/knowledge-bases/:id/query which handles embedding, retrieval, and generation in a single request.