VectoriaDB uses embedding vectors to enable semantic search. Each document’s text is converted to a vector representation that captures its meaning.
How Indexing Works
- Text Input: You provide a document with text and metadata
- Embedding Generation: VectoriaDB generates a vector embedding from the text
- Storage: The embedding is stored in memory (and optionally persisted)
- Searchable: The document becomes searchable via semantic queries
Document Structure
Each document requires three pieces:src/document-structure.ts
ID Requirements
- Must be unique within the database
- Used to retrieve, update, or remove documents
- Should match
metadata.idfor consistency
Text Guidelines
- Descriptive, natural language text works best
- Include relevant keywords and context
- Maximum size controlled by
maxDocumentSizeconfig
Metadata
- Must extend
DocumentMetadatainterface idfield is required and must match document ID- Add any custom fields for filtering and display
Type-Safe Metadata
Define your metadata interface for compile-time safety:src/types.ts
Embedding Generation
Embeddings are generated automatically when you add or update documents. The process:- Text is tokenized using the configured model
- Embeddings are generated (~100-200 documents/second)
- Embeddings are stored in memory (and optionally persisted)
addMany with appropriate maxBatchSize to avoid memory spikes.
Document Limits
VectoriaDB enforces limits to prevent DoS attacks:src/config-limits.ts
Related
Adding Documents
Add single and batch documents
Updating Documents
Update metadata and text
Search
Query the index