Skip to main content
Learn how to configure similarity thresholds for optimal search results.

Understanding Thresholds

The threshold determines the minimum similarity score for results:
src/thresholds.ts
// Strict matching - only highly relevant results
const strict = await toolIndex.search(query, { threshold: 0.7 });

// Moderate matching - good balance
const moderate = await toolIndex.search(query, { threshold: 0.5 });

// Loose matching - include tangentially related
const loose = await toolIndex.search(query, { threshold: 0.3 });

Threshold Guidelines

ThresholdUse CaseResults
0.7+High precision, exact matchesFew, highly relevant
0.5-0.7Balanced precision/recallModerate, mostly relevant
0.3-0.5High recall, broader matchesMany, some tangential
< 0.3Exploratory, catch-allMost documents

Default Threshold

Set a default threshold in configuration:
src/default-threshold.ts
const db = new VectoriaDB({
  defaultSimilarityThreshold: 0.4,
});

// Uses default threshold (0.4)
const results = await db.search(query);

// Override for specific query
const strict = await db.search(query, { threshold: 0.7 });

Choosing the Right Threshold

For Tool Discovery

src/tool-discovery-threshold.ts
// Start conservative - only show high-confidence matches
const results = await db.search(userQuery, {
  threshold: 0.5,
  topK: 5,
});

if (results.length === 0) {
  // Fallback: broaden search
  const broader = await db.search(userQuery, {
    threshold: 0.3,
    topK: 10,
  });
}
src/document-search-threshold.ts
// Lower threshold for document search - users expect more results
const results = await db.search(searchQuery, {
  threshold: 0.3,
  topK: 20,
});

For Recommendations

src/recommendations-threshold.ts
// Get similar documents
const similar = await db.search(currentDocument.text, {
  threshold: 0.6, // Higher - we want truly similar items
  topK: 5,
  filter: (m) => m.id !== currentDocument.id, // Exclude current
});

Adaptive Thresholds

Adjust thresholds based on query characteristics:
src/adaptive-threshold.ts
function getThreshold(query: string): number {
  // Short queries often need lower thresholds
  if (query.split(' ').length <= 2) {
    return 0.3;
  }

  // Longer, more specific queries can use higher thresholds
  if (query.split(' ').length >= 5) {
    return 0.5;
  }

  return 0.4; // Default
}

const results = await db.search(query, {
  threshold: getThreshold(query),
});

Debugging Threshold Issues

Too Few Results

src/debug-few-results.ts
// Check what scores documents actually have
const debug = await db.search(query, {
  threshold: 0.0, // Show all
  topK: 20,
});

console.log('Score distribution:');
for (const r of debug) {
  console.log(`  ${r.id}: ${r.score.toFixed(3)}`);
}

// Adjust threshold based on observed scores

Too Many Irrelevant Results

src/debug-many-results.ts
// Gradually increase threshold
for (const threshold of [0.3, 0.4, 0.5, 0.6, 0.7]) {
  const results = await db.search(query, { threshold, topK: 10 });
  console.log(`Threshold ${threshold}: ${results.length} results`);

  if (results.length > 0) {
    console.log(`  Top score: ${results[0].score.toFixed(3)}`);
    console.log(`  Low score: ${results[results.length - 1].score.toFixed(3)}`);
  }
}
Start with a lower threshold (0.3-0.4) and increase it if you’re getting too many irrelevant results. It’s easier to filter down than to miss relevant documents.

Basic Search

Search fundamentals

Filtering

Filter by metadata

Performance

Optimize search