Skip to main content

Documentation Index

Fetch the complete documentation index at: https://doc.ambientsoul.ai/llms.txt

Use this file to discover all available pages before exploring further.

Embedding Search Tutorial

Learn how to implement semantic search using Soul Kernel’s embedding and storage systems.

Overview

This tutorial demonstrates how to:
  • Generate embeddings for text content
  • Store memories with vector embeddings
  • Perform semantic similarity search
  • Use caching for efficiency
  • Switch between embedding providers

Prerequisites

  • Rust 1.79 or later
  • Basic understanding of vector embeddings
  • (Optional) OpenAI API key for production embeddings

Setup

1. Add Dependencies

Add the required crates to your Cargo.toml:
[dependencies]
embeddings = { path = "../embeddings" }
storage = { path = "../storage" }
tokio = { version = "1.42", features = ["full"] }
anyhow = "1.0"
tracing-subscriber = "0.3"
dotenv = "0.15"

[features]
openai = ["embeddings/openai"]

2. Environment Configuration

For OpenAI embeddings, create an openai.env file:
OPENAI_API_KEY=sk-proj-your-api-key-here
OPENAI_EMBEDDING_MODEL=text-embedding-3-small
EMBEDDING_CACHE_SIZE=1000

Basic Implementation

Step 1: Initialize Services

use embeddings::{create_embedding_service, EmbeddingConfig, EmbeddingCache};
use storage::{HybridMemoryStore, MemoryStore, SqliteMemoryStore, QdrantMemoryStore};
use std::sync::Arc;

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    // Initialize logging
    tracing_subscriber::fmt::init();
    
    // Load environment variables
    if let Ok(_) = dotenv::from_filename("openai.env") {
        println!("Loaded configuration from openai.env");
    }
    
    // Create embedding service
    let embedding_config = if cfg!(feature = "openai") && std::env::var("OPENAI_API_KEY").is_ok() {
        println!("Using OpenAI API for embeddings");
        EmbeddingConfig::openai(std::env::var("OPENAI_API_KEY").unwrap())
    } else {
        println!("Using mock embeddings");
        EmbeddingConfig::default()
    };
    
    let embedding_service = create_embedding_service(&embedding_config).await?;
    let embedding_cache = EmbeddingCache::new(100);
    
    // Create storage
    let sqlite = Arc::new(SqliteMemoryStore::in_memory()?);
    let vector = Arc::new(QdrantMemoryStore::new("mock://")?);
    let store = HybridMemoryStore::new(sqlite, vector);
    
    Ok(())
}

Step 2: Store Memories with Embeddings

use storage::{MemoryEvent, MemoryMetadata};

// Sample memories to store
let memories = vec![
    ("I love pizza with extra cheese", vec!["food", "preference"]),
    ("The weather is sunny today in San Francisco", vec!["weather", "location"]),
    ("Remember to buy milk and eggs", vec!["todo", "shopping"]),
    ("My favorite programming language is Rust", vec!["tech", "preference"]),
    ("The meeting is scheduled for 3 PM tomorrow", vec!["schedule", "work"]),
];

println!("Storing memories with embeddings...");

for (content, tags) in memories {
    // Check cache first
    let embedding = if let Some(cached) = embedding_cache.get(content) {
        println!("  Using cached embedding for: {}", content);
        cached
    } else {
        // Generate embedding
        let embedding = embedding_service.generate_embedding(content).await?;
        embedding_cache.put(content, embedding.clone());
        println!("  Generated embedding for: {}", content);
        embedding
    };
    
    // Create memory event
    let event = MemoryEvent::new(
        content.to_string(),
        embedding,
        MemoryMetadata {
            source: "example".to_string(),
            tags: tags.iter().map(|s| s.to_string()).collect(),
            confidence: 0.9,
            ..Default::default()
        },
    );
    
    // Store in database
    store.insert(&event).await?;
}
// Queries to test
let queries = vec![
    "What food do I like?",
    "Tell me about the weather",
    "What should I remember to do?",
    "Programming languages",
];

for query in queries {
    println!("\nQuery: '{}'", query);
    
    // Generate query embedding
    let query_embedding = embedding_service.generate_embedding(query).await?;
    
    // Search for similar memories
    let results = store.search_by_embedding(&query_embedding, 3).await?;
    
    println!("Top {} results:", results.len());
    for (i, result) in results.iter().enumerate() {
        println!("  {}. [Score: {:.3}] {}", 
            i + 1, 
            result.score, 
            result.event.content
        );
        if !result.event.metadata.tags.is_empty() {
            println!("     Tags: {}", result.event.metadata.tags.join(", "));
        }
    }
}

Advanced Features

Batch Processing

Process multiple texts efficiently:
let texts = vec![
    "First document about AI".to_string(),
    "Second document about ML".to_string(),
    "Third document about data science".to_string(),
];

// Generate embeddings in batch
let embeddings = embedding_service.generate_embeddings(&texts).await?;

// Store all at once
for (text, embedding) in texts.iter().zip(embeddings.iter()) {
    let event = MemoryEvent::new(
        text.clone(),
        embedding.clone(),
        MemoryMetadata::default(),
    );
    store.insert(&event).await?;
}
Search with specific criteria:
use storage::{SearchFilter};

let filter = SearchFilter {
    tags: Some(vec!["food".to_string(), "preference".to_string()]),
    sources: Some(vec!["conversation".to_string()]),
    min_confidence: Some(0.8),
    time_range: None,
};

let results = store.search_by_embedding_filtered(
    &query_embedding, 
    5,     // top_k
    filter
).await?;

Similarity Threshold

Only return results above a certain similarity:
let results = store.search_by_embedding(&query_embedding, 10).await?;

// Filter by similarity score
let relevant_results: Vec<_> = results
    .into_iter()
    .filter(|r| r.score > 0.7)  // Only high similarity
    .collect();

Performance Tips

1. Use Caching Effectively

// Create a larger cache for production
let cache = EmbeddingCache::new(10000);

// Pre-warm cache with common queries
let common_queries = vec!["weather", "schedule", "tasks", "preferences"];
for query in common_queries {
    let embedding = embedding_service.generate_embedding(query).await?;
    cache.put(query, embedding);
}

2. Batch Operations

// Instead of this:
for text in texts {
    let embedding = service.generate_embedding(&text).await?;
    // Process...
}

// Do this:
let embeddings = service.generate_embeddings(&texts).await?;
for (text, embedding) in texts.iter().zip(embeddings.iter()) {
    // Process...
}

3. Optimize Storage

// Use in-memory for testing
let store = SqliteMemoryStore::in_memory()?;

// Use file-based for production
let store = SqliteMemoryStore::new("memories.db")?;

// Enable WAL mode (done automatically)
// Provides better concurrent performance

Switching Providers

Development (Mock)

cargo run --bin embedding_search
Results:
  • Fast generation (~50μs)
  • Low similarity scores
  • Good for testing logic

Production (OpenAI)

cargo run --bin embedding_search --features openai
Results:
  • Slower generation (~310ms)
  • High-quality semantic matching
  • Meaningful similarity scores

Complete Example

See the full working example at:
kernel/examples/embedding_search.rs
Run it with:
cd kernel/examples
cargo run --bin embedding_search --features openai

Troubleshooting

OpenAI API Issues

  1. API Key Not Found
    Error: OPENAI_API_KEY not set
    
    Solution: Create openai.env file with your API key
  2. Rate Limiting
    Error: Rate limit exceeded
    
    Solution: Implement exponential backoff or reduce batch size
  3. Model Not Found
    Error: Model not found: gpt-4
    
    Solution: Use a valid embedding model (not chat models)

Storage Issues

  1. Database Locked
    Error: database is locked
    
    Solution: Ensure only one writer at a time, or use WAL mode
  2. Embedding Dimension Mismatch
    Error: Expected 384 dimensions, got 1536
    
    Solution: Ensure consistent embedding dimensions across providers

Next Steps

Change Log

  • 2025-06-13: Initial tutorial created
  • 2025-06-13: Added complete working examples
  • 2025-06-13: Added troubleshooting section