Skip to main content

Embedding Search Tutorial

Learn how to implement semantic search using Soul Kernel’s embedding and storage systems.

Overview

This tutorial demonstrates how to:
  • Generate embeddings for text content
  • Store memories with vector embeddings
  • Perform semantic similarity search
  • Use caching for efficiency
  • Switch between embedding providers

Prerequisites

  • Rust 1.79 or later
  • Basic understanding of vector embeddings
  • (Optional) OpenAI API key for production embeddings

Setup

1. Add Dependencies

Add the required crates to your Cargo.toml:
[dependencies]
embeddings = { path = "../embeddings" }
storage = { path = "../storage" }
tokio = { version = "1.42", features = ["full"] }
anyhow = "1.0"
tracing-subscriber = "0.3"
dotenv = "0.15"

[features]
openai = ["embeddings/openai"]

2. Environment Configuration

For OpenAI embeddings, create an openai.env file:
OPENAI_API_KEY=sk-proj-your-api-key-here
OPENAI_EMBEDDING_MODEL=text-embedding-3-small
EMBEDDING_CACHE_SIZE=1000

Basic Implementation

Step 1: Initialize Services

use embeddings::{create_embedding_service, EmbeddingConfig, EmbeddingCache};
use storage::{HybridMemoryStore, MemoryStore, SqliteMemoryStore, QdrantMemoryStore};
use std::sync::Arc;

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    // Initialize logging
    tracing_subscriber::fmt::init();
    
    // Load environment variables
    if let Ok(_) = dotenv::from_filename("openai.env") {
        println!("Loaded configuration from openai.env");
    }
    
    // Create embedding service
    let embedding_config = if cfg!(feature = "openai") && std::env::var("OPENAI_API_KEY").is_ok() {
        println!("Using OpenAI API for embeddings");
        EmbeddingConfig::openai(std::env::var("OPENAI_API_KEY").unwrap())
    } else {
        println!("Using mock embeddings");
        EmbeddingConfig::default()
    };
    
    let embedding_service = create_embedding_service(&embedding_config).await?;
    let embedding_cache = EmbeddingCache::new(100);
    
    // Create storage
    let sqlite = Arc::new(SqliteMemoryStore::in_memory()?);
    let vector = Arc::new(QdrantMemoryStore::new("mock://")?);
    let store = HybridMemoryStore::new(sqlite, vector);
    
    Ok(())
}

Step 2: Store Memories with Embeddings

use storage::{MemoryEvent, MemoryMetadata};

// Sample memories to store
let memories = vec![
    ("I love pizza with extra cheese", vec!["food", "preference"]),
    ("The weather is sunny today in San Francisco", vec!["weather", "location"]),
    ("Remember to buy milk and eggs", vec!["todo", "shopping"]),
    ("My favorite programming language is Rust", vec!["tech", "preference"]),
    ("The meeting is scheduled for 3 PM tomorrow", vec!["schedule", "work"]),
];

println!("Storing memories with embeddings...");

for (content, tags) in memories {
    // Check cache first
    let embedding = if let Some(cached) = embedding_cache.get(content) {
        println!("  Using cached embedding for: {}", content);
        cached
    } else {
        // Generate embedding
        let embedding = embedding_service.generate_embedding(content).await?;
        embedding_cache.put(content, embedding.clone());
        println!("  Generated embedding for: {}", content);
        embedding
    };
    
    // Create memory event
    let event = MemoryEvent::new(
        content.to_string(),
        embedding,
        MemoryMetadata {
            source: "example".to_string(),
            tags: tags.iter().map(|s| s.to_string()).collect(),
            confidence: 0.9,
            ..Default::default()
        },
    );
    
    // Store in database
    store.insert(&event).await?;
}
// Queries to test
let queries = vec![
    "What food do I like?",
    "Tell me about the weather",
    "What should I remember to do?",
    "Programming languages",
];

for query in queries {
    println!("\nQuery: '{}'", query);
    
    // Generate query embedding
    let query_embedding = embedding_service.generate_embedding(query).await?;
    
    // Search for similar memories
    let results = store.search_by_embedding(&query_embedding, 3).await?;
    
    println!("Top {} results:", results.len());
    for (i, result) in results.iter().enumerate() {
        println!("  {}. [Score: {:.3}] {}", 
            i + 1, 
            result.score, 
            result.event.content
        );
        if !result.event.metadata.tags.is_empty() {
            println!("     Tags: {}", result.event.metadata.tags.join(", "));
        }
    }
}

Advanced Features

Batch Processing

Process multiple texts efficiently:
let texts = vec![
    "First document about AI".to_string(),
    "Second document about ML".to_string(),
    "Third document about data science".to_string(),
];

// Generate embeddings in batch
let embeddings = embedding_service.generate_embeddings(&texts).await?;

// Store all at once
for (text, embedding) in texts.iter().zip(embeddings.iter()) {
    let event = MemoryEvent::new(
        text.clone(),
        embedding.clone(),
        MemoryMetadata::default(),
    );
    store.insert(&event).await?;
}
Search with specific criteria:
use storage::{SearchFilter};

let filter = SearchFilter {
    tags: Some(vec!["food".to_string(), "preference".to_string()]),
    sources: Some(vec!["conversation".to_string()]),
    min_confidence: Some(0.8),
    time_range: None,
};

let results = store.search_by_embedding_filtered(
    &query_embedding, 
    5,     // top_k
    filter
).await?;

Similarity Threshold

Only return results above a certain similarity:
let results = store.search_by_embedding(&query_embedding, 10).await?;

// Filter by similarity score
let relevant_results: Vec<_> = results
    .into_iter()
    .filter(|r| r.score > 0.7)  // Only high similarity
    .collect();

Performance Tips

1. Use Caching Effectively

// Create a larger cache for production
let cache = EmbeddingCache::new(10000);

// Pre-warm cache with common queries
let common_queries = vec!["weather", "schedule", "tasks", "preferences"];
for query in common_queries {
    let embedding = embedding_service.generate_embedding(query).await?;
    cache.put(query, embedding);
}

2. Batch Operations

// Instead of this:
for text in texts {
    let embedding = service.generate_embedding(&text).await?;
    // Process...
}

// Do this:
let embeddings = service.generate_embeddings(&texts).await?;
for (text, embedding) in texts.iter().zip(embeddings.iter()) {
    // Process...
}

3. Optimize Storage

// Use in-memory for testing
let store = SqliteMemoryStore::in_memory()?;

// Use file-based for production
let store = SqliteMemoryStore::new("memories.db")?;

// Enable WAL mode (done automatically)
// Provides better concurrent performance

Switching Providers

Development (Mock)

cargo run --bin embedding_search
Results:
  • Fast generation (~50μs)
  • Low similarity scores
  • Good for testing logic

Production (OpenAI)

cargo run --bin embedding_search --features openai
Results:
  • Slower generation (~310ms)
  • High-quality semantic matching
  • Meaningful similarity scores

Complete Example

See the full working example at:
kernel/examples/embedding_search.rs
Run it with:
cd kernel/examples
cargo run --bin embedding_search --features openai

Troubleshooting

OpenAI API Issues

  1. API Key Not Found
    Error: OPENAI_API_KEY not set
    
    Solution: Create openai.env file with your API key
  2. Rate Limiting
    Error: Rate limit exceeded
    
    Solution: Implement exponential backoff or reduce batch size
  3. Model Not Found
    Error: Model not found: gpt-4
    
    Solution: Use a valid embedding model (not chat models)

Storage Issues

  1. Database Locked
    Error: database is locked
    
    Solution: Ensure only one writer at a time, or use WAL mode
  2. Embedding Dimension Mismatch
    Error: Expected 384 dimensions, got 1536
    
    Solution: Ensure consistent embedding dimensions across providers

Next Steps

Change Log

  • 2025-06-13: Initial tutorial created
  • 2025-06-13: Added complete working examples
  • 2025-06-13: Added troubleshooting section