Documentation Index
Fetch the complete documentation index at: https://doc.ambientsoul.ai/llms.txt
Use this file to discover all available pages before exploring further.
Embedding Search Tutorial
Learn how to implement semantic search using Soul Kernel’s embedding and storage systems.
Overview
This tutorial demonstrates how to:
- Generate embeddings for text content
- Store memories with vector embeddings
- Perform semantic similarity search
- Use caching for efficiency
- Switch between embedding providers
Prerequisites
- Rust 1.79 or later
- Basic understanding of vector embeddings
- (Optional) OpenAI API key for production embeddings
Setup
1. Add Dependencies
Add the required crates to your Cargo.toml:
[dependencies]
embeddings = { path = "../embeddings" }
storage = { path = "../storage" }
tokio = { version = "1.42", features = ["full"] }
anyhow = "1.0"
tracing-subscriber = "0.3"
dotenv = "0.15"
[features]
openai = ["embeddings/openai"]
2. Environment Configuration
For OpenAI embeddings, create an openai.env file:
OPENAI_API_KEY=sk-proj-your-api-key-here
OPENAI_EMBEDDING_MODEL=text-embedding-3-small
EMBEDDING_CACHE_SIZE=1000
Basic Implementation
Step 1: Initialize Services
use embeddings::{create_embedding_service, EmbeddingConfig, EmbeddingCache};
use storage::{HybridMemoryStore, MemoryStore, SqliteMemoryStore, QdrantMemoryStore};
use std::sync::Arc;
#[tokio::main]
async fn main() -> anyhow::Result<()> {
// Initialize logging
tracing_subscriber::fmt::init();
// Load environment variables
if let Ok(_) = dotenv::from_filename("openai.env") {
println!("Loaded configuration from openai.env");
}
// Create embedding service
let embedding_config = if cfg!(feature = "openai") && std::env::var("OPENAI_API_KEY").is_ok() {
println!("Using OpenAI API for embeddings");
EmbeddingConfig::openai(std::env::var("OPENAI_API_KEY").unwrap())
} else {
println!("Using mock embeddings");
EmbeddingConfig::default()
};
let embedding_service = create_embedding_service(&embedding_config).await?;
let embedding_cache = EmbeddingCache::new(100);
// Create storage
let sqlite = Arc::new(SqliteMemoryStore::in_memory()?);
let vector = Arc::new(QdrantMemoryStore::new("mock://")?);
let store = HybridMemoryStore::new(sqlite, vector);
Ok(())
}
Step 2: Store Memories with Embeddings
use storage::{MemoryEvent, MemoryMetadata};
// Sample memories to store
let memories = vec![
("I love pizza with extra cheese", vec!["food", "preference"]),
("The weather is sunny today in San Francisco", vec!["weather", "location"]),
("Remember to buy milk and eggs", vec!["todo", "shopping"]),
("My favorite programming language is Rust", vec!["tech", "preference"]),
("The meeting is scheduled for 3 PM tomorrow", vec!["schedule", "work"]),
];
println!("Storing memories with embeddings...");
for (content, tags) in memories {
// Check cache first
let embedding = if let Some(cached) = embedding_cache.get(content) {
println!(" Using cached embedding for: {}", content);
cached
} else {
// Generate embedding
let embedding = embedding_service.generate_embedding(content).await?;
embedding_cache.put(content, embedding.clone());
println!(" Generated embedding for: {}", content);
embedding
};
// Create memory event
let event = MemoryEvent::new(
content.to_string(),
embedding,
MemoryMetadata {
source: "example".to_string(),
tags: tags.iter().map(|s| s.to_string()).collect(),
confidence: 0.9,
..Default::default()
},
);
// Store in database
store.insert(&event).await?;
}
// Queries to test
let queries = vec![
"What food do I like?",
"Tell me about the weather",
"What should I remember to do?",
"Programming languages",
];
for query in queries {
println!("\nQuery: '{}'", query);
// Generate query embedding
let query_embedding = embedding_service.generate_embedding(query).await?;
// Search for similar memories
let results = store.search_by_embedding(&query_embedding, 3).await?;
println!("Top {} results:", results.len());
for (i, result) in results.iter().enumerate() {
println!(" {}. [Score: {:.3}] {}",
i + 1,
result.score,
result.event.content
);
if !result.event.metadata.tags.is_empty() {
println!(" Tags: {}", result.event.metadata.tags.join(", "));
}
}
}
Advanced Features
Batch Processing
Process multiple texts efficiently:
let texts = vec![
"First document about AI".to_string(),
"Second document about ML".to_string(),
"Third document about data science".to_string(),
];
// Generate embeddings in batch
let embeddings = embedding_service.generate_embeddings(&texts).await?;
// Store all at once
for (text, embedding) in texts.iter().zip(embeddings.iter()) {
let event = MemoryEvent::new(
text.clone(),
embedding.clone(),
MemoryMetadata::default(),
);
store.insert(&event).await?;
}
Filtered Search
Search with specific criteria:
use storage::{SearchFilter};
let filter = SearchFilter {
tags: Some(vec!["food".to_string(), "preference".to_string()]),
sources: Some(vec!["conversation".to_string()]),
min_confidence: Some(0.8),
time_range: None,
};
let results = store.search_by_embedding_filtered(
&query_embedding,
5, // top_k
filter
).await?;
Similarity Threshold
Only return results above a certain similarity:
let results = store.search_by_embedding(&query_embedding, 10).await?;
// Filter by similarity score
let relevant_results: Vec<_> = results
.into_iter()
.filter(|r| r.score > 0.7) // Only high similarity
.collect();
1. Use Caching Effectively
// Create a larger cache for production
let cache = EmbeddingCache::new(10000);
// Pre-warm cache with common queries
let common_queries = vec!["weather", "schedule", "tasks", "preferences"];
for query in common_queries {
let embedding = embedding_service.generate_embedding(query).await?;
cache.put(query, embedding);
}
2. Batch Operations
// Instead of this:
for text in texts {
let embedding = service.generate_embedding(&text).await?;
// Process...
}
// Do this:
let embeddings = service.generate_embeddings(&texts).await?;
for (text, embedding) in texts.iter().zip(embeddings.iter()) {
// Process...
}
3. Optimize Storage
// Use in-memory for testing
let store = SqliteMemoryStore::in_memory()?;
// Use file-based for production
let store = SqliteMemoryStore::new("memories.db")?;
// Enable WAL mode (done automatically)
// Provides better concurrent performance
Switching Providers
Development (Mock)
cargo run --bin embedding_search
Results:
- Fast generation (~50μs)
- Low similarity scores
- Good for testing logic
Production (OpenAI)
cargo run --bin embedding_search --features openai
Results:
- Slower generation (~310ms)
- High-quality semantic matching
- Meaningful similarity scores
Complete Example
See the full working example at:
kernel/examples/embedding_search.rs
Run it with:
cd kernel/examples
cargo run --bin embedding_search --features openai
Troubleshooting
OpenAI API Issues
-
API Key Not Found
Error: OPENAI_API_KEY not set
Solution: Create openai.env file with your API key
-
Rate Limiting
Error: Rate limit exceeded
Solution: Implement exponential backoff or reduce batch size
-
Model Not Found
Error: Model not found: gpt-4
Solution: Use a valid embedding model (not chat models)
Storage Issues
-
Database Locked
Error: database is locked
Solution: Ensure only one writer at a time, or use WAL mode
-
Embedding Dimension Mismatch
Error: Expected 384 dimensions, got 1536
Solution: Ensure consistent embedding dimensions across providers
Next Steps
Change Log
- 2025-06-13: Initial tutorial created
- 2025-06-13: Added complete working examples
- 2025-06-13: Added troubleshooting section