Documentation Index
Fetch the complete documentation index at: https://doc.ambientsoul.ai/llms.txt
Use this file to discover all available pages before exploring further.
Embeddings System
The Embeddings system provides vector generation capabilities for Soul Kernel’s semantic search and memory indexing.Overview
The embeddings system enables:- Semantic understanding of text content
- Vector similarity search in memory graph
- Multiple embedding providers (Mock, OpenAI, Local)
- Efficient caching to reduce API costs
- Rust 1.79 compatibility
Architecture
Provider Architecture
The embeddings system uses a trait-based design implemented in theembeddings crate:
Supported Providers
-
Mock Provider - For testing and development
- 384-dimensional vectors
- Deterministic output based on text hash
- ~50μs generation time
-
OpenAI Provider - Production-quality embeddings
- Supports all OpenAI embedding models
- 1536-dim (text-embedding-3-small) or 3072-dim (text-embedding-3-large)
- Batch processing optimization
- ~310ms single, ~40ms per embedding in batch
-
Local Provider - Planned for offline use
- Candle-based local models
- Privacy-preserving
- No API costs
Implementation
Crate Structure
Configuration
Caching Layer
LRU cache implementation for cost optimization:Key Features
1. Provider Abstraction
Easy switching between providers:2. Batch Processing
Efficient batch operations with OpenAI:3. Error Handling
Comprehensive error types:4. Conditional Compilation
Feature flags for optional dependencies:Usage Examples
Basic Embedding Generation
With Caching
Integration with Storage
OpenAI Setup
Environment Configuration
- Create
openai.envfile:
- Add your API key:
- Run with feature flag:
Supported Models
text-embedding-3-small- 1536 dimensions, fastesttext-embedding-3-large- 3072 dimensions, highest qualitytext-embedding-ada-002- 1536 dimensions, legacy
Performance Metrics
Generation Speed
- Mock: ~50μs per embedding
- OpenAI Single: ~310ms per embedding
- OpenAI Batch: ~40ms per embedding (in batches)
Semantic Quality
- Mock similarity for related texts: ~0.05
- OpenAI similarity for related texts: ~0.55
Memory Usage
- 384-dim mock: ~1.5KB per embedding
- 1536-dim OpenAI: ~6KB per embedding
- 3072-dim OpenAI: ~12KB per embedding
Best Practices
- Use Caching - Avoid redundant API calls
- Batch When Possible - Up to 100 texts per request
- Handle Rate Limits - Implement exponential backoff
- Choose Right Model - Balance quality vs cost
- Normalize Inputs - Clean text before embedding
Security Considerations
- API keys stored in environment variables
- Never commit
openai.envto version control - Use mock provider for public demos
- Consider local models for sensitive data
Future Enhancements
- Local model support with Candle
- Embedding versioning and migration
- Query expansion techniques
- Multi-modal embeddings
- Custom fine-tuned models
Next Steps
- See Memory Graph for storage integration
- Read API Reference for detailed docs
- Try Embedding Search Tutorial
- Explore examples/ for working code
Change Log
- 2025-06-13: Initial embeddings system documentation
- 2025-06-13: Added OpenAI provider implementation details
- 2025-06-13: Documented caching and performance metrics