The vCon MCP server provides four search tools with different capabilities, from simple filtering to advanced semantic search.
1. search_vcons - Basic Filter Search
Best for: Finding vCons by metadata (subject, parties, dates)
Searches:
Party names, emails, phone numbers
Does NOT search:
Example:
{
"subject": "customer support",
"party_name": "John Doe",
"start_date": "2024-01-01T00:00:00Z",
"limit": 10
}
Returns: Complete vCon objects matching the filters
2. search_vcons_content - Keyword Search
Best for: Finding specific words or phrases in conversation content
Searches:
✅ Dialog bodies (conversations, transcripts)
✅ Analysis bodies (summaries, sentiment, etc.)
✅ Party information (names, emails, phones)
❌ Attachments (not indexed for full-text search)
Features:
Full-text search with ranking
Typo tolerance via trigram indexing
Highlighted snippets in results
Example:
Returns: Ranked results with snippets showing where matches were found
Result format:
3. search_vcons_semantic - AI-Powered Semantic Search
Best for: Finding conversations by meaning, not just keywords
Searches:
✅ Dialog bodies (embedded)
✅ Analysis bodies with encoding='none' or NULL (embedded)
❌ Analysis with encoding='base64url' or encoding='json' (not embedded)
❌ Attachments (not embedded)
Features:
Finds conceptually similar content
Works across paraphrases and synonyms
AI embeddings using 384-dimensional vectors
Similarity threshold control
Requirements:
Embeddings must be generated first (see embedding documentation)
Currently requires pre-computed embedding vector (384 dimensions)
Example:
Note: Automatic embedding generation from query text is not yet implemented. Use search_vcons_content for keyword-based search without embeddings.
Returns: Similar conversations ranked by semantic similarity
4. search_vcons_hybrid - Combined Keyword + Semantic Search
Best for: Comprehensive search combining exact matches and conceptual similarity
Searches:
Everything from keyword search (subject, dialog, analysis, parties)
Everything from semantic search (embedded content)
Features:
Combines full-text and semantic search
Adjustable weighting between keyword and semantic results
Best of both worlds: exact matches + conceptual matches
Example:
Parameters:
semantic_weight: 0-1 (default 0.6)
0.0 = 100% keyword search
1.0 = 100% semantic search
0.6 = 60% semantic, 40% keyword (recommended)
Returns: Combined results with both keyword and semantic scores
What About Attachments?
Attachments are NOT indexed for search in the current implementation.
Why?
Binary content: Many attachments contain binary data (PDFs, images, audio) that isn't suitable for text-based search
Encoding: Attachments with encoding='base64url' contain encoded data, not searchable text
Structured data: Attachments with encoding='json' contain structured data that produces poor quality embeddings
Attachments of type tags with encoding='json' ARE used for filtering, but not for content search.
Example tags attachment:
These tags can be used with the tags parameter in any search tool:
Future Enhancements
Potential future support for attachment content search:
Text extraction: Extract text from PDFs, Word docs, etc.
Audio transcription: Transcribe audio attachments to searchable text
OCR: Extract text from images
Selective indexing: Index only attachments with text content
If you need to search attachment content, consider:
Extracting text and adding it as an analysis element
Adding a summary of attachment content as an analysis
Using attachment metadata in tags
Analysis Encoding and Search
Analysis Elements ARE Searchable
Analysis elements are included in search, with filtering based on encoding:
Encoding
Keyword Search
Semantic Search
Notes
Plain text content, ideal for search
Included in keyword search only
Included in keyword search only
Why Filter Semantic Search by Encoding?
Analysis with encoding='none' contains human-readable text like:
Sentiment analysis results
Natural language insights
These are ideal for semantic search because they contain meaningful natural language.
Analysis with encoding='json' or encoding='base64url' typically contains:
Structured data (poor quality embeddings)
Binary content (not suitable for embeddings)
Encoded data (not searchable as text)
Search Comparison
Feature
search_vcons
search_vcons_content
search_vcons_semantic
search_vcons_hybrid
search_vcons: Quick metadata lookups
"Show me vCons from last week"
"List vCons with subject containing 'urgent'"
search_vcons_content: Keyword-based content search
"Find conversations mentioning 'refund'"
"Search for 'technical support' in dialog"
"Find analysis containing 'positive sentiment'"
search_vcons_semantic: Concept-based search
"Find conversations where customer was unhappy"
"Show me calls about payment issues"
"Find similar conversations to this one"
search_vcons_hybrid: Comprehensive search
"Find all billing-related conversations" (gets both exact matches and related topics)
"Search for customer complaints" (finds variations and synonyms)
Best when you want both precision and recall
Use filters: Date ranges and tags can dramatically reduce search scope
Set appropriate limits: Start with smaller limits (10-20) for faster results
Choose the right tool: Don't use semantic search if keyword search is sufficient
Pre-generate embeddings: Semantic search requires embeddings to be generated beforehand
Generating Embeddings
For semantic and hybrid search to work effectively, you need to generate embeddings for your vCons.
See the following guides:
Quick start:
Troubleshooting
"No results found" for content search
Check that the content exists in dialog or analysis
Try a simpler query (fewer words)
Use wildcards or partial words
"Embedding generation not yet implemented"
Semantic search currently requires pre-computed embeddings
Use search_vcons_content for keyword search instead
Generate embeddings using the scripts in /scripts/
"Embedding must be 384 dimensions"
The system uses 384-dimensional embeddings
If you're providing embeddings, ensure they match this dimension
Use text-embedding-3-small with dimensions=384 (OpenAI)
Or use sentence-transformers/all-MiniLM-L6-v2 (Hugging Face)
Poor search results
For keyword search: Try simpler, more specific terms
For semantic search: Ensure embeddings are up to date
For hybrid search: Adjust semantic_weight parameter
Consider using tags to filter results
Find customer complaints in dialog
Find high-priority sales conversations
Hybrid search with keyword emphasis
Find conversations similar to a specific vCon
Get the vCon's embedding from the database
Use it in semantic search: