Documentation Index
Fetch the complete documentation index at: https://mintlify.com/georgeguimaraes/arcana/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Community detection groups related entities into clusters based on their relationship patterns. Communities help answer global questions that require understanding broad themes rather than specific facts. For example:- “What are the main topics in this knowledge base?”
- “Tell me about the AI research ecosystem”
- “Summarize the key companies in fintech”
How Community Detection Works
After building the entity graph, communities are detected:lib/arcana/graph/community_detector/leiden.ex:48
Leiden Algorithm
What is Leiden?
The Leiden algorithm is an improvement over the popular Louvain algorithm for community detection. It:- Optimizes modularity - Maximizes connections within communities, minimizes connections between
- Guarantees connectivity - All entities in a community are reachable from each other
- Produces hierarchies - Multiple levels from fine-grained to coarse clusters
- Scales efficiently - Handles large graphs (10,000+ nodes) quickly
Installation
Community detection requires theleidenfold library (Rust NIF):
- macOS (Apple Silicon)
- Linux (x86_64, ARM64)
lib/arcana/graph/community_detector/leiden.ex:11
Configuration
Options
:resolution (float, default: 1.0)
- Controls community granularity
- Higher values → smaller, more focused communities
- Lower values → larger, broader communities
- Typical range: 0.5 - 2.0
:objective (atom, default: :cpm)
- Quality function to optimize
- Options:
:cpm- Constant Potts Model (recommended):modularity- Classic modularity measure:rber- Reichardt-Bornholdt with Erdős-Rényi null model:rbc- Reichardt-Bornholdt with configuration null model:significance- Statistical significance:surprise- Surprise measure
:iterations (integer, default: 2)
- Number of optimization passes
- More iterations = better quality, longer runtime
- Typical range: 1-5
:min_size (integer, default: 1)
- Minimum entities per community
- Set to 2 to exclude singleton communities
- Set to 3+ for more substantial clusters
:max_level (integer, default: 1)
- Maximum hierarchy levels
- Level 0 = finest granularity
- Higher levels = coarser aggregations
- Typical range: 1-5
:seed (integer, default: 0)
- Random seed for reproducibility
- 0 = random seed each run
- Set specific value for deterministic results
lib/arcana/graph/community_detector/leiden.ex:30
Community Summarization
LLM Summarizer (Default)
Generates natural language summaries of communities using LLMs. Configuration:lib/arcana/graph/community_summarizer.ex:84
Summary Format
Good summaries should (2-5 sentences):- Identify the theme - What domain/topic does this community represent?
- Name key entities - Who/what are the most important members?
- Describe relationships - How are entities connected?
- Provide context - Why is this community significant?
Summary Regeneration
Communities are marked “dirty” when modified and need re-summarization:lib/arcana/graph/community_summarizer.ex:140
Custom Detectors
Implement theArcana.Graph.CommunityDetector behaviour:
lib/arcana/graph/community_detector.ex:76
Custom Summarizers
Implement theArcana.Graph.CommunitySummarizer behaviour:
lib/arcana/graph/community_summarizer.ex:74
Real Examples from Source
Example 1: Leiden Detection
Fromlib/arcana/graph/community_detector/leiden.ex:50:
Example 2: Edge Conversion
Fromlib/arcana/graph/community_detector/leiden.ex:123:
Example 3: Hierarchy Formatting
Fromlib/arcana/graph/community_detector/leiden.ex:138:
Example 4: Needs Regeneration Check
Fromlib/arcana/graph/community_summarizer.ex:142:
Using Communities in Search
Communities enable global queries that need broad context:lib/arcana/graph/graph_query.ex:166
Performance Considerations
Leiden Detection:- Small graphs (< 100 nodes): ~10-50ms
- Medium graphs (100-1000 nodes): ~50-500ms
- Large graphs (1000-10000 nodes): ~500-5000ms
- Scales approximately O(n log n)
- ~1-5MB per 1000 nodes
- Edge-weighted graphs use more memory
- Run detection asynchronously during ingest
- Cache community assignments
- Regenerate summaries only when
needs_regeneration?is true - Use higher
min_sizeto reduce number of communities - Limit
max_levelto reduce hierarchy depth
Next Steps
- Search - Use communities in graph search
- Relationships - Communities are built from relationships
- GraphRAG Overview - Understand the full pipeline