Skip to content

← Back to YouLab

YouLab is built as a layered architecture where each component has a single responsibility:

┌─────────────────────────────────────────────────────────────┐
│ OpenWebUI │
│ (Chat Interface) │
└─────────────────────────┬───────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Pipeline (Pipe) │
│ • Extract user_id, chat_id from OpenWebUI │
│ • Ensure agent exists for user │
│ • Stream SSE responses back to UI │
└─────────────────────────┬───────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ HTTP Service (FastAPI :8100) │
│ ┌──────────────────┐ ┌──────────────────┐ │
│ │ AgentManager │ │ StrategyManager │ │
│ │ (per-user) │ │ (singleton RAG) │ │
│ └────────┬─────────┘ └────────┬─────────┘ │
│ │ │ │
│ └──────────┬──────────┘ │
│ │ │
│ ┌──────────┴──────────┐ │
│ │ HonchoClient │ │
│ │ (message persist) │ │
│ └──────────┬──────────┘ │
└──────────────────────┼──────────────────────────────────────┘
┌────────────┴────────────┐
│ │
▼ ▼
┌─────────────────────┐ ┌─────────────────────┐
│ Letta Server │ │ Honcho Service │
│ (:8283) │ │ (ToM Layer) │
│ • Agent lifecycle │ │ • Message store │
│ • Core memory │ │ • Session mgmt │
│ • Archival memory │ │ • Peer tracking │
│ • Tool execution │ │ │
└─────────┬───────────┘ └─────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Claude API │
│ (via OpenAI compatibility) │
└─────────────────────────────────────────────────────────────┘

The chat frontend that users interact with. Provides:

  • Chat interface with message history
  • User authentication and sessions
  • Pipe extension system for custom backends
  • Chat title management

The bridge between OpenWebUI and the HTTP service:

ResponsibilityImplementation
User context extraction__user__["id"], __user__["name"]
Chat context__metadata__["chat_id"], Chats.get_chat_by_id()
Agent provisioningPOST /agents on first message
Response streamingSSE via httpx-sse

Location: src/youlab_server/pipelines/letta_pipe.py

FastAPI application providing RESTful endpoints:

DomainEndpointsManager
Agent CRUD/agents, /agents/{id}AgentManager
Chat/chat, /chat/streamAgentManager
Strategy/strategy/*StrategyManager
Background/background/*BackgroundAgentRunner
Health/health-

Location: src/youlab_server/server/

Manages per-user Letta agents:

# Agent naming convention
agent_name = f"youlab_{user_id}_{agent_type}"
# Example: youlab_user123_tutor
# Cache structure
cache: dict[tuple[str, str], str] # (user_id, agent_type) -> agent_id

Key Features:

  • Lazy agent creation from templates
  • Agent caching for fast lookups
  • Cache rebuild on service startup
  • Streaming with Letta metadata stripping

Singleton RAG agent for project knowledge:

# Single shared agent
AGENT_NAME = "YouLab-Support"
# Persona instructs archival search
"CRITICAL: Before answering ANY question about YouLab:
1. Use archival_memory_search to find relevant documentation"

Use Cases:

  • Upload project documentation
  • Query project knowledge
  • Search archival memory

The underlying agent framework:

FeaturePurpose
Core MemoryPersona + Human blocks in context
Archival MemoryVector-indexed long-term storage
Tool SystemFunction calling for agents
StreamingReal-time response generation
User types "Help me brainstorm essay topics"
┌─────────────────┐
│ OpenWebUI │ 1. Captures message
└────────┬────────┘
┌─────────────────┐
│ Pipe.pipe() │ 2. Extracts user_id="user123"
│ │ 3. Calls _ensure_agent_exists()
│ │ 4. POSTs to /chat/stream
└────────┬────────┘
┌─────────────────┐
│ HTTP Service │ 5. Validates agent exists
│ │ 6. Calls stream_message()
└────────┬────────┘
┌─────────────────┐
│ Letta Server │ 7. Loads agent with memory
│ │ 8. Generates response via Claude
│ │ 9. Streams chunks back
└────────┬────────┘
┌─────────────────┐
│ Pipeline │ 10. Transforms to OpenWebUI events
│ │ 11. Emits via __event_emitter__
└────────┬────────┘
┌─────────────────┐
│ OpenWebUI │ 12. Displays streaming response
└─────────────────┘
┌─────────────────────────────────────────┐
│ Core Memory │
│ ┌─────────────┐ ┌─────────────┐ │
│ │ Persona │ │ Human │ │
│ │ Block │ │ Block │ │
│ │ │ │ │ │
│ │ [IDENTITY] │ │ [USER] │ │
│ │ [CAPS] │ │ [TASK] │ │
│ │ [STYLE] │ │ [CONTEXT] │ │
│ └─────────────┘ └──────┬──────┘ │
│ │ │
│ Rotation when >80% capacity │
│ │ │
└──────────────────────────┼──────────────┘
┌─────────────────────────────────────────┐
│ Archival Memory │
│ │
│ [ARCHIVED 2025-12-31T10:00:00] │
│ Previous context notes and tasks │
│ │
│ [TASK COMPLETED 2025-12-30T15:00:00] │
│ Brainstormed 5 essay topics │
│ │
└─────────────────────────────────────────┘
src/youlab_server/
├── agents/ # Agent creation and management
│ ├── base.py # BaseAgent class (deprecated)
│ ├── default.py # Factory functions (deprecated)
│ └── templates.py # AgentTemplate (deprecated)
├── background/ # Background agent system
│ └── runner.py # BackgroundAgentRunner execution engine
├── config/ # Configuration
│ └── settings.py # Settings, ServiceSettings
├── curriculum/ # Curriculum system
│ ├── schema.py # Full Pydantic schemas (CourseConfig, etc.)
│ ├── loader.py # TOML loading and caching
│ └── blocks.py # Dynamic memory block generation
├── honcho/ # Message persistence + dialectic
│ ├── __init__.py # Exports HonchoClient
│ └── client.py # HonchoClient, query_dialectic
├── memory/ # Memory system
│ ├── blocks.py # PersonaBlock, HumanBlock (deprecated)
│ ├── manager.py # MemoryManager (deprecated)
│ ├── strategies.py # Rotation strategies (deprecated)
│ └── enricher.py # MemoryEnricher for external updates
├── observability/ # Logging and tracing
│ ├── logging.py # Structured logging
│ ├── metrics.py # LLMMetrics
│ └── tracing.py # Tracer context manager
├── pipelines/ # OpenWebUI integration
│ └── letta_pipe.py # Pipe class
├── server/ # HTTP service
│ ├── main.py # FastAPI app
│ ├── agents.py # AgentManager
│ ├── background.py # Background agent endpoints
│ ├── curriculum.py # Curriculum endpoints
│ ├── schemas.py # Request/response models
│ ├── tracing.py # Langfuse integration
│ └── strategy/ # Strategy agent subsystem
│ ├── manager.py # StrategyManager
│ ├── router.py # FastAPI router
│ └── schemas.py # Strategy schemas
├── tools/ # Agent tools
│ ├── dialectic.py # query_honcho tool
│ └── memory.py # edit_memory_block tool
└── main.py # CLI entry point
config/
└── courses/ # TOML course configurations
├── default/ # Default agent configuration
│ └── course.toml
└── college-essay/ # College essay course
├── course.toml
└── modules/
├── 01-self-discovery.toml
├── 02-topic-development.toml
└── 03-drafting.toml

Letta provides:

  • Persistent memory - Core and archival memory that survives sessions
  • Structured memory blocks - Type-safe memory with validation
  • Tool system - Agents can call functions
  • Streaming - Real-time response generation
  • Decoupling - Pipeline doesn’t directly depend on Letta SDK
  • Testability - HTTP endpoints are easy to test
  • Flexibility - Multiple clients can use the service
  • Observability - Centralized tracing and logging

Each student gets their own agent with:

  • Personal context (name, preferences, facts)
  • Session history
  • Progress tracking
  • Isolated memory

A shared RAG agent for:

  • Project documentation
  • FAQ responses
  • Developer queries
  • Knowledge that doesn’t belong to a user