Speed %

Context chunks

Notes max

Intents max

📁 VTT 📄 Webex

Connected

📥0 📤0 💰$0

✨ Live Context

— ▼

Waiting for meeting to start...

📋

Meeting summary will appear here

📖

Context history will appear here

🎯

No intents detected yet

Polls, scheduling, Jira, insights will appear here

⚡

State history will appear here

📊 Debug Log

Debug data appears here

🏗️ Live Agentic Framework Architecture

Stateful Orchestration for Real-Time Meeting Intelligence

🔄 End-to-End Architecture Comparison

Two competing approaches to live meeting intelligence. The Stateless Observer (current) vs the Stateful Orchestrator (proposed).

✅ PROPOSED

The "Live Agentic Framework" (Stateful) — The "Moat"

Stateful Live Agentic Framework Architecture

🔍 Click image to view fullscreen

3 Total Hops to Action

0 Hops per New Agent

<2s Latency

O(1) Scaling

📐 Architecture Flow

ASR → Flink (Meeting Memory) → Orchestrator (LLM) → Agent Registry → Webex Client

The Orchestrator is the single consumer; adding a new agent just requires registering it in the registry. The data flow doesn't change.

🔑 Key Architecture Properties

🧠 "Compute-to-Data" Model: The Orchestrator (LLM) sits directly next to the Meeting Memory (Flink). It reasons over the entire context in-memory without making network calls to a database.
📍 Single Source of Truth: The "Agentic Meeting Memory" handles ASR corrections natively. If the ASR changes "Site-level" to "Org-level," the memory updates instantly before the Orchestrator sees it.
📈 O(1) Complexity: Adding 50 new agents adds zero additional load to the database or the ASR stream. The Orchestrator filters intents once and dispatches only when necessary.
⚡ Real-Time Latency: < 2 seconds end-to-end, designed for real-time intervention during live meetings.

🧠 "The Brain" (In-Memory State)

The Agentic Meeting Memory + Orchestrator operate as a unified in-memory reasoning engine.

🔌 Plug-and-Play Expansion

Live Agent Registry allows adding new agents with zero infrastructure changes.

❌ CURRENT

The "Observer" Architecture (Stateless) — The "Trap"

🔍 Click image to view fullscreen

7+ Total Hops to Action

3-4 Hops per New Agent

10-15s Latency

O(n) Scaling

📐 Architecture Flow

ASR → AI Bridge → DB → Intent Detection → Polling Agent → DB (Read) → DB (Write) → Client

Every new agent requires its own detection logic, its own database read to get context, and its own database write.

⚠️ Critical Problems

🐘 "Thundering Herd" Problem: Every agent (Polling, Jira, Scheduling) must independently query the "Transcript DB" for context. If you have 10 agents, you have 10x the database load for every sentence spoken.
🏃 Race Conditions: The "Polling Agent" reads from the DB while the "AI Bridge" is writing to it. If an ASR correction happens (e.g., "Wait, don't launch the poll"), the Agent might read the old, wrong text and launch it anyway.
🐌 High Latency: The multi-hop chain (Bridge → API → Agent → DB) introduces 10-15 seconds of lag, making "real-time" intervention impossible.
🔒 Siloed Logic: The "Polling Intent detection" is hardcoded. Adding a "Jira Agent" requires building a whole new detection pipe, duplicating effort.

⚠️ Latency Bottleneck (Network I/O)

Multiple round-trips to Media Backend/DB on every agent decision.

👁️ Stateless Blindness

Polling Agent cannot see ASR corrections — may act on stale/wrong data.

📊 Architecture Comparison

Feature

Current Architecture (Stateless)

Proposed Framework (Stateful)

Data Model

Message-based (Append Only)

Stream-based (Living Document)

Context Handling

Fetch from DB (Slow)

In-Memory (Instant)

ASR Corrections

Fails (Context Pollution)

Succeeds (Native Updates)

Scaling Cost

Linear (More Agents = More DB Load)

Constant (More Agents = Same Load)

Latency

10-15 Seconds

< 2 Seconds

Total Hops to Action

7+ Hops

3 Hops

Hops per New Agent

3-4 Hops (new detection pipe)

0 Hops (registry only)

🔬 This PoC: Live Implementation

This demo implements the stateful architecture end-to-end:

┌──────────────┐     ┌───────────────┐     ┌─────────────────────────────────────┐
│   Browser    │────▶│ Google Cloud  │────▶│        Apache Flink                 │
│ Transcript   │     │   Pub/Sub     │     │  ┌───────────────────────────────┐  │
│  Generator   │     │  (Message Bus)│     │  │  StatefulNoteProcessor        │  │
└──────────────┘     └───────────────┘     │  │  • keyed by meeting_id        │  │
                                           │  │  • stores last N chunks       │  │
                                           │  │  • maintains topic/intent     │  │
                                           │  │  • manages MoM state          │  │
                                           │  └───────────────┬───────────────┘  │
                                           │                  │                  │
                                           │          ┌───────▼───────┐          │
                                           │          │  LLM Proxy    │          │
                                           │          │  (GPT-4/etc)  │          │
                                           │          └───────┬───────┘          │
                                           └──────────────────┼──────────────────┘
                                                              │
                                                      ┌───────▼───────┐
                                                      │  Centrifugo   │
                                                      │  (WebSocket)  │
                                                      └───────┬───────┘
                                                              │
                                                      ┌───────▼───────┐
                                                      │    Browser    │
                                                      │    (This UI)  │
                                                      └───────────────┘

📝 Transcript Generator

Browser + Node.js

Simulates live ASR stream
Uploads VTT/Webex transcripts
Configurable playback speed

📬 Pub/Sub

Google Cloud Pub/Sub

Fully managed message bus
At-least-once delivery
Decouples producers from consumers

⚡ Flink Orchestrator

Apache Flink (Java)

Keyed state per meeting_id
In-memory context window
Manages MoM, intents, topics
Calls LLM for analysis

🤖 LLM Proxy

Python Flask + GPT-4

Configurable model selection
Prompt management
Token tracking & cost calc

📡 Centrifugo

WebSocket Server

Per-meeting channels
Real-time push to clients
Sub-100ms delivery

⏱️ Observed Latency (This PoC)

Ingest (Browser → Pub/Sub) ~50-100ms

Queue (Pub/Sub → Flink) ~100-200ms

                LLM Inference (GPT-4)
                
                ~500-2000ms

Push (Centrifugo → Browser) ~50-100ms

Total E2E ~700-2500ms

💡 LLM inference dominates. With faster models (SLM), total could drop to <500ms.

🎯 Intent Detection Capabilities

The orchestrator detects intents in real-time via a configurable Intent Registry:

📊

Poll Suggestion

Detects debates/votes and suggests creating a poll

📅

Scheduling

Identifies follow-up meeting requests

🔍

Knowledge Fetch

Detects questions about past decisions

✅

Action Items

Captures commitments with owners/deadlines

⚖️

Decisions

Logs key decisions made during meeting

❓

Open Questions

Tracks unresolved questions for follow-up

📚 Reference Documents

📄

PRD: Stateful Orchestration

Full PRD for the Live Agentic Framework architecture pivot

📄

Unified Detection Framework

Phase 1 & 2 architecture for proactive detection

📄

SPARK-708543: Polling Agent

AI-generated poll feature spec (current stateless design)

🛠️ Tech Stack

Kubernetes (GKE) Apache Flink Google Pub/Sub Centrifugo OpenAI GPT-4 Java 17 TypeScript Python/Flask Helm Charts

✏️ Prompt Editor & Tester

Edit prompts and test them with sample data to see LLM response

🤖 Global LLM Model

🌐 Applies to ALL requests — This model is used for live meeting processing and all API calls. Changes are saved to the server.

⚡ Prompt Architecture

Monolithic Single prompt with all 5 steps (original) Modular Parallel prompts - faster, extensible

📝 User Prompt Template

Available Placeholders: {previous_summary}, {previous_topics}, {previousIntents}, {context_window}, {current_chunk}

🧪 Test Sample Data

Edit sample data matching prompt placeholders, then click "Test Prompt"

{current_narrative} - Current Context Map (High-Level Agenda)

{meeting_state_json} - Active Meeting State (for Deduplication)

{
  "current_chapter": "Budget Discussion",
  "meeting_notes": {
    "note-001": {"category": "INSIGHT", "text": "Mobile app usage up 40% YoY", "owner": "Alice", "priority": 1},
    "note-002": {"category": "ACTION_ITEM", "text": "Review competitor mobile apps by Friday", "owner": "Bob", "priority": 1}
  },
  "recent_intents": [],
  "total_notes": 2
}

{active_intent_registry} - Intent Registry (Active Triggers)

{
  "intent_registry": [
    {"id": "POLL_SUGGESTION", "display_name": "Suggest Poll", "trigger_logic": {"contextual_pattern": "Sustained debate or consensus needed"}, "coalescing_window_seconds": 120},
    {"id": "SCHEDULING", "display_name": "Schedule Meeting", "trigger_logic": {"contextual_pattern": "Request to book future time"}, "coalescing_window_seconds": 60},
    {"id": "JIRA_TICKET", "display_name": "Create Jira Ticket", "trigger_logic": {"contextual_pattern": "Technical task with enough detail"}, "coalescing_window_seconds": 180}
  ]
}

{context_window} - Conversation Context Window (Last 3 mins)

{current_chunk} - New Transcript Chunk (Live Stream)

📤 Request Body (JSON sent to LLM Proxy)

Click "Preview Request" or "Test Prompt" to see the JSON request body

Test with: 🧪 One-time test only

📤 LLM Response

🤖

Click "Test Prompt" to see LLM response

🎯 Intent Registry

🏗️ Live Agentic Framework Architecture

🔄 End-to-End Architecture Comparison

The "Live Agentic Framework" (Stateful) — The "Moat"

📐 Architecture Flow

🔑 Key Architecture Properties

The "Observer" Architecture (Stateless) — The "Trap"

📐 Architecture Flow

⚠️ Critical Problems

📊 Architecture Comparison

🔬 This PoC: Live Implementation

📝 Transcript Generator

📬 Pub/Sub

⚡ Flink Orchestrator

🤖 LLM Proxy

📡 Centrifugo

⏱️ Observed Latency (This PoC)

🎯 Intent Detection Capabilities

Poll Suggestion

Scheduling

Knowledge Fetch

Action Items

Decisions

Open Questions

📚 Reference Documents

PRD: Stateful Orchestration

Unified Detection Framework

SPARK-708543: Polling Agent

🛠️ Tech Stack

✏️ Prompt Editor & Tester

🤖 Global LLM Model

⚡ Prompt Architecture

🔀 Modular Agents & Artifact Builders

Narrative Anchor

Executive Scribe

Intent Sentinel

Jira Ticket Builder

Poll/Slido Builder

Meeting Scheduler

Insight Query Builder

📝 System Prompt

📝 User Prompt Template

🧪 Test Sample Data

📤 LLM Response