Speech-to-Text
Convert player voice to text in real-time using Whisper.cpp. Supports multiple languages, voice activity detection, and push-to-talk.
Large Language Models
Power NPC dialogue with Phi-3-mini via llama.cpp. Dynamic conversations, streaming responses, and context management.
Audio Capture
Robust microphone capture with consumer pattern. Push-to-talk, voice activity detection, and efficient ring buffers.
100% Offline
No cloud dependencies, no API keys, no subscriptions. All processing happens locally on the player's machine.
Multiplayer-Safe
Client-side processing with text-only replication. Never send audio over the network - only transcribed results.
Blueprint-Friendly
Clean Blueprint API with events and async nodes. Full C++ access for advanced users who need it.
🏗️ Modular Architecture
Four focused modules with clean dependencies
ElysGenAICore
Foundation with interfaces, types, and settings. The base that all other modules depend on.
Interfaces • Types • SettingsElysGenAIAudio
Microphone capture with ring buffers and consumer pattern for efficient audio distribution.
Capture • PTT • Ring BufferElysGenAISTT
Speech-to-text with Whisper.cpp backend. VAD, streaming, and multi-language support.
Whisper • VAD • StreamingElysGenAILLM
Language model inference with llama.cpp. Conversation context and streaming generation.
llama.cpp • Phi-3 • Streaming