Multiplayer Guide

Designing multiplayer-safe GenAI systems with ElysGenAI.

Core Principle

Audio processing is CLIENT-SIDE ONLY

Never replicate raw audio over the network. Only replicate text results and metadata.

Architecture Rules

What Runs on Client

Audio capture (microphone)
Speech-to-text processing
LLM generation (for player assistance)
Voice activity detection
Audio buffering

What Runs on Server

Text processing (from STT results)
Intent parsing
Command validation
Game state changes
NPC dialogue generation (LLM)

Never Replicate

Raw audio data (TArray<float>)
Audio buffers
Model inference results (internal)
Backend instances

Always Replicate

Transcribed text (FString)
Parsed intents
Command confirmations
NPC responses

Network Mode Check

Check network mode before processing audio:

void UERP_STTComponent::BeginPlay()
{
    Super::BeginPlay();
    
    // Only process audio on owning client
    if (GetNetMode() == NM_DedicatedServer)
    {
        UE_LOG(LogElysSTT, Warning, 
            TEXT("STT on dedicated server - disabled"));
        return;
    }
    
    if (!GetOwner()->HasAuthority() || GetOwner()->IsLocallyControlled())
    {
        StartListening();
    }
}

Network Modes:

NM_Standalone: Single player - audio capture enabled
NM_ListenServer: Host + client - audio capture on locally controlled only
NM_DedicatedServer: Server only - NO audio capture
NM_Client: Connected client - audio capture on locally controlled only

Voice Command Pattern

Complete example showing client-side processing with server validation:

// PlayerController.h
UCLASS()
class AMyPlayerController : public APlayerController
{
    GENERATED_BODY()
    
    UPROPERTY(VisibleAnywhere)
    UERP_STTComponent* STTComponent;
    
    UFUNCTION(Server, Reliable, WithValidation)
    void ServerExecuteVoiceCommand(const FString& Command, float Confidence);
    
    UFUNCTION()
    void OnTranscriptionReceived(const FERP_STTResult& Result);
};

// PlayerController.cpp
void AMyPlayerController::BeginPlay()
{
    Super::BeginPlay();
    
    // Only on owning client
    if (IsLocalController())
    {
        STTComponent->OnTranscriptionComplete.AddDynamic(
            this, &AMyPlayerController::OnTranscriptionReceived);
        STTComponent->StartListening();
    }
}

void AMyPlayerController::OnTranscriptionReceived(const FERP_STTResult& Result)
{
    // Client-side: transcribe speech
    if (Result.Confidence > 0.7f)
    {
        // Send text to server
        ServerExecuteVoiceCommand(Result.TranscribedText, Result.Confidence);
    }
}

// Server RPC
void AMyPlayerController::ServerExecuteVoiceCommand_Implementation(
    const FString& Command, float Confidence)
{
    // Server-side: validate and execute
    if (Command.Contains(TEXT("attack")))
    {
        GetPawn()->PerformAttack();
    }
    else if (Command.Contains(TEXT("reload")))
    {
        GetPawn()->ReloadWeapon();
    }
}

bool AMyPlayerController::ServerExecuteVoiceCommand_Validation(
    const FString& Command, float Confidence)
{
    // Prevent abuse: check confidence and length
    return Confidence > 0.5f && Command.Len() < 256;
}

Flow: Microphone → Audio Capture (client) → STT (client) → RPC Text → Server Validation → Execute

Bandwidth Efficiency

Text vs Audio:

Raw audio: ~1.2 MB/minute (16kHz mono 16-bit)
Transcribed text: ~2 KB/minute (typical speech)
Reduction: 500x more efficient

Best Practices:

Always send text, never audio
Compress text with FArchive if sending frequently
Batch multiple commands if possible

Security Considerations

Command Validation

Always validate commands server-side:

bool AMyPlayerController::ServerExecuteVoiceCommand_Validation(
    const FString& Command, float Confidence)
{
    // Check confidence threshold
    if (Confidence < 0.5f) return false;
    
    // Check length (prevent spam)
    if (Command.Len() > 256) return false;
    
    // Check rate limit
    float TimeSinceLastCommand = GetWorld()->GetTimeSeconds() - LastCommandTime;
    if (TimeSinceLastCommand < 0.5f) return false;
    
    LastCommandTime = GetWorld()->GetTimeSeconds();
    return true;
}

Anti-Cheat

Never trust client-provided confidence scores for critical actions
Validate all commands match expected patterns
Rate-limit voice commands
Log suspicious activity (e.g., impossible confidence scores, rapid-fire commands)

NPC Dialogue

Server-side LLM for NPC responses:

// NPC.h (replicated actor)
UCLASS()
class ANPC : public ACharacter
{
    GENERATED_BODY()
    
    UPROPERTY(VisibleAnywhere)
    UERP_LLMComponent* LLMComponent;
    
    UFUNCTION(Server, Reliable, WithValidation)
    void ServerSendDialogue(const FString& PlayerMessage);
    
    UFUNCTION(NetMulticast, Reliable)
    void MulticastDisplayResponse(const FString& Response);
    
    UFUNCTION()
    void OnLLMResponse(const FERP_LLMResult& Result);
};

// NPC.cpp
void ANPC::BeginPlay()
{
    Super::BeginPlay();
    
    // Only generate dialogue on server
    if (HasAuthority())
    {
        LLMComponent->SetSystemPrompt(TEXT("You are a friendly merchant."));
        LLMComponent->OnGenerationComplete.AddDynamic(
            this, &ANPC::OnLLMResponse);
    }
}

void ANPC::ServerSendDialogue_Implementation(const FString& PlayerMessage)
{
    // Generate response on server
    LLMComponent->SendMessage(PlayerMessage);
}

void ANPC::OnLLMResponse(const FERP_LLMResult& Result)
{
    // Broadcast response to all clients
    MulticastDisplayResponse(Result.GeneratedText);
}

void ANPC::MulticastDisplayResponse_Implementation(const FString& Response)
{
    // Display dialogue on all clients
    ShowDialogueBubble(Response);
}

Common Patterns

Pattern 1: Client STT → Server Processing

Use for: Voice commands, multiplayer actions Flow: Client captures → STT → Send text → Server validates → Execute

Pattern 2: Server LLM → Client Display

Use for: NPC dialogue, quest text Flow: Server generates → Multicast text → All clients display

Pattern 3: Client STT + Client LLM

Use for: Single-player helpers, local UI Flow: Client captures → STT → LLM → Display (no network)

Testing Multiplayer

Test in Editor

Enable multiplayer PIE: Editor Preferences → Play → Number of Players = 2
Set Net Mode to "Play as Listen Server"
Test with both client and server windows

Verify Behavior

Check logs in both client and server windows:

// Client log (expected)
LogElysSTT: Started listening
LogElysSTT: Transcription: "attack" (0.85)

// Server log (expected)
LogGame: Received voice command: "attack" (0.85)
LogGame: Executing attack for Player 1

// Dedicated server log (expected - NO audio)
LogElysSTT: Warning: STT on dedicated server - disabled

Next Steps

Examples - See multiplayer examples
Troubleshooting - Network-specific issues
Core Concepts - Understand the architecture

Core Principle​

Architecture Rules​

What Runs on Client​

What Runs on Server​

Never Replicate​

Always Replicate​

Network Mode Check​

Voice Command Pattern​

Bandwidth Efficiency​

Security Considerations​

Command Validation​

Anti-Cheat​

NPC Dialogue​

Common Patterns​

Pattern 1: Client STT → Server Processing​

Pattern 2: Server LLM → Client Display​

Pattern 3: Client STT + Client LLM​

Testing Multiplayer​

Test in Editor​

Verify Behavior​

Next Steps​