InfiniService - Guest Agent

InfiniService documentation - Rust guest agent, VirtIO communication, command modules

InfiniService - Guest Agent

Introduction

InfiniService is a Rust guest agent for VM monitoring and remote command execution via VirtIO serial. Supports Windows and Linux with circuit breaker, exponential backoff, and keep-alive for fault tolerance.

Architecture Overview

Technology Stack

  • Language: Rust (stable)
  • Async Runtime: Tokio
  • Serialization: Serde (JSON)
  • Logging: log + env_logger
  • Error Handling: anyhow + thiserror
  • Windows: windows-rs, wmi, windows-service
  • System Metrics: sysinfo, netstat2

Module Organization

Six core modules:

  • config: Configuration with environment and CLI overrides
  • collector: System metrics (CPU, memory, disk, network, processes, ports, services)
  • communication: VirtIO serial with circuit breaker and keep-alive
  • service: Service orchestration and lifecycle
  • commands: Command execution (safe/unsafe), 11+ command types
  • os_detection: Platform detection

Design Patterns

  • Circuit Breaker: 15 failures → 60s open → auto recovery
  • Exponential Backoff: 5s to 300s with jitter
  • Strategy Pattern: Platform-specific implementations
  • Observer Pattern: Device monitoring with auto-reconnect

Cross-Platform Strategy

  • Conditional compilation: #[cfg(target_os = "windows/linux")]
  • Platform modules: windows_com.rs for Windows COM API
  • Shared abstractions: Common interfaces, platform backends
graph TD
    A[main.rs] --> B[InfiniService]
    B --> C[DataCollector]
    B --> D[VirtioSerial]
    B --> E[CommandExecutor]

    C --> C1[sysinfo]
    C --> C2[netstat2]
    C --> C3[WMI - Windows]
    C --> C4[/proc - Linux]

    D --> D1[CircuitBreaker]
    D --> D2[Keep-Alive]
    D --> D3[Device Detection]

    E --> E1[SafeCommandExecutor]
    E --> E2[UnsafeCommandExecutor]

    E1 --> F1[ServiceControl]
    E1 --> F2[PackageManagement]
    E1 --> F3[ProcessControl]
    E1 --> F4[WindowsUpdates]
    E1 --> F5[Autochecks]

    D --> G[VirtIO Serial]
    G --> H[Backend Host]

    H --> |Commands| G
    G --> |Responses| H
    C --> |Metrics| D
    D --> |Metrics| H

    style A fill:#0ea5e9,stroke:#0284c7,stroke-width:2px,color:#fff
    style B fill:#ef4444,stroke:#dc2626,stroke-width:2px,color:#fff
    style C fill:#22c55e,stroke:#16a34a,stroke-width:2px,color:#fff
    style D fill:#f59e0b,stroke:#d97706,stroke-width:2px,color:#000
    style E fill:#a855f7,stroke:#9333ea,stroke-width:2px,color:#fff

Service Lifecycle

Entry Point

main.rs handles CLI parsing, Windows service integration, and signals.

Key Flags:

  • --console: Force console mode (skip Windows service detection)
  • --device <path>: Override device path
  • --debug: Enable debug logging
  • --diagnose: VirtIO troubleshooting mode
  • --aggressive-retry: Development mode (faster retries)
  • --require-virtio: Fail if VirtIO unavailable
  • --no-virtio: Disable VirtIO entirely

See --help for all flags and environment variables.

Windows Service Mode:

Detects service context (no CLI args), attempts service_dispatcher::start(). Falls back to console mode if service dispatch fails.

Diagnostic Mode:

--diagnose helps troubleshoot VirtIO issues:

  • Windows: Enumerates VirtIO devices via SetupAPI, shows paths and DEV_1043 status
  • Linux: Shows expected paths (/dev/virtio-ports/org.infinibay.agent, /dev/vport0p1)

Production channel: org.infinibay.agent

Initialization

service.rs creates components:

  1. Load config (defaults + env + CLI)
  2. Create DataCollector
  3. Create VirtioSerial (auto-detect or --device)
  4. Create CommandExecutor
  5. Attempt virtio.connect_persistent()
    • Success: start main loop
    • Failure: log error with guidance, continue if require_virtio=false

Main Loop

Continuous async loop in service.rs:

  • Metrics Collection: Every 30s (configurable)
    • Fast initial retry: 5 attempts @ 2s to get IP
    • Degraded mode: 120s interval when connection poor
  • Command Reception: Non-blocking read, 500ms timeout
  • Health Checks: Every 60s, assess error rate and latency
  • Keep-Alive: Send heartbeat every 30s, expect response within 60s
  • Device Monitoring: inotify (Linux) or WMI events (Windows)
  • Shutdown: Graceful on SIGTERM/SIGINT/Ctrl+C

Connection Quality:

  • Excellent: <5% errors, <100ms latency
  • Good: 5-10% errors, <200ms
  • Fair: 10-20% errors, <500ms
  • Poor: 20-40% errors, <1000ms
  • Critical: >40% errors or >1000ms

Reconnection:

Exponential backoff: 5s → 10s → 20s → 40s → 80s → 160s → 300s (max). Circuit breaker blocks requests when open, tests connection in half-open state.

Shutdown

Graceful shutdown on signal:

  1. Disconnect VirtIO
  2. Stop device monitoring
  3. Log summary (uptime, collections, commands, quality stats)

Windows Service: Service Control Manager sends Stop/Shutdown, updates status to StopPendingStopped.

VirtIO Serial Communication

Device Detection

Linux:

  1. /dev/virtio-ports/org.infinibay.agent
  2. /dev/virtio-ports/org.qemu.guest_agent.*
  3. /dev/vport*p1
  4. Scan /dev for virtio devices

Windows:

  1. Enumerate COM ports via SetupAPI
  2. Filter by VirtIO vendor (VEN_1AF4)
  3. Check device IDs (commonly DEV_1003, DEV_1043, DEV_1044 - may vary by driver version)
  4. Test port accessibility

Connection Management

Opens device handle, sets up async I/O (Windows: overlapped I/O, Linux: standard file), initializes circuit breaker and keep-alive.

Message Protocol

Line-based JSON, newline-delimited. Auto-reconnect on errors, buffered I/O.

Incoming Message Types (from host to guest):

  • Metrics: Request metrics collection
  • SafeCommand: Execute validated command (with action field)
  • UnsafeCommand: Execute raw shell command (disabled by default)
  • KeepAliveResponse: Response to keep-alive heartbeat

Outgoing Message Types (from guest to host):

  • system_metrics: Periodic metrics data
  • keep_alive: Heartbeat with sequence_number and timestamp
  • Command responses: JSON with id, success, stdout, stderr, etc.

Circuit Breaker

Tracks failures, opens after 15 failures, half-open for testing after 60s, closes on success.

States:

  • Closed: Normal operation
  • Open: Requests blocked, wait 60s
  • HalfOpen: Test with limited calls

Keep-Alive System

Sends heartbeat every 30s, expects response within 60s. Triggers reconnection on timeout.

Keep-Alive Message Fields:

  • type: Always "keep_alive"
  • sequence_number: Incrementing counter for tracking
  • timestamp: ISO 8601 timestamp of when heartbeat was sent
{
  "type": "keep_alive",
  "sequence_number": 42,
  "timestamp": "2025-10-20T12:34:56.789Z"
}

Command Execution Framework

Command Flow

Incoming Message → Parser → Type Detection
    ↓
Safe Command → Validation → Safe Executor
    ↓
Unsafe Command → Permission Check → Unsafe Executor
    ↓
Command Response → JSON → VirtIO

SafeCommandExecutor

Pre-validated commands with structured parameters:

Command Types:

  • ServiceList, ServiceControl
  • PackageList, PackageInstall, PackageUpdate, PackageRemove
  • ProcessList, ProcessKill
  • WindowsUpdateCheck, WindowsUpdateInstall
  • WindowsDefenderStatus, WindowsDefenderScan
  • ApplicationInventory
  • Autochecks (health diagnostics)
  • DiskCleanup

No arbitrary shell execution, all operations validated.

UnsafeCommandExecutor

Executes raw shell commands. Disabled by default. Security risk - only enable for debugging/diagnostics. Never expose to untrusted users.

Command Modules

InfiniService has 8+ command modules. Below is a summary:

Command Type Description Platforms
Metrics Collect system metrics: CPU, memory, disk, network, processes, ports, services Windows, Linux
ServiceList List services (Windows services / systemd units) Windows, Linux
ServiceControl Start/stop/restart/enable/disable services Windows, Linux
PackageList List installed packages (winget / apt/dnf) Windows, Linux
PackageInstall Install package Windows, Linux
PackageUpdate Update package Windows, Linux
PackageRemove Remove package Windows, Linux
ProcessList List running processes Windows, Linux
ProcessKill Kill process by PID Windows, Linux
WindowsUpdateCheck Check for Windows updates Windows
WindowsUpdateInstall Install Windows updates Windows
WindowsDefenderStatus Get Windows Defender status Windows
WindowsDefenderScan Run Windows Defender scan Windows
ApplicationInventory List installed applications (registry / dpkg) Windows, Linux
Autochecks Health diagnostics with remediation suggestions Windows, Linux
DiskCleanup Clean temp files, logs, caches Windows, Linux

Example: Metrics Command

Request:

{
  "type": "Metrics"
}

Response: SystemMetrics with CPU, memory, disk, network, processes, ports, services. See collector.rs for full structure.

Platform Differences:

  • Windows: WMI metrics, Windows services
  • Linux: /proc metrics, systemd units

Example: ServiceControl

Request:

{
  "type": "SafeCommand",
  "id": "cmd-002",
  "command_type": {
    "action": "ServiceControl",
    "service": "wuauserv",
    "operation": "start"
  },
  "timeout": 60
}

Parameters:

  • service: Service name (e.g., "wuauserv" on Windows, "ssh.service" on Linux)
  • operation: start, stop, restart, enable, disable, status

Example: PackageInstall

Request:

{
  "type": "SafeCommand",
  "id": "cmd-010",
  "command_type": {
    "action": "PackageInstall",
    "package": "vim"
  },
  "timeout": 300
}

Uses winget (Windows) or apt/dnf (Linux). Validates package names, no shell injection.

Example: Autochecks

Request:

{
  "type": "SafeCommand",
  "id": "cmd-020",
  "command_type": {
    "action": "Autochecks"
  },
  "timeout": 120
}

Response:

Health diagnostics: disk space, services, updates, security, performance. Includes remediation suggestions (e.g., "Low disk space on C: - Run disk cleanup").

Platform-specific checks:

  • Windows: Windows Update, Defender, Event Log errors
  • Linux: systemd failed units, package updates, security patches

Configuration System

Config Struct

config.rs defines configuration with defaults, environment variables, and CLI overrides.

Key Fields:

  • device_path: Optional device override
  • require_virtio: Fail if VirtIO unavailable
  • min_backoff_secs: Min retry delay (default: 5s)
  • max_backoff_secs: Max retry delay (default: 300s)
  • connection_timeout_secs: Connection timeout (default: 10s)
  • read_timeout_ms: Read timeout (default: 500ms)
  • keep_alive_interval_secs: Heartbeat interval (default: 30s)
  • keep_alive_timeout_secs: Heartbeat timeout (default: 60s)
  • aggressive_retry: Development mode (default: false)
  • disable_device_monitoring: Disable device monitoring (default: false)
  • validate_connection: Periodic ping tests (default: false)

Loading Priority:

  1. Defaults
  2. Environment variables (INFINISERVICE_*)
  3. CLI flags (highest priority)

Development Mode

--aggressive-retry enables faster retry intervals (2s min, 30s max) for development.

Troubleshooting

VirtIO Device Not Found

Windows:

  • Check Device Manager for VirtIO Serial Port
  • Ensure guest tools installed
  • Run --diagnose to see device enumeration

Linux:

  • Check /dev/virtio-ports/ exists
  • Verify VM XML has <channel type='unix'> with org.infinibay.agent
  • Ensure process has read/write access to /dev/virtio-ports/... (run as root or adjust udev rules/group membership)

Connection Timeouts

  • Check circuit breaker status (may be open)
  • Verify device path correct (--device override)
  • Check host backend is sending responses
  • Review logs: RUST_LOG=debug

High Error Rate

  • Connection quality degraded (check latency/error rate)
  • Device I/O issues (check VM host load)
  • Network issues (if VirtIO over network)

Service Won't Start (Windows)

  • Run as Administrator
  • Check "Log on as a service" right
  • Review Event Viewer for service errors

Commands Not Executing

  • Check command type spelling
  • Verify timeout sufficient for operation
  • Review stderr in response
  • Check unsafe commands not enabled by default

Diagnostic Commands

# Check VirtIO devices (Linux)
ls -la /dev/virtio-ports/
cat /sys/class/virtio-ports/vport*/name

# Check service status
systemctl status infiniservice  # Linux
sc query InfiniService          # Windows

# View logs
journalctl -u infiniservice -f  # Linux
# Windows: Event Viewer → Application

# Debug mode
RUST_LOG=debug infiniservice --console --debug

Installation

Prerequisites

Windows:

  • VirtIO guest tools (virtio-win)
  • Administrator rights for service installation

Linux:

  • libvirt VirtIO serial driver
  • Read/write permissions on /dev/virtio-ports/ (run as root or adjust udev rules)

Installation Steps

Windows:

  1. Run install-windows.ps1 PowerShell script (creates service automatically)
  2. Or manually: Copy infiniservice.exe to C:\Program Files\Infinibay\
  3. Create service: sc create InfiniService binPath="C:\Program Files\Infinibay\infiniservice.exe" start=auto
  4. Start service: sc start InfiniService

Linux:

  1. Run install-linux.sh script (sets up systemd service automatically)
  2. Or manually: Copy infiniservice to /opt/infiniservice/
  3. Create systemd service file: /etc/systemd/system/infiniservice.service
  4. Set permissions: chmod +x /opt/infiniservice/infiniservice
  5. Enable and start: systemctl enable --now infiniservice

VM Configuration

Add VirtIO serial channel to VM XML:

<channel type='unix'>
  <source mode='bind' path='/var/lib/libvirt/qemu/org.infinibay.agent.sock'/>
  <target type='virtio' name='org.infinibay.agent'/>
  <address type='virtio-serial' controller='0' bus='0' port='1'/>
</channel>

Verification:

# Check device exists
ls -la /dev/virtio-ports/org.infinibay.agent  # Linux
# Windows: Device Manager → Ports (COM & LPT) → VirtIO Serial Port

# Check service running
systemctl status infiniservice  # Linux
sc query InfiniService          # Windows

Performance Considerations

Resource Usage

Target Metrics:

  • CPU: <1% average
  • Memory: 15-30 MB
  • Disk I/O: Minimal (metrics read-only)

Optimization Techniques

  • Zero-copy operations where possible
  • Buffered I/O to reduce syscalls
  • Selective refresh (update only changed data)
  • Async I/O (non-blocking)

Collection Intervals

  • Default: 30s (balanced)
  • Fast mode: 2s (initial IP collection)
  • Degraded: 120s (poor connection quality)

Configurable via config or CLI.

Security Architecture

Privilege Requirements

  • Windows: Administrator or service account with "Log on as a service" right
  • Linux: Read/write permissions on /dev/virtio-ports/ (run as root or configure udev rules)

Safe vs Unsafe Commands

  • Safe: Pre-validated, no shell injection risk (default enabled)
  • Unsafe: Raw shell commands, security risk (disabled by default)

Never enable unsafe commands in production without strict access controls.

VirtIO Trust Model

VirtIO channel is trusted - host controls guest. No encryption on VirtIO serial (relies on hypervisor isolation).

Resource Limits

  • Timeouts on all operations (prevent hangs)
  • Memory limits on response buffers
  • Command queue bounds

Audit Logging

All operations logged with timestamps, command types, and results. Review logs for security audits.

Log Locations:

  • Linux: journalctl -u infiniservice
  • Windows: Event Viewer → Application → InfiniService

Future Enhancements

Planned Features

  • Plugin System: Extensible metric collectors
  • Compression: Reduce data transfer size
  • Encryption: End-to-end message encryption (if VirtIO shared)
  • Multi-Channel: Multiple VirtIO channels for different purposes
  • Hot Reload: Config changes without restart

Scalability Improvements

  • Metric Streaming: Real-time updates (not just 30s intervals)
  • Command Pipelining: Batch command execution
  • Adaptive Collection: Dynamic interval based on change rate
  • Distributed Tracing: Track operations across VMs