DGX Spark TensorRT-LLM Benchmark Results

Container vs Native Execution Performance Analysis

Phase 1: Memory Efficiency Study | November 2025

🖥️ Test Hardware

System: NVIDIA DGX Spark (ProMax GB10)

GPU: NVIDIA GB10 (Grace Hopper, Compute Capability 12.1)

Memory: 119.64 GB Unified (Shared CPU+GPU)

Architecture: ARM Cortex (X925 + A725, 20 cores)

Test Runs: 60 total (3 models × 2 environments × 10 iterations)

Memory Savings

20-31 GB
Native uses less memory

KV Cache Advantage

1.6-2.7x
More cache in native mode

Performance Impact

~0%
Identical throughput

Total Benchmarks

60
Comprehensive testing

📊 Peak Memory Usage Comparison

💾 KV Cache Allocation Comparison

⚡ Throughput Performance

🔍 Key Findings

✅ Recommendations