logo

Pebble and MiTAC Computing Unlock 28% More AI Throughput on AMD Instinct™ MI350X Within the Same Power Envelope

Jun 27, 2026
USA

Pebble and MiTAC Computing Unlock 28% More AI Throughput on AMD Instinct™ MI350X Within the Same Power Envelope

Pebble Sonar™ software recovered stranded power to bring two additional GPUs online while serving Llama 3.1 405B, adding frontier-model capacity with zero additional power consumed

San Jose, California — June 26, 2026 — Pebble (MyPebble Inc.), the company building the intelligence layer between AI infrastructure and the power grid, today announced results from a joint demonstration with MiTAC Computing showing that its Pebble Sonar™ software increased AI inference throughput by 28% on a power-constrained AMD Instinct™ MI350X cluster, without drawing a single additional watt.

In an industry where power, not chips, is increasingly the ceiling on how much compute an operator can deploy, the demonstration targeted the constraint directly. Running on MiTAC Computing infrastructure with AMD Instinct MI350X GPUs, a cluster featuring MiTAC G8825Z5  servers was held to a strict 6-kilowatt power ceiling while serving Meta's Llama 3.1 405B—one of the largest publicly available models. Pebble Sonar first profiled the live workload and recovered roughly 15% of power headroom that the default configuration had been wasting as heat rather than useful compute. That reclaimed budget was then used to bring two additional GPUs online inside the same 6 kW envelope.

The result: total throughput climbed from 4,582 to 5,866 tokens per second, a 28% gain, with energy efficiency improving 14.7%, all while total rig power stayed pinned to the 6 kW ceiling. Latency and thermals held steady, with 99th-percentile time-to-first-token statistically unchanged and GPU temperatures well within operating limits.

“Operators are being told the only way to serve more AI is to buy more power, and the wait for new grid capacity can stretch for years,” said Pradeep Gaddam, CEO of Pebble. “This demonstration shows another path. The headroom is already sitting inside the power you have. Sonar finds it, and turns stranded GPUs into deployable capacity at zero additional energy cost.”

“MiTAC is committed to delivering maximum performance-per-watt across our AI infrastructure. Validating these significant, software-driven efficiency gains by deploying Pebble Sonar on the MiTAC G8825Z5 server racks featuring AMD Instinct MI350X GPUs proves that smarter power management is the key to unlocking sustainable, next-generation compute,” said Raymond Huang, GM, MiTAC Computing Technology Corp., USA

 

Pebble Sonar runs alongside the inference stack, continuously profiling real-time power draw, GPU utilization, clock frequencies, and token throughput across every GPU in the rig. From that telemetry it builds a live model of the workload's power-efficiency curve and identifies the operating point at which each watt produces the most useful compute, jointly tuning both hardware power caps and serving-engine parameters rather than treating them as independent knobs.

In a companion test on the same AMD Instinct MI350X platform, Sonar applied the same approach to an 8-GPU Llama 3.1 70B deployment, improving tokens per watt by 82.4% by identifying a more efficient per-GPU operating point, evidence that the method scales across model sizes from 70B to frontier-scale 405B. Both findings point to the same conclusion: the next efficiency frontier in AI inference is not a bigger server, but smarter power management. For operators running frontier models on AMD Instinct hardware, that means more AI output from the infrastructure and the power they already own, without new procurement cycles, infrastructure upgrades, or model changes.

The demonstration was conducted on AMD Instinct MI350X GPUs provided by MiTAC Computing. Full methodology and results are available in the accompanying technical case study at gopebble.com.

About Pebble

Pebble (MyPebble Inc.) builds the intelligence layer between AI infrastructure and the power grid. Its flagship product, Pebble Sonar™, continuously profiles GPU clusters in real time, identifying the optimal power operating point for each workload and unlocking throughput gains that static, hardware-default configurations leave on the table. A second product, Pebble Flex™, enables GPU clusters to participate in grid flexibility programs, automatically reducing and restoring power in response to utility signals without interrupting AI workloads, turning stranded grid capacity into a deployable asset. By treating power not as a fixed constraint but as a dynamic resource to be optimized, Pebble helps operators serve more AI from the infrastructure they already own. Learn more at gopebble.com.

About MiTAC Computing

MiTAC Computing Technology Corp., a subsidiary of MiTAC Holdings, specializes in AI, HPC, cloud, and edge computing. MiTAC Computing employs rigorous methodologies to ensure uncompromising quality across barebones, systems, racks, and cluster levels, fully achieving performance and integration. With a worldwide presence and end-to-end capabilities from R&D; and manufacturing to global support, MiTAC Computing provides agile, customized platforms for hyperscale data centers, HPC, and AI applications. Learn more at www.mitaccomputing.com.

Media Contacts

MiTAC Computing

Raymond Huang

[email protected]

Phone: 510-651-8868 ext. 6915

www.mitaccomputing.com

Pebble (MyPebble Inc.)

Pradeep Gaddam

[email protected]

240-505-9239

gopebble.com 

Talk to Our Experts

Let's discuss your needs, from servers to clusters.

Contact Us