Pebble and MiTAC Computing Unlock 28% More AI Throughput on AMD Instinct™ MI350X Within the Same Power Envelope
Pebble
and MiTAC Computing Unlock 28% More AI Throughput on AMD Instinct™ MI350X
Within the Same Power Envelope
Pebble Sonar™ software recovered
stranded power to bring two additional GPUs online while serving Llama 3.1
405B, adding frontier-model capacity with zero additional power consumed
San Jose, California — June 26,
2026 — Pebble
(MyPebble Inc.), the company building the intelligence layer between AI
infrastructure and the power grid, today announced results from a joint
demonstration with MiTAC Computing showing that its Pebble Sonar™ software
increased AI inference throughput by 28% on a power-constrained AMD Instinct™
MI350X cluster, without drawing a single additional watt.
In an industry where power, not chips, is increasingly the ceiling on how much compute an operator can deploy, the demonstration targeted the constraint directly. Running on MiTAC Computing infrastructure with AMD Instinct MI350X GPUs, a cluster featuring MiTAC G8825Z5 servers was held to a strict 6-kilowatt power ceiling while serving Meta's Llama 3.1 405B—one of the largest publicly available models. Pebble Sonar first profiled the live workload and recovered roughly 15% of power headroom that the default configuration had been wasting as heat rather than useful compute. That reclaimed budget was then used to bring two additional GPUs online inside the same 6 kW envelope.
The result: total throughput climbed from 4,582 to 5,866
tokens per second, a 28% gain, with energy efficiency improving 14.7%, all
while total rig power stayed pinned to the 6 kW ceiling. Latency and thermals
held steady, with 99th-percentile time-to-first-token statistically unchanged
and GPU temperatures well within operating limits.
“Operators are being told the only
way to serve more AI is to buy more power, and the wait for new grid capacity
can stretch for years,” said Pradeep Gaddam, CEO of Pebble. “This demonstration
shows another path. The headroom is already sitting inside the power you have.
Sonar finds it, and turns stranded GPUs into deployable capacity at zero
additional energy cost.”
“MiTAC is committed to delivering
maximum performance-per-watt across our AI infrastructure. Validating these
significant, software-driven efficiency gains by deploying Pebble Sonar on the
MiTAC G8825Z5 server racks
featuring AMD Instinct MI350X GPUs proves that smarter power management is the
key to unlocking sustainable, next-generation compute,” said Raymond Huang, GM,
MiTAC Computing Technology Corp., USA
Pebble Sonar runs alongside the
inference stack, continuously profiling real-time power draw, GPU utilization,
clock frequencies, and token throughput across every GPU in the rig. From that
telemetry it builds a live model of the workload's power-efficiency curve and
identifies the operating point at which each watt produces the most useful
compute, jointly tuning both hardware power caps and serving-engine parameters
rather than treating them as independent knobs.
In a companion test on the same AMD Instinct MI350X platform, Sonar applied the same approach to an 8-GPU Llama 3.1 70B deployment, improving tokens per watt by 82.4% by identifying a more efficient per-GPU operating point, evidence that the method scales across model sizes from 70B to frontier-scale 405B. Both findings point to the same conclusion: the next efficiency frontier in AI inference is not a bigger server, but smarter power management. For operators running frontier models on AMD Instinct hardware, that means more AI output from the infrastructure and the power they already own, without new procurement cycles, infrastructure upgrades, or model changes.
The
demonstration was conducted on AMD Instinct MI350X GPUs provided by MiTAC
Computing. Full methodology and results are available in the accompanying
technical case study at gopebble.com.
About Pebble
Pebble (MyPebble Inc.) builds the
intelligence layer between AI infrastructure and the power grid. Its flagship
product, Pebble Sonar™, continuously profiles GPU clusters in real time,
identifying the optimal power operating point for each workload and unlocking
throughput gains that static, hardware-default configurations leave on the
table. A second product, Pebble Flex™, enables GPU clusters to participate in
grid flexibility programs, automatically reducing and restoring power in
response to utility signals without interrupting AI workloads, turning stranded
grid capacity into a deployable asset. By treating power not as a fixed
constraint but as a dynamic resource to be optimized, Pebble helps operators
serve more AI from the infrastructure they already own. Learn more at
gopebble.com.
About MiTAC Computing
MiTAC Computing Technology Corp.,
a subsidiary of MiTAC Holdings, specializes in AI, HPC, cloud, and edge
computing. MiTAC Computing employs rigorous methodologies to ensure
uncompromising quality across barebones, systems, racks, and cluster levels,
fully achieving performance and integration. With a worldwide presence and
end-to-end capabilities from R&D; and manufacturing to global support,
MiTAC Computing provides agile, customized platforms for hyperscale data
centers, HPC, and AI applications. Learn more at www.mitaccomputing.com.
Media Contacts
MiTAC Computing
Raymond Huang
Phone: 510-651-8868 ext. 6915
www.mitaccomputing.com
Pebble (MyPebble Inc.)
Pradeep Gaddam
240-505-9239
gopebble.com


