I got one of these GB10s, but the ASUS variety. So far fairly happy with it. Most days I don't remember I'm on ARM.
It's pretty performant, snappy, about the same speed as my other mini PC, a Ryzen 9 7940HS Minisforum UM 790 Pro, but with double the amount of cores and many times the amount of RAM.
But large and dense at the same time is a bit slow.
Running a local LLM will be a load of money for something much slower than the api providers though.
I'm not sure I agree on the cost aspect though. For high-volume production workloads the API bills scale linearly and can get painful fast. If you can amortize the hardware over a year and keep the data local for privacy, the math often works out in favor of self-hosting.
On the same hardware gpt-oss-120b at 128k context, I fed it a longer version of the input (a whole novel, 97k tok), and it input processed at 1650 tok/s and output generated at 27 tok/s. Just fast enough IMO
If I was primarily interested in that, I would have probably bought one of the cheaper Strix Halo machines.
It's also just a decent non-Mac ARM64 workstation, with large quantities of RAM. Which in 2026 is a bit of unicorn.
Thank you for doing these. Earned a star and a watch from me on both! Minor sponsor donation as gratitude.
Would be sick to have an RSS feed for your data releases.