close
close

Has Google turned things around in quantum and AI computing?

Has Google turned things around in quantum and AI computing?

Google recently announced three new hardware platforms including Quantum, AI and Arm. And these chips are really impressive.

Okay, I admit it. I underestimated Google’s ability to drive significant innovation in computing. Yes, the company showed great potential, but it seemed that few developments crossed the competitive threshold and it did not gain much traction in the market, perhaps due more to the marketing than the product. However, recent announcements made it clear that this is no longer the case, and the stock responded by soaring over 11% in the last two trading days. It looks like the market has decided that hardware is cool again.

Google introduced the groundbreaking Willow chip for quantum computing, the new Trillium AI chip that powers the newly released Gemini 2.0 AI model, and GA launched its first Arm-based server chip called Axion to compete with Amazon’s Graviton and to displace some Intel and AMD x86 cloud instances. Let’s take a look.

Willow Quantum AI Computing

Google is competing with Amazon, IBM, Microsoft and others to advance the state of quantum computing. While quantum remains a global research project, the potential is breathtaking and could reshape the computing landscape in the next decade to solve currently intractable problems.

The 105-qubit Willow largely replaces Google’s 53-qubit Sycamore by supporting more qubits and improving error correction, making the qubits more usable. Google has grouped qubits to reduce error rates, one of the main factors currently limiting quantum computing to the experimental stage.

Modern quantum computers typically have an error rate of one in a hundred to a thousand. Traditional binary computers produce memory errors only once in a billion calculations. So if you do the math, quantum computers have error rates about one to ten billion times higher than classical computers because quantum circuits are susceptible to a variety of disturbances, such as temperatures, electromagnetic radiation and vibrations.

Typically, error rates increase as quantum systems are scaled to larger numbers of qubits. But that’s not the case with Willow. The larger the network of qubits becomes, the better Willow performs. “Every time we increased our logical qubits – or our groupings – from 3 x 3 to 5 x 5 or 7 x 7 arrays of physical qubits, the error rate did not increase,” Google researcher Michael Newman said. “It kept going back. Every time we increased the size, it decreased by a factor of two.” A modest improvement compared to the 1 in 10 billion deltas of traditional computers puts Google on the path to a profitable commercial operation in the next five years, as the Qubit numbers exceed 1,000.

Willow also shows significantly improved coherence, the ability of qubits to maintain their state over time. Willow demonstrated a doubling of memory coherency, another key step in reducing inherent error rates.

Willow Quantum achieves what supercomputers simply cannot

In the Nature In an article announcing Willow, Google claimed that the new system could solve a mathematical problem that has puzzled scientists for decades: so-called random circuit sampling. Willow ran the RCS benchmark in less than five minutes, while the giant Frontier supercomputer at Oak Ridge National Laboratories would theoretically take 10 septillion years (1 followed by 25 zeros).

Willow is a research tool that reaches a significant milestone toward quantum supremacy or utility in commercial applications that are beyond the reach of today’s largest supercomputers. But like other quantum computers, Willow has very few if any commercial applications. Give the industry a few years and that could change. However, the stock market didn’t care about this crap and rewarded Google with a significant price increase of 7% on the day of the announcement.

Trillium: The 6th generation of Google TPU for AI

Google today announced the general availability of Trillium, the newest and most advanced TPU in the company’s nine-year effort to offer a competitive AI accelerator. Trillium is designed at scale and is available on Google Cloud with over 100,000 Trillium chips per Jupiter network fabric, allowing a single distributed training job to scale to hundreds of thousands of accelerators. Trillium was used to create the new AI model Gemini 2.0 and to enable inference processing of this model.

Trillium comes in a 256-chip pod and scales to hundreds of thousands of chips, making it possible to train AI models with trillions of parameters. Trillium offers the following advantages compared to its predecessor:

  • More than four times the training performance
  • Up to 3x increase in inference throughput
  • A 67% increase in energy efficiency
  • An impressive 4.7x increase in peak computing power per chip
  • Double storage capacity with high bandwidth
  • Double the interchip interconnect bandwidth
  • 100,000 Trillium chips in a single Jupiter network structure
  • Up to 2.5x improvement in training performance per dollar and up to 1.4x improvement in inference performance per dollar

Although TPUs have always been great accelerators, they have struggled to gain significant market share because TPUs are only available on the Google Compute Platform. However, Apple has now stated that Apple Intelligence was trained on Google TPUs and Google TPU has a majority share in the other cloud accelerators, including Microsoft and Amazon AWS.

But where Trillium shines is its mix of expert models, or MoEs. The models emphasize the network connecting accelerators, and the Trillium pods are almost four times faster than the previous TPU v5e of the same size. As MoEs continue to grow, this could help Google gain more market share in its cloud.

Google is finally joining the gun battle

Google recently released its own Arm chip, Axion, following Amazon AWS and Microsoft Azure, which have already released their Arm server CPUs. Even though the technology is less glamorous than AI and quantum processors, Arm has steadily increased server CPU market share. TrendForce predicts that the penetration rate of Arm architecture in data center servers will reach 22% by the end of 2025, posing a significant threat to x86 CPU market share.

Axion is a well-designed server CPU with 35% better performance, 60% better price/performance and 65% better power efficiency than “current generation” x86 chips. Now that it is generally available, we can expect Axion to get a significant share of the Google Cloud compute fleet for internal and customer applications.

Where is Google?

We believe that Google now has a portfolio of complete in-house systems that allow it to compete with other cloud providers and chip companies and even be the best. Google has a fast Arm-based CPU, a fast and scalable AI accelerator that excels at the latest workloads like training MoE models, a quantum computing platform that can match or surpass most competitors, and AI -Software stack and the AI ​​models developers need to attract the ecosystem and end users to their cloud computing platform.

Disclosures: This article reflects the author’s opinion and should not be construed as a recommendation to buy or invest in any of the companies mentioned. Cambrian-AI Research is fortunate to count many, if not most, semiconductor companies among our customers, including Blaize, BrainChip, Cadence Design, Cerebras, D-Matrix, Eliyan, Esperanto, Flex, GML, Groq, IBM, Intel, Nvidia , Qualcomm Technologies, Si-Five, SiMa.ai, Synopsys, Ventana Microsystems, Tenstorrent and numerous investment customers. Like many in the tech industry, while we hold Nvidia in our portfolio, we do not otherwise hold any investment positions in any of the companies mentioned in this article and have no plans to initiate any in the near future. For more information, please visit our website at http://www.cambrian-ai.com.

Leave a Reply

Your email address will not be published. Required fields are marked *