Why Stanford Researchers Are Using CuBox Clusters to Train Neural Networks

There’s not much to the quiet hum emanating from Stanford’s research buildings. Mostly cooling fans. The sound of someone occasionally unplugging a cable. However, behind those walls, racks of NVIDIA GPUs are powering nearly every recent advancement in artificial intelligence, and the researchers in charge of them are subtly taking control of what gets tested next.

This wasn’t always the case. Most computer science departments regarded neural networks as a bit of a dead end in 2008. Gently or harshly, graduate students were told to look elsewhere. Then, in 2012, a group at the University of Toronto used Fei-Fei Li’s ImageNet dataset—built covertly at Stanford after she had moved her project from Princeton—to train a model known as AlexNet on NVIDIA GPUs. The outcome was so unfair that the field changed in less than a year.

The math wasn’t really altered. The algorithm at the center of it all, backpropagation, was first described by Rumelhart, Hinton, and Williams in Nature in 1986. For twenty-five years, the concept had existed, largely unaltered. Hardware that could run it at any significant scale was what was lacking. Originally designed to render explosions in video games, GPUs have proven to be nearly ideal for dense matrix math neural networks. It’s the kind of coincidental fit that science historians like.

Stanford’s connection to GPU clusters has evolved from experimental to existential. Stanford researchers reported using thousands of GPUs in parallel to train language models with up to a trillion parameters, achieving throughput of about 502 petaFLOPs per second, in a 2021 paper with NVIDIA and Microsoft Research. It’s easy to scroll past numbers like that, but it’s helpful to keep in mind that in 2012, a single GPU had about a thousandth of that capacity.

The university’s approach seems to be purposefully distinct from what is taking place in business. “Move fast and break things” doesn’t really apply in this situation. Yuyan Wang, an assistant professor in the Graduate School of Business, came to Stanford from front-line AI work at Google DeepMind and Uber because, in her words, not enough is known about why these systems function. She desired a more leisurely schedule. When viewed in this light, the GPU clusters are tools for examining what is truly occurring within the models rather than awards.

That probing is more important than it seems. Mobile ALOHA, a cheap robot that can cook shrimp, is trained by Chelsea Finn’s robotics lab using GPU clusters. The robot’s learned policies were shaped on racks humming somewhere down a campus corridor, but the hardware in the demo videos isn’t glamorous—just wires, brackets, and a small apron of cooling lines.

Why Stanford Researchers Are Using CuBox Clusters to Train Neural Networks

It’s difficult to ignore how rapidly the topic has changed. GPU clusters were a specialized issue for a small number of labs ten years ago. From climate modeling with Aditi Sheshadri to therapeutics work with Brian Trippe, they are now the bottleneck for almost every significant AI question Stanford is posing. Everyone is aware that researchers cannot compete on some problems if they lack sufficient computing power.

It is still up for debate whether this is good for the field. Some at Stanford are concerned about the energy footprint, the concentration of computing power, and the potential for exciting research to be crowded out by what fits on the available hardware. Some believe that the clusters are the only reliable means of verifying the more significant assertions made regarding machine intelligence.

There’s an odd irony as you watch this happen. The closest thing the world has to a mental microscope is a machine designed to render pixels. Researchers at Stanford appear to be aware of this and are making the most of every minute of GPU time available.

What's Hot

How a 2x2x2 Inch Fanless ARM Computer From Israel Became the Secret Weapon of U.S. Edge Deployments

The Quantum Threat: A 15-Bit Crypto Key Was Just Broken on Quantum Hardware

Apple’s Custom Silicon for Private Cloud Compute Is Already in Data Centers, Here Is What We Know

Why Stanford Researchers Are Using CuBox Clusters to Train Neural Networks

The Micro-PC Wars: CuBox vs. Raspberry Pi in the Battle for the Edge

The ARM Server Revolution Is Coming — and the CuBox-i Ecosystem Is Already Ahead of the Curve

The CuBox-i Review That Changed How One American Developer Thinks About Embedded Computing

Why Power Over Ethernet Support on the CuBox-M Is a Bigger Deal Than Anyone Is Giving It Credit For

How a 2x2x2 Inch Fanless ARM Computer From Israel Became the Secret Weapon of U.S. Edge Deployments

The Quantum Threat: A 15-Bit Crypto Key Was Just Broken on Quantum Hardware

Apple’s Custom Silicon for Private Cloud Compute Is Already in Data Centers, Here Is What We Know

Inside the Online Courses Building the Next Generation of AI Server Architects

The Talent Shortage: Why Big Tech Will Pay Top Dollar for Recognized AI Certifications

Utah Medical Board Suspends State’s AI Doctor Experiment After Misdiagnosis Scare

The ARM Mini Computer That Ships With Full Industrial Certifications and Fits in the Palm of Your Hand

Our Picks

How a 2x2x2 Inch Fanless ARM Computer From Israel Became the Secret Weapon of U.S. Edge Deployments

The Quantum Threat: A 15-Bit Crypto Key Was Just Broken on Quantum Hardware

Apple’s Custom Silicon for Private Cloud Compute Is Already in Data Centers, Here Is What We Know

What's Hot

Why Stanford Researchers Are Using CuBox Clusters to Train Neural Networks

Related Posts