July 3, 2024

Interconnects 101

Our Intra-Thematic AI Primers on Bits & Bottlenecks

Interconnects_101_1

Preface

AI will soon be expanding beyond just the datacenter infrastructure “picks and shovels” plays, in my opinion. The explosion in use cases has already begun and will likely result in many “wow, I get it now” moments in the next 6 months. People are building and the technology is improving, even if it is in ways that are not readily apparent to casual end users - it soon will be. We might even see the first genuine AI-driven improvement in a public companies’ bottom line within the next couple quarters!

In the meantime, generalists are left to keep up with a rapidly advancing technological landscape in semiconductors that can easily leave you behind - a function of AI’s position on the “Disruption-Continuation” spectrum we mentioned in our One Year Anniversary piece.

Citrindex One-Year Anniversary

June 2, 2024

The shame in that is that it often robs you of the ability to truly understand what’s behind the whole picture while ensuring you’re properly positioned - a picture which will only become increasingly important when these narratives take hold in the real world. The data center buildout is underway, but now the beneficiaries will become more skewed towards who can provide the most marginal benefit.

It was only a matter of second order thinking to get long AI via long the data center infrastructure in 2023. But technology only marches forward, never back. While one has indeed so far been able to get away with an understanding that only goes GPU-deep in AI, I believe that will rapidly begin changing.

In my opinion, performance in SMH will narrow and we’ll see a broadening out elsewhere (before the end of the year).

That doesn’t mean there won’t be significant outperformance in our trusty Phase 1 AI beneficiary names, though. We’ll likely see a lot more dispersion in the “AI Semiconductor” space, so it’s important to grasp the drivers that are most likely to see continued benefit in order to keep us asymmetrically positioned as the AI theme progresses. 

For that reason, we’re introducing our first “Intra-Thematic Primer”. These will go beyond simply updating what’s occurred in a specific theme since publication, but will highlight areas that we believe will play an increasingly important role in the theme and should be understood.

This article is meant to serve as an introduction to the business of High Performance Computing infrastructure through the lens of Interconnects.

Our call is that interconnects are likely to be more insulated from risk of AI datacenter capex moderating by retaining upside from efforts in custom silicon (for inference now and, potentially, in the future for training). This is an area we believe is rife for innovation and will only become more important as we progress through AI’s development.

Without further ado…


Interconnects 101: Our First Intra-Thematic Primer

Have you ever wondered what makes a GPU cluster so special?

Why are 8, 72, or 32,768 GPUs together better than 1?

How do peripheral devices (like GPUs) connect and communicate with each other to work together?

For that matter, have you ever wondered how GPUs even work or why they work better for AI?

It might surprise you that the methods employed in hyperscale data centers are essentially souped-up versions of the standards and technologies that drive our personal machines.

This cast of characters includes ethernet, fiber optics, and PCIe (Peripheral Component Interconnect Express) that you might know from your home internet, TV, and high-end consumer electronics respectively.

In order to elaborate on why interconnect technology plays such an important role in the progression of AI compute, we are first going to address that I know some of you have just been like this while you’ve been riding your long NVDA position to the moon:

Before diving-in to the connections, why do we even use GPUs for artificial intelligence? 

Access Full Research