Wikipedia Archive — Reference Article
| Type | Private |
| Founded | 2016 |
| Founders | Andrew Feldman, Gary Lauterbach, Sean Lie, Michael James, Jean-Philippe Fricker, Dhiraj Malhotra |
| Headquarters | Sunnyvale, California, United States |
| Industry | Artificial Intelligence, Semiconductors, High-performance Computing |
| Products | CS-1, CS-2, CS-3, Wafer Scale Engine (WSE) |
| Website | cerebras.net |
Cerebras Systems is an American artificial intelligence (AI) company specialising in the design and manufacture of deep learning hardware and software. Founded in 2016 and headquartered in Sunnyvale, California, the company is best known for producing the Wafer Scale Engine (WSE), the largest chip ever manufactured, designed specifically for accelerating neural network training and inference workloads.[1]
The company was co-founded by Andrew Feldman and Gary Lauterbach, veterans of the semiconductor industry, along with several other engineers. Cerebras has raised significant venture capital funding and has attracted attention from the AI research community and enterprise customers seeking alternatives to conventional GPU-based computing clusters.[2]
Cerebras' flagship product line — the CS-1, CS-2, and CS-3 compute systems — integrates the WSE chip directly into a rack-mounted unit capable of training large language models and other deep neural networks at unprecedented speeds and energy efficiencies compared to traditional multi-GPU setups.[3]
Cerebras Systems was founded in 2016 by Andrew Feldman, who had previously co-founded and served as CEO of SeaMicro (acquired by AMD in 2012), and Gary Lauterbach, along with Jean-Philippe Fricker, Sean Lie, Michael James, and Dhiraj Malhotra. The founding team shared a vision that the conventional approach of using arrays of graphics processing units (GPUs) to train neural networks was fundamentally inefficient, due to the overhead of communicating data between discrete chips over relatively slow interconnects.[4]
The company operated in stealth mode for several years before publicly revealing its technology at the Hot Chips conference in August 2019, when it announced the Wafer Scale Engine (WSE-1). The announcement generated significant media and industry attention because the chip was dramatically larger than any commercially available processor at the time, occupying an entire silicon wafer rather than a small die.[5]
In 2020, Cerebras released the CS-1, its first commercial compute system incorporating the WSE-1. The following year saw the announcement of the WSE-2 and the CS-2 system, offering substantially improved transistor counts, core counts, and on-chip memory bandwidth. The CS-3 system, incorporating the WSE-3, was announced in 2024, further extending the performance envelope of the platform.[6]
The Wafer Scale Engine is the defining technology developed by Cerebras. Unlike conventional chips, which are diced from a silicon wafer into individual dies, the WSE uses the entire wafer as a single, monolithic processor. This approach eliminates the chip-to-chip communication latency that plagues multi-GPU systems, placing all compute cores and on-chip memory on a single substrate.[7]
The first generation WSE-1, announced in 2019, measured approximately 46,225 mm² and contained 1.2 trillion transistors, 400,000 AI-optimised compute cores, and 18 gigabytes of on-chip SRAM. For comparison, leading GPU dies at the time measured around 800 mm².[8]
The second generation WSE-2, introduced in 2021, grew to 2.6 trillion transistors, 850,000 cores, and 40 gigabytes of on-chip SRAM, manufactured on TSMC's 7 nm process node. The WSE-3, announced in 2024, further expanded to 4 trillion transistors and 900,000 cores using a 5 nm process.[9]
Manufacturing wafer-scale chips presents significant engineering challenges, particularly around defect tolerance. Cerebras employs redundancy techniques where defective cores are identified and disabled during production, ensuring a sufficient number of functional cores in every shipped unit.[10]
The CS-1, CS-2, and CS-3 are rack-mounted compute appliances designed to house and power the WSE chips. Because a wafer-scale chip consumes substantially more power than a conventional processor and requires specialised cooling, each CS unit integrates custom power delivery and liquid cooling systems engineered to handle the thermal load.[11]
The CS systems connect to external host servers for data ingestion and model orchestration, while the WSE chip itself handles the compute-intensive matrix and tensor operations required for neural network training. The systems are compatible with popular deep learning frameworks including PyTorch and TensorFlow through Cerebras' software stack.[12]
The company also introduced MemoryX and SwarmX technologies to address the memory capacity constraints inherent in on-chip SRAM, enabling large language models with billions or trillions of parameters to be trained across multiple CS-2 or CS-3 systems by offloading model weights to external memory pools and coordinating work across a cluster of wafer-scale processors.[13]
Cerebras Systems has raised multiple rounds of venture funding from prominent investors. Key backers include Benchmark, Eclipse Ventures, Foundation Capital, Altimeter Capital, and others. By 2021, the company had raised over $475 million in total funding. A subsequent funding round in 2023 valued the company at approximately $4 billion.[14]
In 2024, Cerebras filed for an initial public offering (IPO) on the NASDAQ under the ticker symbol "CBRS". The filing attracted significant attention due to the company's rapid revenue growth driven by enterprise and government customers deploying its AI hardware, as well as its high-profile partnership with G42, a UAE-based AI and cloud computing conglomerate.[15]
Cerebras hardware has been deployed across a range of scientific and commercial AI workloads. Notable use cases include:
Cerebras has also offered cloud-based access to its CS systems, enabling organisations without capital expenditure budgets for dedicated hardware to access wafer-scale compute via an API, positioning the company to compete with GPU cloud providers.[16]
The primary competitive landscape for Cerebras includes:
Despite strong competition, Cerebras has maintained a differentiated position through the sheer scale of its wafer-level integration, which provides uniquely high memory bandwidth and ultra-low latency communication between cores that is difficult to replicate with multi-chip architectures.[17]