I was at SC19 last week and as always there was lots to see on the expo floor and at the show in general. Two expo booths that I thought were especially interesting were:
- Zapata Computing systems – a quantum computing programming for hire outfit and
- Cerebras – a new AI wafer scale accelerator chip that sported 400K+ cores in a single package.
Zapata Computing, quantum coding for hire
We’ve been on a sort of quantum thread this past month or so (e.g., see our Quantum computing – part 2 and part 1, The race for quantum supremacy posts). Zapata Computing was at the edge of the exhibit floor in a small booth pretty much just one guy (Michael Warren) and their booth with some handouts. Must have had something on the booth about quantum computing, because I stopped by
Warren said they have ~20 PhDs, from around the world working for them and provide quantum coding for hire. Zapata works with organizations to either get them up to speed on quantum programing or write quantum programs themselves under contract for clients and help run them on quantum computers.
Zapata’s quantum algorithms are designed to run on any type of quantum computer such as ion trap, superconducting qubit, quantum annealers, etc. They also work with Microsoft Azure Quantum, IBM Q, Rigetti, and Honeywell systems to run quantum programs for customers. Notably missing from this list was Google and Honeywell is new to me but seem active in quantum computing.
Zapata has their own Orquestra quantum toolkit. We have discussed quantum software development kits like IBM Q Qiskit previously but Microsoft has their own, QDK and Rigetti has Forrest SDK. So, presumably, Orquestra front ends these other development kits. Couldn’t find anything on Honeywell but it’s likely they have their own development kit as well or make use of others.
In talking to the Warren at the show, Zapata is working to come up with a quantum computing cloud, which can be used to run quantum code on any of these quantum computers with the click of a button. Warren sounded like this was coming out soon.
Some of the Zapata Computing quantum programs they have developed for clients include: logistic simulations, materials design, chemistry simulations, etc.
Warren didn’t mention the cost of running on quantum computers but he said that some companies are more forthright with pricing than others. It seemed Rigetti had a published price list to use their systems but others seemed to want to negotiate price on a per use basis.
It seems only a matter of time before quantum computing becomes just like GPUs. Just another computational accelerator that works well for some workloads but not others. Zapata Computing and Orquestra are just steps along this path.
AI accelerator chips have also been a hot topic for us (see our posts on Google TPU, GraphCore’s system, and the Mythic’s and Syntiant’s AI accelerators). But none,. with the possible exception of GraphCore, has taken this on to quite the same level as Cerebras.
Cerebras offers a wafer scale chip that is embedded into their CS-1 system. The chip has 400K cores, 18GB of (very fast) SRAM (memory), 100Pb/sec (peta-bits or 10**15 bits per second) of bandwidth and draws ~20kW. Their CS-1 system fits in a standard rack taking up 15U of space.
The on-chip fabric is called SWARM which supports a 2D mesh. The SWARM mesh is entirely configurable, to support optimal neural network connectivity. I assume this means that any core can talk directly (with 0 hops) to any other core on the chip through a configuration setup.
The high speed on chip SRAM supports up to 9PB/sec of memory bandwidth and can be accessed in a single clock cycle. They call the cores Sparse Linear Algebra Compute (SLAC) cores and say that they are optimized to support ML-DL computations, which we assume meansfloating point aritmetic.
Although you can’t really see the (wafer scale) chip in the picture above, it’s located in the section between the copper plate and the copper heat sink and is starts at the copper line between the two. CS-1 consumes a lot of power and much of its design is to provide proper cooling. One can view some of that on the left side of the picture above.
As for software, Cerebras CS-1 supports TensorFlow and PyTorch as well as standard C++. Their Cerebras Software Platform stack, consists of two layers: the Cerebras Intermediate Representation and Cerebras Graph Compiler (CGC) that feeds their Cerebras Wafer Scale Engine (WSE). The CGC maps neural network nodes to cores on the WSE and probably configures SWARM to provide NN core to NN core connectivity.
It’s great to see hardware innovation again. There was a time where everyone thought that software alone was going to kill off hardware innovation. But the facts are that both need to innovate to take computing forward. Cerebras didn’t tell me any PetaFlop rate for their system and but my guess it would beat out the 2PFlop GraphCore2 (GC2) system but it’s only a matter of time before GC3 comes out. That being said, what could be beyond wafer scale integration?
I enjoy going to SC19 for all the leading edge technology on display. They have some very interesting cooling solutions that I don’t ever see anywhere else. And the student competition is fun. Teams of students running HPC workloads around the clock, on donated equipment, from Monday evening until Wednesday evening. With (by SC19) spurious fault injection to see how they and their systems react to the faults to continue to perform the work needed.
For every SC conference, they create an SCinet to support the show. This year it supported Tb/sec of bandwidth and the WiFi for the floor and conference. All the equipment and time that goes into creating SCinet is donated.
Unfortunately, I didn’t get a chance to go to keynotes or plenary sessions. I did attend one workshop on container use in HPC and it was completely beyond me. Next years, SC20 will be in Atlanta.