Photonic [AI] computing seeing the light of day – part 2

Read an interesting article in Analytics India Magazine (MIT Researchers Make New Chips That Work On Light) about a startup out of MIT focused on using photonics for AI/ML/DL activities. Not exactly neuromorphic chips, but using analog photonics interactions to perform computational intensive operations required by todays deep neural net training.

We’ve written about photonics computing before ( see Photonic computing seeing the light of day [-part 1]). That post was about spin outs from Princeton and MIT back in 2019. We showed a bit more on how photonics can perform multiplication and other computations with less power.

The article (noted above) talked about LightIntelligence, an MIT spinout/ startup that’s been around since ~2017, but there’s another company in the same space, also out of MIT called LightMatter that just announced early access to their hardware system.

The CEOs of both companies collaborated on a paper (#1&2 authors of the 10 author paper) written back in 2017 on Deep Learning with Coherent Nanophotonic Circuits. This seemed to be the event that launched both companies.

LightMatter just received $80M in Series B funding ( bringing total funding to $113M) last month and LightIntelligence seems to have $40M in total funding So both have decent funding but, LightMatter seems further ahead in funding and product technology.


LightMatter Envise Photonics-RISC AI processing chip

LightMatter Envise AI chip uses standard RISC electronic cores together with Photo Arithmetic Units for accelerated AI computations. Each Envise chip has 500MB of SRAM for large models, offers 400Gbps chip to chip interconnect fabric, 256 RISC cores, a Graph processor, 294 photonic arithmetic units and PCIe 4.0 connectivity.

LightMatter has just announced early access for their Envise AI photonics server. It’s an 4U, AI server with 16 Envise chips, 2 AMD EPYC CPUs, (16×400=)6.4Tpbs optical fabric for inter-chip communications, 1TB of DDR4 DRAM, 3TB of NVMe SSD and supports 2-200GbE SmartNICs for outside communications.

Envise also offers Idiom Software that interfaces with standard AI frameworks to transform models for photonics computing to use Envise hardware . Developers select Envise hardware to run their AI models on and Idiom automatically re-compiles (IdCompile) their model into more parallelized, photonics operations. Idiom also has a model profiler (IdProfiler) to help debug and visualize photonic models in operation (training or inferencing?) on Envise hardware. Idiom also offers an AI model library (IdML) which provides a PyTorch frontend to help compress and quantize a standard set of AI models.

LightMatter also announced their Passage optical interconnect chip that supplies 100Tbps optical switch for photonics, CPU or GPU processing. It’s huge, 8″x8″ and built on 5nm/7nm node process. Passage can connect up to 48 photonics, CPU or GPU chips that are built onto of it (one can see the space for each of these 48 [sub-]chips on the chip). LightMatter states that 40 Passage (photonic/optical) lanes are the width of one optical fibre. Passage chips are sampling now.

LightMatter Passage photonics-transistor chip (carrier) that provides a photonics programmable interconnect for inter-[photonics-electronic-]chip communications.


They don’t appear to be announcing any specific hardware just yet but they are at work in creating the world largest integrated photonics processing system. But LightIntelligence have published a number of research papers focused on photonic approaches to CNNs, RNNs/LSTMs/GRUs, Recurrent ISING machines, statistical computing, and invisibility cloaking.

Turns out the processing power needed to provide invisibility cloaking is very intensive and as its all pixels, photonics offers serious speedups (for invisibility, see Nature article, behind paywall).

Photonics Recurrent ISLING Sampler (PRIS)

LightIntelligence did produce a prototype photonics processor in 2019. And they believe the will have de-risked 80-90% of their photonics technology by year end 2021.

If I had to guess, it would appear as if LightIntelligence is trying to re-imagine deep learning taking a predominately all photonics approach.

Why photonics for AI DL

It turns out that one can use the interaction/interference between two light beams to perform matrix multiplication and other computations a lot faster, with a lot less power than using standard RISC (or CISC) electronic processor architectures. Typical GPUs run 400W each and multi-GPU training activities are commonplace today.

The research documented in the (Deep learning using nanophotonics) paper was based on using an optical FPGA which we have talked about before (See Photonics or Optical FPGAs on the horizon) to prototype the technology back in 2017.

Can photonics change the technology underpinning AI or computing?

If by using photonics, one could speed up AI inferencing by 3-5X and do it with 5-6X less power, you might have a market. These are LightMatter Envise performance numbers on ResNet50 with ImageNet and BERT-Base with SQUAD v1.1 against NVIDIA DGX-A100 (state of the art) AI processing system.

The challenge to changing the technology behind multi-million/billion/trillion dollar industry is that it’s not sufficient to offer a product better than the competition. One has to offer a technology that’s better enough to fund the building of a new (multi-million/billion/trillion dollar) ecosystem surrounding that technology. In order to do that it’s got to be orders of magnitude faster/lower power/better so that commercial customers adopt it en masse.

I like where LightMatter is going with their Passage chip. But their Envise server doesn’t seem fast enough to give them enough traction to build a photonics ecosystem or to fund Envise 2, 3, 4, etc. to change the industry.

The 2017 (Deep learning using nanophotonics) paper predicted that an all optical/photonics implementation of CNN would use 3 orders of magnitude less power for small models and that advantage would only go up for larger models (not counting power for data movement, photo detectors, etc.). Now if that’s truly feasible and maybe it takes a more photonics intensive processor to get there, then photonics technology could truly transform the AI or for that matter the computing industry.

But the other thing that LightIntelligence and LightMatter may be counting on is the slowdown in Moore’s law which may inhibit further advances in electronics processing power. Whether the silicon industry is ready to throw in the towel yet on Moore’s law is TBD.


Photo Credit(s):

A “few exabytes-a-day” from SKA

A number of radio telescopes, positioned close together pointed at a cloudy sky
VLA by C. G. P. Grey (cc) (from Flickr)

ArsTechnica reported today on the proposed Square Kilometer Array (SKA) radio telescope and it’s data requirements. IBM is in collaboration with the Netherlands Institute for Radio Astronomy (ASTRON) to help develop the SKA called the DOME project.

When completed in ~2024, the SKA will generate over an exabyte a day (10**18) of raw data.  I reported in a previous post how the world was generating an exabyte-a-day, but that was way back in 2009.

What is the SKA?

The new SKA telescope will be a configuration of “millions of radio telescopes” which when combined together will create a telescope with an aperture of one square kilometer, which is no small feet.  They hope that the telescope will be able to shed some light on galaxy evolution, cosmology and dark energy.  But it will go beyond that to investigating “strong-field tests of gravity“, “origins and evolution of cosmic magnetism” and search for life on other planets.

But the interesting part from a storage perspective is that the SKA will be generating a “few exabytes a day” of radio telescopic data for every full day of operation.   Apparently the new radio telescopes will make use of a new, more sensitive detector able to generate data of up to 10GB/second.

How much data, really?

The team projects final storage needs at between 300 to 1500 PB per year. This compares to the LHC at CERN which consumes ~15PB of storage per year.

It would seem that the immediate data download would be the few exabytes and then it would be post- or inline-processed into something more mangeable and store-able.  Unless they have some hellaciously fast processing, I am hard pressed to believe this could all happen inline.  But then they would need at least another “few exabytes” of storage to buffer the data feed before processing.

I guess that’s why it’s still a research project.  Presumably, this also says that the telescope won’t be in full operation every day of the year, at least at first.

The IBM-ASTRON DOME collaboration project

The joint research project was named for the structure that covers a major telescope and for a famous Swiss mountain.  Focus areas for the IBM-ASTRON DOME project include:

  • Advanced high performance computing utilizing 3D chip stacks for better energy efficiency
  • Optical interconnects with nanophotonics for high-speed data transfer
  • Storage for both high access performance access and for dense/energy efficient data storage.

In this last focus area, IBM is considering the use of phase change memories (PCM) for high access performance and new generation tape for dense/efficient storage.  We have discussed PCM before in a previous post as an alternative to NAND based storage today (see Graphene Flash Memory).  But IBM has also been investigating MRAM based race track memory as a potential future storage technology.  I would guess the advantage of PCM over MRAM might be access speed.

As for tape, IBM has already demonstrated in their labs technologies for a 35TB tape. However storing 1500 PB would take over 40K tapes per year so they may need another even higher capacities to support SKA tape data needs.

Of course new optical interconnects will be needed to move this much data around from telescope to data center and beyond.  It’s likely that the nanophotonics will play some part as an all optical network for transceivers, amplifiers, and other networking switching gear.

The 3D chip stacks have the advantage of decreasing chip IO and more dense packing of components will make efficient use of board space.  But how these help with energy efficiency is another question.  The team projects very high energy and cooling requirements for their exascale high performance computing complex.

If this is anything like CERN, datasets gathered onsite are initially processed then replicated for finer processing elsewhere (see 15PB a year created by CERN post.  But moving PBs around like SKA will require is way beyond today’s Internet infrastructure.


Big science like this gives a whole new meaning to BIGData. Glad I am in the storage business.  Now just what exactly is nanophotonics, mems based phote-electronics?