Photonic [AI] computing seeing the light of day – part 2

Read an interesting article in Analytics India Magazine (MIT Researchers Make New Chips That Work On Light) about a startup out of MIT focused on using photonics for AI/ML/DL activities. Not exactly neuromorphic chips, but using analog photonics interactions to perform computational intensive operations required by todays deep neural net training.

We’ve written about photonics computing before ( see Photonic computing seeing the light of day [-part 1]). That post was about spin outs from Princeton and MIT back in 2019. We showed a bit more on how photonics can perform multiplication and other computations with less power.

The article (noted above) talked about LightIntelligence, an MIT spinout/ startup that’s been around since ~2017, but there’s another company in the same space, also out of MIT called LightMatter that just announced early access to their hardware system.

The CEOs of both companies collaborated on a paper (#1&2 authors of the 10 author paper) written back in 2017 on Deep Learning with Coherent Nanophotonic Circuits. This seemed to be the event that launched both companies.

LightMatter just received $80M in Series B funding ( bringing total funding to $113M) last month and LightIntelligence seems to have $40M in total funding So both have decent funding but, LightMatter seems further ahead in funding and product technology.

LightMatter

LightMatter Envise Photonics-RISC AI processing chip

LightMatter Envise AI chip uses standard RISC electronic cores together with Photo Arithmetic Units for accelerated AI computations. Each Envise chip has 500MB of SRAM for large models, offers 400Gbps chip to chip interconnect fabric, 256 RISC cores, a Graph processor, 294 photonic arithmetic units and PCIe 4.0 connectivity.

LightMatter has just announced early access for their Envise AI photonics server. It’s an 4U, AI server with 16 Envise chips, 2 AMD EPYC CPUs, (16×400=)6.4Tpbs optical fabric for inter-chip communications, 1TB of DDR4 DRAM, 3TB of NVMe SSD and supports 2-200GbE SmartNICs for outside communications.

Envise also offers Idiom Software that interfaces with standard AI frameworks to transform models for photonics computing to use Envise hardware . Developers select Envise hardware to run their AI models on and Idiom automatically re-compiles (IdCompile) their model into more parallelized, photonics operations. Idiom also has a model profiler (IdProfiler) to help debug and visualize photonic models in operation (training or inferencing?) on Envise hardware. Idiom also offers an AI model library (IdML) which provides a PyTorch frontend to help compress and quantize a standard set of AI models.

LightMatter also announced their Passage optical interconnect chip that supplies 100Tbps optical switch for photonics, CPU or GPU processing. It’s huge, 8″x8″ and built on 5nm/7nm node process. Passage can connect up to 48 photonics, CPU or GPU chips that are built onto of it (one can see the space for each of these 48 [sub-]chips on the chip). LightMatter states that 40 Passage (photonic/optical) lanes are the width of one optical fibre. Passage chips are sampling now.

LightMatter Passage photonics-transistor chip (carrier) that provides a photonics programmable interconnect for inter-[photonics-electronic-]chip communications.

LightIntelligence

They don’t appear to be announcing any specific hardware just yet but they are at work in creating the world largest integrated photonics processing system. But LightIntelligence have published a number of research papers focused on photonic approaches to CNNs, RNNs/LSTMs/GRUs, Recurrent ISING machines, statistical computing, and invisibility cloaking.

Turns out the processing power needed to provide invisibility cloaking is very intensive and as its all pixels, photonics offers serious speedups (for invisibility, see Nature article, behind paywall).

Photonics Recurrent ISLING Sampler (PRIS)

LightIntelligence did produce a prototype photonics processor in 2019. And they believe the will have de-risked 80-90% of their photonics technology by year end 2021.

If I had to guess, it would appear as if LightIntelligence is trying to re-imagine deep learning taking a predominately all photonics approach.

Why photonics for AI DL

It turns out that one can use the interaction/interference between two light beams to perform matrix multiplication and other computations a lot faster, with a lot less power than using standard RISC (or CISC) electronic processor architectures. Typical GPUs run 400W each and multi-GPU training activities are commonplace today.

The research documented in the (Deep learning using nanophotonics) paper was based on using an optical FPGA which we have talked about before (See Photonics or Optical FPGAs on the horizon) to prototype the technology back in 2017.

Can photonics change the technology underpinning AI or computing?

If by using photonics, one could speed up AI inferencing by 3-5X and do it with 5-6X less power, you might have a market. These are LightMatter Envise performance numbers on ResNet50 with ImageNet and BERT-Base with SQUAD v1.1 against NVIDIA DGX-A100 (state of the art) AI processing system.

The challenge to changing the technology behind multi-million/billion/trillion dollar industry is that it’s not sufficient to offer a product better than the competition. One has to offer a technology that’s better enough to fund the building of a new (multi-million/billion/trillion dollar) ecosystem surrounding that technology. In order to do that it’s got to be orders of magnitude faster/lower power/better so that commercial customers adopt it en masse.

I like where LightMatter is going with their Passage chip. But their Envise server doesn’t seem fast enough to give them enough traction to build a photonics ecosystem or to fund Envise 2, 3, 4, etc. to change the industry.

The 2017 (Deep learning using nanophotonics) paper predicted that an all optical/photonics implementation of CNN would use 3 orders of magnitude less power for small models and that advantage would only go up for larger models (not counting power for data movement, photo detectors, etc.). Now if that’s truly feasible and maybe it takes a more photonics intensive processor to get there, then photonics technology could truly transform the AI or for that matter the computing industry.

But the other thing that LightIntelligence and LightMatter may be counting on is the slowdown in Moore’s law which may inhibit further advances in electronics processing power. Whether the silicon industry is ready to throw in the towel yet on Moore’s law is TBD.

Comments?

Photo Credit(s):

AI inferencing using light alone

Researchers at UCLA have taken a trained DL neural network and implemented it into a series of passive optical only, 3D printed diffraction gratings to perform fashion MNIST object classification. And did the same with a MNIST handwritten digit and ImageNet DL neural network classifiers.

But first please take our new poll:

Experimental testing of 3D-printed D2NNs.(A and B) After the training phase, the final designs of five different layers (L1, L2, …, L5) of the handwritten digit classifier, fashion product classifier, and the imager D2NNs are shown. To the right of the network layers, an illustration of the corresponding 3D-printed D2NN is shown. (C and D) Schematic (C) and photo (D) of the experimental terahertz setup. An amplifier-multiplier chain was used to generate continuous-wave radiation at 0.4 THz, and a mixer-amplifier-multiplier chain was used for the detection at the output plane of the network. RF, radio frequency; f, frequency.

See the article on SlashGear, 3D printed all-optical diffractive deep learning neural network…. The research article is only available on Optical Society of America’s website/magazine (see Residual D2NN: training diffractive deep neural networks via learnable light shortcuts behind hard paywall). However, I did find a follow on article on ArchivX (see Analysis of Diffractive Optical Neural Networks and Their Integration with Electronic Neural Networks) that discussed how to integrate D2NN approaches with an electronic NN to create a hybrid inference engine. And another earlier Science article (see All-optical machine learning using diffractive deep neural networks) that was available which described earlier versions of D2NN technology for MNIST digit classification, fashion MNIST classification and ImageNet object classification.

How does it work

Apparently the researchers trained a normal (electronic based) deep learning neural network on the MNIST, Fashion MNIST and ImageNet and then converted the resultant trained NNs into a set of multiple diffraction grids. They did some computer simulation of the D2NN and once satisfied it worked and achieved decent accuracy, 3D printed the diffraction plates.

All-optical D2NN-based classifiers. These D2NN designs were based on spatially and temporally coherent illumination and linear optical materials/layers. (a) D2NN setup for the task of classification of handwritten digits (MNIST), where the input information is encoded in the amplitude channel of the input plane. (b) Final design of a 5-layer, phase-only classifier for handwritten digits. (c) Amplitude distribution at the input plane for a test sample (digit ‘0’). (d-e) Intensity patterns at the output plane for the input in (c); (d) is for MSE-based, and (e) is softmax- cross-entropy (SCE)-based designs. (f) D2NN setup for the task of classification of fashion products (Fashion-MNIST), where the input information is encoded in the phase channel of the input plane. (g) Same as (b), except for fashion product dataset. (h) Phase distribution at the input plane for a test sample. (i-j) Same as (d) and (e) for the input in (h),  refers to the illumination source wavelength. Input plane represents the plane of the input object or its data, which can also be generated by another optical imaging system or a lens, projecting an image of the object data onto this plane.

In their D2NN, they start with coherent (laser) light in the THz spectrum, used this to illuminate the input plane (I assume an image of the object/digit/fashion accessory) and passed this through multiple plates of diffraction grids onto THz detector which was used to detect the illuminated spot that indicated the classification.

The article in science has a supplementary materials download that show how the researchers converted NN weights into a diffraction grating. Essentially each pixel on the diffraction grating either transmits, refracts, or reflects a light path. And this represents the connections between layers. It’s unclear whether the 5 or 6 plates used in the D2NN correspond to the NN layers but it’s certainly possible.

And to the life of me I can’t understand what they mean by “Residual D2NN”, other than if it means using a trained (residual) NN and converting this to D2NN.

Some advantages of D2NN

3D printing diffraction gratings means anyone/lab could do this. The 3D printers they used had a spatial accuracy of 600 dpi, with 0.1mm accuracy, almost consumer grade 3D printers. In any case, being able to print these in a matter of hours, while not as easy as changing an all digital NN, seems like an easy way to try out the approach.

For example, for the MNIST digit classifier they used a pixel size of 400um and each diffraction layer they created was equivalent to 200X200 neural weights. Which means that 5 layer D2NN could handle about 0.2M neural weights which were completely connected to one another. This meant they could have (200×200)**2*5=8B connections in the MNIST D2NN. In the image classifier, each diffraction layer had 300×300 neural weights. So D2NN’s seem to scale very well.

Being an all passive optical device, the system is operates entirely in parallel, That is, the researchers indicated that the D2NN devices operate at the speed of light and would perform the inferencing activity in the time it takes a camera to capture the image.

Also the device uses very little energy (I assume just the energy for the THz generator, the input plane detector and the THz detector at the end.

And the researchers also claimed the device was cheap to manufacture, it could be created for less than $50. (Unclear if this included all the electronics or just the D2NN diffraction gratings and holder). And once you have locked into a D2NN that you wanted to use, could be manufactured in volume, very cheaply (sort of like stamping out CD platters). Finally, the number of neural network nodes and layers can be scaled up to a large number of layers and nodes per layer while still fitting on the diffraction gratings. In contrast, all electronic NN require more compute power as you scale up network layers and nodes per layer.

The other article (ArchivX) talked about potentially using a hybrid optical-electronic DNN approach with some layers being D2NN and others being purely digital (electronics). Such a system could potentially be used where some portion of the NN was more stable/more compute intensive than others and where the final output classification layer(s) was more changeable and much smaller/less compute intensive. Such a hybrid system could make use of the best of of the all optical D2NN to efficiently and quickly compress the input space and then have the electronic final classification layer provide the final classification step.

The Oracle

Combining a handful of D2NNs into a device that accepts speech input and provides speech output with the addition of say an offline copy of Wikipedia, Google Books etc. with a search engine that could be used to retrieve responses to questions asked would create an oracle device. Where you would ask a question and the device would respond with the best answer it could find (in it’s databases).

If this could be made out of an all passive optical components and use natural sunlight/electronic illumination to perform it’s functionality, such an all optical, question to answer oracle would be very useful to the populations of the world. And could be manufactured in volume very cheaply and would cost almost nothing to operate.

A couple of other tweaks, if we could collapse the multiple grating D2NNs into a single multi-layer plate/platter and make these replaceable in the device that would allow the oracle’s information base to be updated periodically.

Then if we could embed such a device into a Long Now Clock that would reflect sunlight onto the disk every Solstice, or Equinox, then we could have a quarterly oracle device that could last for 1000 of years. That would provide answers to queries one day every quarter. And that would be quite the oracle…

Photo credit(s):

Photonics + Nonlinear optical crystals = Quantum computing at room temp

Read an article the other day in ScienceDaily (Path to quantum computing at room temp) which was reporting on a Phys.Org article (Researchers see path to quantum computing at room temp). Both articles were discussing recent research documented in a Physical Review Letters (Controlled-Phase Gate Using Dynamically Coupled Cavities and Optical Nonlinearities, behind paywall) being done at the Army Research Laboratory, Army and MIT researchers used photonis circuits and non-linear optical (NLO) crystals to provide quantum entanglement between photon waves. I found a pre-print version of the paper on Arxiv.org, (Controlled-Phase Gate Using Dynamically Coupled Cavities and Optical Nonlinearities).

NLO Crystals

Nonlinear optics (source: Wikipedia Nonlinear Optics article) uses NLO crystals whicht when exposed to high electrical fields and high intensity light can modify or modulate light polarization, frequency, phase and path. For example:

Comparison of a phase-conjugate mirror with a conventional mirror. With the phase-conjugate mirror the image is not deformed when passing through an aberrating element twice.
  • Double or tripling light frequency, where one can double or triple the frequency of light (with two [or three] photons destroyed and a new one created).
  • Cross phase modulation where the wavelength phase of one photon can affect the wavelength phase of another photon.
  • Cross polarization wave generation where the polarization vector of a photon can be changed to be perpendicular to the original photon.
  • Phase conjugation mirror where light beams interact to exactly reverse “the propagation direction and phase variability” of a beam of light.

The Wikipedia article discusses a dozen more affects like this that NLO crystals can have on photons.

Quantum photon traps using NLO

MIT and Army researchers have theorized that there is another NLO crystal affect which can create a quantum photon trap. The researchers believe they can engineer a NLO crystal cavity(s) that act as a photon trap. With such an NLO crystal and photonics circuits, the traps could have the value of either a photon inside or a photon not inside the trap, but as it’s a quantum photon trap, it takes on both values at the same time.

Using photon trap NLO crystals, the researchers believe these devices could serve as room temperature qubits and quantum (photonic) gates.

The researchers state that with recent advances in nano-fabrication and the development of ultra-confined NLO crystals, experimental demonstrations of the photonics qubits and quantum gates appear feasible.

Quantum computing today

As our blog readers mayrecall, quantum computers today can take on many approaches but they all require extremely cold temperatures (a few Kelvin) to work. Even at that temperature quantum computing today is extremely susceptible to noise and other interference.

A quantum computer based on photonics, NLO crystals and operations at room temperature would be much more energy efficient, have many more qubits and much less susceptible to noise. Such a quantum computer could result in quantum computing being as ubiquitous as GPUs, TPU/IPUs or FPGA computational resources today .

Ubiquitous quantum computing would turn over our world. Digital information security today depends on mathematics for key exchanges which are extremely hard to do with digital computers. Quantum computers with sufficient qubits have no difficulty with such mathematics. Block chain relies on similar technology, so that too would also be at risk.

Standards organizations are working on security based on quantum proof algorithms but to date, we have yet to see any descriptions, let alone implementations of any quantum proof security in any information security scheme.

If what the researchers propose, pans out, advances in photonic quantum computing could restart information security of our world.

Photo Credit(s):

Using jell-o (hydrogel) for new form of photonics computing

Read an article the other day which blew me away, Researchers Create ” Intelligent interaction between light and meterial – New form of computing, which discussed the use of a hydrogel (like raspberry jell-o) that could be used both as a photonics switch for optical communications and as modifiable material to create photonics circuits. The research paper on the topic is also available on PNAS, Opto-chemical-mechanical transduction in photeresponsive gel elicits switchable self trapped beams with remote interactions.

Apparently researchers have created this gel (see B in the graphic above)which when exposed to laser light interacts to a) trap the beam within a narrow cylinder and or b) when exposed to parallel beams interact such that it boosts the intensity of one of the beams. They still have some work to show more interactions on laser beam(s) but the trapping of the laser beams is well documented in the PNAS paper.

Jell-o optical fibres

Most laser beams broaden as they travel through space, but when a laser beam ise sent through the new gel it becomes trapped in a narrow volume almost as if sent through a pipe.

The beam trading experiment using a hydrogel cube of ~4mm per side. They sent a focused laser beam with a ~20um diameter through an 4mm empty volume and measured the beam’s disbursement to be ~130um diameter. Then the did the same experiment only this time shining the laser beam through the hydrogel cube and over time (>50 seconds) the beam diameter narrows to becomes ~22um. In effect, the gel over time constructs (drills) a self-made optical fibre or cylindrical microscopic waveguide for the laser beam.

A similar process works with multiple laser beam going through the gel. More below on what happens with 2 parallel laser beams.

The PNAS article has a couple of movies showing the effect from the side of the hydrogel. with a single and multiple laser beams.

Apparently as the beam propagates through the hydrogel, it alters the optical-mechanical properties of the material such that the refractive index within the beam diameter is better than outside the beam diameter. Over time, as this material change takes place, the beam diameter narrows back down to almost the size of the incoming beam. They call any material like this that changes its refractive index as chromophores.

It appears that the self-trapping effectiveness is a function of the beam intensity. That is higher intensity incoming laser beams (6.0W in C above) cause the exit beam to narrow while lower (0.37W) intensity incoming laser beams don’t narrow as much.

This self-created optical wave-guide (fibre) through the gel can be reset or reversed (> 45 times) by turning off the laser and leaving the gel in darkness for a time (200 seconds or so). This allows the material to be re-used multiple times to create other optical channels or to create the same one over and over again.

Jell-o optical circuits

It turns out that by illuminating two laser beams in parallel their distances apart can change their interaction even though they don’t cross.

When the two beams are around 200um apart, the two beams self channel to about the size of ~40um (incoming beams at ~20um). But the intensity of the two beams are not the same at the exit as they were at the entrance to the gel. One beam intensity is boosted by a factor of 12 or so and the other is boosted by a factor of 9 providing an asymmetric intensity boost. Unclear how the higher intensity beam is selected but if I read the charts right the more intensely boosted beam is turned on after the the less intensely boosted beam (so 2nd one in gets the higher boost.

When one of the beams is disabled (turned off/blocked), the intensity of the remaining beam is boosted on the order of 20X. This boosting effect can be reversed by illuminating (turning back on/unblocking) the blocked laser. But, oddly the asymmetric boosting, is no longer present after this point. The process seemingly can revert back to the 20X intensity boost, just by disabling the other laser beam again. .

When the two beam are within 25 um of each other, the two beams emerge with the same (or close to similar) intensity (symmetric boosting), and as you block one beam the other increases in intensity but not as much as the farther apart beams (only 9X).

How to use this effect to create an optical circuit is beyond me but they haven’t documented any experiments where the beams collide or are close together but at 90-180 degrees from one another. And what happens when a 3rd beam is introduced? So there’s much room for more discovery.

~~~~

Just in case you want to try this at home. Here is the description of how to make the gel from the PNAS article: “The polymerizable hydrogel matrix was prepared by dissolving acrylamide:acrylic acid or acrylamide:2-hydroxyethyl methacrylate (HEMA) in a mixture of dimethyl sulfoxide (DMSO):deionized water before addition of the cross-linker. Acrylated SP (for tethered samples) or hydroxyl-substituted SP was then added to the unpolymerized hydrogel matrix followed by an addition of a catalyst. Hydrogel samples were cured in a circular plastic mold (d = 10 mm, h = 4 mm thick).

How long it will take to get the gel from the lab to your computer is anyones guess. It seems to me they have quite a ways to go to be able to simulate “nor” or “nand” universal logic gates widely used in to create electronic circuits today.

On the other hand, using the gel in optical communications may come earlier. Having a self trapping optical channel seems useful for a number of applications. And the intensity boosting effect would seem to provide an all optical amplifier.

I see two problems:

  1. The time it takes to get to a self trapping channel, 50sec is long and it will probably take longer as you increase the size of the material.
  2. The size of the material seems large for optical (or electronic) circuitry. 4mm may not be much but it’s astronomical compared to the nm used in electronic circuitry

The size may not be a real concern as the movies don’t seem to show that the beam once trapped changes across the material, so maybe it could be a 1mm, or 1um cube of material that’s used instead. The time is a more significant problem. But then again there may be another gel recipe that acts quicker. But from 50sec down to something like 50nsec is nine orders of magnitude. So there’s a lot of work here.

Comments?

Photo Credit(s): all charts are from the PNAS article, Opto-chemo-mechanical transduction in photo responsive gel…