New chip architecture with CPU, storage & sensors in one package

Read an article the other day in MIT news, (3D chip combines computing and data storage) about a new 3D chip out of Stanford and MIT research, which includes CPU, RRAM (resistive RAM) storage class memories and sensors in one single package. Such a chip architecture vastly minimizes the off chip bottleneck to access storage and sensors.

Chip componentry

The chip’s sensors are based on carbon nanotubes. Aside from a layer of silicon at the bottom, all the rest of transistors used in the chip are also based off of carbon nanotube FET (field effect transistors).

The RRAM storage class memory is a based on a dielectric material which uses electrical resistance to store non-volatile data.

The bottom layer is a silicon based CPU. On top of the silicon is a carbon nanotube layer. Next comes the RRAM and the top layer is more carbon nanotubes making up the sensor array.

Architectural benefits

One obvious benefit is having data storage directly accessible to the CPU is that there’s no longer a need to go off chip to access data. The 2nd major advantage to the chip architecture is that the sensor array can write directly to RRAM storage, so there’s no off chip delay to provide sensor readout and storage.

Another advantage to using carbon nanotube FET’s is that they can be an order of magnitude more energy efficient than silicon transistors. Moreover, RRAM has the potential to be much denser than DRAM.

Finally, another major advantage is that this can all be built in one 3D chip because carbon nanotube and RRAM fabrication can be done at relatively cooler temperatures (~200C) vs. silicon fabrication which requires relatively high temperatures (1000C). Silicon cannot be readily fabricated in multiple layers because of the high temperatures required which will harm lower layers. But you could fabricate the lowest layer in silicon and then the rest as either carbon nanotube FETs or RRAM without harming the silicon layer.

Transistor/RRAM counts

The chip as fabricated has a million RRAM cells (bits?) and 2 million nanotube FETs. In contrast, in 2014, Intel’s 15-core Xeon Ivy Bridge EX had 4.3B transistors and current DRAM chips offer 64Gb. So there’s a ways to go before carbon nanotube and RRAM densities can get to a level available from silicon today.

However, as they have a bottom layer of silicon they can have all the CPU complexity of an Intel processor and still build RRAM and carbon nanotubes FETs on top of that. Which makes this chip architecture compatible with current CMOS fabrication techniques and a very interesting addition to current CPU architectures.


Unclear to me why they stopped at 4 layers (1-silicon FET, 1 carbon nanotubes FET, 1 RRAM and 1 carbon nanotubes FET [sensor array]). If they can do 4 why not do 5 or more. That way they could pack in even more RRAM storage and perhaps more sensor layers.

Also, not sure what the bottom most layer of carbon nanotubes is doing. If I had to hazard a guess, it’s being used for RRAM control logic. But I could be wrong.

I could see how these chips could be used for very specialized sensor applications, with a limited need for data storage. The researchers claim many types of sensors can be created using carbon nanotubes. If that’s the case, maybe we might see these sorts of chips showing up all over the place.


Photo Credit(s): Three dimensional integration of nanotechnologies for computing and data storage on a single chip, Nature magazine. 

Zipline delivers blood 7X24 using fixed wing drones in Rwanda

Read an article the other day in MIT Tech Review (Zipline’s ambitious medical drone delivery in Africa) about a startup in Silicon Valley, Zipline, that has started delivering blood by drones to remote medical centers in Rwanda.

We’ve talked about drones before (see my Drones as a leapfrog technology post) and how they could be another leapfrog 3rd world countries into the 21st century. Similar, to cell phones, drones could be used to advance infrastructure without having to go replicate the same paths as 1st world countries such as building roads/hiways, trains and other transport infrastructure.

The country

Rwanda is a very hilly but small (10.2K SqMi/26.3 SqKm) and populous (pop. 11.3m) country in east-central Africa, just a few degrees south of the Equator. Rwanda’s economy is based on subsistence agriculture with a growing eco-tourism segment.

Nonetheless, with all
its hills and poverty roads in Rwanda are not the best. In the past delivering blood supplies to remote health centers could often take hours or more. But with the new Zipline drone delivery service technicians can order up blood products with an app on a smart phone and have it delivered via parachute to their center within 20 minutes.

Drone delivery operations

In the nest, a center for drone operations, there is a tent housing the blood supplies, and logistics for the drone force. Beside the tent are a steel runway/catapults that can launch drones and on the other side of the tent are brown inflatable pillows  used to land the drones.

The drones take a pre-planned path to the remote health centers and drop their cargo via parachute to within a five meter diameter circle.

Operators fly the drones using an iPad and each drone has an internal navigation system. Drones fly a pre-planned flightaugmented with realtime kinematic satellite navigation. Drone travel is integrated within Rwanda’s controlled air space. Routes are pre-mapped using detailed ground surveys.

Drone delivery works

Zipline drone blood deliveries have been taking place since late 2016. Deliveries started M-F, during daylight only. But by April, they were delivering 7 days a week, day and night.

Zipline currently only operates in Rwanda and only delivers blood but they have plans to extend deliveries to other medical products and to expand beyond Rwanda.

On their website they stated that before Zipline, delivering blood to one health center would take four hours by truck which can now be done in 17 minutes. Their Muhanga drone center serves 21 medical centers throughout western Rwanda.

Photo Credits:

Axellio, next gen, IO intensive server for RT analytics by X-IO Technologies

We were at X-IO Technologies last week for SFD13 in Colorado Springs talking with the team and they showed us their new IO and storage intensive server, the Axellio. They want to sell Axellio to customers that need extreme IOPS, very high bandwidth, and large storage requirements. Videos of X-IO’s sessions at SFD13 are available here.

The hardware

Axellio comes in 2U appliance with two server nodes. Each server supports  2 sockets of Intel E5-26xx v4 CPUs (4 sockets total) supporting from 16 to 88 cores. Each server node can be configured with up to 1TB of DRAM or it also supports NVDIMMs.

There are two key differentiators to Axellio:

  1. The FabricExpress™, a PCIe based interconnect which allows both server nodes to access dual-ported,  2.5″ NVMe SSDs; and
  2. Dense drive trays, the Axellio supports up to 72 (6 trays with 12 drives each) 2.5″ NVMe SSDs offering up to 460TB of raw NVMe flash using 6.4TB NVMe SSDs. Higher capacity NVMe SSDS available soon will increase Axellio capacity to 1PB of raw NVMe flash.

They also probably spent a lot of time on packaging, cooling and power in order to make Axellio a reliable solution for edge computing. We asked if it was NEBs compliant and they told us not yet but they are working on it.

Axellio can also be configured to replace 2 drive trays with 2 processor offload modules such as 2x Intel Phi CPU extensions for parallel compute, 2X Nvidia K2 GPU modules for high end video or VDI processing or 2X Nvidia P100 Tesla modules for machine learning processing. Probably anything that fits into Axellio’s power, cooling and PCIe bus lane limitations would also probably work here.

At the frontend of the appliance there are 1x16PCIe lanes of server retained for networking that can support off the shelf NICs/HCAs/HBAs with HHHL or FHHL cards for Ethernet, Infiniband or FC access to the Axellio. This provides up to 2x100GbE per server node of network access.

Performance of Axellio

With Axellio using all NVMe SSDs, we expect high IO performance. Further, they are measuring IO performance from internal to the CPUs on the Axellio server nodes. X-IO says the Axellio can hit >12Million IO/sec with at 35µsec latencies with 72 NVMe SSDs.

Lab testing detailed in the chart above shows IO rates for an Axellio appliance with 48 NVMe SSDs. With that configuration the Axellio can do 7.8M 4KB random write IOPS at 90µsec average response times and 8.6M 4KB random read IOPS at 164µsec latencies. Don’t know why reads would take longer than writes in Axellio, but they are doing 10% more of them.

Furthermore, the difference between read and write IOP rates aren’t close to what we have seen with other AFAs. Typically, maximum write IOPs are much less than read IOPs. Why Axellio’s read and write IOP rates are so close to one another (~10%) is a significant mystery.

As for IO bandwitdh, Axellio it supports up to 60GB/sec sustained and in the 48 drive lax testing it generated 30.5GB/sec for random 4KB writes and 33.7GB/sec for random 4KB reads. Again much closer together than what we have seen for other AFAs.

Also noteworthy, given PCIe’s bi-directional capabilities, X-IO said that there’s no reason that the system couldn’t be doing a mixed IO workload of both random reads and writes at similar rates. Although, they didn’t present any test data to substantiate that claim.

Markets for Axellio

They really didn’t talk about the software for Axellio. We would guess this is up to the customer/vertical that uses it.

Aside from the obvious use case as a X-IO’s next generation ISE storage appliance, Axellio could easily be used as an edge processor for a massive fabric of IoT devices, analytics processor for large RT streaming data, and deep packet capture and analysis processing for cyber security/intelligence gathering, etc. X-IO seems to be focusing their current efforts on attacking these verticals and others with similar processing requirements.

X-IO Technologies’ sessions at SFD13

Other sessions at X-IO include: Richard Lary, CTO X-IO Technologies gave a very interesting presentation on an mathematically optimized way to do data dedupe (caution some math involved); Bill Miller, CEO X-IO Technologies presented on edge computing’s new requirements and Gavin McLaughlin, Strategy & Communications talked about X-IO’s history and new approach to take the company into more profitable business.

Again all the videos are available online (see link above). We were very impressed with Richard’s dedupe session and haven’t heard as much about bloom filters, since Andy Warfield, CTO and Co-founder Coho Data, talked at SFD8.

For more information, other SFD13 blogger posts on X-IO’s sessions:

Full Disclosure

X-IO paid for our presence at their sessions and they provided each blogger a shirt, lunch and a USB stick with their presentations on it.


Quantum computing at our doorsteps

Read an article the other day in MIT’s Technical Review, Google’s new chip is a stepping stone to quantum computing… about Google’s latest endeavor to create quantum computers. Although, digital logic or classical electronic computation has been around since mid last century, quantum logic does things differently and there are many problems that are easier to compute with quantum computing that take much longer to solve with digital computing.

Qubits are weird

Classical or digital electronic computation follows the more physical mechanistic view of the world (for the most part) and quantum computing follows the quantum mechanical view of the world. Quantum computing uses quantum bits or Qubits and the device that Google demonstrated has a 2X3 matrix of qubits, 6 in total.

Unlike a bit, which (theoretically)is a two state system that can only take on the values of 0 and 1, a qubit is a two level system but it can take on an infinitely many number of different states in reality. In practice, with a qubit, there are always two states that are distinguishable from one another but they can be any two states of the infinitely many states they can take on.

Also, reading out the state value of a qubit can be a probabilistic endeavor and can impact the “value” of the qubit that is read out afterwards.

There’s more to quantum computing and I am certainly no expert. So if your interested, I suggest starting with this Arxiv article.

Faster quantum algorithms

In any case some difficult and time consuming arenas of classical computation seem to be easier and faster with quantum computation. For example,

  • Factoring large numbers – in classical computation this process takes an amount of time that is exponential to the number of bits in the “large number”, where “B” is number of bits and “E” epsilon is a constant >0, the best current algorithms take O([1+E]**B) time. But Shor’s quantum factorization algorithm takes only O(B**3) time, which is considerably faster for large numbers. This is important because RSA cryptography and most key exchange algorithms in use today, base their security on the difficulty of factoring large numbers. (See Wikipedia article on Integer Factorization for more information.
  • Searching an unstructured list – in classical computation for a list of N items, it takes on the O(N). But Grover’s quantum search algorithm only takes O(sort[N]) which is considerably faster for large lists. (See Arxiv paper for more information.)

Using the Shor factorization algorithm, they were able to factor the number 15 with 7 qubits.

There are many quantum algorithms available today (see the Quantum Algorithm Zoo at NIST) with more showing up all the time.  Suffice it to say that quantum computing will be a more time efficient and thus, more effective approach to certain problems than classical computing.

Quantum computers starting to scale

Now back to the chip. According to the article the new Googl chip implements a 2X3 matrix of qubits.

For those old enough to remember, this was called an Octal or 3-bit number, ranging from 0 to 7, and two octals can range from 0..64. Octals were used for a long time to represent digital information for some (mostly mini-computers) computers. This is in contrast to most computing nowadays ,which uses Hexadecimal numbers or 4-bit numbers ranging from 0..15, and with two hexadecimal numbers ranging from 0..255.

Why are octals important? Well if quantum computing can scale up multiple octal numbers, then they can start representing really large numbers. According to the article Google chose 2X3 qubit structure because it’s more easy to scale.

I assume all the piping surrounding the chip package in the above photo are cooling ports. It seems that quantum computing only works at very cold temperatures. And if this is a two octals computer, scaling these up to multiple octals is going to take lots of space.

How quickly will it scale?

For some history, Intel introduced their 4004 (4-bit) computing chip in 1971 (Wikipedia), their 8-bit Intel 8008 in 1972 (Wikipedia), their 16-bit Intel 8086 between 1976-78. So in 7 years we went from a 4-bit computer to a 16 bit computer whose (x86) architecture continues on today and rules the world.

Now the Intel 4004 had 16 4-bit registers, had a data/instruction bus that could address 4096 4-bit words, 3-level subroutine stack and was a full fledged 4 bit computer. It’s unclear what’s in Google’s chip. But if we consider that this 2×3-qubit computer, which has multiple 2×3 qubit registers, a qubit storage bus, multi-level qubit subroutine (register) stack, etc. Then we are well on our way to quantum computing being added to the worlds computational capabilities in less than 10 years.

And of course, Googles not the only large organization working on quantum computing.


So there you have it, Google and others are in the process of making your cryptography obsolete, rapidly speeding up unstructured searching and doing multiple other computations lots faster than today.

Photo Credit(s): from the MIT Technical Review article.


The fragility of public cloud IT

I have been reading AntiFragile again (by Nassim Taleb). And although he would probably disagree with my use of his concepts, it appears to me that IT is becoming more fragile, not less.

For example, recent outages at major public cloud providers display increased fragility for IT. Yet these problems, although almost national in scope, seldom deter individual organizations from their migration to the cloud.

Tragedy of the cloud commons

The issues are somewhat similar to the tragedy of the commons. When more and more entities use a common pool of resources, occasionally that common pool can become degraded. But because no-one really owns the common resources no one has any incentive to improve the situation.

Now the public cloud, although certainly a common pool of resources, is also most assuredly owned by corporations. So it’s not a true tragedy of the commons problem. Public cloud corporations have a real incentive to improve their services.

However, the fragility of IT in general, the web, and other electronic/data services all increases as they become more and more reliant on public cloud, common infrastructure. And I would propose this general IT fragility is really not owned by any one person, corporation or organization, let alone the public cloud providers.

Pre-cloud was less fragile, post-cloud more so

In the old days of last century, pre-cloud, if a human screwed up a CLI command the worst they could happen was to take out a corporation’s data services. Nowadays, post-cloud, if a similar human screws up a CLI command, the worst that can happen is that major portions of the internet services of a nation go down.

Strange Clouds by michaelroper (cc) (from Flickr)

Yes, over time, public cloud services have become better at not causing outages, but they aren’t going away. And if anything, better public cloud services just encourages more corporations to use them for more data services, causing any subsequent cloud outage to be more impactful, not less

The Internet was originally designed by DARPA to be more resilient to failures, outages and nuclear attack. But by centralizing IT infrastructure onto public cloud common infrastructure, we are reversing the web’s inherent fault tolerance and causing IT to be more susceptible to failures.

What can be done?

There are certainly things that can be done to improve the situation and make IT less fragile in the short and long run:

  1. Use the cloud for non-essential or temporary data services, that don’t hurt a corporation, organization or nation when outages occur.
  2. Build in fault-tolerance, automatic switchover for public cloud data services to other regions/clouds.
  3. Physically partition public cloud infrastructure into more regions and physically separate infrastructure segments within regions, such that any one admin has limited control over an amount of public cloud infrastructure.
  4. Divide an organizations or nations data services across public cloud infrastructures, across as many regions and segments as possible.
  5. Create a National Public IT Safety Board, not unlike the one for transportation, that does a formal post-mortem of every public cloud outage, proposes fixes, and enforces fix compliance.

The National Public IT Safety Board

The National Transportation Safety Board (NTSB) has worked well for air transportation. It relies on the cooperation of multiple equipment vendors, airlines, countries and other parties. It performs formal post mortems on any air transportation failure. It also enforces any fixes in processes, procedures, training and any other activities on equipment vendors, maintenance services, pilots, airlines and other entities that can impact public air transport safety. At the moment, air transport is probably the safest form of transportation available, and much of this is due to the NTSB

We need something similar for public (cloud) IT services. Yes most public cloud companies are doing this sort of work themselves in isolation, but we have a pressing need to accelerate this process across cloud vendors to improve public IT reliability even faster.

The public cloud is here to stay and if anything will become more encompassing, running more and more of the worlds IT. And as IoT, AI and automation becomes more pervasive, data processes that support these services, which will, no doubt run in the cloud, can impact public safety. Just think of what would happen in the future if an outage occurred in a major cloud provider running the backend for self-guided car algorithms during rush hour.

If the public cloud is to remain (at this point almost inevitable) then the safety and continuous functioning of this infrastructure becomes a public concern. As such, having a National Public IT Safety Board seems like the only way to have some entity own IT’s increased fragility due to  public cloud infrastructure consolidation.


In the meantime, as corporations, government and other entities contemplate migrating data services to the cloud, they should consider the broader impact they are having on the reliability of public IT. When public cloud outages occur, all organizations suffer from the reduced public perception of IT service reliability.

Photo Credits: Fragile by Bart Everson; Fragile Planet by Dave Ginsberg; Strange Clouds by Michael Roper

Hardware vs. software innovation – round 4

We, the industry and I, have had a long running debate on whether hardware innovation still makes sense anymore (see my Hardware vs. software innovation – rounds 1, 2, & 3 posts).

The news within the last week or so is that Dell-EMC cancelled their multi-million$, DSSD project, which was a new hardware innovation intensive, Tier 0 flash storage solution, offering 10 million of IO/sec at 100µsec response times to a rack of servers.

DSSD required specialized hardware and software in the client or host server, specialized cabling between the client and the DSSD storage device and specialized hardware and flash storage in the storage device.

What ultimately did DSSD in, was the emergence of NVMe protocols, NVMe SSDs and RoCE (RDMA over Converged Ethernet) NICs.

Last weeks post on Excelero (see my 4.5M IO/sec@227µsec … post) was just one example of what can be done with such “commodity” hardware. We just finished a GreyBeardsOnStorage podcast (GreyBeards podcast with Zivan Ori, CEO & Co-founder, E8 storage) with E8 Storage which is yet another approach to using NVMe-RoCE “commodity” hardware and providing amazing performance.

Both Excelero and E8 Storage offer over 4 million IO/sec with ~120 to ~230µsec response times to multiple racks of servers. All this with off the shelf, commodity hardware and lots of software magic.

Lessons for future hardware innovation

What can be learned from the DSSD to NVMe(SSDs & protocol)-RoCE technological transition for future hardware innovation:

  1. Closely track all commodity hardware innovations, especially ones that offer similar functionality and/or performance to what you are doing with your hardware.
  2. Intensely focus any specialized hardware innovation to a small subset of functionality that gives you the most bang, most benefits at minimum cost and avoid unnecessary changes to other hardware.
  3. Speedup hardware design-validation-prototype-production cycle as much as possible to get your solution to the market faster and try to outrun and get ahead of commodity hardware innovation for as long as possible.
  4. When (and not if) commodity hardware innovation emerges that provides  similar functionality/performance, abandon your hardware approach as quick as possible and adopt commodity hardware.

Of all the above, I believe the main problem is hardware innovation cycle times. Yes, hardware innovation costs too much (not discussed above) but I believe that these costs are a concern only if the product doesn’t succeed in the market.

When a storage (or any systems) company can startup and in 18-24 months produce a competitive product with only software development and aggressive hardware sourcing/validation/testing, having specialized hardware innovation that takes 18 months to start and another 1-2 years to get to GA ready is way too long.

What’s the solution?

I think FPGA’s have to be a part of any solution to making hardware innovation faster. With FPGA’s hardware innovation can occur in days weeks rather than months to years. Yes ASICs cost much less but cycle time is THE problem from my perspective.

I’d like to think that ASIC development cycle times of design, validation, prototype and production could also be reduced. But I don’t see how. Maybe AI can help to reduce time for design-validation. But independent FABs can only speed the prototype and production phases for new ASICs, so much.

ASIC failures also happen on a regular basis. There’s got to be a way to more quickly fix ASIC and other hardware errors. Yes some hardware fixes can be done in software but occasionally the fix requires hardware changes. A quicker hardware fix approach should help.

Finally, there must be an expectation that commodity hardware will catch up eventually, especially if the market is large enough. So an eventual changeover to commodity hardware should be baked in, from the start.


In the end, project failures like this happen. Hardware innovation needs to learn from them and move on. I commend Dell-EMC for making the hard decision to kill the project.

There will be a next time for specialized hardware innovation and it will be better. There are just too many problems that remain in the storage (and systems) industry and a select few of these can only be solved with specialized hardware.


Picture credit(s): Gravestones by Sherry NelsonMotherboard 1 by Gareth Palidwor; Copy of a DSSD slide photo taken from EMC presentation by Author (c) Dell-EMC

4.5M IO/sec@227µsec 4KB Read on 100GBE with 24 NVMe cards #SFD12

At Storage Field Day 12 (SFD12) this week we talked with Excelero, which is a startup out of Israel. They support a software defined block storage for Linux.

Excelero depends on NVMe SSDs in servers (hyper converged or as a storage system), 100GBE and RDMA NICs. (At the time I wrote this post, videos from the presentation were not available, but the TFD team assures me they will be up on their website soon).

I know, yet another software defined storage startup.

Well yesterday they demoed a single storage system that generated 2.5 M IO/sec random 4KB random writes or 4.5 M IO/Sec random 4KB reads. I didn’t record the random write average response time but it was less than 350µsec and the random read average response time was 227µsec. They only did these 30 second test runs a couple of times, but the IO performance was staggering.

But they used lots of hardware, right?

No. The target storage system used during their demo consisted of:

  • 1-Supermicro 2028U-TN24RT+, a 2U dual socket server with up to 24 NVMe 2.5″ drive slots;
  • 2-2 x 100Gbs Mellanox ConnectX-5 100Gbs Ethernet (R[DMA]-NICs); and
  • 24-Intel 2.5″ 400GB NVMe SSDs.

They also had a Dell Z9100-ON Switch  supporting 32 X 100Gbs QSFP28 ports and I think they were using 4 hosts but all this was not part of the storage target system.

I don’t recall the CPU processor used on the target but it was a relatively lowend, cheap ($300 or so) dual core, Intel standard CPU. I think they said the total target hardware cost $13K or so.

I priced out an equivalent system. 24 400GB 2.5″ NVMe Intel 750 SSDs would cost around $7.8K (Newegg); the 2 Mellanox ConnectX-5 cards $4K (Neutron USA); and the SuperMicro plus an Intel Cpu around $1.5K. So the total system is close to the ~$13K.

But it burned out the target CPU, didn’t it?

During the 4.5M IO/sec random read benchmark, the storage target CPU was at 0.3% busy and the highest consuming process on the target CPU was the Linux “Top” command used to display the PS status.

Excelero claims that the storage target system consumes absolutely no CPU processing to service an 4K read or write IO request. All of IO processing is done by hardware (the R(DMA)-NICs, the NVMe drives and PCIe bus) which bypasses the storage target CPU altogether.

We didn’t look at the host cpu utilization but driving 4.5M IO/sec would take a high level of CPU power even if their client software did most of this via RDMA messaging magic.

How is this possible?

Their client software running in the Linux host is roughly equivalent to an iSCSI initiator but talks a special RDMA protocol (patent pending by Excelero, RDDA protocol) that adds an IO request to the NVMe device submission queue and then rings the doorbell on the target system device and the SSD then takes it off the queue and executes it. In addition to the submission queue IO request they preprogram the PCIe MSI interrupt request message to somehow program (?) the target system R-NIC to send the read data/write status data back to the client host.

So there’s really no target CPU processing for any NVMe message handling or interrupt processing, it’s all done by the client SW and is handled between the NVMe drive and the target and client R-NICs.

The result is that the data is sent back to the requesting host automatically from the drive to the target R-NIC over the target’s PCIe bus and then from the target system to the client system via RDMA across 100GBE and the R-NICS and then from the client R-NIC to the client IO memory data buffer over the client’s PCIe bus.

Writes are a bit simpler as the 4KB write data can be encapsulated into the submission queue command for the write operation that’s sent to the NVMe device and the write IO status is relatively small amount of data that needs to be sent back to the client.

NVMe optimized for 4KB IO

Of course the NVMe protocol is set up to transfer up to 4KB of data with a (write command) submission queue element. And the PCIe MSI interrupt return message can be programmed to (I think) write a command in the R-NIC to cause the data transfer back for a read command directly into the client’s memory using RDMA with no CPU activity whatsoever in either operation. As long as your IO request is less than 4KB, this all works fine.

There is some minor CPU processing on the target to configure a LUN and set up the client to target connection. They essentially only support replicated RAID 10 protection across the NVMe SSDs.

They also showed another demo which used the same drive both across the 100Gbs Ethernet network and in local mode or direct as a local NVMe storage. The response times shown for both local and remote were within  5µsec of each other. This means that the overhead for going over the Ethernet link rather than going local cost you an additional 5µsec of response time.

Disaggregated vs. aggregated configuration

In addition to their standalone (disaggregated) storage target solution they also showed an (aggregated) Linux based, hyper converged client-target configuration with a smaller number of NVMe drives in them. This could be used in configurations where VMs operated and both client and target Excelero software was running on the same hardware.

Simply amazing

The product has no advanced data services. no high availability, snapshots, erasure coding, dedupe, compression replication, thin provisioning, etc. advanced data services are all lacking. But if I can clone a LUN at lets say 2.5M IO/sec I can get by with no snapshotting. And with hardware that’s this cheap I’m not sure I care about thin provisioning, dedupe and compression.  Remote site replication is never going to happen at these speeds. Ok HA is an important consideration but I think they can make that happen and they do support RAID 10 (data mirroring) so data mirroring is there for an NVMe device failure.

But if you want 4.5M 4K random reads or 2.5M 4K random writes on <$15K of hardware and happen to be running Linux, I think they have a solution for you. They showed some volume provisioning software but I was too overwhelmed trying to make sense of their performance to notice.

Yes it really screams for 4KB IO. But that covers a lot of IO activity these days. And if you can do Millions of them a second splitting up bigger IOs into 4K should not be a problem.

As far as I could tell they are selling Excelero software as a standalone product and offering it to OEMs. They already have a few customers using Excelero’s standalone software and will be announcing  OEMs soon.

I really want one for my Mac office environment, although what I’d do with a millions of IO/sec is another question.