NVIDIA Triton Giant Model Inference, a step too far

At GTC this week NVIDIA announced a new capability for their AI suite called Triton Giant Model Inference . This solution addresses the current and future problem of trying to perform inferencing with models whose parameters exceed a single GPU card.

During NVIDIA’s GTC show they showed a chart which indicates that model parameters are on an exponential climb (just eyeballing it here but 10X every year since 2018). Current models, like OpenAI’s GPT-3 have 175B parameters. Such a model would require ~350GB of GPU memory to perform inferencing on the whole model.

The fact that NVIDIA’s A100 currently sports 80GB of GPU memory means that GPT-3 would need to be cut up or partitioned to run on NVIDIA GPUs. Hence the need (from NVIDIA’s perspective) for a mechanism that can allow them to perform multi-GPU inferencing or their Triton Giant Machine Inference engine (GMI).

But first please take our new poll:

Why do we need GMI

It’s unclear what needs to be done to perform inferencing with a 175B parameter model today but my guess it involves a lot of manual splitting up of the model, into different layers/partitions and running the layers/partitions on separate GPUs and gluing the output of one portion to the input of the next. Such activity would be a complex, manual undertaking and would inherently slow down the model inferencing activities and add to inferencing latencies.

With Triton GMI, NVIDIA appears able to supply automated multi-GPU inferencing for models that exceed single GPU memory. Whether such models can span (DGX) servers or not was not revealed but even within a single DGX server there’s 4-A100s, so that provides an aggregate of 320GB of GPU memory. Of course, it’s very likely future Ampere GPUs will allow for more memory.

Why consider a step too far

Here’s my point, with artificial general intelligence (AGI, reasoning at human levels and beyond), coming sooner or later. My (and perhaps, humanities) preference is to have this happen later than earlier. Hopefully, this will give us more time to understand how to design/engineer/control AGI so that it doesn’t harm humanity or the earth. (See my post on Existential event risk… for more information on risks of Superintelligence)

One way to control or delay the emergence of AGI is to limit model size. Now NVIDIA, Google and others have already released capabilities that allow them to train models that exceed the size of one GPU.

Alas, the only thing left is to consider limit the size of models that can be used to perform inferencing. I fear that Triton GMI pretty much open up the flood gates to supply any size model inferencing. This will provide for more and more sophisticated AI/ML/DL models and will uncap model sizes in the near future.

Doing this will give us (humanity) a little more time to understand how to control AGI. But all this presupposes that any AGI will require more parameters than current DNN models. I think this is a safe assumption but I’m no expert.

Will delaying NVIDIA Triton GMI really help

I was not briefed on internals of GMI but possibly it makes use of DGX NV-Link and NVIDIA Software to automatically partition a DNN and deploy it over the 4-A100 GPUS in a DGX.

NVIDIA is not the only organization working on advancing DNN training and inferencing capabilities. And it’s very likely that more than one of them (Google, FaceBook, AWS, etc) have probably identified the model size as a problem for inferencing and are working on their own solutions. So delaying GMI will not be a long term fix.

But maybe if we could just delay this capability from reaching the market for 2 to 5 years it would have a follow on impact of delaying the emergence of AGI.

Is this going to stop some one/some organization from achieving AGI, probably not. Could it delay some person/organization/government from getting there – maybe. Perhaps, it will give humanity enough time to come up with other ways to control AGI. But I fear the more technology moves on, are options for controlling AGI diminish.

Don’t get me wrong. I think AI, DL NN and NVIDIA (Google, DeepMind, Facebook and others) have done a great service to help mankind succeed over this next century. And I in no way wish to hold back this capability. And a “good” AGI has the potential to help everyone on this earth in more ways than I can imagine.

But achieving AGI is a step function and once unleashed it may be difficult to control. Anything we can do today to a) delay the emergence of AGI and b) help to control it, is IMHO, worthy of consideration.


Photo Credits:

  • from NVIDIA GTC Keynote by Jensen Huang, CEO
  • From Hackernoon article, Can Bitcoin AGI develops to benefit humanity

Phonons, the next big technology underpinning integrated circuits

Often science and industry seems to advance from investigating phenomena that is a side effect of something else we want to try to accomplish. Optical fibers have been in use for over a decade now and have always had a problem called Brillouin scattering, where light photon’s interact with surrounding cladding and generate small vibrations or sound packets. This feedback causes light to disperse across the length of the fibre due to Brillouin scattering and create sound packets called phonons aka hyper sound.

As a recent article I read in Science Daily (Wired for sound a third wave emerges in integrated circuits) describes it, the first wave of ICs was based on electronics and was developed after WW II, the second wave was based on photons and came about largely at the start of this century, and now the third wave is emerging based on sound, phonons.

The research team at The University of Sydney, Nano Institute have published over 70 papers on Brillouin scattering and Prof. Benjamin J. Eggleton recently published a summary of their research in a Nature Photonics paper (Brillouin integrated photonics, behind paywall) but one can download the deck he presented as a summary of the paper, at an OSA Optoelectronics Technical Group webinar, last year..

It appears as if the Brillouin scattering technology is particularly useful for (microwave) photonics computing. In the Science Daily article, the professor says that the big advance here is in the control of light and sound over small distances. In the Sarticle, the Professor goes onto say that “Brillouin scattering of light helps us measure material properties, transform how light and sound move through materials, cool down small objects, measure space, time and inertia, and even transport optical information.”

I believe from a photonics IC perspective, transforming how light, other electromagnetic radiation, and sound move through materials is exciting. New technology for measuring material properties, cool down small objects, measure space, time and inertia are also of interest, but not as important in our view.

What’s a phonon

As discussed earlier, phonons are packets of sound vibration above 100mhz, that come about due to optical photons interaction with cladding. As photons bounce off the cladding they generate phonons within the material. Such bouncing creates optical and acoustical waves or phonons.

There’s been a lot of research on how to create “Stimulated Brillouin Scattering” (SBS) on silicon CMOS devices and still goes on, but lately they have found an effective hybrid (Silicon, SiO2, & As2S3) formula to generate SBS at will at chip scale.

What can you do with SBS phonons

Essentially SBS phonons can be used to measure, monitor, alter and increase the flow of electromagnetic (EM) waves in a substance or wave guide. I believe this can be light, microwaves, or just about anything on the EM spectrum. Nothing was mentioned about X-Rays, but it’s just another band of EM radiation.

With SBS, one can supply microwave filters, phase shifters and sources, recover carrier signal in coherent optical communications, store (or delay) light, create lasers and measure, at the sub-mm scale, optical material characteristics. Although the article discusses cooling down materials, I didn’t see anything in the deck describing this.

As SBS technologies are optical-acoustical devices, they are immune to EMI (electro- magnetic interference), EMPs (electro-magnetic pulses) and consume less energy than electronic circuits performing similar functions.

We’ve talked about photonic computing before (see our Photonic computing, seeing the light of day post). But to make photonics a real alternative to electronic computing they need a lot of optical management devices. We discussed a couple in the blog post mentioned above but SBS opens up another dimension of ways to control photonic data flow and processing.

Unclear why the research into SBS seems to be generated out of Australian Universities. However their research is being (at least partially) funded by a number of US DoD entities.

It’s unclear whether SBS will ultimately be one of those innovations in the long run, which enables a new generation of (photonic) IC technologies. But the team has shown that with SBS they can do a lot of useful work with optical/microwave transmission, storage and measurement.

It seems to me that to construct full photonic computing, we need an optical DRAM device. Storing light (with SBS) is a good first step, but any optical store/memory device needs to be randomly accessible, and store Kb, Mb or Gb of optical data, in chip size areas and persist (dynamic refreshing is ok).

The continued use of DRAM for this would make the devices susceptible to EMI, EMP and consume more energy. Maybe something could be done with an all optical 3DX that would suffice as a photonics memory device. Then it could be called Optical DC PM.

So, ICs with electronics, photonics and now phononics are in our future.


Photo Credits:

Shedding light on all optical neural networks

Read a couple of articles in the past week or so on all optical neural networks (see All optical neural network (NN) closes performance gap with electronic NN and New design advances optical neural networks that compute at the speed of light using engineered matter).

All optical NN solutions operate faster and use less energy to inference than standard all electronic ones. However, in reality they aree more of a hybrid soulution as they depend on the use of standard ML DL to train a NN. They then use 3D printing and other lithographic processes to create a series diffraction layers of an all optical NN that matches the trained NN.

The latest paper (see: Class-specific Differential Detection in Diffractive Optical Neural Networks Improves Inference Accuracy) describes a significant advance beyond the original solution (see: All-Optical Machine Learning Using Diffractive Deep Neural Networks, Ozcan’s original paper).

How (all optical) Diffractive Deep NNs (DDNNs) work for inferencing

In the original Ozcan discussion, a DDNN consists of a coherent light source (laser), an image, a bunch of refractive and reflective diffraction layers and photo detectors. Each neural network node is represented by a point (pixel?) on a diffractive layer. Node to node connections are represented by lights path moving through the diffractive layer(s).

In Ozcan’s paper, the light flowing through the diffraction layer is modified and passed on to the next diffraction layer. This passing of the light through the diffraction layer is equivalent to the mathematical bias (neural network node FP multiplier) in the trained NN.

The previous challenge has been how to fabricate diffraction layers and took a lot of hand work. But with the advent of 3D printing and other lithographic techniques, nowadays, creating a diffraction layer is relatively easy to do.

In DDNN inferencing, one exposes (via a coherent beam of light) the first diffraction layer to the input image data, then that image is transformed into a different light pattern which is sent down to the next layer. At some point the last diffraction layer converts the light hitting it into classification patterns which is then be detected by photo detectors. Altenatively, the classification pattern can be sent down an all optical computational path (see our Photonic computing sees the light of day post and Photonic FPGAs on the horizon post) to perform some function.

In the original paper, they showed results of an DDNN for a completely connected, 5 layer NN, with 0.2M neurons and 8B connections in total. They also showed results from a sparsely connected, 5 layer NN ,with 0.45M neurons and <0.1B connections

Note, that there’s significant power advantages in exposing an image to a series of diffraction gratings and detecting the classification using a photo detector vs. an all electronic NN which takes an image, uses photo detectors to convert it into an electrical( pixel series) signal and then process it through NN layers performing FP arithmetic at layer node until one reaches the classification layer.

Furthermore, the DDNN operates at the speed of light. The all electronic network seems to operate at FP arithmetic speeds X number of layers. That is only if it could all done in parallel (with GPUs and 1000s of computational engines. If it can’t be done in parallel, one would need to add another factor X the number of nodes in each layer . Let’s just say this is much slower than the speed of light.

Improving DDNN accuracy

The team at UCLA and elsewhere took on the task to improve DDNN accuracy by using more of the optical technology and techniques available to them.

In the new approach they split the image optical data path to create a positive and negative classifier. And use a differential classifier engine as the last step to determine the image’s classification.

It turns out that the new DDNN performed much better than the original DDNN on standard MNIST, Fashion MNIST and another standard AI benchmark.

DDNN inferencing advantages, disadvantages and use cases

Besides the obvious power efficiencies and speed efficiencies of optical DDNN vs. electronic NNs for inferencing, there are a few other advantages:

  • All optical data paths are less noisy – In an electronic inferencing path, each transformation of an image to a pixel file will add some signal loss. In an all optical inferencing engine, this would be eliminated.
  • Smaller inferencing engine – In an electronic inferencing engine one needs CPUs, memory, GPUs, PCIe busses, networking and all the power and cooling to make it work. For an all optical DDNN, one needs a laser, diffraction layers and a set of photo detectors. Yes there’s some electronics involved but not nearly as much as an all electronic NN. And an all electronic NN with 0.5m nodes, and 5 layers with 0.1B connections would take a lot of memory and compute to support. Their DDNN to perform this task took up about 9 cm (3.6″) squared by ~3 to5 cm (1.2″-2.0″) deep.

But there’s some problems with the technology.

  • No re-training or training support – there’s almost no way to re-train the optical DDNN without re-fabricating the DDNN diffraction layers. I suppose additional layers could be added on top of or below the bottom layers, sort of like a corrective lens. Also, if perhaps there was some sort of way to (chemically) develop diffraction layers during training steps then it could provide an all optical DL data flow.
  • No support for non-optical classifications – there’s much more to ML DL NN functionality than optical classification. Perhaps if there were some way to transform non-optical data into optical images then DDNNs could have a broader applicability.

The technology could be very useful in any camera, lidar, sighting scope, telescope image and satellite image classification activities. It could also potentially be used in a heads up displays to identify items of interest in the optical field.

It would also seem easy to adapt DDNN technology to classify analog sensor data as well. It might also lend itself to be used in space, at depth and other extreme environments where an all electronic NN gear might not survive for very long.


Photo Credit(s):

Figure 1 from All-Optical Machine Learning Using Diffractive Deep Neural Networks

Figure 2 from All-Optical Machine Learning Using Diffractive Deep Neural Networks

Figure 2 from Class-specific Differential Detection in Diffractive Optical Neural Networks Improves Inference Accuracy

Figure 3 from Class-specific Differential Detection in Diffractive Optical Neural Networks Improves Inference Accuracy

Intel’s new DL Boost for DL AI inferencing

I was at a TechFieldDay Extra with Intel Data Centric Innovation Conference last week in San Francisco. It was a lavish affair with many industry analysts in attendance besides the TFDx crew.

At the event Intel announced a number of new products including the availability of their next generation scaleable Xeon processor chips, new Optane DC PM (DIMM) and software, new Ethernet (800) NIC cards, new FPGA line (10nm) and DL (deep learning inferencing) boost functionality.

But first please take our new poll:

I was most interested in the DL Boost and Optane DC PM solutions. For this post I focus on DL Boost.

DL Boost for DL inferencing on Xeon

Intel’s DL Boost technology provides a new integer 8 bit precision (INT 8) matrix multiply & summation instruction which can be used to speed up DL inferencing operations. As those who have been following along with my AI-DL-machine learning (ML) blog posts (latest being Learning Machine Learning part 3), probably know, deep learning machine learning that processes data to create a neural network made up with a number of layers and a number of nodes each of which represents a floating point weight used to transform inputs into outputs.

All DL AI projects involve at least two phases: model training and model inferencing (prediction, classification, AI result, etc.). Although both of these activities involve matrix calculations, model training involves a lot more of these compute intensive operations than inferencing. In fact, while training typically is done on GPUs or other special purpose compute hardware (TPU, IPUs, etc.) inferencing can typically be done on standard off the shelf CPUs.

Historically. inferencing used floating point matrix multiplication and summation functionality ,taking input from sensors, logs, photos, etc. and performing the model logic to create an output.

Intel believes (with industry analyst agreement) that over time, 50% or more of the DL AI workload is going to involve inferencing. Hence, the focus on this end of the AI workload, at least for now.

For example, although speech recognition AI can take a long time to process audio recordings and use reinforcement learning to train a recognition model. But, once trained, you could use that recognition AI model in anything from smart speakers, to speech to text dictation machines, to voice response systems, etc. In all of these the recognition model is passed a voice recording (or voice in real time) and processes these to create a text version of the speech.

But all of this has historically been done in floating point (FP) 32 (bit precision) or FP 16. Google’s TPU is capable of doing this with less precision, but to my knowledge, up to this point, it’s always been floating point.

What is DL Boost

What Intel has done with DL Boost is to create a new X86 instruction which can perform an integer (INT) 8 (bit precision) matrix multiplication and summation with less cycles than what it took before. Intel believes if customers were to modify their trained AI neural network models to move from FP 32 (or 16) to INT 8, they could perform inferencing much faster on Xeon Cascade Lake CPUs, than they could before and not have to rely on GPUs for this activity at all.

Yes, this does require hand optimization of trained AI neural network. Some of this may be automated, but not all. Intel claims the precision loss, if done properly, is less than a few percent and it’s impact on AI inferencing correctness is negligible at best.

At the moment, for all the DL modeling I have done, i have never looked at the trained model’s weights leaving this to TensorFlow/Keras to manage for me. But I’m not creating production level DL AI systems (yet). So, I don’t know what it would take to modify my AI models to use INT 8 nor what level of degradation in correctness would ensue. But I also don’t have Cascade Lake Xeon CPUs available.

Some potential problems here:

  1. Manual activity to hand tune the INT 8 neural network is not going to be that popular, except for those organizations where inferencing requires GPUs.
  2. Most production DL AI models, undergo some form of personalization for a user or implementation instance which would require a further FP to INT conversion for each user/implementation.
  3. Most production DL AI models also undergo periodic retraining to fine tune the model with the latest data that has been accumulated. This would also require further FP to INT conversion after each training cycle.

In the end, there’s an advantage for production AI inferencing, for models that don’t require substantial retraining/personalization as they don’t change that often. And there’s a definite cost advantage to using DL Boost INT 8, for those AI inferencing that must use GPUs today to perform in real time or under other performance constraints.

But hand converting neural networks, reminds me of creating assembly code for modules that can impact performance. This is normally reserved for only a select modules or functionality that executud a lot. However, DL models are much more monolithic and by definition, less modular. Identifying which models (or model layers) within a production DL AI solution that are performance sensitive and hand optimizing them to work on CPUs rather than GPUs, seems like a hard task.

It would be better from my perspective to create a single FP 16 matrix multiplication instruction. Alternatively, create some software that would automatically convert any DL AI model (or model layer) from FP to INT 8. That way DL Boost optimization would be just another step in the model training process and could be automatically generated to see if A) it loses too much sensitivity and B) if it’s worthwhile using CPU inferencing.



Screaming IOP performance with StarWind’s new NVMeoF software & Optane SSDs

Was at SFD17 last week in San Jose and we heard from StarWind SAN (@starwindsan) and their latest NVMeoF storage system that they have been working on. Videos of their presentation are available here. Starwind is this amazing company from the Ukraine that have been developing software defined storage.

They have developed their own NVMe SPDK for Windows Server. Intel doesn’t currently offer SPDK for Windows today, so they developed their own. They also developed their own initiator (CentOS Linux) for NVMeoF. The target system was a multicore server running Windows Server with a single Optane SSD that they used to test their software.

Extreme IOP performance consumes cores

During their development activity they tested various configurations. At the start of their development they used a Windows Server with their NVMeoF target device driver. With this configuration and on a bare metal server, they found that they could max out the Optane SSD at 550K 4K random write IOPs at 0.6msec to a single Optane drive.

When they moved this code directly to run under a Hyper-V environment, they were able to come close to this performance at 518K 4K write IOPS at 0.6msec. However, this level of IO activity pegged 100% of 8 cores on their 40 core server.

More IOPs/core performance in user mode

Next they decided to optimize their driver code and move as much as possible into user space and out of kernel space, They continued to use Hyper-V. With this level off code, they were able to achieve the same performance as bare metal or ~551K 4K random write IOP performance at the 0.6msec RT and 2.26 GB/sec level. However, they were now able to perform only pegging 2 cores. They expect to release this initiator and target software in mid October 2018!

They converted this functionality to run under ESX/VMware and were able to see much the same 2 cores pegged, ~551K 4K random write IOPS at 0.6msec RT and 2.26 GB/sec. They will have the ESXi version of their target driver code available sometime later this year.

Their initiator was running CentOS on another server. When they decided to test how far they could push their initiator, they were able to drive 4 Optane SSDs at up to ~1.9M 4K random write IOP performance.

At SFD17, I asked what they could have done at 100 usec RT and Max said about 450K IOPs. This is still surprisingly good performance. With 4 Optane SSDs and consuming ~8 cores, you could achieve 1.8M IOPS and ~7.4GB/sec. Doubling the Optane SSDs one could achieve ~3.6M IOPS, with sufficient initiators and target cores with ~14.8GB/sec.

Optane based super computer?

ORNL Summit super computer, the current number one supercomputer in the world, has a sustained throughput of 2.5 TB/sec over 18.7K server nodes. You could do much the same with 337 CentOS initiator nodes, 337 Windows server nodes and ~1350 Optane SSDs.

This would assumes that Starwind’s initiator and target NVMeoF systems can scale but they’ve already shown they can do 1.8M IOPS across 4 Optane SSDs on a single initiator server. Aand I assume a single target server with 4 Optane SSDs and at least 8 cores to service the IO. Multiplying this by 4 or 400 shouldn’t be much of a concern except for the increasing networking bandwidth.

Of course, with Starwind’s Virtual SAN, there’s no data management, no data protection and probably very little in the way of logical volume management. And the ORNL Summit supercomputer is accessing data as files in a massive file system. The StarWind Virtual SAN is a block device.

But if I wanted to rule the supercomputing world, in a somewhat smallish data center, I might be tempted to put together 400 of StarWind NVMeoF target storage nodes with 4 Optane SSDs each. And convert their initiator code to work on IBM Spectrum Scale nodes and let her rip.


New website monetization approaches

Historically, websites have made money by selling wares, services or advertising. In the last two weeks it seems like two new business models are starting to emerge. One more publicly supported and the other less publicly supported.

Europe’s new copyright law

According to an article I read recently (This newly approved European copyright law might break the Internet), Article 11 of Europe’s new Copyright Directive (not quite law yet) will require search engines, news aggregators and other users of Internet content to pay a “link tax” to copyright holders of anything they link to. As a long time blogger, podcaster and content provider, I find this new copyright policy very intriguing.

The article proposes that this will bankrupt small publishers as larger ones will charge less for the traffic. But presently, I get nothing for links to my content. And, I’d be delighted to get any amount – in fact I’d match any large publishers link tax amount that the market demands.

But my main concern is the impact this might have on site traffic. If aggregators pay a link tax, why would they want to use content that charges any tax. Yes at some point aggregators need content. But there are many websites full of content, certainly there would be some willing to forego tax fees for more traffic.

I also happen to be a copyright user. Most of my blog posts are from articles I read on the web. I usually link to an article in the 1st one or two paragraphs (see above and below) of a post and may refer (and link) to more that go deeper into a subject. Will I have to pay a link tax to the content owner?

How much of a link tax is anyone’s guess. I’m not sure it would amount to much. But a link tax, if done judiciously might even raise the quality of the content on the web.

Browser’s of the world, lay down your blockchains

The second article was a recent research paper (Digging into browser based crypto mining). Researchers at RWTH Aachen University had developed a new method to associate mined blocks to mining pools as a way to unearth browser-based mined crypto coins. With this technique they estimated that 1.8% of all Monero coins were mined by CoinHive using participant browsers to mine the coin or ~$250K/month from browser mining.

I see this as steeling compute power. But with that much coin being generated, it might be a reasonable way for an honest website to make some cash from people browsing their web pages. The browsing party would need to be informed of the mining operation in the page’s information, sort of like “we use cookies” today.

Just think, someone creates a WP plugin to do ETH mining and when activated, a WP website pops up a message that says “We mine coins while you browse – OK?”.

In another twist perhaps the websites could share the ETH mined on their browser with the person doing the browsing, similar to airline/hotel travel awards. Today most travel is done on corporate dime, but awards go to the person doing the traveling. Similarly, employees could browse using corporate computers but they would keep a portion of the ETH that’s mined while they browse away… Sounds like a deal.

Other monetization approaches

We’ve tried Google AdSense and other advertising but it only generated pennies a month. So, it wasn’t worth it.

We also sell research and occasionally someone buys some (see SCI Research Shop). And I do sell services but not through my website.


Not sure a link tax will fly. It would be a race to the bottom and anyone that charged a tax would suffer from less links until they decided to charge a $0 link tax.

Maybe if every link had a tax associated with it, whether the site owner wanted it or not there could be a level playing field. Recording, paying/receiving and accounting for all these link tax micro payments would be another nightmare altogether.

But a WP plugin, that announces and mines crypto coins with a user’s approval and splits the profit with them might work. Corporate wouldn’t like it but employees would just be browsing websites, where’s the harm in that.

Browse a website and share the mined crypto coin with site owner. Sounds fine to me.

Photo Credit(s): Strasburg – European Parliament|Giorgio Barlocco

Crypto News Daily – Telegram cancels ICO…

Photo of Bitcoin, Etherium and Litecoin|QuoteInspector

Hyperloop One in Colorado?

Read a couple of articles last week (TechCrunch, ArsTechnica & Denver Post) about Colorado becoming a winner in the Hyperloop One Global Challenge. The Colorado Department of Transportation (DoT) have joined with Hyperloop One to commission a study on Hyperloop transportation across the front range, from Cheyenne, WY to Pueblo, CO.

There’s been talk forever about adding a passenger train in Colorado from Fort Collins to Pueblo but every time they look at it they can’t make the economics work. How’s this different?

Transportation and the Queen city of the Prairie

Transportation has always been important to Denver. It was the Denver Pacific railroad from Denver to Cheyenne that first linked Denver to the rest of the nation. But even before that there was a stage coach line (Leavenworth & Pike’s Peak Express) that went through Denver to reduce travel time. Denver is currently the largest city within 500 miles and the second only to Phoenix as the most populus city in the mountain west.

Denver International Airport is a major hub and the world’s sixth busiest airport. Denver is a cross road for major north-south and east-west highways through the mountain west. Both the BNSF and Union Pacific railroads serve Denver and Denver is one of the major stops on the Amtrak  passenger train from San Francisco to Chicago.

Why Hyperloop?

Hyperloop can provide much faster travel, even faster than airplanes. Hyperloop can go up to 760 mph (1200 km/h) and should average 600 mph (970 km/h) from point to point

Further, it could potentially require less security.  Hyperloop can go above or below ground. But in either case a terrorist act shouldn’t be as harmful as one on a plane thats traveling at 20 to 30,000 feet in the air.

And because it can go above or below ground it could potentially make use current transportation right of way corridors for building its tubes. Although to go west, it’s going to need a new tunnel or two through the mountains.

Stops along the way

The proposed hyperloop track will bring it through Greeley and as far west as Vail. For a total of 360 miles. Cheyenne to Pueblo have about 10 urban centers between and west of them (Cheyenne, Fort Collins, Greely, Longmont-Boulder, Denver, Denver Tech Center [DTC], West [Denver] metro, Silverthorne/Dillon, Vail, Colorado Springs and Pueblo).

Cheyenne to Pueblo is is 213 miles apart and ~3.5 hr drive with Denver at about the 1/2 way point. With Hyperloop, Denver to either location should take ~10 minutes without stops and the total trip, Cheyenne to Pueblo should be ~21 minutes.

Yes but is there any demand

I would think the way to get a handle on any potential market is to examine airline traffic between these cities. Airplanes can travel at close to these speeds and the costs are public.

But today there’s not much airline traffic between Cheyenne, Denver and Pueblo.  Flights to Vail are mostly seasonal. I could only find one flight from Denver to Cheyenne over a week, one flight between Cheyenne and Pueblo, and 16 flights between Denver and Pueblo. The airplanes used on these trips only holds 9 passengers, so maybe that would amount to a maximum of 162 air travelers a week.

The other approach to estimating potential passengers is to use highway traffic between these destinations. Yes the interstate (I25) from Cheyenne through Denver to Pueblo is constantly busy and needs another lane or two in each direction to handle peak travel. And travel to Vail is very busy during weekends. But how many of these people would be willing to forego a car and travel by Hyperloop?

I travel on tollroads to get to the Denver Airport and it’s a lot faster then traveling non-tollroad highways. But the cost for me is a business expense and it’s not that frequent. These days there’s not much traffic on my tollroad corridor and at rush hour, there’s very few times where one has to slow down. But there are plenty of people coming to the airport each day from the NorthWest and SouthEast Denver suburbs that could use these tollroads but don’t.

And what can you do in Pueblo, Cheyenne or Denver for that matter without a car. It depends on where you end up. The current stops in Denver include the Denver International Airport, DTC, or West Metro (Golden?). Denver, Golden, Boulder, Vail, Greeley and Fort Collins all have compact downtowns with decent transportation. But for the rest of the stops along the way, you will probably want access to a car to get anywhere. There’s always Uber and Left and worst case renting a car.

So maybe Hyperloop would compete for all air travel and some portion of the car travel between along the Cheyenne to Denver to Pueblo. It just may not be large enough.

Other alternative routes

Why stop at Cheyenne, what about Jackson WY or Billings MT? And why Pueblo what about Sante Fe and Albuquerque in NM. And you could conceivably go down to Brownsville, TX and extend up to Calgary and Edmonton in Alberta, Canada, if it made sense. I suppose it’s a question of how many people for what distance.

I would think that going east-west would be more profitable. Say Kansas City to Salt Lake City with Denver in between. With this corridor: 1) the distances are longer (Kansas to Salt Lake is 910 mi [~1465 km]); 2) the metropolitan areas are much larger; and 3) the air travel between them is more popular.

There are currently 10 winners for Hyperloop One’s Global Challenge Contest.  The other routes in the USA include Texas (Dallas, Houston & San Antonio), Florida (Miami to Orlando), & the midwest (Chicago IL to Columbus OH to Pittsburgh PA). But there are others in Canada and Mexico in North America and more in Europe and India.

Hyperloop One will “commit meaningful business and engineering resources and work closely with each of the winning teams/routes to determine their commercial viability.” All this means that each of the winners will be examined professionally to see if it makes economic sense.

Of the 10 winners, Colorado’s route has the least population, almost by a factor of 2. Not sure why we are even in contention, but maybe it’s the ease of building the tubes that makes us a good candidate.

In any case, the public-private partnership has begun to work on the feasibility study.


Photo Credit(s): 7 hyperloop facts Elon Musk would love us to know, Detechter

Take a ride on Hyperloop…, Daily Mail


Mesosphere, Kubernetes and the coming container orchestration consensus

Read a story this past week in TechCrunch, Mesosphere adds Kubernetes support, about how Mesosphere with their own container orchestration software (called Marathon) will now support Google Kubernetes clusters and container orchestration services.

Mesosphere uses their own DC/OS (data center/operating system) to provide service discovery, resource management and networking for container cluster deployments across multiple machines.

DC/OS sounds similar to Kubo discussed in last week’s post, VMworld2017 forecast, cloudy with high chance of containers. Although Kubo was an open source development led by Pivotal to run Kubernetes clusters.

Kubernetes (and Docker) wins

This is indicative of the impact Kubernetes cluster operations is having on the container space.For now, the only holdout in container orchestration without Kubernetes is Docker with their Docker Swarm Engine.

Why add Kubernetes when Mesosphere already had a great container cluster orchestration service? It seems as the container market is maturing, more and more applications are being developed for Kubernetes clusters rather than other container orchestration software.

Although Mesosphere is the current leader in container orchestration both in containers run and revenue (according to their CEO), the move to Kubernetes clusters is likely to accelerate their market adoption/revenues and ultimately help keep them in the lead.

Marathon still lives on

It turns out that Marathon also orchestrates non-container application deployments.

Marathon can also support statefull apps like database machines with persistent storage (unlike Docker containers, stateless apps). These are closer to more typical enterprise applications. This is probably why Mesosphere has done so well up to now.
Marathon also supports both Docker and Mesos containers. Mesos containers depend on Apache Mesos, a specially developed distributed system’s kernel based on Linux for containers.

So Mesosphere will continue to fund development and support for Marathon, even while it rolls out Kubernetes. This will allow them to continue to support their customer base and move them forward into the Kubernetes age.


I see an eventual need for both stateless and statefull apps in the enterprise data center. And that might just be Mesosphere’s key value proposition – the ability to support apps of the future (containers-stateless) and apps of today (statefull) within the same DC/OS.

Picture credit(s): Enormous container ship by Ruth Hartnup