Disk rulz, at least for now

Last week WDC announced their next generation technology for hard drives, MAMR or Microwave Assisted Magnetic Recording. This is in contrast to HAMR, Heat (laser) Assisted Magnetic Recording. Both techniques add energy so that data can be written as smaller bits on a track.

Disk density drivers

Current hard drive technology uses PMR or Perpendicular Magnetic Recording with or without SMR (Shingled Magnetic Recording) and TDMR (Two Dimensional Magnetic Recording), both of which we have discussed before in prior posts.

The problem with PMR-SMR-TDMR is that the max achievable disk density is starting to flat line and approaching the “WriteAbility limit” of the head-media combination.

That is even with TDMR, SMR and PMR heads, the highest density that can be achieved is ~1.1Tb/sq.in. The Writeability limit for the current PMR head-media technology is ~1.4Tb/sq.in. As a result most disk density increases over the past years has been accomplished by adding platters-heads to hard drives.

MAMR and HAMR both seem able to get disk drives to >4.0Tb/sq.in. densities by adding energy to the magnetic recording process, which allows the drive to record more data in the same (grain) area.

There are two factors which drive disk drive density (Tb/sq.in.): Bits per inch (BPI) and Tracks per inch (TPI). Both SMR and TDMR were techniques to add more TPI.

I believe MAMR and HAMR increase BPI beyond whats available today by writing data on smaller magnetic grain sizes (pitch in chart) and thus more bits in the same area. At 7nm grain sizes or below PMR becomes unstable, but HAMR and MAMR can record on grain sizes of 4.5nm which would equate to >4.5Tb/sq.in.

HAMR hurdles

It turns out that HAMR as it uses heat to add energy, heats the media drives to much higher temperatures than what’s normal for a disk drive, something like 400C-700C.  Normal operating temperatures for disk drives is  ~50C.  HAMR heat levels will play havoc with drive reliability. The view from WDC is that HAMR has 100X worse reliability than MAMR.

In order to generate that much heat, HAMR needs a laser to expose the area to be written. Of course the laser has to be in the head to be effective. Having to add a laser and optics will increase the cost of the head, increase the steps to manufacture the head, and require new suppliers/sourcing organizations to supply the componentry.

HAMR also requires a different media substrate. Unclear why, but HAMR seems to require a glass substrate, the magnetic media (many layers) is  deposited ontop of the glass substrate. This requires a new media manufacturing line, probably new suppliers and getting glass to disk drive (flatness-bumpiness, rotational integrity, vibrational integrity) specifications will take time.

Probably more than a half dozen more issues with having laser light inside a hard disk drive but suffice it to say that HAMR was going to be a very difficult transition to perform right and continue to provide today’s drive reliability levels.

MAMR merits

MAMR uses microwaves to add energy to the spot being recorded. The microwaves are generated by a Spin Torque Oscilator, (STO), which is a solid state device, compatible with CMOS fabrication techniques. This means that the MAMR head assembly (PMR & STO) can be fabricated on current head lines and within current head mechanisms.

MAMR doesn’t add heat to the recording area, it uses microwaves to add energy. As such, there’s no temperature change in MAMR recording which means the reliability of MAMR disk drives should be about the same as todays disk drives.

MAMR uses todays aluminum substrates. So, current media manufacturing lines and suppliers can be used and media specifications shouldn’t have to change much (?) to support MAMR.

MAMR has just about the same max recording density as HAMR, so there’s no other benefit to going to HAMR, if MAMR works as expected.

WDC’s technology timeline

WDC says they will have sample MAMR drives out next year and production drives out in 2019. They also predict an enterprise 40TB MAMR drive by 2025. They have high confidence in this schedule because MAMR’s compatabilitiy with  current drive media and head manufacturing processes.

WDC discussed their IP position on HAMR and MAMR. They have 400+ issued HAMR patents with another 100+ pending and 75 issued MAMR patents with 46 more pending. Quantity doesn’t necessarily equate to quality, but their current IP position on both MAMR and HAMR looks solid.

WDC believes that by 2020, ~90% of enterprise data will be stored on hard drives. However, this is predicated on achieving a continuing, 10X cost differential between disk drives and (QLC 3D) flash.

What comes after MAMR is subject of much speculation. I’ve written on one alternative which uses liquid Nitrogen temperatures with molecular magnets, I called CAMR (cold assisted magnetic recording) but it’s way to early to tell.

And we have yet to hear from the other big disk drive leader, Seagate. It will be interesting to hear whether they follow WDC’s lead to MAMR, stick with HAMR, or go off in a different direction.

Comments?

 

Photo Credit(s): WDC presentation

A tale of two storage companies – NetApp and Vantara (HDS-Insight Grp-Pentaho)

It was the worst of times. The industry changes had been gathering for a decade almost and by this time were starting to hurt.

The cloud was taking over all new business and some of the old. Flash’s performance was making high performance easy and reducing storage requirements commensurately. Software defined was displacing low and midrange storage, which was fine for margins but injurious to revenues.

Both companies had user events in Vegas the last month, NetApp Insight 2017 last week and Hitachi NEXT2017 conference two weeks ago.

As both companies respond to industry trends, they provide an interesting comparison to watch companies in transition.

Company role

  • NetApp’s underlying theme is to change the world with data and they want to change to help companies do this.
  • Vantara’s philosophy is data and processing is ultimately moving into the Internet of things (IoT) and they want to be wherever the data takes them.

Hitachi Vantara is a brand new company that combines Hitachi Data Systems, Hitachi Insight Group and Pentaho (an analytics acquisition) into one organization to go after the IoT market. Pentaho will continue as a separate brand/subsidiary, but HDS and Insight Group cease to exist as separate companies/subsidiaries and are now inside Vantara.

NetApp sees transitions occurring in the way IT conducts business but ultimately, a continuing and ongoing role for IT. NetApp’s ultimate role is as a data service provider to IT.

Customer problem

  • Vantara believes the main customer issue is the need to digitize the business. Because competition is emerging everywhere, the only way for a company to succeed against this interminable onslaught is to digitize everything. That is digitize your manufacturing/service production, sales, marketing, maintenance, any and all customer touch points, across your whole value chain and do it as rapidly as possible. If you don’t your competition will.
  • NetApp sees customers today have three potential concerns: 1) how to modernize current infrastructure; 2) how to take advantage of (hybrid) cloud; and 3) how to build out the next generation data center. Modernization is needed to free capital and expense from traditional IT for use in Hybrid cloud and next generation data centers. Most organizations have all three going on concurrently.

Vantara sees the threat of startups, regional operators and more advanced digitized competitors as existential for today’s companies. The only way to keep your business alive under these onslaughts is to optimize your value delivery. And to do that, you have to digitize every step in that path.

NetApp views the threat to IT as originating from LoB/shadow IT originating applications born and grown in the cloud or other groups creating next gen applications using capabilities outside of IT.

Product direction

  • NetApp is looking mostly towards the cloud. At their conference they announced a new Azure NFS service powered by NetApp. They already had Cloud ONTAP and NPS, both current cloud offerings, a software defined storage in the cloud and a co-lo hardware offering directly attached to public cloud (Azure & AWS), respectively.
  • Vantara is looking towards IoT. At their conference they announced Lumada 2.0, an Industrial IoT (IIoT) product framework using plenty of Hitachi software functionality and intended to bring data and analytics under one software umbrella.

NetApp is following a path laid down years past when they devised the data fabric. Now, they are integrating and implementing data fabric across their whole product line. With the ultimate goal that wherever your data goes, the data fabric will be there to help you with it.

Vantara is broadening their focus, from IT products and solutions to IoT. It’s not so much an abandoning present day IT, as looking forward to the day where present day IT is just one cog in an ever expanding, completely integrated digital entity which the new organization becomes.

They both had other announcements, NetApp announced ONTAP 9.3, Active IQ (AI applied to predictive service) and FlexPod SF ([H]CI with SolidFire storage) and Vantara announced a new IoT turnkey appliance running Lumada and a smart data center (IoT) solution.

Who’s right?

They both are.

Digitization is the future, the sooner organizations realize and embrace this, the better for their long term health. Digitization will happen with or without organizations and when it does, it will result in a significant re-ordering of today’s competitive landscape. IoT is one component of organizational digitization, specifically outside of IT data centers, but using IT resources.

In the mean time, IT must become more effective and efficient. This means it has to modernize to free up resources to support (hybrid) cloud applications and supply the infrastructure needed for next gen applications.

One could argue that Vantara is positioning themselves for the long term and NetApp is positioning themselves for the short term. But that denies the possibility that IT will have a role in digitization. In the end both are correct and both can succeed if they deliver on their promise.

Comments?

 

Two paths to better software

Read an article last week in the Atlantic, The coming software apocalypse, about some of the problems in how we develop software today.

Most software development today is editing text files. Some of these text files have 1,000s of lines and are connected to other text files with 1,000s of more lines which are connected to other text files with 1,000s of lines, etc. Pretty soon you have millions of lines of code all interacting with one another.

The problem

Been there done that and it’s not pretty. We even spent some time trying to reduce the code bloat by macro-izing some of it, and that just made it harder to understand, but reduced the lines of code.

The problem is much worse now where . we have software everywhere you look, from the escalator-elevator you take up and down between floors, to the cars you drive around town, to the trains and airplanes you travel between cities.

All of these literally have millions of lines of code controlling them and are many more each year. How can they all possibly be correct.

Well you can test the s&*t out of them. But you can’t cover every path in a lifetime or ten of testing a million line program. And even if you could, changing a single line would generate another 100K or more paths to test. So testing was never a true answer.

Two solutions

The article talks about two approaches that have some merit to solve the real problem.

  • Model based development, a new development and coding environment. In this approach your not so much coding as playing with a model of the behavior your looking for. Say you were coding robot control logic, rather than editing 1000s of lines of Java text, you work with a model of your robot and its environment on 1/2 a screen and on the other half, model parameters (dials, sliders, arrow keys, etc) and logic (sequences) that you  manipulate to do what the robot needs to do. Sort of like Scratch on steroids (see my post on 10 years of Scratch) with the sprite being whatever you need to code for be it a jet engine, automobile, elevator, whatever. The playground would be a realtime/real life simulation of the entity under control of the code and you would code by setting parameters  and defining sequences. But the feedback would be immediate!
  • TLA+ a formal design verification approach. Formal methods have been around since the early 70s. They are used to rigorously specify a design of  some code or a whole system. The idea is that if you can specify a  provably correct design, then the code (derived from that design) has the potential to be more correct. Yes there’s still the translation from code to design that’s error prone but the likelihood is that these errors will be smaller in scope than having a design that wrong.

Model based  development

One can find model based development already in the Apple new application development language, Swift, ANSYS SCADE suite based on Esterel Technologies, and Light Table software development environment.

I have never used any of them but they all look interesting. Esterel was developed for safety critical, real-time aerospace applications. Light Table was a kickstarter project started by a leading engineer of Microsoft’s Visual Studio, the leading IDE. Apple Swift was developed to make it much easier to develop IOS apps.

TLA+

TLA+ takes a bit getting used to. All formal methods depend on advanced mathematics and sophisticated logic and requires an adequate understanding of these in order to use properly. TLA+ was developed by Leslie Lamport and stands for temporal logic of actions.

TLA+ specifications identify the set of all correct system actions. I would call it a formal pseudo code.

There’s apparently a video course , a hyperbook and a book on the language It’s being used in AWS and Microsoft XBOX and Azure. (See the wikipedia TLA+ article for more information).

There’s PlusCal algorithm (specification) language which is translated into a TLA+ specification which can then be checked by the automated TLC model checker.  There’s also an automated TLAPS, a TLA+ proof system although it doesn’t support all of the TLA+ primitives.  There’s a whole TLA+ toolbox that has these and other tools that can make TLA+ easier to use.

~~~~

We dabbled in formal specifications methods for on our million+ line storage system at a former employer. It worked well and cleaned up a integrity critical area of the product. Alas, we didn’t expand it’s use to other areas of the product and it sort of fell out of favor. But it worked when and where we applied it.

Of course this was before automated formal methods of today, but even manual methods of specification precision can be helpful to think out what a design has to do to be correct.

I have no doubt that both TLA+ formal methods and model based development approaches and more are required to truly vanquish the coming software apocalypse.

At least until artificial intelligence starts developing all our code for us.

Comments?

Photo Credits: Six easy pieces of quantitatively analyzing open source, SAP Research;

Spaghetti code still existed, Toolbox.com;

How to write apps with Swift, MacWorld;

Modeling the dining philosophers problem in TLA+, Metadata blog

 

Compressing information through the information bottleneck during deep learning

Read an article in Quanta Magazine (New theory cracks open the black box of deep learning) about a talk (see 18: Information Theory of Deep Learning, YouTube video) done a month or so ago given by Professor Naftali (Tali) Tishby on his theory that all deep learning convolutional neural networks (CNN) exhibit an “information bottleneck” during deep learning. This information bottleneck results in compressing the information present, in for example, an image and only working with the relevant information.

The Professor and his researchers used a simple AI problem (like recognizing a dog) and trained a deep learning CNN to perform this task. At the start of the training process the CNN nodes at the top were all connected to the next layer, and those were all connected to the next layer and so on until you got to the output layer.

Essentially, the researchers found that during the deep learning process, the CNN went from recognizing all features of an image to over time just recognizing (processing?) only the relevant features of an image when successfully trained.

Limits of deep learning CNNs

In his talk the Professor identifies two modes of operations of a deep learning CNN: the encoder layers and decoder layers. The encoder function identifies relevant information in the input and the decoder function takes this relevant information and maps this to an output.

This view results in two statistics that can characterize any deep learning CNN:

  • Sample complexity which refers to the the mutual information inside the last hidden layer of the encoder function, and
  • Accuracy or generalization error, which refers to the mutual information inside the last hidden layer of the decoder function.

Where mutual information is defined as how much of the uncertainty of an input is removed when you have an output that is based on that input. (See the talk for a more formal explanation).

The professor states that any complex deep learning CNN can be characterized by these two statistics where sample complexity determines the number of samples required and accuracy determines the precision by which the deep learning CNN can properly interpret those samples. The deep black line in the chart represents the limits of accuracy achievable at some number of training events, with some number of hidden layers and some sample set.

What happens during deep learning

Moreover, the professor shows an interesting characteristic of all CNNs is that they converge over time in accuracy and that convergence differs based mostly on the number of layers, sample size and training count used.

In the chart, the top row show 3 CNNs with different amounts of training data (5%, 40% and 80% of total). The chart shows the end result and trace of learning within the CNN over the same number of epochs (training cycles). More training data generates more accurate results.

The Professor views those epochs after the farthest right traces (where the trace essentially starts moving up and to the left in the chart), the compression phase of deep learning.

Statistics of deep learning process

The professor goes on to characterize the deep learning  process by calculating the mean and variance of each layers connection weights.

In the chart he shows an standard “eiffel tower” neural network, with 6 hidden layers, each with less neurons (nodes)  than the previous layer (12 nodes, 10 nodes, 7 nodes, etc.). And what he plots is the average weights and variance between layers (red lines are average and variance of the weights for arcs[connections] between nodes in layer 1 to nodes in layer 2, blue lines the mean and variance of weights for arcs between layer 2 and 3, purple lines the mean and variance of weights for arcs between layer 3 and 4, etc.).

He shows that at the start of training the (randomly assigned) weights for each layer have a normalized mean which is higher than its normalized variance. He calls this phase as high signal to noise (I would say the opposite, its low signal to noise, more noise than signal). But as training proceeds (over more epochs), there comes a point where the layer mean drops below its variance and the signal to noise ratio changes dramatically. After that point the mean weights and variance of the group of layers start to diverge or move apart.

The phase (epochs) after the line where the weights means are lower than its variance, he calls the Compression phase of the deep layer CNN training.

The Professor suggests that every complex deep learning CNN looks the same during training if you perform the calculations. The professor shows charts like this for other deep learning CNNs used on different problems and they all exhibit some point where their means are lower than their weights after which means and variances between layers starts to differentiate.

Do layer counts and sample size matter?


It turns out that the more hidden layers you have, the sooner (less training) you need to begin the compression phase. This chart shows the same problem, with different hidden layer counts. One can see in the traces, that not only is accuracy improved with more layers but it also more quickly reaches the compression phase.

Using his sample complexity and accuracy statistics, the Professor has also shown that their are limits to the amount of accuracy to any deep learning CNN based on the function of layer counts, sample size and training event counts.

~~~~

As far as I know, The Professor and his team are the first to try to characterize and understand what happens during deep learning. In doing so, he has shown that the number of layers and the number of samples can be used to predict the speed of learning. And ultimately how accurate any deep learning CNN can be.

Comments?

Hyperloop One in Colorado?

Read a couple of articles last week (TechCrunch, ArsTechnica & Denver Post) about Colorado becoming a winner in the Hyperloop One Global Challenge. The Colorado Department of Transportation (DoT) have joined with Hyperloop One to commission a study on Hyperloop transportation across the front range, from Cheyenne, WY to Pueblo, CO.

There’s been talk forever about adding a passenger train in Colorado from Fort Collins to Pueblo but every time they look at it they can’t make the economics work. How’s this different?

Transportation and the Queen city of the Prairie

Transportation has always been important to Denver. It was the Denver Pacific railroad from Denver to Cheyenne that first linked Denver to the rest of the nation. But even before that there was a stage coach line (Leavenworth & Pike’s Peak Express) that went through Denver to reduce travel time. Denver is currently the largest city within 500 miles and the second only to Phoenix as the most populus city in the mountain west.

Denver International Airport is a major hub and the world’s sixth busiest airport. Denver is a cross road for major north-south and east-west highways through the mountain west. Both the BNSF and Union Pacific railroads serve Denver and Denver is one of the major stops on the Amtrak  passenger train from San Francisco to Chicago.

Why Hyperloop?

Hyperloop can provide much faster travel, even faster than airplanes. Hyperloop can go up to 760 mph (1200 km/h) and should average 600 mph (970 km/h) from point to point

Further, it could potentially require less security.  Hyperloop can go above or below ground. But in either case a terrorist act shouldn’t be as harmful as one on a plane thats traveling at 20 to 30,000 feet in the air.

And because it can go above or below ground it could potentially make use current transportation right of way corridors for building its tubes. Although to go west, it’s going to need a new tunnel or two through the mountains.

Stops along the way

The proposed hyperloop track will bring it through Greeley and as far west as Vail. For a total of 360 miles. Cheyenne to Pueblo have about 10 urban centers between and west of them (Cheyenne, Fort Collins, Greely, Longmont-Boulder, Denver, Denver Tech Center [DTC], West [Denver] metro, Silverthorne/Dillon, Vail, Colorado Springs and Pueblo).

Cheyenne to Pueblo is is 213 miles apart and ~3.5 hr drive with Denver at about the 1/2 way point. With Hyperloop, Denver to either location should take ~10 minutes without stops and the total trip, Cheyenne to Pueblo should be ~21 minutes.

Yes but is there any demand

I would think the way to get a handle on any potential market is to examine airline traffic between these cities. Airplanes can travel at close to these speeds and the costs are public.

But today there’s not much airline traffic between Cheyenne, Denver and Pueblo.  Flights to Vail are mostly seasonal. I could only find one flight from Denver to Cheyenne over a week, one flight between Cheyenne and Pueblo, and 16 flights between Denver and Pueblo. The airplanes used on these trips only holds 9 passengers, so maybe that would amount to a maximum of 162 air travelers a week.

The other approach to estimating potential passengers is to use highway traffic between these destinations. Yes the interstate (I25) from Cheyenne through Denver to Pueblo is constantly busy and needs another lane or two in each direction to handle peak travel. And travel to Vail is very busy during weekends. But how many of these people would be willing to forego a car and travel by Hyperloop?

I travel on tollroads to get to the Denver Airport and it’s a lot faster then traveling non-tollroad highways. But the cost for me is a business expense and it’s not that frequent. These days there’s not much traffic on my tollroad corridor and at rush hour, there’s very few times where one has to slow down. But there are plenty of people coming to the airport each day from the NorthWest and SouthEast Denver suburbs that could use these tollroads but don’t.

And what can you do in Pueblo, Cheyenne or Denver for that matter without a car. It depends on where you end up. The current stops in Denver include the Denver International Airport, DTC, or West Metro (Golden?). Denver, Golden, Boulder, Vail, Greeley and Fort Collins all have compact downtowns with decent transportation. But for the rest of the stops along the way, you will probably want access to a car to get anywhere. There’s always Uber and Left and worst case renting a car.

So maybe Hyperloop would compete for all air travel and some portion of the car travel between along the Cheyenne to Denver to Pueblo. It just may not be large enough.

Other alternative routes

Why stop at Cheyenne, what about Jackson WY or Billings MT? And why Pueblo what about Sante Fe and Albuquerque in NM. And you could conceivably go down to Brownsville, TX and extend up to Calgary and Edmonton in Alberta, Canada, if it made sense. I suppose it’s a question of how many people for what distance.

I would think that going east-west would be more profitable. Say Kansas City to Salt Lake City with Denver in between. With this corridor: 1) the distances are longer (Kansas to Salt Lake is 910 mi [~1465 km]); 2) the metropolitan areas are much larger; and 3) the air travel between them is more popular.

There are currently 10 winners for Hyperloop One’s Global Challenge Contest.  The other routes in the USA include Texas (Dallas, Houston & San Antonio), Florida (Miami to Orlando), & the midwest (Chicago IL to Columbus OH to Pittsburgh PA). But there are others in Canada and Mexico in North America and more in Europe and India.

Hyperloop One will “commit meaningful business and engineering resources and work closely with each of the winning teams/routes to determine their commercial viability.” All this means that each of the winners will be examined professionally to see if it makes economic sense.

Of the 10 winners, Colorado’s route has the least population, almost by a factor of 2. Not sure why we are even in contention, but maybe it’s the ease of building the tubes that makes us a good candidate.

In any case, the public-private partnership has begun to work on the feasibility study.

Comments?

Photo Credit(s): 7 hyperloop facts Elon Musk would love us to know, Detechter

Take a ride on Hyperloop…, Daily Mail

@hyperloop

Mesosphere, Kubernetes and the coming container orchestration consensus

Read a story this past week in TechCrunch, Mesosphere adds Kubernetes support, about how Mesosphere with their own container orchestration software (called Marathon) will now support Google Kubernetes clusters and container orchestration services.

Mesosphere uses their own DC/OS (data center/operating system) to provide service discovery, resource management and networking for container cluster deployments across multiple machines.

DC/OS sounds similar to Kubo discussed in last week’s post, VMworld2017 forecast, cloudy with high chance of containers. Although Kubo was an open source development led by Pivotal to run Kubernetes clusters.

Kubernetes (and Docker) wins

This is indicative of the impact Kubernetes cluster operations is having on the container space.For now, the only holdout in container orchestration without Kubernetes is Docker with their Docker Swarm Engine.

Why add Kubernetes when Mesosphere already had a great container cluster orchestration service? It seems as the container market is maturing, more and more applications are being developed for Kubernetes clusters rather than other container orchestration software.

Although Mesosphere is the current leader in container orchestration both in containers run and revenue (according to their CEO), the move to Kubernetes clusters is likely to accelerate their market adoption/revenues and ultimately help keep them in the lead.

Marathon still lives on

It turns out that Marathon also orchestrates non-container application deployments.

Marathon can also support statefull apps like database machines with persistent storage (unlike Docker containers, stateless apps). These are closer to more typical enterprise applications. This is probably why Mesosphere has done so well up to now.
Marathon also supports both Docker and Mesos containers. Mesos containers depend on Apache Mesos, a specially developed distributed system’s kernel based on Linux for containers.

So Mesosphere will continue to fund development and support for Marathon, even while it rolls out Kubernetes. This will allow them to continue to support their customer base and move them forward into the Kubernetes age.

~~~~

I see an eventual need for both stateless and statefull apps in the enterprise data center. And that might just be Mesosphere’s key value proposition – the ability to support apps of the future (containers-stateless) and apps of today (statefull) within the same DC/OS.

Picture credit(s): Enormous container ship by Ruth Hartnup

VMworld2017’s forecast, cloudy with a high chance of containers

Attended VMworld2017 this past week in Vegas and aside from all the parties there was a lot of news, mostly for public cloud users.

In talking with analysts and others at the show it seems like VMware has recently discovered that they can’t fight the cloud, so they better join them. Early this year VMware divested itself of its vCloud Air Business to OVH, which removed their owned competition to the cloud. Now, VMware’s on a different tack, figuring out how to best work with today’s public cloud providers and implementing this.

Last year VMware announced an agreement with IBM and to supply vCloud Air services on IBM’s SoftLayer public cloud. This year, VMware ramps up other public cloud offerings with VMware Cloud on AWS and PKS (Pivotal Container Services) on vSphere.

First up, VMware on the (AWS) cloud

You may recall that earlier this year VMware showed a tech preview of vSphere running in AWS. At VMworld2017 they took off the wraps on this service and made it real. At first it’s only available in AWS US WEST region but they plan to roll it out to the rest of US soon and rest of the world after that.

VMware Cloud on AWS is vSphere, vCenter, NSX, and vSAN running ontop of AWS Elastic cloud services. Essentially, any VM that you run onprem, can be run on AWS, using VMware Cloud on AWS.

The AWS EC2 machines you run VMware on are BIG – 2 CPU, 36 cores (72 hyper threads) with 512GiB of memory and a local (SSD) cache of 3.6TB/10.7TB raw capacity. VMware Cloud on AWS requires four EC2 instances to run. No information about the networking capabilities but I assume HIGH SPEED.

The cost for the service is high but you are paying for 7x24x365 AWS EC2 services. For a 3 year “reservation”, it will cost $109.4K/host. That comes out to be about $3K/month/host for 36 months. VMware claims that on a 3 year TCO basis this would be cheaper than running an equivalent configuration onprem.

You can also contract for VMware Cloud on AWS on an hourly basis. You do have to have a VMware login and VMware credits (?) to do so. It’s certainly not as simple as just having a credit card and an AWS login. But the costs for this are $8.361/hour/host. This seems awfully high but there’s no direct comparison to other EC2 machine configurations. Although there is an EC2 X1.16 with 64 vCPUs (hyper thread equivalents), 976GiB DRAM and 1-1920 (GiB) SSD that lists for $6.669/hour – close, but not a complete match.

You are running a VMware service on AWS so the billing is done through VMware. And any data you move in or out of the cloud will be billed (through VMware) at whatever AWS would charge for the data egress/import.

It seems that if you “connect” your VMware Cloud on AWS to your onprem   vSphere cluster (through stretched layer 2 NSX networking and ? other means) you can vMotion VMs from onprem to AWS and back again. There is a behind the scenes Storage vMotion that also happens to get the data to AWS so that the VMs can operate properly.

VMware vCenter offers a dashboard of sorts to tell admins whether a particular VM is a good candidate to move to AWS or not. This is based on the VM’s connections to other VMs and maybe the amount of data that would need to moved.


Next, (PKS) containers and more (GCP) cloud

VMware together with Pivotal and Google Cloud announced a tech preview of the Pivotal Container Service (PKS) on vSphere. The new service implements Pivotal Kubo, or Kubernetes container orchestration with Bosh HA infrastructure management ontop of vSphere. PKS also comes with Harbor a secure, enterprise class container registry from VMware

This would allow a development team to develop a container micro-services application, completely within a VMware environment and to run it under vSphere. This seems tailor made to cloud developers.

Kubernetes has worker and master nodes and each which would run as a VM on vSphere. Inside worker nodes, Kubernetes runs Pods which have one or more tightly connected container(s) which enclose an application and share context.

I was talking with the vSphere team and they had been spending a lot of time making vSphere native services available to PKS. This means that you can use NSX networking and vSAN, VVOLs or VMDK storage for your container (persistent) storage.

Not exactly sure where DevOps fits into PKS on vSphere but my assumption is that you could run PuppetChef or if your up to the challenge, vRA to automate application roll out.

There was specific talk of having PKS run on AWS, probably within VMware Cloud on AWS in the future.

Of course, PKS containers that run on vSphere are completely compatible with GKE (Google Container Engine) which runs on Google Cloud Platform

No information on VMware PKS pricing as of yet.

Where lies Photon and VIC (VMware Integrated Containers)

You may recall that VMware announced Photon last year which was a open source container framework and Photon OS which was an OS for Photon containers. This still exists as an open source project and is still being developed but there was nary a word about Photon this year.

VIC still exists. VIC can support running a container as a VM but is not a real container orchestration engine. Yes you could potentially run Docker Swarm as VM or a number of containers as separate VMs under VI, but this is not the same as having a fully integrated container orchestration and management service layer in vSphere. That’s where PKS fits in.

~~~~

Although timelines weren’t discussed there were a number of discussions that led me to believe that VMware on AWS would be rolled out to other public cloud provider (read Azure and GCP). And how long it would take to be rolled out to other AWS regions around the world was not discussed.  VMware Cloud would really make sense to run on GCP, but Azure might be a bit of a stretch.

Similarly, PKS seems already heading for VMware Cloud on AWS and is already available in native form as GKE on GCP. But Azure already has a native Kubernetes Container Service. And there was no discussion as to whether PKS would be made available on IBM Softlayer or OVH vCloud Air.

Stay tuned more to come as VMware finds its true path to the cloud.

Research reveals ~liquid nitrogen temperature molecular magnets with 100X denser storage


Must be on a materials science binge these days. I read another article this week in Phys.org on “Major leap towards data storage at the molecular level” reporting on a Nature article “Molecular magnetic hysteresis at 60K“, where researchers from University of Manchester, led by Dr David Mills and Dr Nicholas Chilton from the School of Chemistry, have come up with a new material that provides molecular level magnetics at almost liquid nitrogen temperatures.

Previously, molecular magnets only operated at from 4 to 14K (degrees Kelvin) from research done over the last 25 years or so, but this new  research shows similar effects operating at ~60K or close to liquid nitrogen temperatures. Nitrogen freezes at 63K and boils at ~77K, and I would guess, is liquid somewhere between those temperatures.

What new material

The new material, “hexa-tert-butyldysprosocenium complex—[Dy(Cpttt)2][B(C6F5)4], with Cpttt = {C5H2tBu3-1,2,4} and tBu = C(CH3)3“, dysprosocenium for short was designed (?) by the researchers at Manchester and was shown to exhibit magnetism at the molecular level at 60K.

The storage effect is hysteresis, which is a materials ability to remember the last (magnetic/electrical/?) field it was exposed to and the magnetic field is measured in oersteds.

The researchers claim the new material provides magnetic hysteresis at a sweep level of 22 oersteds. Not sure what “sweep level of 22 oersteds” means but I assume a molecule of the material is magnetized with a field strength of 22 oersteds and retains this magnetic field over time.

Reports of disk’s death, have been greatly exaggerated

While there seems to be no end in sight for the densities of flash storage these days with 3D NAND (see my 3D NAND, how high can it go post or listen to our GBoS FMS2017 wrap-up with Jim Handy podcast), the disk industry lives on.

Disk industry researchers have been investigating HAMR, ([laser] heat assisted magnetic recording, see my Disk density hits new record … post) for some time now to increase disk storage density. But to my knowledge HAMR has not come out in any generally available disk device on the market yet. HAMR was supposed to provide the next big increase in disk storage densities.

Maybe they should be looking at CAMMR, or cold assisted magnetic molecular recording (heard it here, 1st).

According to Dr Chilton using the new material at 60K in a disk device would increase capacity by 100X. Western Digital just announced a 20TB MyBook Duo disk system for desktop storage and backup. With this new material, at 100X current densities, we could have 2PB Mybook Duo storage system on your desktop.

That should keep my ever increasing video-photo-music library in fine shape and everything else backed up for a little while longer.

Comments?

Photo Credit(s): Molecular magnetic hysteresis at 60K, Nature article