Google releases new Cloud TPU & Machine Learning supercomputer in the cloud

Last year about this time Google released their 1st generation TPU chip to the world (see my TPU and HW vs. SW … post for more info).

This year they are releasing a new version of their hardware called the Cloud TPU chip and making it available in a cluster on their Google Cloud.  Cloud TPU is in Alpha testing now. As I understand it, access to the Cloud TPU will eventually be free to researchers who promise to freely publish their research and at a price for everyone else.

What’s different between TPU v1 and Cloud TPU v2

The differences between version 1 and 2 mostly seem to be tied to training Machine Learning Models.

TPU v1 didn’t have any real ability to train machine learning (ML) models. It was a relatively dumb (8 bit ALU) chip but if you had say a ML model already created to do something like understand speech, you could load that model into the TPU v1 board and have it be executed very fast. The TPU v1 chip board was also placed on a separate PCIe board (I think), connected to normal x86 CPUs  as sort of a CPU accelerator. The advantage of TPU v1 over GPUs or normal X86 CPUs was mostly in power consumption and speed of ML model execution.

Cloud TPU v2 looks to be a standalone multi-processor device, that’s connected to others via what looks like Ethernet connections. One thing that Google seems to be highlighting is the Cloud TPU’s floating point performance. A Cloud TPU device (board) is capable of 180 TeraFlops (trillion or 10^12 floating point operations per second). A 64 Cloud TPU device pod can theoretically execute 11.5 PetaFlops (10^15 FLops).

TPU v1 had no floating point capabilities whatsoever. So Cloud TPU is intended to speed up the training part of ML models which requires extensive floating point calculations. Presumably, they have also improved the ML model execution processing in Cloud TPU vs. TPU V1 as well. More information on their Cloud TPU chips is available here.

So how do you code a TPU?

Both TPU v1 and Cloud TPU are programmed by Google’s open source TensorFlow. TensorFlow is a set of software libraries to facilitate numerical computation via data flow graph programming.

Apparently with data flow programming you have many nodes and many more connections between them. When a connection is fired between nodes it transfers a multi-dimensional matrix (tensor) to the node. I guess the node takes this multidimensional array does some (floating point) calculations on this data and then determines which of its outgoing connections to fire and how to alter the tensor to send to across those connections.

Apparently, TensorFlow works with X86 servers, GPU chips, TPU v1 or Cloud TPU. Google TensorFlow 1.2.0 is now available. Google says that TensorFlow is in use in over 6000 open source projects. TensorFlow uses Python and 1.2.0 runs on Linux, Mac, & Windows. More information on TensorFlow can be found here.

So where can I get some Cloud TPUs

Google is releasing their new Cloud TPU in the TensorFlow Research Cloud (TFRC). The TFRC has 1000 Cloud TPU devices connected together which can be used by any organization to train machine learning algorithms and execute machine learning algorithms.

I signed up (here) to be an alpha tester. During the signup process the site asked me: what hardware (GPUs, CPUs) and platforms I was currently using to training my ML models; how long does my ML model take to train; how large a training (data) set do I use (ranging from 10GB to >1PB) as well as other ML model oriented questions. I guess there trying to understand what the market requirements are outside of Google’s own use.

Google’s been using more ML and other AI technologies in many of their products and this will no doubt accelerate with the introduction of the Cloud TPU. Making it available to others is an interesting play but this would be one way to amortize the cost of creating the chip. Another way would be to sell the Cloud TPU directly to businesses, government agencies, non government agencies, etc.

I have no real idea what I am going to do with alpha access to the TFRC but I was thinking maybe I could feed it all my blog posts and train a ML model to start writing blog post for me. If anyone has any other ideas, please let me know.


Photo credit(s): From Google’s website on the new Cloud TPU


Disaster recovery from VMware to AWS using Dell EMC Avamar & Data Domain

avI was at Dell EMC World2017 last week and although most of the news was on Dell’s new 14th generation server and Dell-EMC integration progress, Wednesday’s keynote was devoted to storage and non-server infrastructure news.

There was plenty of non-server news but one item that caught my attention was new functionality from Dell EMC Data Protection Division that used Avamar and Data Domain to provide disaster recovery for VMware VMs directly to AWS.

Data Domain (AWS) Cloud DR

Dell EMC Data Domain Cloud DR (DDCDR) is  a new capability that enables DD to backup to AWS S3 object storage and when needed restart the virtual machines within AWS.

DDCDR requires that a customer with Avamar backup and Data Domain (DD) storage install an OVA which deploys an “add-on” to their on-prem Avamar/DD system and install a lightweight VM (Cloud DR server) utility in their AWS domain.

Once the OVA is installed, it will read the changed data and will segment, encrypt, and compress the backup data and then send this and the backup metadata to AWS S3 objects. Avamar/DD policies can be established to control how many daily backup copies are to be saved to S3 object storage. There’s no need for Data Domain or Avamar to run in AWS.

When there’s a problem at the primary data center, an admin can click on a Avamar GUI button and have the Cloud DR server, uncompress, decrypt, rehydrate and restore the backup data into EBS volumes, translate the VMware VM image to an AMI image and then restarts the AMI on an AWS virtual server (EC2) with its data on EBS volume storage. The Cloud DR server will use the backup metadata to select the AWS EC2 instance with the proper CPU and RAM needed to run the application. Once this completes, the VM is running standalone, in an AWS EC2 instance. Presumably, you have to have EC2 and EBS storage volumes resources available under your AWS domain to be able to install the application and restore its data.

For simplicity purposes, the user can control almost all of the required functionality for DDCDR from the Avamar GUI alone. But in case of a site outage, the user can initiate the application DR from a portal supplied by the Cloud DR server utility.

There you have it, simplified, easy to use (AWS) Cloud DR for your VM applications all through Dell EMC Avamar, Data Domain storage and DDCDR. At the moment, it only works with AWS cloud but it’s likely to be available for other public clouds in the near future.


There was much more infrastructure news at Dell EMC World2017. I’ll discuss more details on their new storage offerings in my upcoming Storage Intelligence newsletter, due out the end of this month. If your interested in receiving your own copy of my newsletter, checkout the signup button in the upper right of this page.


[Edits were made for readability and technical accuracy after this post was published. Ed]

The fragility of public cloud IT

I have been reading AntiFragile again (by Nassim Taleb). And although he would probably disagree with my use of his concepts, it appears to me that IT is becoming more fragile, not less.

For example, recent outages at major public cloud providers display increased fragility for IT. Yet these problems, although almost national in scope, seldom deter individual organizations from their migration to the cloud.

Tragedy of the cloud commons

The issues are somewhat similar to the tragedy of the commons. When more and more entities use a common pool of resources, occasionally that common pool can become degraded. But because no-one really owns the common resources no one has any incentive to improve the situation.

Now the public cloud, although certainly a common pool of resources, is also most assuredly owned by corporations. So it’s not a true tragedy of the commons problem. Public cloud corporations have a real incentive to improve their services.

However, the fragility of IT in general, the web, and other electronic/data services all increases as they become more and more reliant on public cloud, common infrastructure. And I would propose this general IT fragility is really not owned by any one person, corporation or organization, let alone the public cloud providers.

Pre-cloud was less fragile, post-cloud more so

In the old days of last century, pre-cloud, if a human screwed up a CLI command the worst they could happen was to take out a corporation’s data services. Nowadays, post-cloud, if a similar human screws up a CLI command, the worst that can happen is that major portions of the internet services of a nation go down.

Strange Clouds by michaelroper (cc) (from Flickr)

Yes, over time, public cloud services have become better at not causing outages, but they aren’t going away. And if anything, better public cloud services just encourages more corporations to use them for more data services, causing any subsequent cloud outage to be more impactful, not less

The Internet was originally designed by DARPA to be more resilient to failures, outages and nuclear attack. But by centralizing IT infrastructure onto public cloud common infrastructure, we are reversing the web’s inherent fault tolerance and causing IT to be more susceptible to failures.

What can be done?

There are certainly things that can be done to improve the situation and make IT less fragile in the short and long run:

  1. Use the cloud for non-essential or temporary data services, that don’t hurt a corporation, organization or nation when outages occur.
  2. Build in fault-tolerance, automatic switchover for public cloud data services to other regions/clouds.
  3. Physically partition public cloud infrastructure into more regions and physically separate infrastructure segments within regions, such that any one admin has limited control over an amount of public cloud infrastructure.
  4. Divide an organizations or nations data services across public cloud infrastructures, across as many regions and segments as possible.
  5. Create a National Public IT Safety Board, not unlike the one for transportation, that does a formal post-mortem of every public cloud outage, proposes fixes, and enforces fix compliance.

The National Public IT Safety Board

The National Transportation Safety Board (NTSB) has worked well for air transportation. It relies on the cooperation of multiple equipment vendors, airlines, countries and other parties. It performs formal post mortems on any air transportation failure. It also enforces any fixes in processes, procedures, training and any other activities on equipment vendors, maintenance services, pilots, airlines and other entities that can impact public air transport safety. At the moment, air transport is probably the safest form of transportation available, and much of this is due to the NTSB

We need something similar for public (cloud) IT services. Yes most public cloud companies are doing this sort of work themselves in isolation, but we have a pressing need to accelerate this process across cloud vendors to improve public IT reliability even faster.

The public cloud is here to stay and if anything will become more encompassing, running more and more of the worlds IT. And as IoT, AI and automation becomes more pervasive, data processes that support these services, which will, no doubt run in the cloud, can impact public safety. Just think of what would happen in the future if an outage occurred in a major cloud provider running the backend for self-guided car algorithms during rush hour.

If the public cloud is to remain (at this point almost inevitable) then the safety and continuous functioning of this infrastructure becomes a public concern. As such, having a National Public IT Safety Board seems like the only way to have some entity own IT’s increased fragility due to  public cloud infrastructure consolidation.


In the meantime, as corporations, government and other entities contemplate migrating data services to the cloud, they should consider the broader impact they are having on the reliability of public IT. When public cloud outages occur, all organizations suffer from the reduced public perception of IT service reliability.

Photo Credits: Fragile by Bart Everson; Fragile Planet by Dave Ginsberg; Strange Clouds by Michael Roper

Docker presents at Cloud Field Day 1 (CFD1)

img_6933Earlier this summer, Docker presented at Cloud Field Day 1 (CFD1) on some of their current technology and upcoming enhancements. (See the video’s here).

As you probably recall, Docker is an implementation of Linux containers which is a way of packaging applications into micro-services that can be built, ship and run across onprem, private and public cloud infrastructure.

Docker containers and Docker Engine

Docker containers combine a base OS image, plus whatever other binaries are needed to run a micro-service into a container which runs ontop of a Docker Engine.  Containers can then be run as a single instance or multiple instances on a Docker Engine.

img_6943Containers are not VMs, they have a fundamentally different architecture. For instance,

  • A VM includes a full OS and App software, it often takes several minutes to boot up and there is a hypervisor underneath it that emulates hardware and other critical services needed to run a VM. But there is no underlying standard OS under the VM layer.
  • A Docker container relies on shared OS resources, which allows for a lighter weight application package using shared resources, which means that instantiation/booting up is much faster, there is no Hypervisor, but a container can run under Linux, Windows or Mac OSs, and containers provide for full stack portability.

In the Docker Hub (srepository for Docker containers) one can find a WordPress container that contains the whole LAMP + WordPress stack in a single container. To run WordPress you would also need a MySQL or compatible database and there’s a MySQL machine container that can be used. You could easily run both the WordPress/LAMP container and the MySQL container in the same Docker Engine, connect the two together and connect the LAMP+Wordpress container to the Internet to fire up a WordPress blog site.

Docker compared VMs to houses and containers to apartments. Docker Engines can run as a VM or on bare metal hardware.

Running Docker containers on desktop, servers and in the cloud

img_6938If you want to experiment with Docker, you can download Docker for Mac or Docker for Windows which can be used install and run a native Docker engine on your desktop.

Windows Server also supports native Docker containers. In VMware one can run Docker containers under vSphere Integrated Containers which supplies Docker API endpoints as standard ESX VMs or you can run Docker containers under Project Photon which is a streamlined, non-ESX hypervisor that also supplies Docker API endpoints.

You can run Docker containers in AWS and Azure as well that integrates with each public cloud’s compute, network and storage services.

Docker Swarm

So you have your Docker engine running, with multiple containers sharing resources and to create an application but your out of compute, storage or networking power on your engine and need to bring on another server or two.  What do you do? With Docker 1.12, you can now use Docker Swarm, which supports multiple Docker Engines.

With Docker Swarm, you have management nodes and worker nodes. Management nodes provide HA services for Docker containers which runs across multiple worker nodes. Worker nodes run Docker Engines with multiple containers.

img_6940Docker Swarms orchestrates the operation of multiple Docker Engines running Docker Services.

A Docker Service is a Docker container running across multiple worker nodes (engines) in a Docker Swarm. Docker services can be run globally (across each worker node) or replicated (some number of Docker Container instances are run across one or more worker nodes). You specify on the Docker Service command which you want and Swarm will insure that the specifications selected are implemented across its worker nodes.

If a worker node goes down, Swarm will detect it and re-start the failed container instances on other worker nodes in the Swarm. Beware, if your container relied on persistent storage, that storage must be also available to all Swarm worker nodes.

Swarm provides a Routing Mesh. When you fire up a container service you can identify a swarm-wide ingress port for a container. Every worker node will listen in on that port to provide a container-aware routing service to route app requests across the Swarm to wherever the containers are currently running.

You can have multiple Swarm management nodes which share the management of the Swarm. Swarm management nodes are either leaders or followers and provide a RAFT consensus model. If the leader node goes down, another management node will take on its leadership role and start managing the Swarm.

There are many other technologies underneath Docker Swarm that are worth a look but suffice it to say it provides a load-balancing, HA service for container execution across multiple engines.

Docker Datacenter

What could possibly be missing? We have Docker Engines that can run multiple containers and Docker Swarms that can run multiple Docker Engines and containers in an HA manner. But we really need something that supports multiple Docker Swarms,  and throw in a private secure Container repository and enterprise support options while you’re at it.

Earlier this year Docker introduced Docker Datacenter, a priced service offering which does just that.  It provides Containers-as-a-Service (CaaS) across multiple Docker Swarms that has commercial support options, a Docker Trusted Repository and integrates it all with enterprise services like LDAP/AD to provide audit logs and other monitoring capabilities for container services execution.

Using Docker Datacenter, developers can have their own multiple development swarms to support engineering activities and ship and store their container images in a secure, private repository and operations can have multiple Swarms which all run the same Docker Container apps in an HA manner.

From an app developer standpoint, it all looks like container instances are running in the same Docker Engine environment across all those implementations. Operations sees a centralized management console (plane) that provides a way to monitor and manage multiple Docker Swarms running everywhere.

Well that’s about it for the update on Docker. There wasn’t much at the sessions on how containers access persistent storage but there’s a Flocker service that offers plugin support for EMC, NetApp and other enterprise SAN storage for Container apps. And there seem to be others out there and available.

You can read/hear more about Docker from these other CFD1 participants:


Full disclosure: Docker gave us a very nice/very long scarf, and two t-shirts decorated with Docker logo and tagline and a number of stickers and pins.

Hitachi and the coming IoT gold rush

img_7137Earlier this week I attended Hitachi Summit 2016 along with a number of other analysts and Hitachi executives where Hitachi discussed their current and ongoing focus on the IoT (Internet of Things) business.

We have discussed IoT before (see QoM1608: The coming IoT tsunami or not, Extremely low power transistors … new IoT applications). Analysts and companies predict  ~200B IoT devices by 2020 (my QoM prediction is 72.1B 0.7 probability). But in any case there’s a lot of IoT activity going to come online, very shortly. Hitachi is already active in IoT and if anything, wants it to grow, significantly.

Hitachi’s current IoT business

Hitachi is uniquely positioned to take on the IoT business over the coming decades, having a number of current businesses in industrial processes, transportation, energy production, water management, etc. Over time, all these industries and more are becoming much more data driven and smarter as IoT rolls out.

Some metrics indicating the scale of Hitachi’s current IoT business, include:

  • Hitachi is #79 in the Fortune Global 500;
  • Hitachi’s generated $5.4B (FY15) in IoT revenue;
  • Hitachi IoT R&D investment is $2.3B (over 3 years);
  • Hitachi has 15K customers Worldwide and 1400+ partners; and
  • Hitachi spends ~$3B in R&D annually and has 119K patents

img_7142Hitachi has been in the OT (Operational [industrial] Technology) business for over a century now. Hitachi has also had a very successful and ongoing IT business (Hitachi Data Systems) for decades now.  Their main competitors in this IoT business are GE and Siemans but neither have the extensive history in IT that Hitachi has had. But both are working hard to catchup.

Hitachi Rail-as-a-Service

img_7152For one example of what Hitachi is doing in IoT, they have recently won a 27.5 year Rail-as-a-Service contract to upgrade, ticket, maintain and manage all new trains for UK Rail.  This entails upgrading all train rolling stock, provide upgraded rail signaling, traffic management systems, depot and station equipment and ticketing services for all of UK Rail.

img_7153The success and profitability of this Hitachi service offering hinges on their ability to provide more cost efficient rail transport. A key capability they plan to deliver is predictive maintenance.

Today, in UK and most other major rail systems, train high availability is often supplied by using spare rolling stock, that’s pre-positioned and available to call into service, when needed. With Hitachi’s new predictive maintenance capabilities, the plan is to reduce, if not totally eliminate the need for spare rolling stock inventory and keep the new trains running 7X24.

img_7145Hitachi said their new trains capture 48K data items and generate over ~25GB/train/day. All this data, will be fed into their new Hitachi Insight Group Lumada platform which includes Pentaho, HSDP (Hitachi Streaming Data Platform) and their Content Analytics to analyze train data and determine how best to keep the trains running. Behind all this analytical power will no doubt be HDS HCP object store used to keep track of all the train sensor data and other information, Hitachi UCP servers to process it all, and other Hitachi software and hardware to glue it all together.

The new trains and services will be rolled out over time, but there’s a pretty impressive time table. For instance, Hitachi will add 120 new high speed trains to UK Rail by 2018.  About the only thing that Hitachi is not directly responsible for in this Rail-as-a-Service offering, is the communications network for the trains.

Hitachi other IoT offerings

Hitachi is actively seeking other customers for their Rail-as-a-service IoT service offering. But it doesn’t stop there, they would like to offer smart-water-as-a-service, smart-city-as-a-service, digital-energy-as-a-service, etc.

There’s almost nothing that Hitachi currently supplies as industrial products that they wouldn’t consider offering in an X-as-a-service solution. With HDS Lumada Analytics, HCP and HDS storage systems, Hitachi UCP converged infrastructure, Hitachi industrial products, and Hitachi consulting services, together they are primed to take over the IoT-industrial products/services market.

Welcome to the new Hitachi IoT world.


Blockchains at IBM

img_6985-2I attended IBM Edge 2016 (videos available here, login required) this past week and there was a lot of talk about their new blockchain service available on z Systems (LinuxONE).

IBM’s blockchain software/service  is based on the open source, Open Ledger, HyperLedger project.

Blockchains explained

1003163361_ba156d12f7We have discussed blockchain before (see my post on BlockStack). Blockchains can be used to implement an immutable ledger useful for smart contracts, electronic asset tracking, secured financial transactions, etc.

BlockStack was being used to implement Private Key Infrastructure and to implement a worldwide, distributed file system.

IBM’s Blockchain-as-a-service offering has a plugin based consensus that can use super majority rules (2/3+1 of members of a blockchain must agree to ledger contents) or can use consensus based on parties to a transaction (e.g. supplier and user of a component).

BitCoin (an early form of blockchain) consensus used data miners (performing hard cryptographic calculations) to determine the shared state of a ledger.

There can be any number of blockchains in existence at any one time. Microsoft Azure also offers Blockchain as a service.

The potential for blockchains are enormous and very disruptive to middlemen everywhere. Anywhere ledgers are used to keep track of assets, information, money, etc, that undergo transformations, transitions or transactions as they are further refined, produced and change hands, can be easily tracked in blockchains.  The only question is can these assets, information, currency, etc. be digitally fingerprinted and can that fingerprint be read/verified. If such is the case, then blockchains can be used to track them.

New uses for Blockchain

img_6995IBM showed a demo of their new supply chain management service based on z Systems blockchain in action.  IBM component suppliers record when they shipped component(s), shippers would record when they received the component(s), port authorities would record when components arrived at port, shippers would record when parts cleared customs and when they arrived at IBM facilities. Not sure if each of these transitions were recorded, but there were a number of records for each component shipment from supplier to IBM warehouse. This service is live and being used by IBM and its component suppliers right now.

Leanne Kemp, CEO Everledger, presented another example at IBM Edge (presumably built on z Systems Hyperledger service) used to track diamonds from mining, to cutter, to polishing, to wholesaler, to retailer, to purchaser, and beyond. Apparently the diamonds have a digital bar code/fingerprint/signature that’s imprinted microscopically on the diamond during processing and can be used to track diamonds throughout processing chain, all the way to end-user. This diamond blockchain is used for fraud detection, verification of ownership and digitally certify that the diamond was produced in accordance of the Kimberley Process.

Everledger can also be used to track any other asset that can be digitally fingerprinted as they flow from creation, to factory, to wholesaler, to retailer, to customer and after purchase.

Why z System blockchains

What makes z Systems a great way to implement blockchains is its securely, isolated partitioning and advanced cryptographic capabilities such as z System functionality accelerated hashing, signing & securing and hardware based encryption to speed up blockchain processing.  z Systems also has FIPS-140 level 4 certification which can provide the highest security possible for blockchain and other security based operations.

From IBM’s perspective blockchains speak to the advantages of the mainframe environments. Blockchains are compute intensive, they require sophisticated cryptographic services and represent formal systems of record, all traditional strengths of z Systems.

Aside from the service offering, IBM has made numerous contributions to the Hyperledger project. I assume one could just download the z Systems code and run it on any LinuxONE processing environment you want. Also, since Hyperledger is Linux based, it could just as easily run in any OpenPower server running an appropriate version of Linux.

Blockchains will be used to maintain the system of record of the future just like mainframes maintained the systems of record of today and the past.



Scality’s Open Source S3 Driver

The view from Scality’s conference room

We were at Scality last week for Cloud Field Day 1 (CFD1) and one of the items they discussed was their open source S3 driver. (Videos available here).

Scality was on the 25th floor of a downtown San Francisco office tower. And the view outside the conference room was great. Giorgio Regni, CTO, Scality, said on the two days a year it wasn’t foggy out, you could even see Golden Gate Bridge from their conference room.


img_6912As you may recall, Scality is an object storage solution that came out of the telecom, consumer networking industry to provide Google/Facebook like storage services to other customers.

Scality RING is a software defined object storage that supports a full complement of interface legacy and advanced protocols including, NFS, CIGS/SMB, Linux FUSE, RESTful native, SWIFT, CDMI and Amazon Web Services (AWS) S3. Scality also supports replication and erasure coding based on object size.

RING 6.0 brings AWS IAM style authentication to Scality object storage. Scality pricing is based on usable storage and you bring your own hardware.

Giorgio also gave a session on the RING’s durability (reliability) which showed they support 13-9’s data availability. He flashed up the math on this but it was too fast for me to take down:)

Scality has been on the market since 2010 and has been having a lot of success lately, having grown 150% in revenue this past year. In the media and entertainment space, Scality has won a lot of business with their S3 support. But their other interface protocols are also very popular.

Why S3?

It looks as if AWS S3 is becoming the defacto standard for object storage. AWS S3 is the largest current repository of objects. As such, other vendors and solution providers now offer support for S3 services whenever they need an object/bulk storage tier behind their appliances/applications/solutions.

This has driven every object storage vendor to also offer S3 “compatible” services to entice these users to move to their object storage solution. In essence, the object storage industry, like it or not, is standardizing on S3 because everyone is using it.

But how can you tell if a vendor’s S3 solution is any good. You could always try it out to see if it worked properly with your S3 application, but that involves a lot of heavy lifting.

However, there is another way. Take an S3 Driver and run your application against that. Assuming your vendor supports all the functionality used in the S3 Driver, it should all work with the real object storage solution.

Open source S3 driver

img_6916Scality open sourced their S3 driver just to make this process easier. Now, one could just download their S3server driver (available from Scality’s GitHub) and start it up.

Scality’s S3 driver runs ontop of a Docker Engine so to run it on your desktop you would need to install Docker Toolbox for older Mac or Windows systems or run Docker for Mac or Docker for Windows for newer systems. (We also talked with Docker at CFD1).

img_6933Firing up the S3server on my Mac

I used Docker for Mac but I assume the terminal CLI is the same for both.Downloading and installing Docker for Mac was pretty straightforward.  Starting it up took just a double click on the Docker application, which generates a toolbar Docker icon. You do need to enter your login password to run Docker for Mac but once that was done, you have Docker running on your Mac.

Open up a terminal window and you have the full Docker CLI at your disposal. You can download the latest S3 Server from Scality’s Docker hub by executing  a pull command (docker pull scality/s3server), to fire it up, you need to define a new container (docker run -d –name s3server -p 8000:8000 scality/s3server) and then start it (docker start s3server).

It’s that simple to have a S3server running on your Mac. The toolbox approach for older Mac’s and PC’S is a bit more complicated but seems simple enough.

The data is stored in the container and persists until you stop/delete the container. However, there’s an option to store the data elsewhere as well.

I tried to use CyberDuck to load some objects into my Mac’s S3server but couldn’t get it to connect properly. I wrote up a ticket to the S3server community. It seemed to be talking to the right port, but maybe I needed to do an S3cmd to initialize the bucket first – I think.

[Update 2016Sep19: Turns out the S3 server getting started doc said you should download an S3 profile for Cyberduck. I didn’t do that originally because I had already been using S3 with Cyberduck. But did that just now and it now works just like it’s supposed to. My mistake]


Anyways, it all seemed pretty straight forward to run S3server on my Mac. If I was an application developer, it would make a lot of sense to try S3 this way before I did anything on the real AWS S3. And some day, when I grew tired of paying AWS, I could always migrate to Scality RING S3 object storage – or at least that’s the idea.


#VMworld day 1, Cloud Foundation and Cross-Cloud Services

The main keynote topic for today at VMworld was how to address the coming cloud tsunami. Pat citing his own researchers believes that 50% of all workloads (OS instances) will be running in public and private cloud by 2021 and by 2030, 50% of all workloads will be running in the Public Cloud alone. So today VMware announced two new offerings: VMware Cloud Foundation and VMware Cross-Cloud Services.

Cloud Foundation

Cloud Foundation appears to be a bundling of VMware’s SDDC, NSX®, Virtual SAN™ (VSAN) and vSphere® solutions, into a single, integrated stack/package that can be sold and licensed together. No pricing was provided at the show but essentially VMware want’s to allow customers a simple way to deploy a VMware private cloud.

VMware states that Cloud Foundation offers customers up to 6-8X faster cloud deployment at a TCO savings of >40%.

VMware also announced a joint partnership with IBM to sell Cloud Foundation services residing on the IBM Cloud to their customer base. This broaden’s the availability of VMware cloud service offerings beyond vCloud and on premises Cloud Foundation environments.

Cross-Cloud Services

IMG_6819Everyone wants to minimize cloud vendor lockin but that’s not possible today except in a few special cases (NetApp Private Storage and similar capabilities from other vendors, cloud storage gateway services, cloud archive services, etc.).

VMware Cross-Cloud Services is the next step down this path, attempting to provide easier workload/data migration, consolidated cost and workload management and security deployment across the public and private cloud boundaries.

Cross-Cloud Services was in tech preview at the show but it’s intended to make use of standard public cloud defined APIs to provide specialized targeted services to allow better cross-cloud migration and management.

The tech preview showed VMware Cross-Cloud Services deploying an NSX gateway in AWS which allowed NSX to control public cloud IP addresses and then once that was done, one could apply security templates to deploy network encryption between apps and its services. VMware used a sniffer to show the before plain text traffic and the after with encrypted traffic, all done in a matter of minutes. They also showed cost trending information for workloads running across the private and public cloud.

Next they showed a demo (movie) of VMware migrating/cloning a simple app to other public and private cloud environments. They had a public cloud Unicycle IOT app running in Ireland/AWS (I think) with a three tier (web, app, database) app structure/instances and then migrated/cloned that single site 3-tier app to be deployed across multiple cloud (web and app tiers) sites with a single database instance running in a private cloud.

I started thinking this is getting us down the path towards cloud virtualization but in the end, it’s much more targeted services, which run in instances/gateways in the public and private cloud to do very specific migration or management activities. Nonetheless a great first step towards more flexible cross-cloud deployment and management.

VMworld Day 2 looks to be more on current products and enhancements, stay tuned.