We saw a recent article in Blocks and Files (Storage facing trillion-row db apocalypse), about a couple of companies which were trying to deal with trillion row database queries without taking weeks to respond. One of those companies was Ocient (@Ocient), a Chicago startup, whose CEO and Co-Founder, Chris Gladwin, was an old friend from CleverSafe (now IBM Cloud Object Storage).
Chris and team have been busy creating a new way to perform data analytics on massive data lakes. It’s has a lot to do with extreme parallelism, high core counts, NVMe SSDs, and sophisticated network and compute flow control. Listen to the podcast to learn more.
The key to Ocient’s approach involves NVMe SSDs which have become ubiquitous over the last couple of years which can be deployed to deal with large data problems. Another key to Ocient is multi-core CPUs, which again seem everwhere and if anything, are almost doubling with every new generation of CPU chip.
We let Chris wax a little too long on the SSD revolution in IOPs, especially as pertains too random 4K reads. Put a 20 or so NVMe SSDs in a server with dual 50 core CPU chips and you have one fast random IO machine.
Another key to Ocient is very sophisticated network and bus data flow management. With all this data running any query on it, involves consuming lots of data that all has to be brought into the CPU. PCIe bandwidth helps, as does NVMe SSDs, but you still need to insure that nothing gets bottlenecked moving all that data around a system/server.
Yet another key to Ocient is parallelism. With one 20 NVMe SSD server and 2-50 core CPUs you’ve got a lot of capability but when you are talking about trillion row databases you need more. So in order to respond to queries in anything a second or so, they throw a lot of NVMe servers at the problem.
I asked how they split the data across all these servers and Chris mentioned that at the moment that’s part of their secret sauce and involves professional services.
Ocient supports full ANSI SQL queries against trillion row databases and replies to those queries in a matter of seconds. And we aren’t just talking about SQL selects, Ocient can do splits, joins and updates to this trillion row database at the same time as the SQL select are going on. Chris mentioned that Ocient can be loading 100K JSON files each second, while still performing SQL queries in near real time against the trillion row database.
Ocient supports Reed-Solomon error correction on database data as well as data compression and encryption.
In addition to SQL queries, Chris mentioned that Ocient supports data load and transform activities. He said that most of this data is being generated from IoT applications and often needs to be cleaned up before it can be processed. Doing this in real time, while handling queries to the database is part of their secret sauce.
Chris said there’s probably not that many organizations that have need for trillion row databases. But ad auctions, telecom routers, financial services already use trillion row databases and they all want to be able to process queries faster on this data. Ocient is betting that there will be plenty more like this over time.
Ocient is available on AWS and GCP as a cloud service, can also be used operating in their own Ocient Cloud or can be deployed on premises. Ocient services are billed on a per core pack (500 cores, I think) subscription model.
Chris Gladwin, CEO and Co-founder, Ocient
Chris is the CEO and Co-Founder of Ocient whose mission is to provide the leading platform the world uses to transform, store, and analyze its largest datasets.
In 2004, Chris founded Cleversafe which became the largest and most strategic object storage vendor in the world (according to IDC.) He raised $100M and then led the company to over a $1.3B exit in 2015 when IBM acquired the company. The technology Cleversafe created is used by most people in the U.S. every day and generated over 1,000 patents granted or filed, creating one of the ten most powerful patent portfolios in the world.
Prior to Cleversafe, Chris was the Founding CEO of startups MusicNow and Cruise Technologies and led product strategy for Zenith Data Systems. He started his career at Lockheed Martin as a database programmer and holds an engineering degree from MIT.