Real-time data analytics from customer interactions

the ghosts in the machine by MelvinSchlubman (cc) (From Flickr)
the ghosts in the machine by MelvinSchlubman (cc) (From Flickr)

At a recent EMC product launch in New York, there was a customer question and answer session for industry analysts with four of EMC’s leading edge customers. One customer, Marco Pacelli, was the CEO of ClickFox, a company providing real-time data analytics to retailers, telecoms, banks and other high transaction volume companies.

Interactions vs. transactions

Marco was very interesting, mostly because at first I didn’t understand what his company was doing or how they were doing it.  He made the statement that for every transaction (customer activity that generates revenue) companies encounter (and their are millions of them), there can be literally 10 to a 100 distinct customer interactions.  And it’s the information in these interactions which can most help companies maximize transaction revenue, volume and/or throughput.

Tracking and tracing through all these interactions in real-time, to try to make sense of the customer interaction sphere is a new and emerging discipline.  Apparently, ClickFox makes extensive use of GreenPlum, one of EMC’s recent acquisitions to do all this but I was more interested in what they were trying to achieve than the products used to accomplish this.

Banking interactions

For example, it seems that the websites, bank tellers, ATM machines and myriad of other devices one uses to interact with a bank are all capable of recording any interaction or actions we perform. What ClickFox seems to do is to track customer interactions across all these mechanisms to trace what transpired that led to any transaction, and determines how it can be done better. The fact that most banking interactions are authenticated to one account, regardless of origin, makes tracking interactions across all facets of customer activity possible.

By doing this, ClickFox can tell companies how to generate more transactions, faster.  If a bank can somehow change their interactions with a customer across websites, bank tellers, ATM machines, phone banking and any other touchpoint, so that more transactions can be done with less trouble, it can be worth lots of money.

How all that data is aggregated and sent offsite or processed onsite is yet another side to this problem but ClickFox is able to do all this with the help of GreenPlum database appliances.  Moreover, ClickFox can host interaction data and perform analytics at their own secure site(s) or perform their analysis on customer premises depending on company preference.


Marco’s closing comments were something like the days of offloading information to a data warehouse, asking a question and waiting weeks for an answer are over, the time when a company can optimize their customer interactions by using data just gathered, across every touchpoint they support, are upon us.

How all this works for non-authenticated interactions was another mystery to me.  Marco indicated in later discussions that it was possible to identify patterns of behavior that led to transactions and that this could be used instead to help trace customer interactions across company touchpoints for similar types of analyses!?  Sounds like AI on top of database machines…


Database appliances!?

The Sun Oracle Database Machine by Oracle OpenWorld San Francisco 2009 (cc) (from Flickr)
The Sun Oracle Database Machine by Oracle OpenWorld San Francisco 2009 (cc) (from Flickr)

Was talking with Oracle the other day and discussing their Exadata database system.  They have achieved a lot of success with this product.  All of which got me to wondering whether database specific storage ever makes sense.  I suppose the ultimate arbiter of “making sense” is commercial viability and Oracle and others have certainly proven this, but from a technologist perspective I still wonder.

In my view, the Exadate system combines database servers and storage servers in one rack (with extensions to other racks).  They use an Infiniband bus between the database and storage servers and have a proprietary storage access protocol between the two.

With their proprietary protocol they can provide hints to the storage servers as to what’s coming next and how to manage the database data which make the Exadata system a screamer of a database machine.  Such hints can speed up database query processing, more efficiently store database structures, and overall speed up Oracle database activity.  Given all that it makes sense to a lot of customers.

Now, there are other systems which compete with Exadata like Teradata and Netezza (am I missing anyone?) that also support onboard database servers and storage servers.  I don’t know much about these products but they all seem targeted at data warehousing and analytics applications similar to Exadata but perhaps more specialized.

  • As far as I can tell Teradata has been around for years since they were spun out of NCR (or AT&T) and have enjoyed tremendous success.  The last annual report I can find for them shows their ’09 revenue around $772M with net income $254M.
  • Netezza started in 2000 and seems to be doing OK in the database appliance market given their youth.  Their last annual report for ’10 showed revenue of ~$191M and net income of $4.2M.  Perhaps not doing as well as Teradata but certainly commercially viable.

The only reason database appliances or machines exist is to speed up database processing.  If they can do that then they seem able to build a market place for themselves.

Database to storage interface standards

The key question from a storage analyst perspective is shouldn’t there be some sort of standards committee, like SNIA or others, that work to define a standard protocol between database servers and storage that can be adopted by other storage vendors.  I understand the advantage that proprietary interfaces can supply to an enlightened vendor’s equipment but there are more database vendors out there than just Oracle, Teradata and Netezza and there are (at least for the moment) many more storage vendors out there as well.

A decade or so ago, when I was with another storage company we created a proprietary interface for backup activity and it sold ok but in the end it didn’t sell enough to be worthwhile for either the backup or storage company to continue the approach.  At the time we were looking to support another proprietary interface for sorting but couldn’t seem to justify it.

Proprietary interfaces tend to lock customers in and most customers will only accept lockin if there is a significant advantage to your functionality.  But customer lock-in can lull vendors into not investing R&D funding in the latest technology and over time this affect will cause the vendor to lose any advantage they previously enjoyed.

It seems to me that the more successful companies (with the possible exception of Apple) tend to focus on opening up their interfaces rather than closing them down.  By doing so they introduce more competition which serves their customers better, in the long run.

I am not saying that if Oracle would standardize/publicize their database server to storage server interface that there would be a lot of storage vendors going after that market.  But the high revenues in this market, as evident from Teradata and Netezza, would certainly interest a few select storage vendors.  Now not all of Teradata’s or Netezza’s revenues derive from pure storage sales but I would wager a significant part do.

Nevertheless, a standard database storage protocol could readily be defined by existing database vendors in conjunction with SNIA.  Once defined, I believe some storage vendors would adopt this protocol along with every other storage protocol (iSCSI, FCoE, FC, FCIP, CIFS, NFS, etc.). Once that occurs, customers across the board would benefit from the increased competition and break away from the current customer lock-in with today’s database appliances.

Any significant success in the market from early storage vendor adopters of this protocol would certainly interest other  vendors inducing a cycle of increased adoption, higher competition, and better functionality.  In the end, database customers world wide will benefit from the increased price performance available in the open market.  And in the end that makes a lot more sense to me than the database appliances of today.

As to why Apple has excelled within a closed system environment, that will need to wait for a future post.