Over a decade ago a new open source project called Hadoop came to life and started the era of Big Data Analytics. It was a novel project with a scale-out, shared-nothing distributed architecture, which promised to handle the big data challenges not manageable by the standard relational databases of the time. However, it was notoriously hard to install and operate in production.
Two brands became synonymous with commercial Hadoop – Cloudera and Hortonworks – with the promise of helping enterprises leverage Hadoop. These companies, built by talents from the likes of Google, Facebook and Yahoo, grew into impressive public companies: their combined equity value after the merger is estimated at $5.2 billion (give or take the stock price changes).
But while they were bashing around (together with MapR and some others), the Big Data market has drastically changed. These emerging challenges have driven these arch-rivals to join forces. Cloudera and Hortonworks announced a “merger of equals”, though at a closer look it seems to be more of a “first among equals” arrangement, with Cloudera’s stockholders to own a 60% stake, and the new company to be called “Cloudera Inc.“.
From on-prem to the cloud
In the age of Cloud Computing, enterprises are no longer inclined to build enormous Hadoop clusters in their data centers to crunch the data (especially with Hadoop’s high upfront storage costs and painful upgrades). And so the traditional players found themselves increasingly competing against public cloud giants Amazon, Google, Microsoft and their big data services. On Gartner’s 2018 Magic Quadrant for Data Management Solutions for Analytics, the main cloud vendors shine in the Leaders quadrant (with Google a hairline away), while Hortonworks, Cloudera and MapR lag behind in the Niche Players quadrant.
From data to insights
Since the emergence of Hadoop, there’s been a big (data) bang which spawned a range of tools, platforms and services aimed at different aspects of analytics, from data lakes to data warehouses, from batch to stream, from time series to graphs and from structured to unstructured. But most disruptive are the data science driven methods of artificial intelligence and machine learning. This shuffled the competitive landscape introducing many specialized vendors such as KNIME and H2O.ai (both open source), which lead Gartner’s 2018 Magic Quadrant for Data Science and Machine Learning Platforms and keep pushing back even established players such as IBM, Microsoft, SAS and Teradata. Hortonworks and Cloudera are not even on that chart.
From the data center to the edge with IoT
The Internet of Things (IoT) brings a massive surge of data streaming from disperse locations, ranging from connected cars to industrial automation. This calls for different big data analytics solutions to meet regulatory, security, performance, and geo-location need, combining both cloud and edge computing. You can read more on that in this blog post.
Joining forces towards the next chapter of big data analytics
Hortonworks and Cloudera have made some moves in the right direction even before the merger to adapt their strategies to these market changes: They formed partnerships with Amazon, Google, IBM, Microsoft and started operating on their clouds, while maintaining a hybrid cloud differentiation; They launched advanced analytics and IoT offering such as Hortonworks DataFlow for streaming and IoT workloads, and Cloudera Data Science Workbench.
But it will require more than these individual efforts to keep them competitive. Therefore, joining forces is an important strategic move. Combined with the strong open source DNA shared by both companies, they will be well positioned to take a leadership role in the next chapter of big data analytics.
Follow Horovits on Twitter!