Faded Hadoop

Hadoop Exits

Looking back at the early days of Progressive Gaguge, and starting out with some thoughts on Hadoop: SiliconAngler Paull Gillin got it right in the title of his report: The Sun Sets on the Big Data Era.

His impetus was the news of Hewlett-Packard Enterprise’s acquisition of MapR’s business assets. One of the original Big 3 independent Hadoop distro providers, MapR fell into HPE’s hands under fire-sale circumstances. Seldom have technology chapters closed so conclusively.

The MapR folks were triers, a good lot – but always in the shadow of front runners Cloudera and Hortonworks.

The MapR managers were early to see the limitations of the Hadoop Distributed File System – but didn’t pick up immediately on MapReduce’s shortcomings, or else they would have picked another name for the company. The original Hadoop comprised MapReduce and HDFS, and then it morphed as time went by. MapReduce’s steady backtracking was never a positive for “MapR.”

MapR was earlier than the others to start to build-out container versions of their products but, like its distro brethren, it began to face strong competition in the form of big data offerings from leading cloud providers.

As 2019 dawned, MapR mis-gauged the obstacles it faced in the market and the fact that the VC spigot could be turned off.

While Hortonworks and Cloudera had now merged under a single Cloudera umbrella, the benefit for MapR was slight. Moving up from three to two in the Hadoop independents’ market was not of much value, when the leading Cloudera-Hortonworks combo was revamping projections, saying goodbye to its CEO, and obviously struggling too. It was a perfect storm for Hadoop obit writers.

As one who covered the segment for over five years, there is more than a little melancholy in this – the only surprise being the rapidity of Hadoop’s downfall. The seeds were there in the first western Hadoop Summit I attended. There, a skeptical wag told me there was no clear future for Hadoop east of Berkeley.

People were looking for signs that large-volume distributed processing could successfully spread to enterprises. Why couldn’t the value Hadoop and its kin showed in the big coastal data centers of Yahoo, Google and Facebook be replicated in other organizations? It’s a question still being asked.

But the fact was that Hadoop big data architectures consistently faced big hurdles:

*Prototypes were one thing, but production implementations were usually a bridge to far.
*Hadoop in the data center met the tsunami of capex-inspired flight to cloud.
*SQL style analytics were not out-of-the-box with Hadoop; the assembly required was daunting.
*Whatever short comings Hadoop had could be fixed with new Hadoop ecosystem components; but the endless parade of such components became numbing.
*As “big data” morphed into “machine learning,” the cavalcade of component changes proved exhausting.

With leading lights of Hadoop making headlines for the wrong reasons, now’s the time to find an alternative name for whatever promise Hadoop betokens. So, what is a likely outcome of MapR in HPE?

The first order of business will be calming the waters. Uncertainty has been the norm since May, when MapR’s CEO dutifully alerted the State of California’s employment board of its shortfall and that it had been seeking a buyer for some time. At the same time, the MapR culture is a technical one, which should auger a ready fit with HPE.

While it is clear the company was not able to grow its user base as it haphazardly shifted its sales organization last year, there are indications it had long-time users still onboard. Hello, HPE upsell!

The bad news is HPE has cloud troubles of its own, which it is seeking to address with a hybrid/multicloud architecture known as GreenLake. MapR’s platform could become part of the HPE GreenLake ecosystem, HPE Primera and HPE InfoSight, or other uber HPE architectures to be named later.

Going back to 2017, MapR had a partnership with BlueData, one of the earliest vendors to run with Hadoop container and microservices technology for large-scale machine learning and analytics. BlueData was bought by HPE at the end of 2018, and it seems to form some of the underpinnings of HPE InfoSight. So, expect MapR and BlueData to form the basis for HPE big data going forward.

But, take caution – despite its vaunted technical credentials, HPE’s predecessor H-P organization historically had a bit of difficulty with most things categorized as “software”.

For users, the big data saga is another one for the books. New built-for-purpose engines arise, such as Hadoop, and then not-so-gradually they add bells and whistles, until they are more than tools; they become full-fledged ways of life. That is not all for the best.

Place the enterprise data warehouse, the single-view-of-the-customer system and SOA in the same line of overreaching architectures. It’s been famously said that software is eating the world — perhaps the key is bite-sized portions, instead of full courses.

– Jack Vaughan