Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Hadoop to complement or replace data warehouse?
1. Hadoop to complement or replace data warehouse?
As the market for enterprise Hadoop heats up, the battle lines between two suppliers - Cloudera and
Hortonworks - become more clearly defined with every week that passes.
For these two suppliers, there is everything to fight for and much at stake in a battle that is being
played out on several different fronts. For example, there is the issue of having the financial backing
that a pre-IPO start-up needs to persuade enterprise customers that it is a safe bet. Cloudera seems
to have won that tussle, with the late March 2014 announcement of $740m in financing from
chipmaker Intel.
On the matter of customer acquisition, six-year-old Cloudera probably has a slight lead over three-
year-old Hortonworks, but only just. Analysts estimate Cloudera's base of paying subscribers at
around 350, while Hortonworks' CEO Rob Bearden says his company has acquired 250 customers
over the past five quarters.
But the most significant point of disagreement between Cloudera and Hortonworks lies in their
answers to a single question - and the one that, arguably, matters most to enterprise customers:
should Hadoop complement or replace traditional enterprise data warehouse (EDW) investments?
At Hortonworks, vice-president of marketing David McJannet says it is the former: Hadoop is a
valuable addition to existing analytic technologies. "A unique aspect of our approach is the fact that
we're not trying to compete with the data warehousing incumbents," he says. "This is a pivotal
philosophical difference we have with other folk in this market.
"There is one [supplier] in the market that will tell you, 'Just throw out Teradata and stick it all in
Hadoop', but it's just not realistic to do that."
And who would offer that advice to customers? "Cloudera," McJannet answers, flatly. "That's their
leading message: that the data warehouse is dead."
At Cloudera, chief strategy officer Mike Olson tells a slightly different story - but his broad message
is that, over time, many analytic workloads will move out of the EDW and into Hadoop.
He points out that Hortonworks has a close go-to-market partnership with data warehousing
company Teradata, making it impossible for the company to present Hadoop as an EDW competitor.
Even McJannet acknowledges that Hortonworks' reseller agreements with Teradata, Microsoft, SAP
and Hewlett-Packard mean that a combined 1,000 extra enterprise software sales executives are out
there in the market, selling on Hortonworks' behalf.
But Olson's remarks reveal a strong inclination to position the EDW as a vulnerable "older
technology" whose days are numbered.
"Everyone wants to know if Cloudera is after Teradata," he says. "The fact of the matter is that the
opportunity here is not to knock over old guys and steal their wallets. The opportunity is to monetise
vast amounts of data using tools that weren't previously available."
3. or Hadoop.Â
"Enterprises pay for an enterprise data warehouse because they need the superior performance it
provides, but they either end up putting a lot of cold data on a very expensive platform, or leaving
potentially valuable data on the cutting-room floor," he says. "Hadoop may offer low-cost storage for
data, but it simply isn't fast enough to replace EDW right now. It can't hold a candle to Teradata, for
example, in terms of performance, whatever the Hadoop [suppliers] say."
But Gualtieri insists Hadoop is a threat to the EDW, even if it does have a long way to go,
particularly in terms of hardware performance (and this could be where Cloudera's funding from
Intel could reap the biggest benefits for suppliers.)
SQL on Hadoop
For enterprise customers, the area to watch is the development of tools that enable organisations to
run SQL queries against Hadoop, says Gualtieri. Here, Cloudera offers Impala, while Hortonworks'
Stinger project, now complete, has seen engineers work to improve the performance of the Apache
Hive tool.
"SQL on Hadoop is the hottest area of Hadoop innovation right now," he adds. "Most organisations
analyse only about 12% of the data they hold, so the race is on to get faster, interactive SQL working
well on Hadoop."
In fact, SQL on Hadoop may be the most important deciding factor in Hadoop's progress from now
on. Not only does it give suppliers a chance to differentiate by developing the fastest, easiest-to-use
SQL tools for Hadoop, but it may also unlock a new range of use cases for enterprise customers yet
to make a move on Hadoop.Â
And in the process, it may also pose the biggest threat to the data warehouse by enabling those
customers to move a significant share of data and query loads over to Hadoop and away from their
venerable EDW.
Email Alerts
Register now to receive ComputerWeekly.com IT-related news, guides and more, delivered to your
inbox.
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of
the United States, you consent to having your personal data transferred to and processed in the
United States. Privacy
Read More
Related content from ComputerWeekly.com
RELATED CONTENT FROM THE TECHTARGET NETWORK
This was first published in April 2014