Making Sense of NoSQL and Big Data Amidst High ExpectationsDocument Transcript
Making Sense Of NoSQL And Big Data AmidstHigh ExpectationsFiled in Cloud Industry Insights by Gerardo Dada | September 6, 2012 3:30 pmIn the last six months there has been a dramatic increase in interest for NoSQL and Big Data. You probablyhave heard “NoSQL is the future of databases,” or that “Big Data is a key technology that will allowbusinesses to get much smarter.”Analysts are bold in their predictions. Gartner, for example, predicts that “Big Data will delivertransformational benefits to enterprises within two to five years, and by 2015 will enable enterprisesadopting this technology to outperform competitors by 20% in every available financial metric.”In the same report, Gartner places Big Data near the “Peak of Inflated Expectations” in the hype cycle, whichcan be defined as a phase that generates high amounts of enthusiasm and unrealistic expectations (i.e. whatmost people would call a buzzword). Given the current hype, it is useful to take a step back and understandwhere these technologies can be useful and try to distinguish hype from reality.One aspect of the vision for Big Data is related to business intelligence applications, which seek to empowerbusinesses and organizations to derive intelligence and insights that will enable them to act smarter, resultingin a significant competitive advantage. New forms of processing are needed to deal with the three corecharacteristics of Big Data (from Gartner’s own definition): high volume, high velocity and / or high varietyof data.Solutions such as Hadoop and NoSQL technologies facilitate storage and analysis of very large, unstructureddata sets that have been challenging to manage with traditional SQL databases. While these technologiessolve a significant part of the burden associated with business intelligence efforts, there are two key problemsthat still need to be addressed.The first problem is that it is still highly complex to source and integrate enterprise data. Extracting, de-duplicating and correlating data about, say, customers and profitability, continue to be monumental tasks,particularly because they tend to involve a large number of source databases and information systems withpotentially different definitions for the same piece of information.The second problem is probably the harder one to solve because it goes beyond technology and into the skillsavailable to the organization. Having access to an incredible amount of data and the ability to do complexqueries are only part of the problem. To produce business value, one must derive insights from the data andbe able to act on it. For example, marketers seldom act on the data available to them. In my experience,most marketers (with the possible exception of those companies that extract direct revenue from websitevisitors via online retail or advertising) rarely look at web analytics data and therefore fail to act on anyinsights that these tools may offer.Regardless of the challenges, Big Data can be incredibly powerful when properly applied, but it will requireexpertise and skills that may not exist today in many enterprises (which is expected with any newtechnology). In addition, the tools used to visualize, query and summarize data will need to mature. Given theinterest in this technology, I expect both of the challenges discussed above to be solved quickly by theindustry.
What seems to be lacking is a deep understanding of the type of problems Big Data is designed to solve. BigData or NoSQL technologies will not replace traditional databases that are designed to maintain relationshipsbetween structured data sets and to perform operations such as transaction processing that require theACIDity provided by SQL (Atomicity, Consistency, Isolation and Durability of transactions). SQL databaseswill continue to be fundamental technology tools for many, many years.From a market perspective, Microsoft SQL Server’s revenue is roughly $2.5 billion and grew by 20 percentin the last year. Meanwhile, the total revenue for NoSQL databases, which, according to The451,reached just $20 million in 2011. The451 expects the total NoSQL market to grow to $215 million by 2015,which is still less than half the growth in license revenue that SQL Server saw in 2011. I use this comparisononly to highlight the sheer volume of problems that are still the sweet spot for enterprise mission criticalapplications backed by relational databases.The main point is to set the right expectations. As it is usually the case, it is about selecting the right tool forthe job, as GigaOM points out in the article “MongoDB or MySQL why not both?” Because NoSQLdatabases give organizations the advantages of scale and flexibility of data structures, they are a good tool formanaging large amounts of data where the relationship between the data elements is less important.As the Wikipedia article states: “NoSQL database management systems are useful when working with ahuge quantity of data and the data’s nature does not require a relational model for the data structure.” Tochoose the right tool for your data problem, you should try to understand the business requirements acrossthree dimensions of size, variety of the type of data (unstructured versus highly structured data) and velocityof ingestion and removal of data. In addition, NoSQL databases can often be deployed using commodityhardware, making it an affordable technology to deploy from a hardware requirements perspective.I propose that there are three key aspects of NoSQL and Big Data technologies that we should remember: 1. Organizations should choose database technologies based on the business requirements and the problem at hand. This requires understanding the virtues and challenges of each technology, and to fight our natural inclination to favor “cool technologies.” 2. No single information management technology is the right solution for all needs, whether it is SQL, NoSQL or any other. NoSQL and Big Data offer organizations very high value for specific business and technology problems that require high amounts of varied data types with high velocity of change. Some examples include log analysis, transaction analysis, very large data sets and many applications that require computations and analysis that are impractical to perform in a relational database. 3. Most organizations will struggle to realize the utopian vision that Big Data will deliver unlimited customer insights and “automatic” business value. It is not a panacea. It is still important to ask the right questions and be able to act on insights, and to develop the right skills across the organization.Rackspace has been involved in NoSQL technololgies for quite some time (interestingly, a fellow Rackercoined the term NoSQL). Our own IT department has deployed a NoSQL cluster on its own OpenStackprivate cloud to provide the business intelligence our management team needs (stay tuned for details).At Rackspace we operate under a fundamental principle of openness, which means that we shouldsupport the technology choices of our customers. Whether you need MySQL as a service, SQLServer on dedicated or cloud infrastructure or a NoSQL cluster using technologies from our partners(such as Mongo or Infochimps), we aspire to offer you the right tool for your job.Endnotes: 1. predicts: http://www.forbes.com/sites/louiscolumbus/2012/08/04/hype-cycle-for-cloud-computing-