Also (not as a replacement for these three) the 4 Vs – Volumn, velocity, variety, value. I saw this in an Oracle paper, but it’s all over the web (search for big data and 4vs); some attribute it originally to a Forrester blog. Here’s the Oracle WP, with a paragraph on each V: http://www.oracle.com/technetwork/database/bi-datawarehousing/wp-big-data-with-oracle-521209.pdf
Enterprise Search and Big DataBenefits, Challenges & UsesCathy McKnightDigital Clarity GroupAugust 2012
Presenter Cathy McKnight • Partner & Principal Analyst, Digital Clarity Group • Industry consultant 15+ years • Runner, gardener, photographer, social media enthusiast 2
What are we talking about?Big Data Enterprise SearchBig Data is a general term used to Enterprise search is the organizeddescribe the voluminous amount retrieval of structured andof unstructured and semi- unstructured data within anstructured data a company organization. Properlycreates -- data that would take toomuch time and cost too much implemented, enterprise searchmoney to load into a relational creates an easily navigateddatabase for analysis. Although interface for entering, categorizingBig data doesnt refer to any and retrieving data securely, inspecific quantity, the term is often compliance with security and dataused when speaking about retention regulations.petabytes and exabytes of data. ~ TechTarget ~ TechTarget
What is Big Data, really? Large Unstructured Real-time
Just how large is Big?Unit Just how many zeros is that? Reality CheckKilobyte 1 000 bytes ½ page of text(kB)Megabyte 1 000 000 bytes Small novel(MB)Gigabyte 1 000 000 000 bytes Printed paper filling up a pick-up(GB)Terabyte 50K trees made into paper and 1 000 000 000 000 bytes(TB) printedPetabyte 2 PB = All content in U.S. 1 000 000 000 000 000 bytes(PB) academic research librariesExabyte 1 000 000 000 000 000 000 bytes 5 EB = All words ever spoken(EB)
And Big is getting Bigger“Between the birth of the world and 2003,there were five exabytes of informationcreated. We [now] create five exabytesevery two days.” Eric Schmidt CEO, Google August 2010
What does unstructured mean? Text Heavy Irregular/ambiguous Difficult to search and/or analyze Sources: – Email – Social – Rich data files
Real time means NOW! • More data, coming in faster • Decision windows getting smaller • Valuable to worthless in a matter of minutes .. no seconds … no milliseconds.
But there is new(s) in Big DataBig can now be madesmall(er). Search (and other) technology today lets us make Big Data smaller and more manageable.
How has/does it affect business?Data is becoming the new raw material of business: aneconomic input almost on a par with capital and labour.“Every day I wake up and ask, „how can I flow data better,manage data better, analyse data better?” says Rollin Ford,the CIO of Wal-Mart. Source: Data, Data Everywhere The Economist, February 25, 2010
Big Data challenges and frustrations Working with Big Data is akin to … 12
But it does have potentialbenefits Can be used to identify: – Trends – Patterns – Risk Big Data Analysis 13