There is no such thing as „Big Data‟The hitchhiker‟s guide to the data universeKostas Perifanos, Senior Search and Analytics EngineerData Analytics & Visualization Team19.06.2013
There is no such thing as „Big Data‟ l 19/06/20133The „Big Data‟ buzzwordBig Data is the top IT buzzword:• 4Vs: Volume, Velocity, Variety, Veracity• [Predictive] Analytics• Social Media Data• Networking / Graph• Buzz Buzz Buzz…
There is no such thing as „Big Data‟ l 19/06/20134The „Big Data‟ buzzwordA not so formal definition
There is no such thing as „Big Data‟ l 19/06/20135But what‟s this all about?Is this the whole part of the picture?• Everybody is trying to do “Big Data”• Before “Big”, don‟t forget “Data”“Most companies believe they have „Big Data Problems‟,What they actually do have is „Big, Data Problems‟”
There is no such thing as „Big Data‟ l 19/06/20136Dealing with (potentially massive) datasetsThe Paradigm Shift – EngineeringTraditional RDBMS – SQL world is perfectly fine for the majority of theproblems we are trying to solve. On top of that, we now have noSQLtechnologies to deal with structured, unstructured or semistructureddata• Escaping common pitfalls:– Before “Big”, don‟t forget “Data”– Avoid sub-optimal premature decisions
There is no such thing as „Big Data‟ l 19/06/20137Dealing with (potentially massive) datasetsThe Paradigm Shift – Engineering plus Data scienceChoose your tools wisely ..• Try to avoid over-engineering-but do keep scaling in mind• Define and store all necessary information• Let the data speak for itself• Choose your tools wisely• Data will guide your engineers to choose the appropriate tools• Most of the data problems can be solved in a (decent) laptop
There is no such thing as „Big Data‟ l 19/06/20138Dealing with (potentially massive) datasetsAsking the right questions• Know your own business• Are the questions supported by the existing data?• What will be the impact in case additional data are required?• What will be the expected/desired profit for your organization?
There is no such thing as „Big Data‟ l 19/06/20139Dealing with massive datasetsBuilding “Big Data” teams• Your asset is your Team – Engineers/Data Scientists/Visualizationexperts.• Use open source where applicable – contribute back to thecommunity• Use Lean/Agile Methodologies– Maintain a strong Product Vision• Attend conferences / trainings / trends• Invest in Learning. Always.
There is no such thing as „Big Data‟ l 19/06/201310‘It is a capital mistaketo theorize before onehas data.’Sir Arthur Conan Doyle