Your SlideShare is downloading. ×
0
The Age of Big Data
The Age of Big Data
The Age of Big Data
The Age of Big Data
The Age of Big Data
The Age of Big Data
The Age of Big Data
The Age of Big Data
The Age of Big Data
The Age of Big Data
The Age of Big Data
The Age of Big Data
The Age of Big Data
The Age of Big Data
The Age of Big Data
The Age of Big Data
The Age of Big Data
The Age of Big Data
The Age of Big Data
The Age of Big Data
The Age of Big Data
The Age of Big Data
The Age of Big Data
The Age of Big Data
The Age of Big Data
The Age of Big Data
The Age of Big Data
The Age of Big Data
The Age of Big Data
The Age of Big Data
The Age of Big Data
The Age of Big Data
The Age of Big Data
The Age of Big Data
The Age of Big Data
The Age of Big Data
The Age of Big Data
The Age of Big Data
The Age of Big Data
The Age of Big Data
The Age of Big Data
The Age of Big Data
The Age of Big Data
The Age of Big Data
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

The Age of Big Data

244

Published on

"Big Data" gets thrown around a lot, but what does it really mean? …

"Big Data" gets thrown around a lot, but what does it really mean?

This presentation, from SES Chicago, breaks down three of the main elements of Big Data, Volume, Variety, and Velocity.

Published in: Technology, Business
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
244
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
8
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Name, VP of Product and Marketing for Conductor, the leader in SEO Platform technology, Excited to talk about Big data This morning, we heard Avinash talk about how to not just measure, but learn from the data points we are measuring.So what do I know about this? AT Conductor, At conductor, we’re not just a consumer of the benefits of big data, we’re also building big data solutions for our customers– near and dear to my heart. Every day our platform gathers data on our customers, their websites, their competitor’s websites, and provides insight, reporting and workflow around how to get better.We gather enormous amounts of data while we do this.
  • 1 EB = 1000000000000000000B = 1018 bytes = 1,000,000,000gigabytes = 1,000,000terabytes = 1,000petabytes
  • Up to 50% of all internet traffic is non-human. And that’s through browser type
  • The structure of the data itself. The structure of the container that hosts the data. The structure of the access method used to access the data.Some people use the term ‘multi-structured’ to describe this
  • Need to capture and store data in burstsStorage is Cheap – but has led to some sloppy data storage problemsChallenge to Query that Data in a Responsive Enough WayHow long is there a delay until the data is accessible.
  • Hadoop was created by Doug Cutting and Michael J. Cafarella.[5] Doug, who was working at Yahoo at the time,[6] named it after his son's toy elephant.[7] It was originally developed to support distribution for the Nutch search engine project.[8]– – – 2004-2006: gestation GFS & MapReduce papers published directly address Nutch's scaling issues The other two classes are NoSQL and Massively Parallel Processing (MPP) data stores.
  • The evidence is clear: Data-driven decisions tend to be better decisions. I
  • In 2001 PASSUR began offering its own arrival estimates as a service called RightETA. It calculated these times by combining publicly available data about weather, flight schedules, and other factors with proprietary data the company itself collected, including feeds from a network of passive radar stations it had installed near airports to gather data about every plane in the local sky.PASSUR started with just a few of these installations, but by 2012 it had more than 155. Every 4.6 seconds it collects a wide range of information about every plane that it "sees." This yields a huge and constant flood of digital data. What's more, the company keeps all the data it has gathered over time, so it has an immense body of multidimensional information spanning more than a decade. RightETA essentially works by asking itself "What happened all the previous times a plane approached this airport under these conditions? When did it actually land?"After switching to RightETA, the airline virtually eliminated gaps between estimated and actual arrival times. PASSUR believes that enabling an airline to know when its planes are going to land and plan accordingly is worth several million dollars a year at each airport. It's a simple formula: Using big data leads to better predictions, and better predictions yield better decisions.
  • Tell the story of me, and how I got here…NASA’s Big Data ChallengeWe have deep space spacecraft that sends back data in the order of MB/s. Then we have earth orbiters that can send back data in GB/s per second. We are planning missions today that will easily stream more then 24TB’s a day. That’s roughly 2.4 times the entire Library of Congress – EVERY DAY. For one mission.Sandy, Knocked Out our washing machine– so went online – saw some reviewsThat information was stored in a DB, but also while I was looking at it, it generated reccomendations for me, based on what I’d already looked at, where I was, and what it thought I was more likely to purchse based on previous visitors.My wife and I bought the refridgerator, and scheduled delivery. I got a text message from Chase 5 minutes later asking if I had really purchased it.
  • Identify me as a visitor,Look up my profile (whether my password is right or not, what I’m looking for,Select content that matches visitor’s preferences, Assemble that pageRemember in the background it has to build a description of me, compare it to other folks, and query it’s entire product database to find what I want, my points, what roduct manuals they should pass along, and reviews that are going to be helpful to meWent into the store, and now it’s tying my online buying behavior (my profile) to my offline purchase
  • I don’t buy expensive stuff at Best Buy all that ofte
  • My recommendations – People I may knowAnd amazingly! Bob Sacco posted an artcile about Big data can add years to a CMOs tenure
  • http://www.kaushik.net/avinash/multi-channel-attribution-definitions-models/
  • http://community.advertising.microsoft.com/msa/en/atlas/b/blog/archive/2011/08/15/the-challenge-of-adopting-conversion-attribution-modeling.aspxLori Goode
  • http://www.kaushik.net/avinash/multi-channel-attribution-definitions-models/An example of MCA-AMS is the ability to understand that a search I did on my tablet computer while watching a television commercial resulted in a click on a paid search ad to a camera site which logged into my memory which later caused me to read reviews of the camera on my Nexus S while stuck in traffic and that finally caused a sale for Sony when I got home and happened to be on my laptop.The most common attribution models bundled into even the simplest web analytics tools are: Last click, first click, and even distribution.If you are lucky, you have access to a more sophisticated tool which would include: Adjustable, based on mathematical algorithms, time decay model.But most of what you'll get out of playing with these models is a deep and profound appreciation for how they'll, even in their most shining moment, give you directional guidance how to adjust your media spend (shift dollars/euros/pesos from Search to Display or from Display to Email or… other combinations).You'll realize (even if you use the greatest customized model created by your most magnificent consultant at a equally magnificent cost to you) that success then will come not from that rough output, but rather from your ability to take that rough output, make changes, observe the impact (over weeks, or months if you are small sized), identify insights and be less wrong over time.If you happen to be in a larger company, say you spend more than $10 million on digital marketing per year, you'll quickly see, having learned to be less wrong over time, that the question you want to answer with multi-channel attribution modeling is not "who gets how much credit" but rather "how can I optimally balance my digital marketing portfolio."
  • Approximately 80% of costs for a data project are spent on preping the data – mostly cleaning up data quality issueIf you’re budgeting, don’t make the very common mistake of spending money in frameworks that are only useful once you have clean data
  • Transcript

    • 1. The Age of Big Data and the Modern Marketer Chicago | November 12–16 Seth Dotterer Conductor VP, Marketing and Product (speaker logo)
    • 2. Chicago | November 12–16, 2012 | #SESCHI Link: http://www.youtube.com/watch?v=QV3t-3QIf1E
    • 3. Chicago | November 12–16, 2012 | #SESCHI “Data underpins our economy and our society - data about how much is being spent and where, data about how schools, hospitals and police are performing, data about where things are and data about the weather.” Tim Berners Lee, Director of W3C.
    • 4. Chicago | November 12–16, 2012 | #SESCHI What is Big Data?
    • 5. Chicago | November 12–16, 2012 | #SESCHI A collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools.
    • 6. Chicago | November 12–16, 2012 | #SESCHI Definition of Big Data Volume - Variety - Velocity • a @twitterhandle
    • 7. Chicago | November 12–16, 2012 | #SESCHI Volume
    • 8. Chicago | November 12–16, 2012 | #SESCHI
    • 9. Chicago | November 12–16, 2012 | #SESCHI
    • 10. Chicago | November 12–16, 2012 | #SESCHI 700 million Facebook users 250 million Twitter users 156 million public blogs
    • 11. Chicago | November 12–16, 2012 | #SESCHI Volume An average of 294 billion e-mails are sent every day. 2.4 Billion People Online @dotterer
    • 12. Chicago | November 12–16, 2012 | #SESCHI The machines also talk to each other
    • 13. Chicago | November 12–16, 2012 | #SESCHI Variety
    • 14. Chicago | November 12–16, 2012 | #SESCHI Structured Data
    • 15. Chicago | November 12–16, 2012 | #SESCHI Variety • Structured Data: • Unstructured Data • Even Structured Data has issues Unstructured (mostly) @twitterhandle
    • 16. Chicago | November 12–16, 2012 | #SESCHI Velocity
    • 17. Chicago | November 12–16, 2012 | #SESCHI Wal-Mart generates more than 1M transactions an hour into databases estimated at more than 2.5 petabytes
    • 18. Chicago | November 12–16, 2012 | #SESCHI
    • 19. Chicago | November 12–16, 2012 | #SESCHI
    • 20. Chicago | November 12–16, 2012 | #SESCHI Hadoop has its roots as a search engine • Apache Hadoop, the leading open source solution, was created from Google’s MapReduce and the Google File System but adopted by a Yahoo employee. • Now, Hadoop and the resulting and related technologies and toolsets are used to create distributed processing across clusters of computers. These scale up to thousands of machines that each can store and process information, and can pick up failures from any of their neighbor machines. @dotterer
    • 21. Chicago | November 12–16, 2012 | #SESCHI Big data is the fuel. It is like oil. If you leave it in the ground, it doesn’t have value. But when we find ways to ingest, curate, and analyze the data in new and different ways, such as in Watson, Big Data becomes very interesting.” Stephen Gold, VP of Marketing for IBM’s Watson
    • 22. Chicago | November 12–16, 2012 | #SESCHI
    • 23. Chicago | November 12–16, 2012 | #SESCHI
    • 24. Chicago | November 12–16, 2012 | #SESCHI
    • 25. Chicago | November 12–16, 2012 | #SESCHI
    • 26. Chicago | November 12–16, 2012 | #SESCHI
    • 27. Chicago | November 12–16, 2012 | #SESCHI
    • 28. Chicago | November 12–16, 2012 | #SESCHI
    • 29. Chicago | November 12–16, 2012 | #SESCHI How Search Marketers can use big data • Think Flow, not Batch. • PPC Vendors like Marin, Kenshoo, Adobe all have big data optimization • Leverage cross-discipline (SEO position data to drive Bids, and vice-versa) • Mining and teasing out the tactics and strategies that are working, vs. what is noise • Opportunity forecasting @dotterer
    • 30. Chicago | November 12–16, 2012 | #SESCHI Big Data Means…. How Cross Channel Marketing is Changing how independent silos are working No More Silos
    • 31. Chicago | November 12–16, 2012 | #SESCHI
    • 32. Chicago | November 12–16, 2012 | #SESCHI
    • 33. Chicago | November 12–16, 2012 | #SESCHI
    • 34. Chicago | November 12–16, 2012 | #SESCHI How Cross Channel Marketing is changing how we keep score @dotterer
    • 35. Chicago | November 12–16, 2012 | #SESCHI
    • 36. Chicago | November 12–16, 2012 | #SESCHI Are you referring to MCA-O2S, MCA-AMS or MCA-ADC? http://www.kaushik.net/avinash/multi-channel-attribution-definitions-models/ • MCA-ADS - Multi-Channel Attribution, Across Digital Channels: • MCA-O2S - Multi-Channel Attribution, Online to Store • MCA-AMS - Multi-Channel Attribution, Across Multiple Screens
    • 37. Chicago | November 12–16, 2012 | #SESCHI “The ability to take data - to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it's going to be a hugely important skill in the next decades” Hal Varian - Google’s Chief Economist
    • 38. Chicago | November 12–16, 2012 | #SESCHI
    • 39. Chicago | November 12–16, 2012 | #SESCHI
    • 40. Chicago | November 12–16, 2012 | #SESCHI
    • 41. Chicago | November 12–16, 2012 | #SESCHI Don’t forget the clean up
    • 42. Chicago | November 12–16, 2012 | #SESCHI Takeaways • Big Data is what YOU say it is, so that you can solve your problems. Don't lose sight of that, or blindly 'trust the data'. • The more frequently (or faster) you analyze your data, the more likely it is to be valuable. Think less about ‘batch’ and more about flow. • Find new sources of data – don’t just dumping more of the same at it. • Keep data as long as you can • Become (or hire) a statistician - Big Data and advanced analytics have emerged, with programs sprouting up at USC, N.C. State, NYU and elsewhere. • Figuring out what the data tells you is hard. • How to convince others what the data tells you is harder. Visualization helps. @dotterer
    • 43. Chicago | November 12–16, 2012 | #SESCHI Thanks! sdotterer@conductor.com @dotterer
    • 44. Chicago | November 12–16, 2012 | #SESCHI Photo Credits • http://www.flickr.com/photos/29691859@N03/2873017925/sizes/l/ • http://www.flickr.com/photos/baqueroguapo/5567237076/sizes/l/in/photostream/ • http://www.flickr.com/photos/markusperl/7689989242/sizes/l/in/photostream/ • http://www.flickr.com/photos/ragfield/6728728687/lightbox/ • http://www.flickr.com/photos/ppix/2305078608/sizes/l/in/photostream/ • http://www.flickr.com/photos/matneym/4145370899/sizes/l/in/photostream/ @dotterer

    ×