An Introduction to Big Data

1,079 views
898 views

Published on

An Introduction to Big Data
CUSO Seminar on Big Data, Switzerland
Prof. Philippe Cudre-Mauroux
eXascale Infolab
http://exascale.info/

Published in: Data & Analytics
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,079
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
22
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

An Introduction to Big Data

  1. 1. An Introduction to BIG DATA CUSO Seminar on Big Data Prof. Dr. Philippe Cudré-Mauroux http://exascale.info May 22, 2014 Fribourg–Switzerland 1
  2. 2. 2 On the Menu Today • Big Data: Context • Big Data: Buzzwords – 3 Vs of Big Data • Big Data Landscape • Hadoop • Big Data in Switzerland
  3. 3. Instant Quizz • 3 Vs of Big Data? • CAP? • Hadoop? • Spark? 3
  4. 4. Exascale Data Deluge • Science – Biology – Astronomy – Remote Sensing • Web companies – Ebay – Yahoo • Financial services, retail companies governments, etc. © Wired 2009 ➡ New data formats ➡ New machines ➡ Peta & exa-scale datasets ➡ Obsolescence of traditional information infrastructures
  5. 5. The Web as the Main Driver 5 © Qmee
  6. 6. Big Data Central Theorem Data+Technology  Actionable Insight  $$ 6
  7. 7. Big Data Buzz 7 Between now and 2015, the firm expects big data to create some 4.4 million IT jobs globally; of those, 1.9 million will be in the U.S. Applying an economic multiplier to that estimate, Gartner expects each new big-data-related IT job to create work for three more people outside the tech industry, for a total of almost 6 million more U.S. jobs. Growth in the Asia Pacific Big Data market is expected to accelerate rapidly in two to three years time, from a mere US$258.5 million last year to in excess of $1.76 billion in 2016, with highest growth in the storage segment.
  8. 8. Big Data as a New Class of Asset • The Age of Big Data (NYTimes Feb. 11, 2012) http://www.nytimes.com/2012/02/12/sunday-review/big-datas-impact-in-the- world.html “Welcome to the Age of Big Data. The new megarich of Silicon Valley, first at Google and now Facebook, are masters at harnessing the data of the Web — online searches, posts and messages — with Internet advertising. At the World Economic Forum last month in Davos, Switzerland, Big Data was a marquee topic. A report by the forum, “Big Data, Big Impact,” declared data a new class of economic asset, like currency or gold.” 8
  9. 9. 9
  10. 10. The 3-Vs of Big Data • Volume – Amount of data • Velocity – speed of data in and out • Variety – range of data types and sources • [Gartner 2012] "Big Data are high-volume, high-velocity, and/or high- variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization" 10
  11. 11. What can you do with the data • Reporting – Post Hoc – Real time • Monitoring (fine-grained) • Exploration • Finding Patterns • Root Cause Analysis • Closed-loop Control • Model construction • Prediction • … 11 © Mike Franklin
  12. 12. 10 ways big data changes everything • Some concrete examples – http://gigaom.com/2012/03/11/10-ways-big-data-is-changing-everything/2/ 1. Can gigabytes predict the next Lady Gaga? 2. How big data can curb the world’s energy consumption 3. Big data is now your company’s virtual assistant 4. The future of Foursquare is data-fueled recommendations 5. How Twitter data-tracked cholera in Haiti 6. Revolutionizing Web publishing with big data 7. Can cell phone data cure society’s ills? 8. How data can help predict and create video hits 9. The new face of data visualization 10. One hospital’s embrace of big data 12
  13. 13. Typical Big Data Success Story • Modeling users through Big Data – Online ads sale / placement [e.g., Facebook] – Personalized Coupons [e.g., Target] – Product Placement [Walmart] – Content Generation [e.g., NetFlix] – Personalized learning [e.g., Duolingo] – HR Recruiting [e.g., Gild] 13
  14. 14. More Data => Better Answers? • Not that easy… • More Rows: Algorithmic complexity kicks in • More Columns: Exponentially more hypotheses • Another formulation of the problem: – Given an inferential goal and a fixed computational budget, provide a guarantee that the quality of inference will increase monotonically as data accrue (without bound) • In other words: => Data should be a resource, not a load 14 © Mike Jordan
  15. 15. Big Data Infrastructures 15
  16. 16. A Concrete Example: Zynga 16
  17. 17. Leading the Pack of Wolves: Hadoop • Google: Map/Reduce paper published 2004 • Open source variant: Hadoop • Map-reduce = high-level programming model and implementation for large-scale parallel data processing • Right now most overhyped system in CS 17
  18. 18. What about Swiss Big Data? • Competitive Research Groups • Swiss Big Data User Group • Swiss companies playing catch-up – Productized Big Data systems at leading telcos & financial companies – Big Data is not a new technology: it's a fact; • Deal with it  POCs in most banks, insurance companies, retailers 18
  19. 19. Tasty Bites of Big Data (1) Thursday afternoon • 13:30-15:00: Big Data Profiling Felix Naumann (Hasso Plattner Institute) • 15:15-16:45: Realtime Analytics Christoph Koch (EPFL) • 16:45-17:45: Current Trends and Challenges in Big Data Benchmarking Kais Sachs (SAP / Spec) 19
  20. 20. Tasty Bites of Big Data (2) Friday • 9:00 - 10:30: Structured Data in Web Search Alon Halevy (Google) • 10:45 - 12:15: Human Computation for Big Data Gianluca Demartini (UNIFR) • 13:30-15:00: Analysing and Querying Big Scientific Data Thomas Heinis (EPFL) • 15:00-16:30: The Evolution of Big Data Frameworks Carlo Curino (Microsoft Research) 20
  21. 21. Social Event, Friday – Beer Tasting! Basse-Ville Fribourg / 15 CHF per Person Everything You Always Wanted to Know About Beer. * But Were Afraid to Ask! 18:00 @ Café du Belvédère, Grand-Rue 36 19:00 @ Fri-Mousse, Rue de la Samaritaine 19 Limited Places, Inscription is mandatory at: http://xr.si

×