Lift12Fr - Stephane Grumbach
Upcoming SlideShare
Loading in...5
×
 

Lift12Fr - Stephane Grumbach

on

  • 3,113 views

 

Statistics

Views

Total Views
3,113
Views on SlideShare
2,050
Embed Views
1,063

Actions

Likes
5
Downloads
82
Comments
0

17 Embeds 1,063

http://www.internetactu.net 881
http://geo.gob.bo 57
http://blog.50a.fr 52
http://www.50a.fr 17
http://entretiens-du-futur.blogspirit.com 15
https://twitter.com 14
http://thibautbrousse.blogspot.fr 8
http://flavors.me 5
http://thibautbrousse.blogspot.be 4
http://50a.pme-multimedia.com 2
http://translate.googleusercontent.com 2
http://thibautbrousse.blogspot.nl 1
http://mydatanews.blogspot.fr 1
http://thibautbrousse.blogspot.com.br 1
http://thibautbrousse.blogspot.com 1
http://mydatanews.blogspot.com 1
https://si0.twimg.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Apple Keynote

Usage Rights

CC Attribution License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n

Lift12Fr - Stephane Grumbach Lift12Fr - Stephane Grumbach Presentation Transcript

  • BIG DATA? THE GLOBAL IMBALANCE! Stéphane Grumbach INRIA 1
  • The digital universeData deluge in all sectors of activity U.S. Library of Congress: 235 Terabytes of data Walmart: 2.5 petabytes of data, 1 million customer transactions / hour Facebook: 30 Petabytes of user data Google: processing 20 petabytes a day (2008) World: 5 billion people calling, tweeting, browsing on mobile phonesExponential increase doubles every two years followed by the capacity to store, compute, and communicate 2
  • The digital universeData deluge in all sectors of activity kilo 103 U.S. Library of Congress: 235 Terabytes of data Walmart: 2.5 petabytes of data, 1 million customer transactions / hour mega 106 Facebook: 30 Petabytes of user data giga 109 Google: processing 20 petabytes a day (2008) tera 1012 World: 5 billion people calling, tweeting, browsing on mobile phones peta 1015 exa 1018Exponential increase zetta 1021 doubles every two years yotta 1024 followed by the capacity to store, compute, and communicate 2
  • The digital universe 2.7 ZettabytesData deluge in all sectors of activity kilo 103 U.S. Library of Congress: 235 Terabytes of data Walmart: 2.5 petabytes of data, 1 million customer transactions / hour mega 106 Facebook: 30 Petabytes of user data giga 109 Google: processing 20 petabytes a day (2008) tera 1012 World: 5 billion people calling, tweeting, browsing on mobile phones peta 1015 exa 1018Exponential increase zetta 1021 doubles every two years yotta 1024 followed by the capacity to store, compute, and communicate 2
  • The digital universe 2.7 ZettabytesData deluge in all sectors of activity kilo 103 U.S. Library of Congress: 235 Terabytes of data Walmart: 2.5 petabytes of data, 1 million customer transactions / hour mega 106 Facebook: 30 Petabytes of user data giga 109 Google: processing 20 petabytes a day (2008) tera 1012 World: 5 billion people calling, tweeting, browsing on mobile phones peta 1015 exa 1018Exponential increase zetta 1021 yotta 1024 doubles every two years 35 zettabytes in 2020 followed by the capacity to store, compute, and communicate 2
  • The Big Data Industry Advertising Capture users data Generate users profiles Target ads 3
  • The Big Data Industrybeyond advertising•$300 billion/year US health care•€250 billion/year Europe public administration [McKinsey 2011]Tremendous economic impact Teraeuros (thousands billions) 4
  • First challenge: Data Harvesting70% of the data produced by individuals directly produced by users: email, photos, blogs, etc. (less than half) indirectly digital shadow/footprint: surveillance, web usage, transactions 5
  • First challenge: Data Harvesting70% of the data produced by individuals directly produced by users: email, photos, blogs, etc. (less than half) indirectly digital shadow/footprint: surveillance, web usage, transactionsThe free paradigm of the 2.0 Free services traded for private user data Free exploitation of the accumulated data 5
  • Second challenge: knowledge extractionUser profiles (business) => Ads targetAutomatic discovery (science) => Google Flu monitoring of flu related queries a search engine company knows everything => Biological, sociological data...NSA (security) => Ambition to handle yottabytes (1024) !!! 6
  • Second challenge: knowledge extractionUser profiles (business) => Ads targetAutomatic discovery (science) => Google Flu monitoring of flu related queries a search engine company knows everything => Biological, sociological data...NSA (security) => Ambition to handle yottabytes (1024) !!! 6
  • Data: raw material of the 21st century (much like crude oil) 7
  • Data: raw material of the 21st century (much like crude oil) extractionfrom natural consumption transport refining at users reservoirs 7
  • Data: raw material of the 21st century (much like crude oil) extractionfrom natural consumption transport refining at users reservoirs accumulation production data in large Internet of data analytics repositories at users 7
  • Where are these data? 8
  • Where are these data?Huge concentration of data85% of data handled by (large) corporations Virtualization/dematerialization of infrastructures Social networks, Cloud, ...Most of the prominent corporations based in the USA Google, Facebook, Amazon, Twitter, ... Storage capacity of Europe = 70% USA [McKinsey 2011] 8
  • Where are these data?Huge concentration of data85% of data handled by (large) corporations Virtualization/dematerialization of infrastructures Social networks, Cloud, ...Most of the prominent corporations based in the USA Google, Facebook, Amazon, Twitter, ... Storage capacity of Europe = 70% USA [McKinsey 2011] 1/3 of world data stored in the cloud by 2020 8
  • Geopolitics of big data Alexa.com 9
  • Geopolitics of big dataData from the Web 2.0 produced by users everywhere in the world but accumulated by corporations most often abroadPercentage of national web corporations among top 25 by country Alexa.com 9
  • Geopolitics of big dataData from the Web 2.0 produced by users everywhere in the world but accumulated by corporations most often abroadPercentage of national web corporations among top 25 by country • USA: 100% • China: 92% (only Google makes it in the top 25) • France: 36% (but mostly marginal sites, not data intensive) leboncoin, Orange, Free, commentcamarche, lemonde, lequipe, lefigaro, pagesjaunes, sfr Alexa.com 9
  • Geopolitics of big data 10
  • Geopolitics of big dataThe Top 50 websites worldwide • USA: 72 % 10
  • Geopolitics of big dataThe Top 50 websites worldwide • USA: 72 % • China: 16 % (Baidu: 5; QQ: 8; Taobao: 13; Sina:17; 163: 28; Soso:29; Sina weibo:31; Sohu:43) • Russia: 6 % (Yandex: 21; kontakte:30; Mail: 33; ) • Israel: 2 % (Babylon: 22) • UK: 2 % (BBC: 46) • Netherland: 2 % (AVG: 47) 10
  • Geopolitics of big dataDiversity of search engines • USA: Google: 65 % ; Bing: 15% ;Yahoo: 15% • China: Baidu: 78% ; Google: 16% • Russia: Yandex: 60% ; Google: 25% • UK: Google: 91 % ; Bing: 5% • France: Google: 92 % ; Bing: 3%In France, • Google has a de facto monopoly • Google knows more about France than INSEE 11
  • The global imbalance Information asymmetry“Since asymmetries of information give rise to market power,and perfect competition is required if markets are to be efficient,it is perhaps not surprising that markets with informationasymmetries and other information imperfections are far fromefficient.” JOSEPH E. STIGLITZ 12
  • Impact of the global imbalanceRegulation What legislations over a dematerialized global industry? Aren’t the rules defined by those who have the control?Business How to face monopolistic positions? How to handle the information asymmetry?Security Data at the core of nations independence 13
  • The power of data Map Ecological Footprint 14 http://www.csa.com/discoveryguides/china/review.php
  • The power of data Map Ecological Footprint 14 http://www.csa.com/discoveryguides/china/review.php
  • What’s at stake in Europe?Suspicion (fear?) regarding data concern for privacy protection high in Europe active legislative work historical reasons?Weak industrial/innovation environment no strong corporation emergingBut essential dependence on foreign systems 15
  • Are there alternatives?dominant (centralized) model unclear privacy lost property active (centralized) business little share of business capacitydecentralized ‘utopian’ model high privacy Faroo, Yacy real ownership little business Diaspora 16
  • odel m ed) aliz ce ntr vac y ( ar pri ant uncle perty nessdomin t pr o bus i y los z ed) acit rali cap c ent in ess ve ( bus acti reo f e sha littl de ce ntr aliz e d ‘u rea hig h pri top l ow vacy ian l i tt ne ’m le bu rship od sin ess el Faroo, Yacy Diaspora 17
  • odel m ed) aliz ce ntr vac y ( ar pri ant uncle perty nessdomin t pr o bus i y los z ed) acit rali cap c ent in ess ve ( bus acti reo f e sha littl an alternative path ? active (competitive) business symmetry of information de ce ntr ownership & privacy aliz anti monopoly e d ‘u rea hig h pri top l ow vacy ian l i tt ne ’m le bu rship od sin ess el Faroo, Yacy Diaspora 17
  • An alternative path for Europe?The information society it is only emerging it will continue to evolve it will impact political systems new business models, new equilibrium will appear 18
  • An alternative path for Europe?The information society it is only emerging it will continue to evolve it will impact political systems new business models, new equilibrium will appear Europe should embrace the future 18
  • 19
  • 谢谢 19