IBM Research – Business Solutions and Mathematical Sciences

For the Love of Big Data
Dr. Bob Sutor
VP, Business Solutions and Mathematical Sciences
IBM Research – Business Solutions and Mathematical Sciences

What is Big Data?
 Big data is being generated by everything around us.
 Every digital process and social media exchange produces it.
 Systems, sensors and mobile devices transmit it.
 Big data is arriving from multiple sources at amazing
velocities, volumes and varieties.
 To extract meaningful value from big data, you
need optimal processing power, storage,
analytics capabilities, and skills.

© 2014 International Business Machines Corporation

2
IBM Research – Business Solutions and Mathematical Sciences

Why do data scientists want more data, rather than less?
 It is there.
 Data is the basis of the models we create to explain, predict,
and affect behavior.
 With more data, our models become more sophisticated
and, we hope, more accurate.
 How much data is too much data?

© 2014 International Business Machines Corporation

3
IBM Research – Business Solutions and Mathematical Sciences

What issues can analytics present?
 Are all aspects of privacy, anonymization, and liability
understood by the practitioners?
 If I tell you that you cannot look at some data but you can
infer the information (e.g., gender) anyway, is that all right?
 What are the rules for working with metadata and
summarized data?
 How do we process static, collected data together with more
real-time, rapidly changing information such as location?

© 2014 International Business Machines Corporation

4
IBM Research – Business Solutions and Mathematical Sciences

Approach to policy can determine outcomes
 Reductions in the amount and kinds of data can produce
diminished or inaccurate results.
 Policy must take into account the value received by individuals
for the use of their personal data.
 Enforced data localization may decrease
analytical completeness unless we can
move intermediate results or the site of
computation.

© 2014 International Business Machines Corporation

5
IBM Research – Business Solutions and Mathematical Sciences

Approach to policy can determine outcomes
 Reductions in the amount and kinds of data can produce
diminished or inaccurate results.
 Policy must take into account the value received by individuals
for the use of their personal data.
 Enforced data localization may decrease
analytical completeness unless we can
move intermediate results or the site of
computation.

© 2014 International Business Machines Corporation

5

For the Love of Big Data

  • 1.
    IBM Research –Business Solutions and Mathematical Sciences For the Love of Big Data Dr. Bob Sutor VP, Business Solutions and Mathematical Sciences
  • 2.
    IBM Research –Business Solutions and Mathematical Sciences What is Big Data?  Big data is being generated by everything around us.  Every digital process and social media exchange produces it.  Systems, sensors and mobile devices transmit it.  Big data is arriving from multiple sources at amazing velocities, volumes and varieties.  To extract meaningful value from big data, you need optimal processing power, storage, analytics capabilities, and skills. © 2014 International Business Machines Corporation 2
  • 3.
    IBM Research –Business Solutions and Mathematical Sciences Why do data scientists want more data, rather than less?  It is there.  Data is the basis of the models we create to explain, predict, and affect behavior.  With more data, our models become more sophisticated and, we hope, more accurate.  How much data is too much data? © 2014 International Business Machines Corporation 3
  • 4.
    IBM Research –Business Solutions and Mathematical Sciences What issues can analytics present?  Are all aspects of privacy, anonymization, and liability understood by the practitioners?  If I tell you that you cannot look at some data but you can infer the information (e.g., gender) anyway, is that all right?  What are the rules for working with metadata and summarized data?  How do we process static, collected data together with more real-time, rapidly changing information such as location? © 2014 International Business Machines Corporation 4
  • 5.
    IBM Research –Business Solutions and Mathematical Sciences Approach to policy can determine outcomes  Reductions in the amount and kinds of data can produce diminished or inaccurate results.  Policy must take into account the value received by individuals for the use of their personal data.  Enforced data localization may decrease analytical completeness unless we can move intermediate results or the site of computation. © 2014 International Business Machines Corporation 5
  • 6.
    IBM Research –Business Solutions and Mathematical Sciences Approach to policy can determine outcomes  Reductions in the amount and kinds of data can produce diminished or inaccurate results.  Policy must take into account the value received by individuals for the use of their personal data.  Enforced data localization may decrease analytical completeness unless we can move intermediate results or the site of computation. © 2014 International Business Machines Corporation 5

Editor's Notes

  • #3 http://www.ibm.com/big-data/us/en/
  • #4 http://www-03.ibm.com/systems/x/solutions/analytics/bigdata.html