Big Data
Big Data <ul><li>Astronomy: New telescopes coming online generate peta bytes of data per day. </li></ul><ul><li>Large scal...
Challenges <ul><li>Scale: Data can't be moved for analysis </li></ul><ul><ul><li>Analyze in-situ </li></ul></ul><ul><ul><l...
Addressing the challenge <ul><li>Hardware investment </li></ul><ul><li>R & D of analysis algorithms </li></ul><ul><ul><li>...
Upcoming SlideShare
Loading in …5
×

Big data

481 views

Published on

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

Big data

  1. 1. Big Data
  2. 2. Big Data <ul><li>Astronomy: New telescopes coming online generate peta bytes of data per day. </li></ul><ul><li>Large scale systems modelling in areas of climate and plasma fusion </li></ul><ul><li>Data from x-ray observatories and neuron scattering data. </li></ul>
  3. 3. Challenges <ul><li>Scale: Data can't be moved for analysis </li></ul><ul><ul><li>Analyze in-situ </li></ul></ul><ul><ul><li>Extract smaller set of relevant data </li></ul></ul><ul><li>High flux streaming data </li></ul><ul><li>Structured and unstructured data </li></ul><ul><li>Real-time decision on data to keep vs eliminate </li></ul><ul><li>Optimal mix of access vs compute resources and how to organize the data. </li></ul>
  4. 4. Addressing the challenge <ul><li>Hardware investment </li></ul><ul><li>R & D of analysis algorithms </li></ul><ul><ul><li>Data collection methods don't conform to the hypothesis in statistical analysis </li></ul></ul><ul><ul><li>Data might not be independent and identically distributed </li></ul></ul><ul><li>Deterministic scalable algorithms for analysis </li></ul><ul><li>Randomized algorithms robust to certain hardware failures. </li></ul><ul><li>Sufficient data but semantic gap. e.g. Video streams. </li></ul><ul><li>Lack of enough data to reach defensible conclusions. </li></ul>

×