Wrong confirmation ID
  • Email
  • Favorite
  • Download
  • Embed
  • Private Content

Big Data: tools and techniques for working with large data sets

by Ian Stokes-Rees on Aug 07, 2011

  • 1,342 views

Working with thousands, millions, or billions of data records in high dimensions is increasingly becoming the reality for scientific research. What are some techniques to make this kind of data volume...

Working with thousands, millions, or billions of data records in high dimensions is increasingly becoming the reality for scientific research. What are some techniques to make this kind of data volume tractable? How can parallel computing help? In this talk I'll review data management tools and infrastructures, languages, and paradigms that help in this regard. In particular, I'll discuss Hadoop, MapReduce, Python, NumPy, and Globus Online to provide a survey of ways in which researchers can manage their data and process it in parallel.

Accessibility

Categories

Tags

globusonline mapreduce semantic mediawiki grid irods hadoop gridftp cloud big data tools

More...

Upload Details

Uploaded via SlideShare as Apple Keynote

Usage Rights

© All Rights Reserved

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

Cancel

Statistics

Favorites
4
Downloads
252
Comments
0
Embed Views
0
Views on SlideShare
1,342
Total Views
1,342
Post Comment
Edit your comment Cancel

Big Data: tools and techniques for working with large data sets — Presentation Transcript