Hadoop, Pig, and Twitter (NoSQL East 2009)
by Kevin Weil on Oct 30, 2009
- 75,445 views
A talk on the use of Hadoop and Pig inside Twitter, focusing on the flexibility and simplicity of Pig, and the benefits of that for solving real-world big data problems.
A talk on the use of Hadoop and Pig inside Twitter, focusing on the flexibility and simplicity of Pig, and the benefits of that for solving real-world big data problems.
Accessibility
Categories
Tags
More...Upload Details
Uploaded via SlideShare as Apple Keynote
Usage Rights
© All Rights Reserved
Statistics
- Favorites
- 191
- Downloads
- 1,851
- Comments
- 17
- Embed Views
- Views on SlideShare
- 73,893
- Total Views
- 75,445
http://inetworkingskills.blogspot.com/ 8 months ago Reply
I am doing a report of my work with Hadoop in a high performance computing environment for a national laboratory and was wondering if I could use some of your slides and photos in my report?
If you could get back to me that would be incredible. Thanks in advance.
Tyler Garvin 1 year ago Reply
Regards
Teisha
http://winkhealth.com
http://financewink.com
http://www.fakhriramley.com 1 year ago Reply
We don't read or write data directly from/to MySQL with Pig. The interfaces for load/store aren't quite right yet, although they're getting better (follow along in realtime at https://issues.apache.org/jira/browse/PIG-966). Rather, we export in batch from mysql into hdfs at regular intervals, which is good because then non-pig jobs can access this data too. And we have a little wrapper around pig that can optionally push the results of pig jobs back into databases etc after the job has run. This is something we plan on open sourcing once it hardens a bit. 2 years ago Reply