Karmasphere hadoop-productivity-tools

2,392 views
2,339 views

Published on

Published in: Education
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,392
On SlideShare
0
From Embeds
0
Number of Embeds
339
Actions
Shares
0
Downloads
15
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Karmasphere hadoop-productivity-tools

  1. 1. This slide intentionally left blank.
  2. 2. State-of-the-Art Productivity Tools for Developers & Analysts Shevek
  3. 3. About Karmasphere ● Productivity suite for Developers and Analysts. ● Point-and-drool GUI for Hadoop, Hive, Cascading, Pig. ● MapReduce development and debugging on-cluster. ● Integrated with Eclipse and NetBeans IDEs. ● Interface between a human (you!) and a Hadoop cluster. ● Does the boring, tedious or repetitive bits. ● Finds the errors fast before you do. ● Works anywhere with anything. HALP! Karmasphere Hockey sticks!
  4. 4. The Idea ● Collect Underpants ● ....? ● Profit But what goes in the middle?
  5. 5. The Problem ● Collect Data ● Convert to MapReduce ● Execute ● Debug ● Tune ● … Profit Get someone else to do it!
  6. 6. How long will it take? ● Performance Of what? Surely not the computer.
  7. 7. Computational Performance Time (faster considered better) Make this algorithm as fast as you can.
  8. 8. Analytics Performance But what about this bit? Or this bit? Analytics is slightly different.
  9. 9. Analytics Performance But what about this bit? Or this bit? That the human understands the problem does not mean that the computer understands the problem.
  10. 10. Analytics Performance But what about this bit? Or this bit? The computer knowing the answer is not the same as the human understanding the answer.
  11. 11. Common MapReduce Challenges ● How do I write a Hadoop job? ● Did my job work? ● If it didn't throw an exception, it worked. Right? ● Did I get the correct answer? ● Are you sure? ● Do you have enough information to prove that? ● … to your accountants or customers? ● What happened? or What do I need to know? ● Please note, this feature is now officially called the “Job Profiler”, not the “What?! Window.”
  12. 12. Karmasphere Studio
  13. 13. Karmasphere Studio
  14. 14. Common Analytical Tasks So common, in fact, that ... group sort aggregate intersection unique limit scan join function hash materialize condition set operations store cat index
  15. 15. High Level Languages Hive PigCascading
  16. 16. Cascading A workflow based language Perfect for dylsexics like me.
  17. 17. Pig An imperative scripting language data =     LOAD '$input'     AS (query:CHARARRAY,         count:INT); queries_group =      GROUP data      BY query     PARALLEL $reducers; queries_sum =     FOREACH queries_group      GENERATE          group AS query,          SUM(data.count) AS count; queries_ordered = ORDER queries_sum      BY count DESC     PARALLEL $reducers; Simple and accessible to all.
  18. 18. Hive An SQL-like language FROM (  FROM (   FROM src src1 SELECT src1.key AS c1, src1.value AS c2 WHERE src1.key > 10 and src1.key < 20   ) a  FULL OUTER JOIN (   FROM src src2 SELECT src2.key AS c3, src2.value AS c4 WHERE src2.key > 15 and src2.key < 25  ) b   ON (a.c1 = b.c3)  SELECT a.c1 AS c1, a.c2 AS c2, b.c3 AS c3, b.c4 AS c4 ) c SELECT c.c1, c.c2, c.c3, c.c4 I can parse that in my head, honest.
  19. 19. Karmasphere Analyst FROM (   FROM src select src.key, src.value WHERE src.key < 100   UNION ALL   FROM src SELECT src.* WHERE src.key > 100 ) unioninput INSERT OVERWRITE DIRECTORY 'union.out' SELECT unioninput.*
  20. 20. Karmasphere Analyst
  21. 21. Conclusions How long does it take to get your answers?
  22. 22. How to get involved ● Getting started as a Hadoop Java Developer? ● Download Karmasphere Studio FREE! ● Deploying Hadoop jobs in production? ● Use Karmasphere Studio Professional Edition. ● Want to use high level languages like SQL? ● Talk to us about Karmasphere Analyst. ● Join the beta programme!
  23. 23. Questions, Errata, Heckling ● Some questions suggested by others: ● Where can I download Karmasphere Studio Community Edition? – Visit http://www.karmasphere.com/ for free downloads and great justice. ● What about building production-ready jobs for enterprise deployment? – Ask us about introductory offers on Karmasphere Studio Professional Edition. ● How can I use graphical SQL on Hadoop? – Talk to us about the Karmasphere Analyst Sekrit(!) Beta. ● Some questions I thought up: ● How do I (something awfully complicated)? – Please talk to us, we enjoy the challenges. ● Is there any tea on this spaceship? ● And some from the audience, please! ● I get paid by the answer. I need questions.
  24. 24. BAYAREAHADOOPUSERGROUP;KARMASPHERE®PRODUCTION KARMASPHERE STUDIO PRODUCTIVITY SUITE FOR DEVELOPERS AND ANALYSTS SHEVEKCTO,KARMASPHERE MARTINHALLCEO,KARMASPHERE kDARRENARONOFSKYpCLAUDEBESSONcMETALLICA_ENNIOMORRICONEnJKROWLING dJACQUELINEDURRANzJIMHENSONuINDUSTRIALLIGHTANDMAGIC ÄÇÀ

×