0
This slide intentionally left blank.
State-of-the-Art Productivity Tools
for Developers & Analysts
Shevek
About Karmasphere
● Productivity suite for Developers and Analysts.
● Point-and-drool GUI for Hadoop, Hive, Cascading, Pig...
The Idea
● Collect Underpants
● ....?
● Profit
But what goes in the middle?
The Problem
● Collect Data
● Convert to MapReduce
● Execute
● Debug
● Tune
● … Profit
Get someone else to do it!
How long will it take?
● Performance
Of what? Surely not the computer.
Computational Performance
Time (faster considered better)
Make this algorithm as fast as you can.
Analytics Performance
But what about
this bit?
Or this bit?
Analytics is slightly different.
Analytics Performance
But what about
this bit?
Or this bit?
That the human understands the problem does not
mean that the ...
Analytics Performance
But what about
this bit?
Or this bit?
The computer knowing the answer is not the same as
the human u...
Common MapReduce Challenges
● How do I write a Hadoop job?
● Did my job work?
● If it didn't throw an exception, it worked...
Karmasphere Studio
Karmasphere Studio
Common Analytical Tasks
So common, in fact, that ...
group
sort
aggregate
intersection
unique
limit
scan
join
function
has...
High Level Languages
Hive PigCascading
Cascading
A workflow based language
Perfect for dylsexics like me.
Pig
An imperative scripting language
data =
    LOAD '$input'
    AS (query:CHARARRAY,
        count:INT);
queries_group =...
Hive
An SQL-like language
FROM (
 FROM (
  FROM src src1 SELECT src1.key AS c1, src1.value AS c2 WHERE src1.key > 10 and s...
Karmasphere Analyst
FROM (
  FROM src select src.key, src.value WHERE src.key < 100
  UNION ALL
  FROM src SELECT src.* WH...
Karmasphere Analyst
Conclusions
How long does it take to get your answers?
How to get involved
● Getting started as a Hadoop Java Developer?
● Download Karmasphere Studio FREE!
● Deploying Hadoop j...
Questions, Errata, Heckling
● Some questions suggested by others:
● Where can I download Karmasphere Studio Community Edit...
BAYAREAHADOOPUSERGROUP;KARMASPHERE®PRODUCTION
KARMASPHERE STUDIO
PRODUCTIVITY SUITE FOR DEVELOPERS AND ANALYSTS
SHEVEKCTO,...
Karmasphere hadoop-productivity-tools
Upcoming SlideShare
Loading in...5
×

Karmasphere hadoop-productivity-tools

1,820

Published on

Published in: Education
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,820
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
15
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Transcript of "Karmasphere hadoop-productivity-tools"

  1. 1. This slide intentionally left blank.
  2. 2. State-of-the-Art Productivity Tools for Developers & Analysts Shevek
  3. 3. About Karmasphere ● Productivity suite for Developers and Analysts. ● Point-and-drool GUI for Hadoop, Hive, Cascading, Pig. ● MapReduce development and debugging on-cluster. ● Integrated with Eclipse and NetBeans IDEs. ● Interface between a human (you!) and a Hadoop cluster. ● Does the boring, tedious or repetitive bits. ● Finds the errors fast before you do. ● Works anywhere with anything. HALP! Karmasphere Hockey sticks!
  4. 4. The Idea ● Collect Underpants ● ....? ● Profit But what goes in the middle?
  5. 5. The Problem ● Collect Data ● Convert to MapReduce ● Execute ● Debug ● Tune ● … Profit Get someone else to do it!
  6. 6. How long will it take? ● Performance Of what? Surely not the computer.
  7. 7. Computational Performance Time (faster considered better) Make this algorithm as fast as you can.
  8. 8. Analytics Performance But what about this bit? Or this bit? Analytics is slightly different.
  9. 9. Analytics Performance But what about this bit? Or this bit? That the human understands the problem does not mean that the computer understands the problem.
  10. 10. Analytics Performance But what about this bit? Or this bit? The computer knowing the answer is not the same as the human understanding the answer.
  11. 11. Common MapReduce Challenges ● How do I write a Hadoop job? ● Did my job work? ● If it didn't throw an exception, it worked. Right? ● Did I get the correct answer? ● Are you sure? ● Do you have enough information to prove that? ● … to your accountants or customers? ● What happened? or What do I need to know? ● Please note, this feature is now officially called the “Job Profiler”, not the “What?! Window.”
  12. 12. Karmasphere Studio
  13. 13. Karmasphere Studio
  14. 14. Common Analytical Tasks So common, in fact, that ... group sort aggregate intersection unique limit scan join function hash materialize condition set operations store cat index
  15. 15. High Level Languages Hive PigCascading
  16. 16. Cascading A workflow based language Perfect for dylsexics like me.
  17. 17. Pig An imperative scripting language data =     LOAD '$input'     AS (query:CHARARRAY,         count:INT); queries_group =      GROUP data      BY query     PARALLEL $reducers; queries_sum =     FOREACH queries_group      GENERATE          group AS query,          SUM(data.count) AS count; queries_ordered = ORDER queries_sum      BY count DESC     PARALLEL $reducers; Simple and accessible to all.
  18. 18. Hive An SQL-like language FROM (  FROM (   FROM src src1 SELECT src1.key AS c1, src1.value AS c2 WHERE src1.key > 10 and src1.key < 20   ) a  FULL OUTER JOIN (   FROM src src2 SELECT src2.key AS c3, src2.value AS c4 WHERE src2.key > 15 and src2.key < 25  ) b   ON (a.c1 = b.c3)  SELECT a.c1 AS c1, a.c2 AS c2, b.c3 AS c3, b.c4 AS c4 ) c SELECT c.c1, c.c2, c.c3, c.c4 I can parse that in my head, honest.
  19. 19. Karmasphere Analyst FROM (   FROM src select src.key, src.value WHERE src.key < 100   UNION ALL   FROM src SELECT src.* WHERE src.key > 100 ) unioninput INSERT OVERWRITE DIRECTORY 'union.out' SELECT unioninput.*
  20. 20. Karmasphere Analyst
  21. 21. Conclusions How long does it take to get your answers?
  22. 22. How to get involved ● Getting started as a Hadoop Java Developer? ● Download Karmasphere Studio FREE! ● Deploying Hadoop jobs in production? ● Use Karmasphere Studio Professional Edition. ● Want to use high level languages like SQL? ● Talk to us about Karmasphere Analyst. ● Join the beta programme!
  23. 23. Questions, Errata, Heckling ● Some questions suggested by others: ● Where can I download Karmasphere Studio Community Edition? – Visit http://www.karmasphere.com/ for free downloads and great justice. ● What about building production-ready jobs for enterprise deployment? – Ask us about introductory offers on Karmasphere Studio Professional Edition. ● How can I use graphical SQL on Hadoop? – Talk to us about the Karmasphere Analyst Sekrit(!) Beta. ● Some questions I thought up: ● How do I (something awfully complicated)? – Please talk to us, we enjoy the challenges. ● Is there any tea on this spaceship? ● And some from the audience, please! ● I get paid by the answer. I need questions.
  24. 24. BAYAREAHADOOPUSERGROUP;KARMASPHERE®PRODUCTION KARMASPHERE STUDIO PRODUCTIVITY SUITE FOR DEVELOPERS AND ANALYSTS SHEVEKCTO,KARMASPHERE MARTINHALLCEO,KARMASPHERE kDARRENARONOFSKYpCLAUDEBESSONcMETALLICA_ENNIOMORRICONEnJKROWLING dJACQUELINEDURRANzJIMHENSONuINDUSTRIALLIGHTANDMAGIC ÄÇÀ
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×