Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Cloudera User Group Chicago - Cloudera Manager: APIs & Extensibility


Published on

This presentation was delivered by Cloudera systems engineer Matt Harris at the Chicago Cloudera User Group (CUG) meeting on December 3, 2013.

Published in: Technology
  • Be the first to comment

Cloudera User Group Chicago - Cloudera Manager: APIs & Extensibility

  1. 1. Cloudera  Manager  –  API’s  &   Extensibility     Ma#  Harris  –  Systems  Engineer   December  2013     1 CONFIDENTIAL  -­‐  RESTRICTED  
  2. 2. Cloudera  Manager   End-­‐to-­‐End  AdministraGon  for  CDH   1 Monitor   2 Diagnose   3 Integrate   4 Manage   Easily  deploy,  configure  &  opGmize  clusters   Maintain  a  central  view  of  all  acGvity   Easily  idenGfy  and  resolve  issues   Use  Cloudera  Manager  with  exisGng  tools   2 ©2013  Cloudera,  Inc.  All  Rights  Reserved.  
  3. 3. IntegraGng  with  your  IT  Mgmt  tools   Datacenter  Opera*ons   Various  op*ons  of  integra*ng  Cloudera  Manager  into  your  exis*ng   Installa;on,   Datacenter  Opera*ons/Tools  Monitoring   Deployment   Aler;ng   Tools   tools   Tools   e.g.  Orion,     •  Cloudera  Manager  API   e.g.  Chef,   e.g  Nagios,   Tivoli,  BMC   Puppet  etc.   SNMP  etc.   •  Introduced  in  CM4  (June  2etc.   012)     •  Installa*on  &  deployment   •  Monitoring   •  SNMP  Alerts   •  Introduced  in  CM4.5  (Feb  2013)   •  Hadoop  Opera*ons   And  more…   Cloudera   •  Monitoring  ‘tsquery’  (Feb  2013)   Manager   •  User-­‐defined  triggers/alarms  (new  for  C5!)   •  Service  extensibility  (new  for  C5!)   3 ©2013  Cloudera,  Inc.  All  Rights  Reserved.  
  4. 4. Cloudera  Manager  (CM)  API     •  •      4 API  access  was  a  new  feature  introduced  in  Cloudera  Manager  4.0,  providing  programmaGc  access  to   cluster  operaGons  (such  as  configuraGon  and  restart)  and  monitoring  informaGon  (such  as  health  and   metrics).     The  CM  API  is  an  HTTP  REST  API,  using  JSON  serializaGon.  The  API  is  served  on  the  same  host  and  port  as   the  CM  web  UI,  and  does  not  require  an  extra  process  or  extra  configuraGon.  API  users  have  the  same   privileges  as  they  do  in  the  web  UI  world.   •  Docs  &  Examples   h#p://   h#ps://   •  Java/Python  clients   h#p://­‐to-­‐ automate-­‐your-­‐hadoop-­‐cluster-­‐from-­‐java/       ©2013Cloudera,  Inc.  All  Rights  Reserved.  
  5. 5. Examples  of  integraGon  with  CM  API   •  Installa;on  &  Deployment   •  •  •  Chef   Puppet   Dell  Crowbar   •  •  h#p://­‐to-­‐deploy-­‐hadoop-­‐clusters-­‐automaGcally-­‐with-­‐dell-­‐crowbar-­‐and-­‐cloudera-­‐manager/ StackIQ   •    h#p://­‐Cluster-­‐Manager-­‐now-­‐integrated-­‐with-­‐Cloudera   •  •  •  WANdisco  –  non-­‐stop  NN  setup   Several  other  customers/partners  leveraging  the  API’s  as  part  of    their  install  &  deployment     process   Monitoring  &  Aler;ng   •  •  Oracle  Enterprise  Manager  (via  Big  Data  Appliance)   Nagios   •  •  •  h#ps://   h#ps://­‐plugins/blob/master/   SNMP  alerts  integraGon  with  IBM  Netcool   Develop  &  Contribute  your  plug-­‐in’s  using  Cloudera   Manager  API     5 ©2013  Cloudera,  Inc.  All  Rights  Reserved.  
  6. 6. Cloudera  Manager  –  Monitoring  via  ‘tsquery’   •  Introduced  as  part  of  CM4.5    release  (Feb  2013)   •  Great  way  to  add  interesGng    charts  (above  &  beyond  what  is  provided  by  default)    and  monitor   metrics  that  are  relevant  to  your  clusters   •  •  The  tsquery  language   s  used  to    Manager  Gme-­‐series  diata  store   specify  statements  for  retrieving  Gme-­‐series  data  from  the  Cloudera     Example:  How  do  I  compare  all  disk  IO  for  all  the  DataNodes  that  belong  to  a  specific  HDFS  service?   select  bytes_read,  bytes_wriZen  where  roleType=DATANODE  and  serviceName=hdfs1   •  Retrieved  Gme-­‐series  data  can  be  plo#ed  via  various  opGons  –  line,  bar,  sca#er,    heat  maps,  table  list   etc.   •  Extending  this  concept  to  create  user-­‐defined  triggers/alarms  (new  for  C5!).     More  details   •  h#p://­‐content/cloudera-­‐docs/CM5/latest/Cloudera-­‐ Manager-­‐DiagnosGcs-­‐Guide/cm5dg_chart_Gme_series_data.html   •  6 ©2013  Cloudera,  Inc.  All  Rights  Reserved.  
  7. 7. Examples  of  Cloudera  Manager  ‘tsquery’   Example1:  How  do  I  track  the     aggregate  Cluster  Disk  IO?     select  dt0(read_bytes_disk_sum),   dt0(write_bytes_disk_sum)  where   category  =  CLUSTER  and  clusterId  =   $CLUSTERID   Example2:  How  do  I  compare  CPU   usage  across  hosts?   select  dt0(total_cpu_user)  /  getHostFact(numCores,  1)  *  100,   dt0(total_cpu_system)  /  getHostFact(numCores,  1)  *  100,   dt0(total_cpu_nice)  /  getHostFact(numCores,  1)  *  100,   dt0(total_cpu_iowait)  /  getHostFact(numCores,  1)  *  100,   dt0(total_cpu_irq)  /  getHostFact(numCores,  1)  *  100,   dt0(total_cpu_so`_irq)  /  getHostFact(numCores,  1)  *  100   Create  &  Contribute  your  ‘tsqueries’!   h#ps://   7 ©2013  Cloudera,  Inc.  All  Rights  Reserved.  
  8. 8. Cloudera  Manager  –  Service  Extensibility   •  Introduced  in  C5   •  SGll  in  Beta!   •  Some  aspects  (espcially  Parcel  mgmt)  available  in  CM4.x   •  Example:  CollaboraGon  with  Syncsort  to  deploy  DMX-­‐h  libraries   •  Single  management  console  for  CDH,  non-­‐CDH  services  and  ISV   applicaGons   •  Similar  look  and  feel  as  exisGng  services   •  Easy  to  write  (Java-­‐free!)   •  Flexible   •  Independent  release  cycle   ©2013Cloudera, Inc. All Rights Reserved.
  9. 9. Analogy  from  OperaGng  Systems  (OS)  world       ISV’s  view  of  OS     Systems  Management   Package   Mgmt   Process/   Resource   Mgmt   Security   Mgmt   Core  OS  kernel   9 ©2013Cloudera,  Inc.  All  Rights  Reserved.   Data   Access   Mgmt  
  10. 10. Bringing  ISV  Apps  to  CDH       ISV’s  view  of  Hadoop     Cloudera  Manager   Parcels   Resource     Mgmt   Security   Mgmt   CDK  API’s   Core  Hadoop/CDH  kernel   10 ©2013Cloudera,  Inc.  All  Rights  Reserved.  
  11. 11. IntegraGng  into  the  Cloudera  Product  Porpolio   Features   Examples   -­‐  Ability  to  easily  package  and  distribute  binaries/ jars  via  “Parcels”   -­‐ InformaGca   -­‐ Syncsort   Resource   Mgmt   -­‐  Ability  to  deploy  applicaGons  as  stand-­‐alone   processes    or  via  YARN*  on  the  Hadoop  grid   -­‐  Resource  isolaGon  of  cluster  resources       -­‐ SAS   -­‐ 0xData   -­‐ Accumulo   Security   Mgmt   Cloudera  Manager   Descrip;on   Package   Mgmt   ISV’s   -­‐  Support  for  Kerberos  Mgmt   -­‐  Role  bases  access  control  for  Tables/Views  in   Hive/Impala  via  Sentry   Data  Access   Mgmt   -­‐  HDFS  and  HBase  API  abstracGon  and   simplificaGon   Systems  Mgmt   Manage   Monitor   Non-­‐CDH  Apps…   Accumulo,   Spark,  Giraph   etc.   -­‐ Deploy  and  upgrade  (rolling)  services  and  pkgs   -­‐ Manage  configuraGons   -­‐ ProacGve  health  checks   -­‐ Track  resource  uGlizaGon     -­‐ Custom  metrics  charts   Diagnose   -­‐ Distributed  log  collecGon  and  searching   -­‐ Tag  and  track  key  events   Integrate   -­‐ Access  operaGonal  tools  via  API   -­‐ Surface  overall  cluster  metrics  to  ISV  dashboard     *  Support  for  YARN  planned  as  part  of  CM5.x  in  FY14   11 ©2013Cloudera,  Inc.  All  Rights  Reserved.  
  12. 12. So..  How  does  it  work?     •  A  JSON  file  that  describes  your  service   •  Set  of  control  scripts   •  Packaged  as  a  JAR  file   •  As  promised,  Java-­‐free   ©2013Cloudera, Inc. All Rights Reserved.
  13. 13. Example:  Cloudera  Manager  Extensions  -­‐  Spark       ©2013Cloudera, Inc. All Rights Reserved.
  14. 14.   Cloudera  Manager  Extensions     ©2013Cloudera, Inc. All Rights Reserved.
  15. 15. Cloudera  Manager  Extensions:  Spark   ©2013Cloudera, Inc. All Rights Reserved.
  16. 16. Cloudera  Manager  Extensions:  Spark   ©2013Cloudera, Inc. All Rights Reserved.
  17. 17. Cloudera  Manager  Extensions:  Spark   ©2013Cloudera, Inc. All Rights Reserved.
  18. 18. The  Code       #!/bin/bash   name  :  “spark”,   roles  :  [{            name  :  "master",   CMD=$1   MASTER_PORT=<read  in  from  ./params.proper;es>            startRunner  :  {                    program  :  "scripts/",                    args  :  [    "start_master",     case  $CMD  in      (start_master)      exec  $SPARK_HOME/scripts/spark-­‐  master"                                                  "./params.proper;es"]                },          ;;                parameters  :  [{        (*)          echo  "$;mestamp  Don't  understand  [$CMD]"                      name  :  "master_port",                      type  :  "port",          ;;   esac                      default  :  7077                  }],                configWriter  :  {                      generators  :  [{                            filename  :  "params.proper;es"                      }]   }]   ©2013Cloudera, Inc. All Rights Reserved.
  19. 19. Next  Steps   •  DocumentaGon  &  SDK  as  part  of  C5  Beta2   or  later  (definitely  before  GA!)   •  Working  with  select  ISV’s  (SAS,  Syncsort,   0xData  etc.)  as  part  of  Beta  to  further  fine-­‐ tune  this  feature     Develop  &  Contribute  your    Cloudera  Manager  service   extensibility  plug-­‐in’s  !   ©2013Cloudera, Inc. All Rights Reserved.
  20. 20. Horizontal Extension Security ISV’s 0xData SAS Syncsort Informatica Revolution API Ops Apps Capacity Mgr Service Extensibility Vertical Extension Vision  of  CM  Extensibility   SLA Mgr Cost Optimizer CDH CM SNMP API Oracle OEM 20 Nagios Dell Chef/ Puppet ©2012Cloudera,  Inc.  All  Rights  Reserved.   Accumulo Spark Giraph
  21. 21. Q&A   ©2013Cloudera, Inc. All Rights Reserved.