HBase and Impala Notes - Munich HUG - 20131017


Published on

Talk given during the Munich HUG meetup, 10/17/2013, about how HBase and Impala work together and caveats to watch out for.

Published in: Technology, Sports
1 Comment
No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

HBase and Impala Notes - Munich HUG - 20131017

  1. 1. HBase  and  Impala   Use  Cases  for  fast  SQL  queries   1  
  2. 2. About  Me   •  EMEA  Chief  Architect  @  Cloudera  (3+  years)   •  •  Apache  CommiLer   •  •  ConsulGng  on  Hadoop  projects  (everywhere)   HBase  and  Whirr   O’Reilly  Author   •  HBase  –  The  DefiniGve  Guide   •  •  Contact   •  •  2   Now  in  Japanese!   lars@cloudera.com   @larsgeorge   日本語版も出ました!  
  3. 3. Agenda   “IntroducGon”  to  HBase   •  Impala  Architecture   •  Mapping  Schemas   •  Query  ConsideraGon   •  3  
  4. 4. Intro  To  HBase   Slide  4  to  250   4  
  5. 5. What  is  HBase?   This  is  HBase!   HBase   5  
  6. 6. What  is  HBase?   This  is  HBase!   Really  though…  RTFM!   (there  are  at  least  two  good  books   about  it)   6   HBase  
  7. 7. IOPS  vs  Throughput  Mythbusters   It  is  all  physics  in  the  end,  you  cannot  solve  an  I/O   problem  without  reducing  I/O  in  general.  Parallelize   access  and  read/write  sequenGally.   7  
  8. 8. HBase:  Strengths  &  Weaknesses   Strengths:   •  Random  access  to  small(ish)  key-­‐value  pairs   •  Rows  and  columns  stored  sorted  lexicographically     •  Adds  table  and  region  concepts  to  group  related  KVs   •  Stores  and  reads  data  sequenGally   •  Parallelizes  across  all  clients   •  8   Non-­‐blocking  I/O  throughout  
  9. 9. Using  HBase  Strengths   9  
  10. 10. HBase  “Indexes”   •  Use  primary  keys,  aka  the  row  keys,  as  sorted  index   •  •  One  sort  direcGon  only   Use  “secondary  index”  to  get  reverse  sorGng   •  •  Use  secondary  keys,  aka  the  column  qualifiers,  as   sorted  index  within  main  record   •  10   Lookup  table  or  same  table   Use  prefixes  within  a  column  family  or  separate  column   families    
  11. 11. HBase:  Strengths  &  Weaknesses   Weaknesses:   •  Not  opGmized  (yet)  for  100%  possible  throughput  of   underlying  storage  layer   •  And  HDFS  is  not  opGmized  fully  either   Single  writer  issue  with  WALs   •  Single  server  hot-­‐sporng  with  non-­‐distributed  keys   •  11  
  12. 12. HBase  Dilemma   Although  HBase  can  host  many  applicaGons,  they  may   require  completely  opposite  features   Events   Time  Series   12   En((es   Message  Store  
  13. 13. Opposite  Use-­‐Case   •  EnGty  Store   •  •  •  •  •  Event  Store   •  •  •  •  13   Regular  (random)  updates  and  inserts  in  exisGng  enGty   Causes  enGty  details  being  spread  over  many  files   Needs  to  read  a  lot  of  data  to  reconsGtute  “logical”  view   WriGng  is  osen  nicely  distributed  (can  be  hashed)   One-­‐off  inserts  of  events  such  as  log  entries   Access  is  osen  a  scan  over  parGGons  by  Gme   Reads  are  efficient  due  to  sequenGal  write  paLern   Writes  need  to  be  taken  care  of  to  avoid  hotsporng  
  14. 14. Impala  Architecture   14  
  15. 15. Beyond  Batch   For  some  things  MapReduce  is  just  too  slow   Apache  Hive:   •  •  •  MapReduce  execuGon  engine   High-­‐latency,  low  throughput   High  runGme  overhead   Google  realized  this  early  on   •  15   Analysts  wanted  fast,  interacGve  results  
  16. 16. Dremel   Google  paper  (2010)   “scalable,  interac.ve  ad-­‐hoc  query  system  for  analysis  of   read-­‐only  nested  data”   Columnar  storage  format   Distributed  scalable  aggregaGon   “capable  of  running  aggrega.on  queries  over  trillion-­‐row   tables  in  seconds”   hLp://research.google.com/pubs/pub36632.html   16  
  17. 17. Impala:  Goals   General-­‐purpose  SQL  query  engine  for  Hadoop   •  For  analyGcal  and  transacGonal  workloads   •  Support  queries  that  take  ms  to  hours   •  Run  directly  with  Hadoop   •  •  •  •  17   Collocated  daemons   Same  file  formats   Same  storage  managers  (NN,  metastore)  
  18. 18. Impala:  Goals   •  High  performance   •  •  •  •  Retain  user  experience   •  •  18   C++   runGme  code  generaGon  (LLVM)   direct  access  to  data  (no  MapReduce)   easy  for  Hive  users  to  migrate   100%  open-­‐source  
  19. 19. Impala:  Architecture   •  impalad   •  •  •  •  statestored   •  •  •  19   runs  on  every  node   handles  client  requests  (ODBC,  thris)   handles  query  planning  &  execuGon   provides  name  service   metadata  distribuGon   used  for  finding  data  
  20. 20. Impala:  Architecture   20  
  21. 21. Impala:  Architecture   21  
  22. 22. Impala:  Architecture   22  
  23. 23. Impala:  Architecture   23  
  24. 24. Mapping  Schemas   HBase  to  Typed  Schema   24  
  25. 25. Binary  to  Types   HBase  only  has  binary  keys  and  values   •  Hive  and  Impala  share  the  same  metastore  which   adds  types  to  each  column   •  •  •  The  row  key  of  an  HBase  table  is  mapped  to  a  column   in  the  metastore,  i.e.  on  the  SQL  side     •  25   Can  use  Hive  or  Impala  shell  to  change  metadata   Impala  prefers  “String”  type  to  beLer  support  comparisons   and  sorGng  
  26. 26. Defining  the  Schema   CREATE TABLE hbase_table_1( key string, value string ) STORED BY "org.apache.hadoop.hive.hbase.HBaseStorageHandler" WITH SERDEPROPERTIES( "hbase.columns.mapping" = ":key,cf1:val" ) TBLPROPERTIES ( "hbase.table.name" = "xyz" ); 26  
  27. 27. Defining  the  Schema   CREATE TABLE hbase_table_1( key string, value string ) Maps  columns  to  fields   STORED BY "org.apache.hadoop.hive.hbase.HBaseStorageHandler" WITH SERDEPROPERTIES( "hbase.columns.mapping" = ":key,cf1:val" ) TBLPROPERTIES ( "hbase.table.name" = "xyz" ); 27  
  28. 28. Mapping  OpGons   •  Can  create  a  new  table  or  map  to  an  exis(ng  one   •  •  •  CreaGng  table  through  Hive  or  Impala  does  not  set   any  table  or  column  family  proper(es   •  •      28   CREATE TABLE    vs.   CREATE EXTERNAL TABLE Typically  not  a  good  idea  to  rely  on  defaults   BeLer  specify  compression,  TTLs,  etc.  on  HBase  side  and   then  map  as  external  table  
  29. 29. Mapping  OpGons   SERDE  ProperGes  to  map  columns  to  fields   •  hbase.columns.mapping •  •  •  •  •  hbase.table.default.storage.type •  •  •  29   Matching  count  of  entries  required  (on  SQL  side  only)   Spaces  are  not  allowed  (as  they  are  valid  characters  in  HBase)   The  “:key”  mapping  is  a  special  one  for  the  HBase  row  key   Otherwise:  column-family-name:[column-name] [#(binary|string) Can  be  string  (the  default)  or  binary Defines  the  default  type   Binary  means  data  treated  like  HBase  Bytes  class  does    
  30. 30. Mapping  Limits   •  Only  one  (1)  “:key”  is  allowed   •  •  But  can  be  inserted  in  SQL  schema  at  will   Access  to  HBase  KV  versions  are  not  supported  (yet)   •  •  Always  returns  the  latest  version  by  default   This  is  very  similar  to  what  a  database  user  expects   HBase  columns  not  mapped  are  not  visible  on  SQL  side   •  Since  row  keys  in  HBase  are  unique,  results  may  vary   •  •  •  30   InserGng  duplicate  keys  updates  row  while  count  of  rows  stays   the  same   INSERT  OVERWRITE  does  not  delete  exisGng  rows  but   rather  updates  those  (HBase  is  mutable  aser  all!)  
  31. 31. Query  ConsideraGons   31  
  32. 32. HBase  Table  Scan   $ hbase shell hbase(main):001:0> list xyz 1 row(s) in 0.0530 seconds' Table  was  created   hbase(main):002:0> describe "xyz" DESCRIPTION ENABLED {NAME => 'xyz', FAMILIES => [{NAME => 'cf1', COMPRESSION => 'NONE', VE true RSIONS => '3', TTL => '2147483647', BLOCKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}]} 1 row(s) in 0.0220 seconds hbase(main):003:0> scan "xyz" ROW COLUMN+CELL 0 row(s) in 0.0060 seconds 32   Table  empty  
  33. 33. HBase  Table  Scan   Insert  data  from  exisGng  table  into  HBase  backed  one:   INSERT OVERWRITE TABLE hbase_table_1 SELECT * FROM pokes WHERE foo=98;   Verify  on  HBase  side:   hbase(main):009:0> scan "xyz" ROW COLUMN+CELL 98 column=cf1:val, timestamp=1267737987733, value=val_98 1 row(s) in 0.0110 seconds 33  
  34. 34. Pro  Tip:  hLp://gethue.com/   34  
  35. 35. HBase  Scans  under  the  Hood   Impala  uses  Scan  instances  under  the  hood  just  as  the   naGve  Java  API  does.  This  allows  for  all  scan   opGmizaGons,  e.g.  predicate  push-­‐down,  like   •  Start  and  Stop  Row   Server-­‐side  Filters   •  Scanner  caching  (but  not  batching  yet)   •  35  
  36. 36. Configure  HBase  Scan  Details   In  impala-shell:     •  Same  as  calling  setCacheBlocks(true)  or   setCacheBlocks(false) set hbase_cache_blocks=true; set hbase_cache_blocks=false; •  Same  as  calling  setCaching(rows)   set hbase_caching=1000; 36  
  37. 37. HBase  Scans  under  the  Hood   Back  to  Physics:  A  scan  can  only  perform  well  if  as  few   data  is  read  as  possible.   •  Need  to  issue  queries  that  are  known  not  to  be  full   table  scans   •  This  requires  careful  schema  design!   Typical  use-­‐cases  are     •  OLAP  cube:  read  report  data  from  single  row   •  Time  series:  read  fine-­‐grained,  Gme  parGGoned  data   37  
  38. 38. OLAP  Example   Facebook  Insights  is  using  HBase  to  keep  an  OLAP   cube  live,  i.e.  fully  materialized   •  Each  row  reflect  one  tracked  page  and  contains  all  its   data  points   •  •  All  dimensions  with  Gme  bracket  prefix  plus  TTLs   During  report  Gme  only  one  or  very  few  rows  are   read   •  Design  favors  read  over  write  performance   •  Could  also  think  about  hybrid  system:   •  •  38   CEP  +  HBase  +  HDFS  (Parquet)  
  39. 39. Time  Series  Example   •  OpenTSDB  writes  the  metric  events  bucketed  by   metric  ID  and  then  Gmestamp   •  Helps  using  all  servers  in  the  cluster  equally   During  reporGng/dashboarding  the  data  is  read  for   specific  metrics  within  a  specific  (me  frame   •  Sorted  data  translates  into  effec(ve  use  of  Scan  with   start  and  stop  rows   •  39  
  40. 40. Final  Notes   Since  the  HBase  scan  performance  is  mainly  influenced  by   number  of  rows  scanned  you  need  to  issue  queries  that  are   selecGve,  i.e.  scan  only  certain  rows  and  not  the  en(re  table.     This  requires  WHERE  clauses  with  the  HBase  row  key  in  it:     SELECT f1, f2, f3 FROM mapped_table WHERE key >= "user1234" AND key < "user1235";   “Scan  all  rows  for  user  1234,  i.e.  that  have  a  row  key  starGng   with  user1234”  -­‐  might  be  a  composite  key!   40  
  41. 41. Example   41  
  42. 42. Final  Notes   Not  using  the  primary  HBase  index,  aka  row  key,  results   in  a  full  table  scan  and  might  need  much  longer  (when   you  have  a  large  table.     SELECT f1, f2, f3 FROM mapped_table WHERE f1 = ”value1” OR f20 < ”200";   This  will  result  in  a  full  table  scan.  Remember:  it  is  all   just  physics!   42  
  43. 43. Final  Notes   Impala  also  uses  SingleColumnValueFilter  from  HBase   to  reduce  transferred  data     •  Filters  out  enGre  rows  by  checking  a  given  column   value   •  Does  not  skip  rows  since  no  index  or  Bloom  filter  is   available  to  help  idenGfy  the  next  match     Overall  this  helps  yet  cannot  do  any  magic  (physics   again!)   43  
  44. 44. Final  Notes   Some  advice  on  Tall-­‐narrow  vs.  flat-­‐wide  table  layout:   Store  data  in  a  tall  and  narrow  table  since  there  is   currently  no  support  for  scanner  batching  (i.e.  intra   row  scanning).  Mapping,  for  example,  one  million   HBase  columns  into  SQL  is  fu(le.   This  is  sGll  true  for  Hive’s  Map  support,  since  the   enGre  row  has  to  fit  into  memory!   44  
  45. 45. Outlook   Future  work:   •  Composite  keys:  map  mul(ple  SQL  fields  into  a  single   composite  HBase  row  key   •  Expose  KV  versions  to  SQL  schema   •  BeLer  predicate  pushdown   •  45   Advanced  filter  or  indexes?  
  46. 46. Ques(ons?   46   @larsgeorge   lars@cloudera.com  
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.