Using	  Standard	  File-­‐Based	                         Applica4ons	  and	  SQL-­‐Based	                                 ...
Who	  am	  I?	        hBp://www.mapr.com/company/events/             speaking/dc-­‐hug-­‐9-­‐18-­‐12	  §    Keys	  Botzum...
The	  MapR	  Distribu4on	  for	  Apache	  Hadoop	  §    The	  open,	  enterprise-­‐grade	  distribuLon	  for	  Apache	  H...
MapR	  in	  the	  Cloud	  	  §       Available	  as	  a	  service	  with	  Amazon	  ElasLc	  MapReduce	  (EMR)	          ...
MapR	          Make	  Hadoop	                                 Make	  Hadoop	           more	  open	                       ...
Not	  All	  Applica4ons	  Use	  the	  Hadoop	  APIs	                                                          ApplicaLons	...
Hadoop	  Needs	  Industry-­‐Standard	  Interfaces	                Hadoop	                •  MapReduce	  and	  HBase	  appl...
NFS	  ©MapR	  Technologies	     8	  
Your	  Data	  is	  Important	  §    HDFS-­‐based	  Hadoop	  distribuLons	  do	  not	  (cannot)	        properly	  support...
The	  NFS	  Protocol	  §     RFC	  1813	                                    WRITE3res	  NFSPROC3_WRITE(WRITE3args)	  =	  ...
S3	                              o.a.h.fs.s3naLve.NaLveS3FileSystem	  ©MapR	  Technologies	                               ...
One	  NFS	  Gateway	        What	  about	  scalability	  and	  high	  availability?	  ©MapR	  Technologies	               ...
Mul4ple	  NFS	  Gateways	  ©MapR	  Technologies	     13	  
Mul4ple	  NFS	  Gateways	  with	  Load	  Balancing	  ©MapR	  Technologies	      14	  
Mul4ple	  NFS	  Gateways	  with	  NFS	  HA	  (VIPs)	  ©MapR	  Technologies	        15	  
Customer	  Examples:	  Import/Export	  Data	  §    Network	  security	  vendor	        –  Network	  packet	  captures	  f...
Customer	  Examples:	  Produc4vity	  and	  Opera4ons	  §    Retailer	        –  OperaLonal	  scripts	  are	  easier	  wit...
ODBC	  ©MapR	  Technologies	     18	  
ODBC	  §    ODBC	  –	  Open	  DataBase	  ConnecLvity	        –  Open	  standard	  API	  for	  accessing	  a	  SQL-­‐based...
MapR	  ODBC	  Driver	  §    MapR	  provides	  a	  Hive	  ODBC	  3.52	  driver	        –  Developed	  in	  partnership	  w...
Example:	  Tableau	  ©MapR	  Technologies	     21	  
Example:	  Open	  source	  query	  builder	  (Kaimon)	  ©MapR	  Technologies	         22	  
Example:	  MicrosoW	  Excel	  ©MapR	  Technologies	     23	  
In	  Summary	  §    Open	  standards	  are	  important	  §    SupporLng	  exisLng	  applicaLons	  and	  tools	  that	  s...
Join	  MapR	  §    Join	  the	  fastest	  growing	  Hadoop	  company	  §    Open	  posiLons	  in	  every	  discipline	  ...
Time	  for	  Ques4ons	  §    Download	  slides	  or	  send	  me	  an	  email	        –  hBp://www.mapr.com/company/events...
Upcoming SlideShare
Loading in …5
×

HUG slides on NFS and ODBC

706 views
602 views

Published on

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
706
On SlideShare
0
From Embeds
0
Number of Embeds
53
Actions
Shares
0
Downloads
5
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

HUG slides on NFS and ODBC

  1. 1. Using  Standard  File-­‐Based   Applica4ons  and  SQL-­‐Based   Tools  with  Hadoop  ©MapR  Technologies   1  
  2. 2. Who  am  I?   hBp://www.mapr.com/company/events/ speaking/dc-­‐hug-­‐9-­‐18-­‐12  §  Keys  Botzum  §  kbotzum@maprtech.com  §  Senior  Principal  Technologist,  MapR  Technologies  ©MapR  Technologies   2  
  3. 3. The  MapR  Distribu4on  for  Apache  Hadoop  §  The  open,  enterprise-­‐grade  distribuLon  for  Apache  Hadoop   –  Open  source  components   •  Hive,  Pig,  Cascading,  HBase,  ZooKeeper,  Oozie,  Flume,  Sqoop,  Whirr,  …   –  Enhancements  to  make  Hadoop  more  open  and  enterprise-­‐grade  §  Growing  fast  and  a  recognized  leader  ©MapR  Technologies   3  
  4. 4. MapR  in  the  Cloud    §  Available  as  a  service  with  Amazon  ElasLc  MapReduce  (EMR)   –  hBp://aws.amazon.com/elasLcmapreduce/mapr     §  Available  as  a  service  with  Google  Compute  Engine    ©MapR  Technologies   4  
  5. 5. MapR   Make  Hadoop   Make  Hadoop   more  open   enterprise-­‐grade   •  High  Availability   •  Scalability   •  Management  tools  –  Web,  CLI,  REST   This  presentaLon   •  Data  ProtecLon  –  snapshots  &  mirroring   •  Performance  ©MapR  Technologies   5  
  6. 6. Not  All  Applica4ons  Use  the  Hadoop  APIs   ApplicaLons  and   libraries  that  use  files   and/or  SQL   •  These  are  not  legacy   30  years   applicaLons,  they  are   100,000s  applicaLons   valuable  applicaLons   10,000s  libraries   10s  programming  languages     ApplicaLons  and   libraries  that  use  the   Hadoop  APIs    ©MapR  Technologies   6  
  7. 7. Hadoop  Needs  Industry-­‐Standard  Interfaces   Hadoop   •  MapReduce  and  HBase  applicaLons   API   •  Mostly  custom-­‐built   •  File-­‐based  applicaLons   NFS   •  Supported  by  most  operaLng  systems   •  SQL-­‐based  tools   ODBC   •  Supported  by  most  BI  applicaLons  and   query  builders  ©MapR  Technologies   7  
  8. 8. NFS  ©MapR  Technologies   8  
  9. 9. Your  Data  is  Important  §  HDFS-­‐based  Hadoop  distribuLons  do  not  (cannot)   properly  support  NFS  §  Your  data  is  important,  it  drives  your  business  –  make   sure  you  can  access  it   –  Why  store  your  data  in  a  system  which  cannot  be  accessed   by  95%  of  the  world’s  applicaLons  and  libraries?  §  Access  to  HDFS  source  code  !=  access  to  your  data  ©MapR  Technologies   9  
  10. 10. The  NFS  Protocol  §  RFC  1813   WRITE3res  NFSPROC3_WRITE(WRITE3args)  =  7;     struct  WRITE3args  {          nfs_fh3          file;  §  Very  simple  protocol          offset3          offset;          count3            count;          stable_how    stable;  §  Random  reads/writes          opaque            data<>;   –  Read  count  bytes  from   };   offset  offset  of  file  file     READ3res  NFSPROC3_READ(READ3args)  =  6;   –  Write  buffer  data  to       offset  offset  of  a  file  file   struct  READ3args  {          nfs_fh3    file;          offset3    offset;  §  HDFS  does  not  support          count3      count;   random  writes  so  it   };   cannot  support  NFS    ©MapR  Technologies   10  
  11. 11. S3   o.a.h.fs.s3naLve.NaLveS3FileSystem  ©MapR  Technologies   HDFS   o.a.h.hdfs.DistributedFileSystem   Local  File  System   Storage  Layers   o.a.h.fs.LocalFileSystem   MapReduce   FTP   o.a.h.fs.qp.FTPFileSystem  11   MapR  storage  layer   o.a.h.fs.FileSystem  Interface   com.mapr.fs.MapRFileSystem   Hadoop   Hadoop  Was  Designed  to  Support  Mul4ple   NFS  interface   FileSystem  API  
  12. 12. One  NFS  Gateway   What  about  scalability  and  high  availability?  ©MapR  Technologies   12  
  13. 13. Mul4ple  NFS  Gateways  ©MapR  Technologies   13  
  14. 14. Mul4ple  NFS  Gateways  with  Load  Balancing  ©MapR  Technologies   14  
  15. 15. Mul4ple  NFS  Gateways  with  NFS  HA  (VIPs)  ©MapR  Technologies   15  
  16. 16. Customer  Examples:  Import/Export  Data  §  Network  security  vendor   –  Network  packet  captures  from  switches  are  streamed  into  the  cluster   –  New  paBern  definiLons  are  loaded  into  online  IPS  via  NFS  §  Online  measurement  company   –  Clickstreams  from  applicaLon  servers  are  streamed  into  the  cluster  §  SaaS  company   –  ExporLng  a  database  to  Hadoop  over  NFS  §  Ad  exchange   –  Bids  and  transacLons  are  streamed  into  the  cluster  ©MapR  Technologies   16  
  17. 17. Customer  Examples:  Produc4vity  and  Opera4ons  §  Retailer   –  OperaLonal  scripts  are  easier  with  NFS  than  HDFS  +  MapReduce   •  chmod/chown,  file  system  searches/greps,  perl,  awk,  tab-­‐complete   –  Consolidate  object  store  with  analyLcs  §  Credit  card  company   –  User  and  project  home  directories  on  Linux  gateways   •  Local  files,  scripts,  source  code,  …   •  Administrators  manage  quotas,  snapshots/backups,  …  §  Large  Internet  company  recommendaLon  system   –  Web  server  serve  MapReduce  results    (item  relaLonships)  directly  from  cluster  §  Email  markeLng  company   –  Object  store  with  HBase  and  NFS  ©MapR  Technologies   17  
  18. 18. ODBC  ©MapR  Technologies   18  
  19. 19. ODBC  §  ODBC  –  Open  DataBase  ConnecLvity   –  Open  standard  API  for  accessing  a  SQL-­‐based  backend   –  Developed  by  Microsoq  and  Simba  Technologies  in  1992  §  Flagship  API  for  SQL-­‐based  BI  and  reporLng   –  Excel,  Tableau,  MicroStrategy,  Crystal  Reports,  …  §  Advanced  ODBC  drivers  use  the  latest  3.52  specificaLon  ©MapR  Technologies   19  
  20. 20. MapR  ODBC  Driver  §  MapR  provides  a  Hive  ODBC  3.52  driver   –  Developed  in  partnership  with  ODBC  inventor  Simba  Technologies   –  Compliant  with  latest  ODBC  3.52  specificaLon   •  32-­‐  and  64-­‐bit  plavorm  support   •  Windows  and  Linux  §  Enables  direct  SQL  access  to  MapR-­‐stored  data  by  translaLng  SQL  to   HiveQL  §  SQLizer  enables  seamless  connecLvity   –  Provides  ANSI  SQL-­‐92  front-­‐end   –  Targeted  for  exisLng  apps  that  generate  standard  SQL  queries   –  Transforms  SQL  query  into  HiveQL  query  ©MapR  Technologies   20  
  21. 21. Example:  Tableau  ©MapR  Technologies   21  
  22. 22. Example:  Open  source  query  builder  (Kaimon)  ©MapR  Technologies   22  
  23. 23. Example:  MicrosoW  Excel  ©MapR  Technologies   23  
  24. 24. In  Summary  §  Open  standards  are  important  §  SupporLng  exisLng  applicaLons  and  tools  that  support  those   standards  is  valuable   –  Preserves  investment  in  tools   –  Preserves  investment  in  custom  applicaLons  that  proceeded  Hadoop   –  Leverages  skills  you  already  have  ©MapR  Technologies   24  
  25. 25. Join  MapR  §  Join  the  fastest  growing  Hadoop  company  §  Open  posiLons  in  every  discipline   –  Engineers   –  SoluLon  Architects   –  Product  Management  §  Email  jobs@mapr.com  ©MapR  Technologies   25  
  26. 26. Time  for  Ques4ons  §  Download  slides  or  send  me  an  email   –  hBp://www.mapr.com/company/events/speaking/dc-­‐hug-­‐9-­‐18-­‐12    §  Download  MapR  to  learn  more   –  www.mapr.com/download  ©MapR  Technologies   26  

×