Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Efficient  In-­‐situ  Processing  of  
Various  Storage  Types  on  
Apache  Tajo
Hadoop  Summit  2015  San  Jose
Hyunsik ...
Agenda
• Tajo  Overview
• Various  Storage  Support
• Motivation
• Design  Consideration
• What  we  did/are  doing
An  overview  of  Apache  Tajo
Tajo:  A  Data  Warehouse  System
• Data  Warehouse  System
• Apache  Top-­‐level  project
• Low  latency,  and  long  run...
Master
Efficient in situ processing of various storage types on apache tajo
Efficient in situ processing of various storage types on apache tajo
Efficient in situ processing of various storage types on apache tajo
Efficient in situ processing of various storage types on apache tajo
Efficient in situ processing of various storage types on apache tajo
Efficient in situ processing of various storage types on apache tajo
Efficient in situ processing of various storage types on apache tajo
Efficient in situ processing of various storage types on apache tajo
Efficient in situ processing of various storage types on apache tajo
Efficient in situ processing of various storage types on apache tajo
Efficient in situ processing of various storage types on apache tajo
Efficient in situ processing of various storage types on apache tajo
Efficient in situ processing of various storage types on apache tajo
Efficient in situ processing of various storage types on apache tajo
Efficient in situ processing of various storage types on apache tajo
Efficient in situ processing of various storage types on apache tajo
Efficient in situ processing of various storage types on apache tajo
Efficient in situ processing of various storage types on apache tajo
Efficient in situ processing of various storage types on apache tajo
Efficient in situ processing of various storage types on apache tajo
Upcoming SlideShare
Loading in …5
×

Efficient in situ processing of various storage types on apache tajo

5,965 views

Published on

Published in: Data & Analytics
  • Hello! Get Your Professional Job-Winning Resume Here - Check our website! https://vk.cc/818RFv
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Efficient in situ processing of various storage types on apache tajo

  1. 1. Efficient  In-­‐situ  Processing  of   Various  Storage  Types  on   Apache  Tajo Hadoop  Summit  2015  San  Jose Hyunsik Choi,  Gruter Inc.
  2. 2. Agenda • Tajo  Overview • Various  Storage  Support • Motivation • Design  Consideration • What  we  did/are  doing
  3. 3. An  overview  of  Apache  Tajo
  4. 4. Tajo:  A  Data  Warehouse  System • Data  Warehouse  System • Apache  Top-­‐level  project • Low  latency,  and  long  running   batch  queries  in  a  single  system • ~100  ms up  to  several  hours • Fault  tolerance • Features • ANSI  SQL  compliance • Mature  SQL  features:  Joins,  Group  by,  Sort,  Multiple  distinct  aggregations  and    Window  function • Partitioned  table  support • Java/Python  UDF  support • JDBC  driver  and  Java-­‐based  asynchronous  API • SQL  data  type  and  Nested  type  support • Direct  JSON  support
  5. 5. Master

×