Presto in my use case
Presto Meetup 2015/01/20
http://eventdots.jp/event/276987
@wyukawa
Agenda
•  Log Analysis System Overview
•  Why do I select Presto?
•  How do I use Presto?
•  How do I manage Presto?
•  My impression for Presto
Log Analysis System Overview
Hadoop, Hive of HDP2.1
Azkaban
2.6.4
Presto 0.89
Cognos
10.2.1
Prestogres
0.4.8
DB
MySQL 5.5
DBDB
ETL with
Python 2.7.7
InfiniDB
Pentaho
Saiku
Shib
Why do I select Presto?
•  my familiar(on Twitter) people use
•  comparison with other is a bother
•  Let s try!
How do I use Presto?
•  batch with Hive on MR2(not Tez)
•  select only with Presto
•  execute adhoc query
•  report by Cognos with Prestogres
•  create presto view
http://d.hatena.ne.jp/wyukawa/20140824/1408881620
What is Cognos?
•  Commercial BI tool by IBM
•  Pros
•  authorization management
•  flexible reporting(but not easy)
•  Cons
•  slow rendering speed
•  no permanent link
•  difficult to understand error message
•  web site(especially download site) is bad
•  Windows and IE are necessary
What is Prestogres?
PostgreSQL
pgpool-II
(patched)
BI tool
Presto
Prestogres
Why do I use Prestogres?
•  Interesting
•  MySQL is possible to be a bottleneck
•  Hadoop is easier to scale than MySQL
•  reduce maintenance cost for multi storages
•  but not achieve
Prestogres in my use case
Presto 0.89
Cognos
10.2.1
Prestogres
0.4.8
Prestogres
ODBC Driver
It is not easy to connect Cognos to Presto.
Thanks! > @frsyuki
http://d.hatena.ne.jp/wyukawa/20140623/1403521909
Problem in my use case
•  Cognos don t issue where cause with Presgtogres+ODBC
•  select … from … where yyyymmdd= 20150120 …
•  slow rendering because of no predicate pushdown
•  workaround is to use bigint to where cause, not string
•  solution(not deploy in production. BTW, Thanks! > @frsyuki )
•  Cognos 10.2.2
•  patched(protocolVersion=2) PostgreSQL JDBC Driver
•  Prestogres 0.6.3
•  Presto 0.86
•  increase dentry cache
http://d.hatena.ne.jp/wyukawa/20141216/1418706186
How do I manage Presto?
•  How to deploy Presto?
•  use Ansible
•  Presto Setting
•  task.cpu-timer-enabled=false
•  How to monitor Presto?
•  GrowthForecast + jstat2gf, JMX
•  slow query visualization by nata2
http://d.hatena.ne.jp/wyukawa/20140803/1407052542
http://d.hatena.ne.jp/wyukawa/20141218/1418882574
My impression for Presto
•  stable
•  frequent version up
•  easy to install
•  easy to upgrade
•  but failed at 0.80, 0.87
•  leverage effect

Presto in my_use_case

  • 1.
    Presto in myuse case Presto Meetup 2015/01/20 http://eventdots.jp/event/276987 @wyukawa
  • 2.
    Agenda •  Log AnalysisSystem Overview •  Why do I select Presto? •  How do I use Presto? •  How do I manage Presto? •  My impression for Presto
  • 3.
    Log Analysis SystemOverview Hadoop, Hive of HDP2.1 Azkaban 2.6.4 Presto 0.89 Cognos 10.2.1 Prestogres 0.4.8 DB MySQL 5.5 DBDB ETL with Python 2.7.7 InfiniDB Pentaho Saiku Shib
  • 4.
    Why do Iselect Presto? •  my familiar(on Twitter) people use •  comparison with other is a bother •  Let s try!
  • 5.
    How do Iuse Presto? •  batch with Hive on MR2(not Tez) •  select only with Presto •  execute adhoc query •  report by Cognos with Prestogres •  create presto view http://d.hatena.ne.jp/wyukawa/20140824/1408881620
  • 6.
    What is Cognos? • Commercial BI tool by IBM •  Pros •  authorization management •  flexible reporting(but not easy) •  Cons •  slow rendering speed •  no permanent link •  difficult to understand error message •  web site(especially download site) is bad •  Windows and IE are necessary
  • 7.
  • 8.
    Why do Iuse Prestogres? •  Interesting •  MySQL is possible to be a bottleneck •  Hadoop is easier to scale than MySQL •  reduce maintenance cost for multi storages •  but not achieve
  • 9.
    Prestogres in myuse case Presto 0.89 Cognos 10.2.1 Prestogres 0.4.8 Prestogres ODBC Driver It is not easy to connect Cognos to Presto. Thanks! > @frsyuki http://d.hatena.ne.jp/wyukawa/20140623/1403521909
  • 10.
    Problem in myuse case •  Cognos don t issue where cause with Presgtogres+ODBC •  select … from … where yyyymmdd= 20150120 … •  slow rendering because of no predicate pushdown •  workaround is to use bigint to where cause, not string •  solution(not deploy in production. BTW, Thanks! > @frsyuki ) •  Cognos 10.2.2 •  patched(protocolVersion=2) PostgreSQL JDBC Driver •  Prestogres 0.6.3 •  Presto 0.86 •  increase dentry cache http://d.hatena.ne.jp/wyukawa/20141216/1418706186
  • 11.
    How do Imanage Presto? •  How to deploy Presto? •  use Ansible •  Presto Setting •  task.cpu-timer-enabled=false •  How to monitor Presto? •  GrowthForecast + jstat2gf, JMX •  slow query visualization by nata2 http://d.hatena.ne.jp/wyukawa/20140803/1407052542 http://d.hatena.ne.jp/wyukawa/20141218/1418882574
  • 12.
    My impression forPresto •  stable •  frequent version up •  easy to install •  easy to upgrade •  but failed at 0.80, 0.87 •  leverage effect