Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Piwik fluentd
YAMAMOTO Takashi
yamachan@piwikjapan.org
@yamachan5593
Piwik Japan Team
Feb 27th, 2016
at Open Source Confere...
OpenSolaris
https://osdn.jp/projects/jposug/
Piwikjapan /OSC
https://osdn.jp/projects/piwik-fluentd/
2 of 46
Piwik Piwik tracker
125.54.155.180 - - [21/Feb/2016:08:46:13 +0900] "GET
/piwik.php?action_name=example.com%2F%E5%A0%B1%E5...
4 of 46
Piwik Tracker Piwik
host IP user agent referer
Piwik Tracker
idsite Piwik Web
action name Web
id ID
res PC
pdf Web pdf ?
j...
1. Piwik, fluentd, elasticsearch, kibana
2. Piwik Piwik
Piwik PHP
GET
3. Piwik fluentd elasticsearch
elasticsearch
fluentd UR...
7 of 46
RedHat7 CentOS7, Scientific Linux 7
RedHat6 RedHat6
RedHat6 · · · CentOS6, Scientific Linux 6
Piwik
Piwik Web 2
fluentd, elas...
fluentd ∼ 1
fluentd td-agent
td-agent 2.x 1.x
ruby RPM
fluentd ruby
RedHat6 ruby 1.9.3
RedHat7 ruby 2.0
td-agent 2.x ruby 2.2...
fluentd ∼ 2
ruby 2.2.4
1. ruby RedHat
CentOS, Scientific Linux
6 7
2. td-agent RPM
3. SRPM rpm
$ sudo yum groupinstall Devel...
fluentd ∼ 3
ruby 2.2.4
1. ˜/rpmbuild/SPECS/ruby224.spec
%define rubyver 2.2.4
2. “Ruby 2.2.4 4
” ruby-2.2.4.tar.bz2
3. ruby...
fluentd ∼ 4
1. epel
$ sudo yum install 
http://ftp-srv2.kddilabs.jp/Linux/distributions/ 
fedora/epel/7/x86 64/e/epel-relea...
fluentd ∼ 5
1. RedHat6 git
$ wget http://dl.marmotte.net/rpms/redhat/el6/x86 64/
git-1.8.3.1-3.el6/git-1.8.3.1-3.el6.src.rp...
fluentd ∼ 6
ruby fluentd
1. bundle
$ sudo gem install bundler
2. github clone
$ cd ~
$ git clone 
git@github.com:treasure-da...
fluentd ∼ 7
multipart-post
˜/omnibus-td-agent/Gemfile gem ’pedump’ · · · 6
source ’https://rubygems.org’
# Use Berkshelf for...
fluentd ∼ 8
elasticsearch, record-reformer, norikra RPM
norikra
˜/omnibus-td-agent/plugin gems.rb
download fluent-plugin-no...
fluentd ∼ 9
norikra
norikra
norikra-client msgpack-rpc-over-http rack
2.x 1.6.4
˜/omnibus-td-agent/core gems.rb
download ra...
fluentd ∼ 10
7
$ sudo mkdir -p /opt/td-agent /var/cache/omnibus
$ sudo chown yamachan:yamachan /opt/td-agent
$ sudo chown y...
fluentd ∼ 11:
1. 8
$ cd ~/omnibus-td-agent
$ bundle install --binstubs
sudo
$ bin/gem_downloader core_gems.rb
$ bin/gem_dow...
fluentd ∼
1. pkg
$ cd ~/omnibus-td-agent/pkg
$ sudo yum install td-agent-2.3.1-0.el7.x86 64.rpm
2. RedHat6 td-agent-2.3.1-0...
elasticsearch
1. RedHat7, RedHat6
$ sudo yum install 
https://download.elasticsearch.org/elasticsearch/
release/org/elasti...
kibana
1.
$ cd ~
$ git clone git@github.com:piwikjapan/kibana-rpm-packaging.git
$ cd kibana-rpm-packaging
$ cp kibana.sysc...
RedHat6 kibana
“kibana4 9”
9
http://qiita.com/nagomu1985/items/82e699dde4f99b2ce417
23 of 46
1. norikra 26578/tcp
$ sudo firewall-cmd --zone=public 
--add-port=26578/tcp --permanent # norikra web
$ sudo firewall-cmd...
RedHat6
1. norikra 26578/tcp
2. /etc/sysconfig/iptables
-A INPUT -m state –state ESTABLISHED,RELATED -j ACCEPT
-A INPUT -m ...
td-agent
Piwik elasticsearch, kibana
1. Piwik server elasticsearch server
2. Piwik server elasticsearch server forward
26 of 46
td-agent ∼ Piwik 1
Piwik elasticsearch
td-agent
/etc/td-agent/td-agent.conf
“Piwik elasticsearch
10
”
10
https://osdn.jp/p...
td-agent ∼ Piwik 2
Piwik
Piwik
tag piwiktracker.apache.access
source
type tail
format apache
time_format %d/%b/%Y:%H:%M:%S...
td-agent ∼ Piwik 3
Piwik
host
match piwiktracker.apache.access
type forward
send_timeout 60s
recover_wait 300s
heartbeat_i...
td-agent ∼ Piwik 4
elasticsearch
Tracker
1. Piwik
2. Piwik API
3. filter match piwiktracker.apache.access
filter piwiktrack...
td-agent ∼ Piwik 5
elasticsearch
fluentd
“Supported Query Parameters11
”
“ ” “id”
piwiktracker.apache.access.urldecode
matc...
td-agent ∼ Piwik 6
elasticsearch
fluentd url encode
piwiktracker.apache.access.store
match piwiktracker.apache.access.urlde...
td-agent ∼ Piwik 7:
elasticsearch
store elasticsearch
match piwiktracker.apache.access.store
type copy
store
type elastics...
td-agent ∼ Piwik 1
Piwik elasticsearch
td-agent
/etc/td-agent/td-agent.conf
“ ”
“Piwik elasticsearch
12
”
12
https://osdn....
td-agent ∼ Piwik 2:
Piwik elasticsearch
“ ”
“ ” Piwik forward
source
tag piwiktracker.apache.access
/source
match piwiktra...
elasticsearch 1
fluentd elasticsearch
elasticsearch
string
36 of 46
elasticsearch 2 ∼
Elasticsearch supports the following simple field types13:
String: string
Whole number: byte, short, inte...
elasticsearch 3 ∼
Json 14
15
“elasticsearch mapping
16”
14
MySQL elasticsearch
15
16
https://osdn.jp/projects/piwik-fluentd...
elasticsearch 4 ∼ Json
”template”: ”apache-log-*”,
17 mapping td-agent.conf
logstash prefix apache-log
logstash dateformat
...
elasticsearch 5 ∼ Json
”mappings”: { ”access log”: {
”access log” td-agent.conf type name
access log 19
19
“ default ”
40 ...
elasticsearch 6 ∼ Json
source all
mappings: {
access log: {
 source: {
enabled: false true
},
 all: {
enabled: false true
...
elasticsearch 7 ∼ Json
mappings: {
access log: {
properties: {
@log name: { see td-agent.conf
type: string,
store: true,
i...
elasticsearch 8 ∼ Json
ref: { td-agent.conf
type: multi field,
fields: {
ref: {
type: string,
index: analyzed,
store: true...
elasticsearch 9: ∼ Json
action_name: {
type: string,
analyzer: kuromoji analyzer,
store: true
},
44 of 46
td-agent
# service td-agent start
# service elasticsearch start
# service kibana start
kibana http://your elasticserach se...
Upcoming SlideShare
Loading in …5
×

Piwik elasticsearch kibana at OSC Tokyo 2016 Spring

2,018 views

Published on

View Piwik tracking data through kibana and elasticsearch.

Published in: Technology
  • Be the first to comment

Piwik elasticsearch kibana at OSC Tokyo 2016 Spring

  1. 1. Piwik fluentd YAMAMOTO Takashi yamachan@piwikjapan.org @yamachan5593 Piwik Japan Team Feb 27th, 2016 at Open Source Conference Tokyo
  2. 2. OpenSolaris https://osdn.jp/projects/jposug/ Piwikjapan /OSC https://osdn.jp/projects/piwik-fluentd/ 2 of 46
  3. 3. Piwik Piwik tracker 125.54.155.180 - - [21/Feb/2016:08:46:13 +0900] "GET /piwik.php?action_name=example.com%2F%E5%A0%B1%E5%91 &idsite=1&rec=1&r=047899&h=23&m=46&s=16 &url=http%3A%2F%2Fjpvlad.com%2Findex.php%3Ftopic%3Deventresult_ &_id=4e5ded8520370239&_idts=1435710334&_idvc=387 &_idn=0&_refts=0&_viewts=1455979574&send_image=0 &pdf=1&qt=0&realp=1&wma=1&dir=1&fla=1&java=1&gears=0 &ag=1&cookie=1&res=1366x768 HTTP/1.1" 204 - "http://jpvlad.com/index.php?topic=eventresult_ja" "Mozilla/5.0 (WindowsNT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1500.63 Safari/537.36" elasticsearch kibana 3 of 46
  4. 4. 4 of 46
  5. 5. Piwik Tracker Piwik host IP user agent referer Piwik Tracker idsite Piwik Web action name Web id ID res PC pdf Web pdf ? java java ? fla flash ? cookie cookie ? viewts Supported Query Parameters1 1 http://developer.piwik.org/api-reference/tracking-api 5 of 46
  6. 6. 1. Piwik, fluentd, elasticsearch, kibana 2. Piwik Piwik Piwik PHP GET 3. Piwik fluentd elasticsearch elasticsearch fluentd URL decode 4. kibana elasticsearch 6 of 46
  7. 7. 7 of 46
  8. 8. RedHat7 CentOS7, Scientific Linux 7 RedHat6 RedHat6 RedHat6 · · · CentOS6, Scientific Linux 6 Piwik Piwik Web 2 fluentd, elasticsearch, kibana Piwik 2 http://www.piwikjapan.org/ /3985 8 of 46
  9. 9. fluentd ∼ 1 fluentd td-agent td-agent 2.x 1.x ruby RPM fluentd ruby RedHat6 ruby 1.9.3 RedHat7 ruby 2.0 td-agent 2.x ruby 2.2 fluentd fluentd RPM elasticsearch 9 of 46
  10. 10. fluentd ∼ 2 ruby 2.2.4 1. ruby RedHat CentOS, Scientific Linux 6 7 2. td-agent RPM 3. SRPM rpm $ sudo yum groupinstall Development tools 4. “CentOS 6 ruby RPM 3 ” ruby223.spec 5. RPM Ctrl+C $ rpmbuild -bp ruby223.spec Ctrl+C ~/rpmbuild $ mv ruby223.spec rpmbuild/SPECS/ruby224.spec 224 3 http://www.torutk.com/projects/swe/wiki/CentOS 6 ruby RPM 10 of 46
  11. 11. fluentd ∼ 3 ruby 2.2.4 1. ˜/rpmbuild/SPECS/ruby224.spec %define rubyver 2.2.4 2. “Ruby 2.2.4 4 ” ruby-2.2.4.tar.bz2 3. ruby-2.2.4.tar.bz2 /rpmbuild/SOURCES 4. RPM $ cd ~/rpmbuild/SPECS $ rpmbuild -ba ruby224.spec $ sudo rpm -ivh ~/rpmbuild/RPMS/x86_64/ruby-2.2.4-1.el7.x86_64.rpm RedHat6 el6 $ ruby -v ruby 2.2.4p230 (2015-12-16 revision 53155) [x86_64-linux] 4 https://www.ruby-lang.org/ja/news/2015/12/16/ruby-2-2-4-released/ 11 of 46
  12. 12. fluentd ∼ 4 1. epel $ sudo yum install http://ftp-srv2.kddilabs.jp/Linux/distributions/ fedora/epel/7/x86 64/e/epel-release-7-5.noarch.rpm RedHat6 $ sudo yum install http://ftp-srv2.kddilabs.jp/Linux/distributions/ fedora/epel/6/x86 64/epel-release-6-8.noarch.rpm 2. $ sudo yum install gecode gecode-devel fakeroot 12 of 46
  13. 13. fluentd ∼ 5 1. RedHat6 git $ wget http://dl.marmotte.net/rpms/redhat/el6/x86 64/ git-1.8.3.1-3.el6/git-1.8.3.1-3.el6.src.rpm $ cp ~/rpmbuild/SRPMS/git-1.8.3.1-3.el6.src.rpm $ rpmbuild --rebuild ~/rpmbuild/SRPMS/git-1.8.3.1-3.el6.src.rpm $ sudo yum install perl-TermReadKey $ sudo rpm -ivh ~/rpmbuild/RPMS/x86 64/git-1.8.3.1-3.el6.x86_64.rpm git 1.8 “-c” git 1.8 epel 13 of 46
  14. 14. fluentd ∼ 6 ruby fluentd 1. bundle $ sudo gem install bundler 2. github clone $ cd ~ $ git clone git@github.com:treasure-data/omnibus-td-agent.git $ cd ~/omnibus-td-agent 3. treasure-data/omnibus-td-agent5 multipart-post Gemfile 5 https://github.com/treasure-data/omnibus-td-agent 14 of 46
  15. 15. fluentd ∼ 7 multipart-post ˜/omnibus-td-agent/Gemfile gem ’pedump’ · · · 6 source ’https://rubygems.org’ # Use Berkshelf for resolving cookbook dependencies gem ’berkshelf’, ’~ 3.0’ gem ’pedump’, git: ’https://github.com/ksubrama/pedump’, branch: ’patch-1’ # # Install omnibus software #gem ’omnibus’, ’~ 5.0’ 6 https://github.com/piwikjapan/omnibus-td-agent/blob/master/Gemfile 15 of 46
  16. 16. fluentd ∼ 8 elasticsearch, record-reformer, norikra RPM norikra ˜/omnibus-td-agent/plugin gems.rb download fluent-plugin-norikra, 0.2.2 download fluent-plugin-elasticsearch, 1.3.0 download fluent-plugin-record-reformer, 0.8.0 16 of 46
  17. 17. fluentd ∼ 9 norikra norikra norikra-client msgpack-rpc-over-http rack 2.x 1.6.4 ˜/omnibus-td-agent/core gems.rb download rack, 1.6.4 download norikra-client, 1.3.1 17 of 46
  18. 18. fluentd ∼ 10 7 $ sudo mkdir -p /opt/td-agent /var/cache/omnibus $ sudo chown yamachan:yamachan /opt/td-agent $ sudo chown yamachan:yamachan/var/cache/omnibus yamachan:yamachan id 7 https://github.com/treasure-data/omnibus-td-agent 18 of 46
  19. 19. fluentd ∼ 11: 1. 8 $ cd ~/omnibus-td-agent $ bundle install --binstubs sudo $ bin/gem_downloader core_gems.rb $ bin/gem_downloader plugin_gems.rb $ bin/omnibus build td-agent2 8 https://github.com/treasure-data/omnibus-td-agent 19 of 46
  20. 20. fluentd ∼ 1. pkg $ cd ~/omnibus-td-agent/pkg $ sudo yum install td-agent-2.3.1-0.el7.x86 64.rpm 2. RedHat6 td-agent-2.3.1-0.el6.x86 64.rpm 20 of 46
  21. 21. elasticsearch 1. RedHat7, RedHat6 $ sudo yum install https://download.elasticsearch.org/elasticsearch/ release/org/elasticsearch/distribution/ rpm/elasticsearch/2.2.0/elasticsearch-2.2.0.rpm 2. kuromoji $ sudo /usr/share/elasticsearch/bin/plugin install analysis-kuromoji 21 of 46
  22. 22. kibana 1. $ cd ~ $ git clone git@github.com:piwikjapan/kibana-rpm-packaging.git $ cd kibana-rpm-packaging $ cp kibana.sysconfig kibana.service ~/rpmbuild/SOURCES $ cp kibana.spec ~/rpmbuild/SPECS $ wget -P ~/rpmbuild/SOURCES https://download.elastic.co/kibana/kibana/ kibana-4.4.1-linux-x64.tar.gz $ rpmbuild -ba ~/rpmbuild/SPECS/kibana.spec 2. $ sudo rpm -ivh ~rpmbuild/RPMS/x86_64/ kibana-4.4.1-1.x86_64.rpm 22 of 46
  23. 23. RedHat6 kibana “kibana4 9” 9 http://qiita.com/nagomu1985/items/82e699dde4f99b2ce417 23 of 46
  24. 24. 1. norikra 26578/tcp $ sudo firewall-cmd --zone=public --add-port=26578/tcp --permanent # norikra web $ sudo firewall-cmd --zone=public --add-port=5651/tcp --permanent # kibana web $ sudo firewall-cmd --zone=public --add-port=24224/udp --permanent # fluentd heatbeat $ sudo firewall-cmd --zone=public --add-port=24224/tcp --permanent # fluentd data 24 of 46
  25. 25. RedHat6 1. norikra 26578/tcp 2. /etc/sysconfig/iptables -A INPUT -m state –state ESTABLISHED,RELATED -j ACCEPT -A INPUT -m multiport -p tcp -m tcp --dports 26578,5651,24224 -j ACCEPT -A INPUT -m multiport -p udp -m udp --dports 24224 -j ACCEPT 3. $ sudo service iptables reload 25 of 46
  26. 26. td-agent Piwik elasticsearch, kibana 1. Piwik server elasticsearch server 2. Piwik server elasticsearch server forward
  27. 27. 26 of 46
  28. 28. td-agent ∼ Piwik 1 Piwik elasticsearch td-agent /etc/td-agent/td-agent.conf “Piwik elasticsearch 10 ” 10 https://osdn.jp/projects/piwik-fluentd/wiki/FrontPage 27 of 46
  29. 29. td-agent ∼ Piwik 2 Piwik Piwik tag piwiktracker.apache.access source type tail format apache time_format %d/%b/%Y:%H:%M:%S %z pos_file /var/log/td-agent/access_log.pos path /var/log/httpd/access_log tag piwiktracker.apache.access /source 28 of 46
  30. 30. td-agent ∼ Piwik 3 Piwik host match piwiktracker.apache.access type forward send_timeout 60s recover_wait 300s heartbeat_interval 1s phi_threshold 16 hard_timeout 60s server name fruentd host your_elsticsearch_server i.e. 10.x.x.x port 24224 weight 100 /server /match 29 of 46
  31. 31. td-agent ∼ Piwik 4 elasticsearch Tracker 1. Piwik 2. Piwik API 3. filter match piwiktracker.apache.access filter piwiktracker.apache.access type grep regexp1 path /piwik.php?action name=.*idsite=d+ /filter match piwiktracker.apache.access type record_reformer tag piwiktracker.apache.access.urldecode 30 of 46
  32. 32. td-agent ∼ Piwik 5 elasticsearch fluentd “Supported Query Parameters11 ” “ ” “id” piwiktracker.apache.access.urldecode match piwiktracker.apache.access type record_reformer tag piwiktracker.apache.access.urldecode 29 3 idsite ${path[/piwik.php? action name=.*idsite=(d+)/,1]} ID piwikid ${path[/piwik.php?action name= .* id=([a-zd]+)/,1]} ID fla ${path[/piwik.php?action name= flash ? .*fla=(d+)/,1] == 1 ? true : false } /match 11 http://developer.piwik.org/api-reference/tracking-api 31 of 46
  33. 33. td-agent ∼ Piwik 6 elasticsearch fluentd url encode piwiktracker.apache.access.store match piwiktracker.apache.access.urldecode type uri_decode tag piwiktracker.apache.access.store key_names action_name,ref,url,urlref /match 32 of 46
  34. 34. td-agent ∼ Piwik 7: elasticsearch store elasticsearch match piwiktracker.apache.access.store type copy store type elasticsearch type_name access_log host 127.0.0.1 port 9200 logstash_format true logstash_prefix apache-log logstash_dateformat %Y%m%d include_tag_key true tag_key @log_name flush_interval 10s /store /match 33 of 46
  35. 35. td-agent ∼ Piwik 1 Piwik elasticsearch td-agent /etc/td-agent/td-agent.conf “ ” “Piwik elasticsearch 12 ” 12 https://osdn.jp/projects/piwik-fluentd/wiki/FrontPage 34 of 46
  36. 36. td-agent ∼ Piwik 2: Piwik elasticsearch “ ” “ ” Piwik forward source tag piwiktracker.apache.access /source match piwiktracker.apache.access tag piwiktracker.apache.access.urldecode /match match piwiktracker.apache.access.urldecode tag piwiktracker.apache.access.store /match match piwiktracker.apache.access.store /match 35 of 46
  37. 37. elasticsearch 1 fluentd elasticsearch elasticsearch string 36 of 46
  38. 38. elasticsearch 2 ∼ Elasticsearch supports the following simple field types13: String: string Whole number: byte, short, integer, long Floating-point: float, double Boolean: boolean Date: date 13 https://www.elastic.co/guide/en/elasticsearch/guide/current/mapping- intro.html 37 of 46
  39. 39. elasticsearch 3 ∼ Json 14 15 “elasticsearch mapping 16” 14 MySQL elasticsearch 15 16 https://osdn.jp/projects/piwik-fluentd/wiki/ elasticsearch#h2-elasticsearch.20.E3.81.AE.20mapping.20.E8.A8.AD.E5.AE.9A 38 of 46
  40. 40. elasticsearch 4 ∼ Json ”template”: ”apache-log-*”, 17 mapping td-agent.conf logstash prefix apache-log logstash dateformat %Y%m%d “apache-log- ” ”settings”: { index kuromoji “Elasticsearch kuromoji 18” 17 DB 18 http://tech.gmo-media.jp/post/70245090007/elasticsearch-kuromoji- japanese-fulltext-search 39 of 46
  41. 41. elasticsearch 5 ∼ Json ”mappings”: { ”access log”: { ”access log” td-agent.conf type name access log 19 19 “ default ” 40 of 46
  42. 42. elasticsearch 6 ∼ Json source all mappings: { access log: { source: { enabled: false true }, all: { enabled: false true }, 41 of 46
  43. 43. elasticsearch 7 ∼ Json mappings: { access log: { properties: { @log name: { see td-agent.conf type: string, store: true, index: not analyzed }, 42 of 46
  44. 44. elasticsearch 8 ∼ Json ref: { td-agent.conf type: multi field, fields: { ref: { type: string, index: analyzed, store: true }, full: { type: string, index: not analyzed, store: true } } }, 43 of 46
  45. 45. elasticsearch 9: ∼ Json action_name: { type: string, analyzer: kuromoji analyzer, store: true }, 44 of 46
  46. 46. td-agent # service td-agent start # service elasticsearch start # service kibana start kibana http://your elasticserach server:5601/ 45 of 46
  47. 47. 46 of 46

×