Introduction of pg_statsinfo and pg_stats_reporter ~Statistics Reporting Tool for DBA~

Introduction of
pg_̲statsinfo and pg_̲stats_̲reporter
~∼ Statistics Reporting Tool for DBA ~∼

NTT Open Source Software Center
Mitsumasa KONDO
Copyright(c)2013 NTT Corp. All Rights Reserved.

About Me
• Oﬃcial
C ompany
N ame

•  Nippon
Telegraph
and
Telephone
C orporation

• My
B elonging

•  Service
innovation
Laboratory,
S oftware
Innovation
C enter

Researcher

• My
w ork

•  Middleware
development
for
PostgreSQL

•  pg_statsinfo,
pg_stats_reporter

•  High
A vailability
PostgreSQL
C luster
using
replication
with
Pacemaker

•  PostgreSQL
community
development

•  Improvement
of
disk
IO
bottle
neck

• Past
w ork

•  Data
mining,
Natural
Language
Processing,
Machine
Learning,

Recommendation,
Information
Retrieval

•  I
have
already
been
good
at
them
than
databaseJ

• Hobby

•  Photography

•  Pure
A udio


2

Todayʼ’s Introduction Software
• pg_statsinfo

•  Monitor and Collect PostgreSQL Statistics and Activities

• pg_stats_reporter

•  Visualize PostgreSQL Statistics and Activities getting from
pg_̲statsinfo
Creating
report

pg_statsinfo

DB Server A

pg_statsinfo

DB Server B

Database
Statistics
and
Activity

pg_stats_reporter

Store of DB
statistics
pg_statsinfo

DB Server C

Repository
Database

Sample report which was created by
pg_stats_reporter


3

Contents
• pg_statsinfo

~
M onitor
and
C ollect
DB
S tatistics
and
A ctivities
~

•  What is pg_̲statsinfo ?
•  Feature Introduction
•  Demo

~
V isualize
DB
S tatistics
and
A ctivities
~

•  What is pg_̲stats_̲reporter ?
•  Feature introduction
•  Demo

• Visualizing
D BT-‐2
B enchmark
u sing
p g_statsinfo
a nd

pg_stats_reporter

•  Introduction of DBT-‐‑‒2
•  Visualized DBT-‐‑‒2 by pg_̲stats_̲reporter
•  For more performance


4

Contents
• pg_statsinfo

~
M onitor
and
C ollect
DB
S tatistics
and
A ctivities
~

•  Demo

~
V isualize
DB
S tatistics
and
A ctivities
~

•  Demo

• Visualizing
D BT-‐2
B enchmark
u sing
p g_statsinfo
a nd

pg_stats_reporter



5

What is pg_̲statsinfo ?
•  Monitoring
a nd
C ollecting
PostgreSQL
S tatistics
a nd
A ctivities

• 
• 
• 
• 

Collecting statistics and activities
All tables in pg_̲catalog schema
pg_̲log information
OS resources

•  Other Features
• 
• 
• 
• 

Create Report by command line
Alert and Monitoring function
Log management function
Auto repositoryDB management

•  Other
r elative
i nformation

•  BSD License
•  Latest version is 2.5.0

Collective Database Statistics

•  http://pgfoundry.org /frs /?group_̲id=1000422
•  Working on PostgreSQL 9.3!

•  Web online manual is here

•  http://pgstatsinfo.projects.pgfoundry.org /pg_̲statsinfo-‐‑‒ja.html

6

Architecture of pg_̲statsinfo
• Programing
L anguage

•  C

• Starting
a nd
P re-‐Setting
m ethod

•  Start pg_̲statsinfo via shared_̲preload_̲library
•  Add postgresql.conf to pg_̲statsinfo conﬁguration, then it
can start normally in PostgreSQL.

• System
C onﬁguration

•  Install pg_̲statsinfo in monitoring instance

•  Not need to install in repository database instance
•  Monitoring instance and repository database can set
together incetance

pg_catalog
pg_log
OS resources

pg_statsinfod

Collect and send
database statistics
(Snapshot)

Statistics of
database

Monitoring
Repository
instance Copyright(c)2013 NTT Corp. All Rights Reserved.
database

7

Features of pg_̲statsinfo 1/5
• Collect
s tatistics
a nd
a ctivities
i n
PostgreSQL

•  All information gathering PostgreSQLʼ’s statistics collector
(ex. pg_̲catalog)
•  Detail of statistics collector, please see PostgreSQL
documentJ

•  http://www.postgresql.jp /document /9.2/html /monitoring-‐‑‒stats.html

•  Get statistcs as snapshot at uniformity time
•  Default every 10 minute

•  Analyze pg_̲log and get activities from logs
•  Get activities which only output pg_̲log
•  Checkpoint activities
•  VACUUM activities

•  Get OS resources information in /proc

•  Get every 5 seconds in sampling, when get snapshot, insert
average values of sampling
•  CPU usage information（idle, iowait, system, user, Load Average）
•  Memory usage information（memfree, buﬀers, cached, swap, dirty）
•  Disk usage information（IO size, IO time, usage size of disk）

8

• Create
r eports
o n
c ommand
l ine

•  Output text format report on command line

•  Example) Database admin or SQL Engineer who wants to
see database statistics
•  Cover almost all report item created by pg_̲stats_̲reporter

Command example: Create report for all monitor instances on 2013-10-1 to now
$ pg_statsinfo -U postgres -B 2013-10-01 -r ALL | less


9



10

• Auto
m aintenance
r epository
d atabase
f eature

•  Delete statistics that stored in repository database
automatically

•  Pg_̲statsinfo stored data that are used partitioning method per day.
•  So it can use TRUNCATE to delete old data
•  Delete data is faster and lower cost

• Note

•  When we use in multi monitor instance, giving priority to
shortest maintenance period of stored data conﬁguration
Repository
database

Maintenance period
of stored data config
pg_statsinfo
Get and Send
1 week
DB server A
database
statistics
Maintenance period
of stored data config
pg_statsinfo
2 weeks
DB server B

Default maintenance
period of stored
Store of data is 1 weeks

database
statistics


11

Features of pg_̲statsinfo 4/5 　
•  Log
management
feature

•  Easy to manage PostgreSQLʼ’s log
•  Log filtering feature

•  Can set log level in pg_̲statsinfo, it means that we can having two log level
•  example）PostgreSQLʼ’s log level is lower setting to save detail information, and
pg_̲statsinfo log level is higher setting to easy to read in daily
•  This feature can fix log file name(ex. postgresql.log) It can use in monitoring log
sof tware.

•  Multi output log feature

•  Can output syslog and pg_̲log

•  Change log level feature

•  If you want to change log level in especially log message, we can change it
•  ex）change log level INFO to LOG in especially log message

•  Log compression and managing feature

•  Compress old logs and manage automatically

pg_statsinfod
pg_log
(csv format)

log formulation

Log by statsinfo
(postgresql.log)

Flow of extraction statistics from pg_log

12

Features of pg_̲statsinfo 5/5 　
• Alert
a nd
M onitoring
F unction
( Trigger
F unction)

•  Output alert log when over the alert thresholds in database
•  usage）monitor alert log by monitoring software
•  Alert function is executed in every snapshot

•  Default setting is under following, set property value on
your server
•  Setting method is UPDATE SQL for statsrepo.alert table
Alert configuration table
colum
name

default

instid

－

rollback_tps

100

commit_tps

1000

Number of commit per seconds (sec)

garbage_size

20000

Garbage records size in the table(%)

garbage_percent

30

Garbage records percentage in the database(%)

garbage_percent_table

30

Garbage records percentage in the table(%)

response_avg

10

average response time in the query (sec)

response_worst

60

Worst response time in the query (sec)

enable_alert

explanation
Target instance ID
Number of rollback (sec)

true
Enable alert function

13

How to install pg_̲statsinfo ?
1.　Install RPM file’s
$ su
# rpm –ivh pg_statsinfo-2.50-1.pg93.rhel6.x86_64.rpm

2. Add configuration to postgresql.conf
#minimum configuration
shared_preload_libraries = ‘pg_statsinfo’ # pre-load library setting
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' # configuration of log file’s (must need)

3. Start PostgreSQL in normally
$ pg_ctl –D data start

4. If we see under following log messages, install was succeed !
server starting
LOG: loaded library "pg_statsinfo"
LOG: pg_statsinfo launcher started
LOG: start
LOG: installing schema: statsinfo
LOG: installing schema: statsrepo_partition

How to install pg_statsinfo is indicated in Web manual ! J
http://pgstatsinfo.projects.pgfoundry.org/pg_statsinfo-ja.html#install

14

Demo of pg_̲statsinfo
１．Install

２．Conﬁrmation
o f
I nstall

３．Collect
Database
S tatistics
and
A ctivities
(Snapshot)

４．Create
Report


15

TIPS of pg_̲statsinfo
• One
s napshot
s ize
i s
3 00kB
~
8 00kB
•  Be careful disk full by snapshots!

• Software
i nstalling
d egradation
i s
a lmost
n othing

•  But little bit happen. In DBT-‐‑‒2 benchmark, we conﬁrm 2%

degradation.

• If
y ou’d
l ike
t o
s eparate
r epository
s erver,
s et

“pg_statsinfo.repository_server”
i n
p ostgresql.conf
.

•  Default setting is ʻ‘host=localhost port=5432ʼ’

• If
y ou
u se
p assword
i n
r epository
d atabase,
s et
/ var/lib/
pgsql/.pgpass

•  pg_̲statsinfo works on postgres user

16

Contents
• pg_statsinfo

~
M onitor
and
C ollect
DB
S tatistics
and
A ctivities
~

•  Demo

~
V isualize
DB
S tatistics
and
A ctivities
~

•  Demo

• Visualizing
D BT-‐2
B enchmark
u sing
p g_statsinfo
a nd

pg_stats_reporter



17

What is pg_̲stats_̲reporter ?
•  Visualization
PostgreSQL
s tatistics
a nd
a ctivities
g etting
f rom

pg_statsinfo

•  Report items

•  Transaction situation
•  Size of Database
•  OS resources
•  Amount of WAL output
•  Replication state
•  Deadlock information
•  Successor software of
pg_̲reporter

•  Extra
i nformation

•  BSD License
•  Latest version is 2.０.0

Report of pg_stats_reporter
•  http://pgfoundry.org /frs /?group_̲id=1000422

•  Detail online manual is here

•  http://pgstatsinfo.projects.pgfoundry.org /pg_̲stats_̲reporter-‐‑‒ja.html

18

Architecture of pg_̲stats_̲reporter
• Software

•  Apache + PHP + PostgreSQL

•  Only PHP + PostgreSQL combination is OK
•  Need PostgreSQL 8.3 later

• Programing
L anguage

•  PHP ＋ javascript + SQL

• Using
L ibrary

•  PHP framework
•  Smarty

•  User Interface

•  jQuery, jQuery UI, tablesorter, Superﬁsh

•  Creating graph

•  dygraphs, jqPlot

19

How to Create Report ? 1/2
• By
W ab
B rowser

•  Only a few clicks for creating report.
② Push
“create new
report” button

③ Set term and
time of report

① Select
database instance
for reporting


20

• By
c ommand
l ine

•  It works on phpʼ’s stand alone mode.
•  Usage scene

•  Create report in command line.
•  Create reports by crond in regular intervals.

•  If you use only command line mode, Apache wasnʼ’t
needed

•  If you have security policy which cannot install Apache

•  Need to save reports in long term

•  Repository database is saved until certain terms
•  Created reports arenʼ’t erased.

Command usage: Create report in 10/1 to 10/8 at report_dir
$ pg_stats_reporter -B 2013-10-01 -E 2013-10-08 -O report_dir
[LOG] Report file created: sample_localhost_5432_1_20131008-1419_20131008-1945.html

21

• Index
o f
Report
f eature

•  Create report and index of reports in report directory
•  It is easy to see and sort out reports

Directory of
Report

Libraly of
pg_stats_reporter
Index.html

Reports which
were created past
Index of report

Report HTML １
Report HTML２

22

How to install pg_̲stats_̲reporter ?
1.　Install pg_stats_reporter RPM and dependency RPMs
$ su
# rpm –ivh httpd-2.2.15-15.el6_2.1.x86_64.rpm
php-5.3.3-3.el6_2.8.x86_64.rpm
php-common-5.3.3-3.el6_2.8.x86_64.rpm
php-pgsql-5.3.3-3.el6_2.8.x86_64.rpm
php-intl-5.3.3-3.el6_2.8.x86_64.rpm
pg_stats_reporter-1.0.0-1.el6.noarch.rpm

2. Set pg_stats_reporter.ini(configuration file) （default setting is under following）　
# vim /etc/pg_stats_reporter.ini
----- configuration of repository database -----
host = localhost
port = 5432
dbname = postgres
username = postgres
password =

3. Start Apache HTTP server
# service httpd start

4. Access under following URL
http://localhost/pg_stats_reporter/pg_stats_reporter.php

Please set
SELINUX disable!!

How to install pg_stats_reporter is indicated in Web manual ! J
http://pgstatsinfo.projects.pgfoundry.org/pg_stats_reporter-ja.html#install

23

Demo of pg_̲stats_̲reporter
１．Install

２．Conﬁrmation
o f
I nstall

３．Create
Report


24

TIPS of pg_̲stats_̲reporter
• Android
a nd
i Pad
a re
r eady

• It
i s
b ased
o n
j QueryUI
l ibrary,
s o
w e
c an
e asy
t o
c hange

interface
d esign
( mostly
c olor)

•  Logo picture can be also changed with ﬁle replaced

• It
c an
s elect
r eport
i tems
o n
r eports

•  If weʼ’d like to, set /etc /pg_̲stats_̲reporter.ini with your
needed report item

• For
S ecurity

•  We can use .httpaccess
•  Apacheʼ’s security technic can use in same

25

Contents
• pg_statsinfo

~
M onitor
and
C ollect
DB
S tatistics
and
A ctivities
~

•  Demo

~
V isualize
DB
S tatistics
and
A ctivities
~

•  Demo

• Visualizing
D BT-‐2
B enchmark
u sing
p g_statsinfo
a nd

pg_stats_reporter



26

What is DBT-‐‑‒2？
• TPC-‐C
b enchmark
s oftware
t hat
d eveloped
b y

O pen

Source
D evelopment
L abs(OSDL)

•  Shopping simulation in parts wholesaler
•  http://www.tpc.org /tpcc /

•  Benchmark score is calculated by only response in
uniformity time

•  Response time is very important!

•  IO bottle-‐‑‒neck benchmark

• Mainly
b enchmark
p arameter

•  warehouse

•  Database size parameter

•  Increase one hundred thousands record per adding 1 parameter
•  Mainly used coordination size of database

•  TPW

•  Transaction per warehouse

•  Prepared clients corresponding warehouse size, Default 10
•  If we set lower TPW, it will be CPU bottle-‐‑‒necked benchmark

27

Transaction Tendency in DBT-‐‑‒2
• Mainly
b ottle-‐neck

•  Random read /write

•  Almost SQL plans are index scan

•  Random read /write performance and cache or buﬀer
replace performance are important
•  Parallel execution performance is also important
•  PostgreSQL is better than other RDBMSJ

• Other
f eatures

•  Plan of SQLs are very simple

•  Most of SQLs are only index scan access.

•  Exist ideal Benchmark score

•  If DB response all transactions in limit time, it is be ideal
score

•  Limit of performance is memory 2x equals database
size.
•  Amount of WAL output is less than pgbench, WAL is not

bottle-‐‑‒neck.

28

Test Server and Settings of postgresql.conf
Server

HP DL360 G7

CPU

Xeon E5640 2.66GHz (1P/4C)

Memory

DDR3-10600R-9 18GB

RAID card

P410i / 256MB cache

Disk

4 x 146GB(1.5krpm) RAID 1 + 0

postgresql.conf

(mainly
changed
parameter)

max_connections = 300
shared_buffers = 2458MB
work_mem = 1MB
maintenance_work_mem = 64MB
fsync = on
wal_sync_method = fdatasync
full_page_writes = on
wal_buffers = -1
archive_mode = on

checkpoint_segments = 300
checkpoint_timeout = 15min
checkpoint_completion_target = 0.7
random_page_cost = 2.0
effective_cache_size = 9GB
default_statistics_target = 10
log_destination = 'syslog’
autovacuum = on

Wherehouse
size
= Copyright(c)2013 NTT Corp. All Rights Reserved.PW
=
10

320(database
size
is
about
40GB)
and
T

29

Visualizing DBT-‐‑‒2 by pg_̲stats_̲reporter 1/5

• Transaction
Situation

•  It was seen fluctuates transactions. It is because some benchmark
specifications and some implementation dependent in PostgreSQL

•  Lower performance in executing CHECKPOINT
•  CHECKPOINT was mainly caused by checkpoint_̲timeout

•  postgresql.conf sets checkpoint_̲timeout = 15min and checkpoint_̲segments = 300

30


• Amount
o f
WAL
o utput

•  Output 4.6GB WAL in data load to benchmark ﬁnished
•  In data load, Maximum WAL speed is 54MB/sec
•  In executing benchmark test, Maximum WAL speed is 12MB/
sec
•  When starting CHECKPOINT, WAL Speed is higher, it is
because “full page write”.

31


• CPU
u sage

•  Iowait is most, next is idle (It indicates IO bottle-‐‑‒neck situation.)
•  Part of ﬁnal CHECKPOINT causes high Load Average
•  It is because executing ugly consecutive fsync().
•  PostgreSQL CHECKPOINT logic is not goodL

32


• Update
a nd
h eavily
a ccess
Tables

•  HOT(Heap on Tuple) is good working!
•  order_̲line table and stock table have many access
•  Each tableʼ’s Cache hit rate are very high, but… (Is it really?L)

33


•  Query
e xecuted
s ituation

•  Queries which have complicated ﬁlter phrase is slow
•  Unexpected, COMMIT assumes long time!

•  It is because long transaction COMMIT needs lot of WAL (WAL
buﬀer writing)

•  Final CHECKPOINT fsync() phase makes queries slower

34

For More Performance
• Use
d irect_cp
i n
a rchive
c opy
c ommand

•  When we use archive mode in PostgreSQL, cp command
consume large amount of waste ﬁle cache, and it is
caused lower performance
•  BSD License Software
•  http://directcp.projects.pgfoundry.org /index.html

• Use
S SD

•  In general, database bottle-‐‑‒neck is random access. SSD
has 10 times faster random access than MD
•  If you need large disk or donʼ’t have cost, you may use
tablespace in only hot table, it is very eﬃciency.

• Use
l arge
R AID
c ache
c ard

•  PostgreSQL CHECKPOINT does not consider fsync()
schedule at all. It is caused very heavy disk write and
fail overL
•  If you use large raid cache card, it may prevent a little.

35

Summary
• pg_statsinfo

•  Monitor and Collect PostgreSQL Statistics and Activities with
time series
•  BSD License

•  http://pgstatsinfo.projects.pgfoundry.org /pg_̲statsinfo-‐‑‒ja.html

•  Collect whole of statistics an activities for DB admin needed
•  If youʼ’d like to another new report, Create reporting SQL from
collecting information


•  Visualize PostgreSQL Statistics and Activities that are
collected by pg_̲statsinfo
•  BSD License

•  http://pgstatsinfo.projects.pgfoundry.org /pg_̲stats_̲reporter-‐‑‒ja.html

•  jQuery Based Useful Interface

•  Report index feature is also useful

•  It is easy to improve software, because it is created by PHP
+ JavaScript
•  It is also easy to submit patchJ

36

Introduction of pg_statsinfo and pg_stats_reporter ~Statistics Reporting Tool for DBA~

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Introduction of pg_statsinfo and pg_stats_reporter ~Statistics Reporting Tool for DBA~

Similar to Introduction of pg_statsinfo and pg_stats_reporter ~Statistics Reporting Tool for DBA~ (20)

Recently uploaded

Recently uploaded (20)

Introduction of pg_statsinfo and pg_stats_reporter ~Statistics Reporting Tool for DBA~