SlideShare a Scribd company logo
Wim Godden
Cu.be Solutions
@wimgtr
Beyond PHP :
It's not (just) about the code
Who am I ?
Wim Godden (@wimgtr)
Where I'm from
Where I'm from

Recommended for you

My app is secure... I think
My app is secure... I thinkMy app is secure... I think
My app is secure... I think

With more and more sites falling victim to data theft, you've probably read the list of things (not) to do to write secure code. But what else should you do to make sure your code and the rest of your web stack is secure ? In this tutorial we'll go through the basic and more advanced techniques of securing your web and database servers, securing your backend PHP code and your frontend javascript code. We'll also look at how you can build code that detects and blocks intrusion attempts and a bunch of other tips and tricks to make sure your customer data stays secure.

phpleuvendata theftrecovery
The why and how of moving to PHP 5.5/5.6
The why and how of moving to PHP 5.5/5.6The why and how of moving to PHP 5.5/5.6
The why and how of moving to PHP 5.5/5.6

With PHP 5.6 out and many production environments still running 5.2 or 5.3, it's time to paint a clear picture on why everyone should move to 5.5 and 5.6 and how to get code ready for the latest version of PHP. In this talk, we'll look at some handy tools and techniques to ease the migration.

phpphp 5.4php 5.5
Remove php calls and scale your site like crazy !
Remove php calls and scale your site like crazy !Remove php calls and scale your site like crazy !
Remove php calls and scale your site like crazy !

This document summarizes a presentation about using caching and reverse proxies like Nginx and Varnish to improve PHP application performance and scalability. It discusses using Edge Side Includes (ESI) and a custom caching language (SCL) to cache dynamic and user-specific content on the edge. Benchmark results show these techniques reducing server requirements by over 80% for some applications like vBulletin forums. The presenter also provides contact information and announces plans to open source the SCL caching language.

phpvarnishnginx
Where I'm from
Where I'm from
Where I'm from
Where I'm from

Recommended for you

The why and how of moving to PHP 5.4/5.5
The why and how of moving to PHP 5.4/5.5The why and how of moving to PHP 5.4/5.5
The why and how of moving to PHP 5.4/5.5

With PHP 5.5 out and many production environments still running 5.2 (or older), it's time to paint a clear picture on why everyone should move to 5.4 and 5.5 and how to get code ready for the latest version of PHP. In this talk, we'll migrate an old piece of code using some standard and some very non-standard tools and techniques.

php 5.5php_codesnifferwebandphp
Caching and tuning fun for high scalability @ FrOSCon 2011
Caching and tuning fun for high scalability @ FrOSCon 2011Caching and tuning fun for high scalability @ FrOSCon 2011
Caching and tuning fun for high scalability @ FrOSCon 2011

The document discusses using caching and tuning techniques to improve scalability for websites. It covers caching full pages, parts of pages, SQL queries, and complex processing results. Memcache is presented as a fast and distributed caching solution. The document also discusses installing and using Memcache, as well as replacing Apache with Nginx as a lighter-weight web server that can serve static files and forward dynamic requests.

cachingtuningphp
Load Data Fast!
Load Data Fast!Load Data Fast!
Load Data Fast!

We all have tasks from time to time for bulk-loading external data into MySQL. What's the best way of doing this? That's the task I faced recently when I was asked to help benchmark a multi-terrabyte database. We had to find the most efficient method to reload test data repeatedly without taking days to do it each time. In my presentation, I'll show you several alternative methods for bulk data loading, and describe the practical steps to use them efficiently. I'll cover SQL scripts, the mysqlimport tool, MySQL Workbench import, the CSV storage engine, and the Memcached API. I'll also give MySQL tuning tips for data loading, and how to use multi-threaded clients.

programmingmysql
My town
My town
My town
Belgium – the traffic

Recommended for you

MySQL under the siege
MySQL under the siegeMySQL under the siege
MySQL under the siege

This document discusses various strategies for optimizing MySQL performance, including: 1. Using partitioning, indexes, and caching to improve query performance. Functional and multi-level partitioning techniques are described. 2. Denormalizing data, adding columns, and flagging rows to reduce the number of joins and optimize for read-heavy workloads. 3. Monitoring tools like the slow query log, server status, and InnoTOP are recommended for identifying bottlenecks and tracking improvements over time. Testing and an iterative process of addressing the biggest problems is advised.

performancedatabasedesign
Tips on how to improve the performance of your custom modules for high volume...
Tips on how to improve the performance of your custom modules for high volume...Tips on how to improve the performance of your custom modules for high volume...
Tips on how to improve the performance of your custom modules for high volume...

The document discusses performance optimization for OpenERP deployments handling high volumes of transactions and data. It provides recommendations around hardware sizing, PostgreSQL and OpenERP architecture, monitoring tools, and analyzing PostgreSQL logs and statistics. Key recommendations include proper sizing based on load testing, optimizing PostgreSQL configuration and storage, monitoring response times and locks, and analyzing logs to identify performance bottlenecks like long-running queries or full table scans.

 
by Odoo
openerpopenerpdays
Varnish, the high performance valhalla?
Varnish, the high performance valhalla?Varnish, the high performance valhalla?
Varnish, the high performance valhalla?

When high performance on a web application is a hard requirement Varnish can be of rescue. But does it’s name, the high-performance HTTP accelerator, really bring what you expect? What are the caveats, pitfalls and problems you introduce when developing your application when the released version is only able to run when there is a Varnish in front? This session will give you some answers, tips and tricks to aid in application design, development with PHP and solutions when there is no Varnish in front of your application.

high performancevarnishesi
Who am I ?
Wim Godden (@wimgtr)
Founder of Cu.be Solutions (http://cu.be)
Open Source developer since 1997
Developer of OpenX, PHPCompatibility, PHPConsistent, Nginx
SLIC, ...
Speaker at PHP and Open Source conferences
Cu.be Solutions ?
Open source consultancy
PHP-centered
Training courses
High-speed redundant network (BGP, OSPF, VRRP)
High scalability development
Nginx + extensions
MySQL Cluster
Projects :
mostly IT & Telecom companies
lots of public-facing apps/sites
Who are you ?
Developers ?
Anyone setup a MySQL master-slave ?
Anyone setup a site/app on separate web and database server ?
→ How much traffic between them ?
The topic
Things we take for granted
Famous last words : "It should work just fine"
Works fine today
→ might fail tomorrow
Most common mistakes
PHP code ↔ PHP ecosystem

Recommended for you

Survey of Percona Toolkit
Survey of Percona ToolkitSurvey of Percona Toolkit
Survey of Percona Toolkit

The document summarizes the Percona Toolkit, which contains free and open source command-line tools for MySQL based on Percona's experience developing best practices. Some of the most popular tools are pt-summary, pt-mysql-summary, pt-stalk, pt-archiver, and pt-query-digest, which allow users to summarize MySQL servers, analyze queries from logs, and check for issues. The toolkit can be installed via package repositories or by downloading individual tools.

mysqltoolspercona
Percona toolkit
Percona toolkitPercona toolkit
Percona toolkit

Using MySQL without Maatkit is like taking a photo without removing the camera's lens cap. Professional MySQL experts use this toolkit to help keep complex MySQL installations running smoothly and efficiently. This session will show you practical ways to use Maatkit every day. 

mysqlpercona
MySQL 5.5 Guide to InnoDB Status
MySQL 5.5 Guide to InnoDB StatusMySQL 5.5 Guide to InnoDB Status
MySQL 5.5 Guide to InnoDB Status

MySQL exposes a collection of tunable parameters and indicators that is frankly intimidating. But a poorly tuned MySQL server is a bottleneck for your PHP application scalability. This session shows how to do InnoDB tuning and read the InnoDB status report in MySQL 5.5.

mysqlinnodbdatabase
It starts with...
… code !
First up : database
Database queries – complexity
SELECT DISTINCT n.nid, n.uid, n.title, n.type, e.event_start, e.event_start AS
event_start_orig, e.event_end, e.event_end AS event_end_orig, e.timezone,
e.has_time, e.has_end_date, tz.offset AS offset, tz.offset_dst AS offset_dst,
tz.dst_region, tz.is_dst, e.event_start - INTERVAL IF(tz.is_dst, tz.offset_dst,
tz.offset) HOUR_SECOND AS event_start_utc, e.event_end - INTERVAL
IF(tz.is_dst, tz.offset_dst, tz.offset) HOUR_SECOND AS event_end_utc,
e.event_start - INTERVAL IF(tz.is_dst, tz.offset_dst, tz.offset) HOUR_SECOND +
INTERVAL 0 SECOND AS event_start_user, e.event_end - INTERVAL IF(tz.is_dst,
tz.offset_dst, tz.offset) HOUR_SECOND + INTERVAL 0 SECOND AS
event_end_user, e.event_start - INTERVAL IF(tz.is_dst, tz.offset_dst, tz.offset)
HOUR_SECOND + INTERVAL 0 SECOND AS event_start_site, e.event_end -
INTERVAL IF(tz.is_dst, tz.offset_dst, tz.offset) HOUR_SECOND + INTERVAL 0
SECOND AS event_end_site, tz.name as timezone_name FROM node n INNER
JOIN event e ON n.nid = e.nid INNER JOIN event_timezones tz ON tz.timezone =
e.timezone INNER JOIN node_access na ON na.nid = n.nid LEFT JOIN
domain_access da ON n.nid = da.nid LEFT JOIN node i18n ON n.tnid > 0 AND
n.tnid = i18n.tnid AND i18n.language = 'en' WHERE (na.grant_view >= 1 AND
((na.gid = 0 AND na.realm = 'all'))) AND ((da.realm = "domain_id" AND da.gid = 4)
OR (da.realm = "domain_site" AND da.gid = 0)) AND (n.language ='en' OR
n.language ='' OR n.language IS NULL OR n.language = 'is' AND i18n.nid IS NULL)
AND ( n.status = 1 AND ((e.event_start >= '2010-01-31 00:00:00' AND
e.event_start <= '2010-03-01 23:59:59') OR (e.event_end >= '2010-01-31 00:00:00'
AND e.event_end <= '2010-03-01 23:59:59') OR (e.event_start <= '2010-01-31
00:00:00' AND e.event_end >= '2010-03-01 23:59:59')) ) GROUP BY n.nid HAVING
(event_start >= '2010-02-01 00:00:00' AND event_start <= '2010-02-28 23:59:59')
OR (event_end >= '2010-02-01 00:00:00' AND event_end <= '2010-02-28 23:59:59')
OR (event_start <= '2010-02-01 00:00:00' AND event_end >= '2010-02-28
23:59:59') ORDER BY event_start ASC;
Database - indexing
'select id from stock where status = 2 order by qty'
→ aggregate index on (status, qty)
'select id from stock where status > 2 order by qty'
→ aggregate index on (status, qty) ?
→ Depends :
- Btree : yes
- Hash : range selection stops use of aggregate index
→ separate index on status and qty (since recent versions)
Database - indexing
Indexes make database faster
→ Let's index everything !
→ DON'T :
Insert/update/delete → Index modification
Each query → evaluation of all indexes
"Relational schema design is based on data
but index design is based on queries"
(Bill Karwin, Percona)

Recommended for you

Non-Relational Postgres / Bruce Momjian (EnterpriseDB)
Non-Relational Postgres / Bruce Momjian (EnterpriseDB)Non-Relational Postgres / Bruce Momjian (EnterpriseDB)
Non-Relational Postgres / Bruce Momjian (EnterpriseDB)

Postgres has always had strong support for relational storage. However, there are many cases where relational storage is either inefficient or overly restrictive. This talk shows the many ways that Postgres has expanded to support non-relational storage, specifically the ability to store and index multiple values, even unrelated ones, in a single database field. Such storage allows for greater efficiency and access simplicity, and can also avoid the negatives of entity-attribute-value (eav) storage. The talk will cover many examples of multiple-value-per-field storage, including arrays, range types, geometry, full text search, xml, json, and records.

системы храненияhl++2016Базы данных
Cassandra for Python Developers
Cassandra for Python DevelopersCassandra for Python Developers
Cassandra for Python Developers

A high level introduction to Apache Cassandra followed by an introduction to pycassa, the Python client library for Cassandra. Presented at PyTexas 2011 by Tyler Hobbs.

pythonpytexascassandra
Working with databases in Perl
Working with databases in PerlWorking with databases in Perl
Working with databases in Perl

An overview of the main questions/design issues when starting to work with databases in Perl - choosing a database - matching DB datatypes to Perl datatypes - DBI architecture (handles, drivers, etc.) - steps of DBI interaction : prepare/execute/fetch - ORM principles and difficulties, ORMs on CPAN - a few examples with DBIx::DataModel - performance issues First given at YAPC::EU::2009 in Lisbon. Updated version given at FPW2011 in Paris and YAPC::EU::2011 in Riga

yapceu2009relational databasesql
Databases – detecting problematic queries
Slow query log
→ SET GLOBAL slow_query_log = ON;
Queries not using indexes
→ In my.cnf/my.ini : 'log_queries_not_using_indexes'
General query log
→ SET GLOBAL general_log = ON;
→ Turn it off quickly !
Percona Toolkit (Maatkit)
pt-query-digest
Databases - pt-query-digest
# Profile
# Rank Query ID Response time Calls R/Call Apdx V/M Item
# ==== ================== ================ ===== ======= ==== ===== ==========
# 1 0x543FB322AE4330FF 16526.2542 98.2% 1208 13.6806 1.00 0.00 SELECT output_option
# 2 0xE78FEA32E3AA3221 0.8312 0.0% 6412 0.0001 1.00 0.00 SELECT poller_output poller_item
# 3 0x211901BF2E1C351E 0.6811 0.0% 6416 0.0001 1.00 0.00 SELECT poller_time
# 4 0xA766EE8F7AB39063 0.2805 0.0% 149 0.0019 1.00 0.00 SELECT wp_terms wp_term_taxonomy wp_term_relationships
# 5 0xA3EEB63EFBA42E9B 0.1999 0.0% 51 0.0039 1.00 0.00 SELECT UNION wp_pp_daily_summary wp_pp_hourly_summary
# 6 0x94350EA2AB8AAC34 0.1956 0.0% 89 0.0022 1.00 0.01 UPDATE wp_options
# MISC 0xMISC 302.8137 1.8% 3853 0.0002 NS 0.0 <147 ITEMS>
Databases - pt-query-digest
# Query 2: 0.26 QPS, 0.00x concurrency, ID 0x92F3B1B361FB0E5B at byte 14081299
# This item is included in the report because it matches --limit.
# Scores: Apdex = 1.00 [1.0], V/M = 0.00
# Query_time sparkline: | _^ |
# Time range: 2011-12-28 18:42:47 to 19:03:10
# Attribute pct total min max avg 95% stddev median
# ============ === ======= ======= ======= ======= ======= ======= =======
# Count 1 312
# Exec time 50 4s 5ms 25ms 13ms 20ms 4ms 12ms
# Lock time 3 32ms 43us 163us 103us 131us 19us 98us
# Rows sent 59 62.41k 203 231 204.82 202.40 3.99 202.40
# Rows examine 13 73.63k 238 296 241.67 246.02 10.15 234.30
# Rows affecte 0 0 0 0 0 0 0 0
# Rows read 59 62.41k 203 231 204.82 202.40 3.99 202.40
# Bytes sent 53 24.85M 46.52k 84.36k 81.56k 83.83k 7.31k 79.83k
# Merge passes 0 0 0 0 0 0 0 0
# Tmp tables 0 0 0 0 0 0 0 0
# Tmp disk tbl 0 0 0 0 0 0 0 0
# Tmp tbl size 0 0 0 0 0 0 0 0
# Query size 0 21.63k 71 71 71 71 0 71
# InnoDB:
# IO r bytes 0 0 0 0 0 0 0 0
# IO r ops 0 0 0 0 0 0 0 0
# IO r wait 0 0 0 0 0 0 0 0
# pages distin 40 11.77k 34 44 38.62 38.53 1.87 38.53
# queue wait 0 0 0 0 0 0 0 0
# rec lock wai 0 0 0 0 0 0 0 0
# Boolean:
# Full scan 100% yes, 0% no
# String:
# Databases wp_blog_one (264/84%), wp_blog_tw… (36/11%)... 1 more
# Hosts
# InnoDB trxID 86B40B (1/0%), 86B430 (1/0%), 86B44A (1/0%)... 309 more
# Last errno 0
# Users wp_blog_one (264/84%), wp_blog_two (36/11%)... 1 more
# Query_time distribution
# 1us
# 10us
# 100us
# 1ms
# 10ms ################################################################
# 100ms
# 1s
# 10s+
# Tables
# SHOW TABLE STATUS FROM `wp_blog_one ` LIKE 'wp_options'G
# SHOW CREATE TABLE `wp_blog_one `.`wp_options`G
# EXPLAIN /*!50100 PARTITIONS*/
SELECT option_name, option_value FROM wp_options WHERE autoload = 'yes'G
Databases – next step : explain
explain <query>
"How will MySQL execute the query"

Recommended for you

New Framework - ORM
New Framework - ORMNew Framework - ORM
New Framework - ORM

- Odoo 13 includes the biggest ORM refactoring since OpenERP 8, focusing on performance improvements by optimizing the in-memory cache, reducing SQL queries, and delaying computations. - Key changes include a single unified cache, preferring in-memory updates over SQL, optimizing dependency trees, and avoiding unnecessary format conversions to reduce overhead. - Onchange methods are being deprecated in favor of computed fields, which provide a cleaner separation of business logic and interface concerns. Computed fields work both in Python and JavaScript and have well-defined dependencies.

 
by Odoo
Best Practices in Handling Performance Issues
Best Practices in Handling Performance IssuesBest Practices in Handling Performance Issues
Best Practices in Handling Performance Issues

The document provides best practices for handling performance issues in an Odoo deployment. It recommends gathering deployment information, such as hardware specs, number of machines, and integration with web services. It also suggests monitoring tools to analyze system performance and important log details like CPU time, memory limits, and request processing times. The document further discusses optimizing PostgreSQL settings, using tools like pg_activity, pg_stat_statements, and pgbadger to analyze database queries and performance. It emphasizes reproducing issues, profiling code with tools like the Odoo profiler, and fixing problems in an iterative process.

 
by Odoo
odoo experience 2020odooperformance
Beyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeBeyond php - it's not (just) about the code
Beyond php - it's not (just) about the code

Most PHP developers focus on writing code. But creating Web applications is about much more than just wrting PHP. Take a step outside the PHP cocoon and into the big PHP ecosphere to find out how small code changes can make a world of difference on servers and network. This talk is an eye-opener for developers who spend over 80% of their time coding, debugging and testing.

databasesmaster-slavemysql
Databases – next step : explain
+----+-------------+-----------+------+---------------+------+---------+------+--------+-------------+
| id | select_type | TABLE | TYPE | possible_keys | KEY | key_len | REF | ROWS | Extra |
+----+-------------+-----------+------+---------------+------+---------+------+--------+-------------+
| 1 | SIMPLE | employees | ALL | NULL | NULL | NULL | NULL | 299809 | USING WHERE |
+----+-------------+-----------+------+---------------+------+---------+------+--------+-------------+
+----+-------------+------------+-------+-------------------------------+---------+---------+-------+------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+-------+-------------------------------+---------+---------+-------+------+-------+
| 1 | SIMPLE | itdevice | const | PRIMARY,fk_device_devicetype1 | PRIMARY | 4 | const | 1 | |
| 1 | SIMPLE | devicetype | const | PRIMARY | PRIMARY | 4 | const | 1 | |
+----+-------------+------------+-------+-------------------------------+---------+---------+-------+------+-------+
Databases – next step : explain
Type of lookup
'system', 'const' and 'ref' = good
'ALL' = bad
Extra info
Using index = good
Using filesort = usually bad
For / foreach
$customers = CustomerQuery::create()
->filterByState('MN')
->find();
foreach ($customers as $customer) {
$contacts = ContactsQuery::create()
->filterByCustomerid($customer->getId())
->find();
foreach ($contacts as $contact) {
doSomestuffWith($contact);
}
}
Joins
$contacts = mysql_query("
select
contacts.*
from
customer
join contact
on contact.customerid = customer.id
where
state = 'MN'
");
while ($contact = mysql_fetch_array($contacts)) {
doSomeStuffWith($contact);
}
or the ORM equivalent

Recommended for you

Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the codeBeyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code

Most PHP developers focus on writing code. But creating Web applications is about much more than just writing PHP. Take a step outside the PHP cocoon and into the big PHP ecosphere to find out how small code changes can make a world of difference on servers and network. This talk is an eye-opener for developers who spend over 80% of their time coding, debugging and testing.

networkingcodingphp
Beyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeBeyond php - it's not (just) about the code
Beyond php - it's not (just) about the code

Most PHP developers focus on writing code. But creating Web applications is about much more than just wrting PHP. Take a step outside the PHP cocoon and into the big PHP ecosphere to find out how small code changes can make a world of difference on servers and network. This talk is an eye-opener for developers who spend over 80% of their time coding, debugging and testing.

databasesmaster-slavemysql
A miało być tak... bez wycieków
A miało być tak... bez wyciekówA miało być tak... bez wycieków
A miało być tak... bez wycieków

Wszyscy zostaliśmy oszukani! Automatyczne zarządzanie pamięci rozwiąże wszystkie Wasze problemy, mówili. W zarządzanych środowiskach takich jak CLR JVM nie będzie wycieków pamięci, mówili! Właściwie pamięć jest tania i nie musisz się już nią nigdy więcej martwić. Wszyscy kłamali. Automatyczne zarządzanie pamięcią jest wygodną abstrakcją i bardzo często działa dobrze. Ale jak każda abstrakcja, wcześniej czy później "wycieka" ona. I to najczęściej w najmniej spodziewanym i przyjemnym momencie. W tej sesji spróbuję otworzyć oczy na fakt, że błoga nieświadomość nt. tej abstrakcji może być kosztowna. Pokażę jak może się objawić frywolne traktowanie pamięci i co możemy zyskać pisząc kod zdając sobie sprawę, że pamięć jednak nie jest nieskończona, tania i zawsze jednakowo szybka.

pamięć.netjava
Better...
10001 → 1 query
Sadly : people still produce code with query loops
Usually :
Growth not anticipated
Internal app → Public app
The origins of this talk
Customers :
Projects we built
Projects we didn't build, but got pulled into
Fixes
Changes
Infrastructure migration
15 years of 'how to cause mayhem with a few lines of code'
Client X
Jobs search site
Monitor job views :
Daily hits
Weekly hits
Monthly hits
Which user saw which job
Client X
Originally : when user viewed job details
Now : when job is in search result
Search for 'php' → 50 jobs = 50 jobs to be updated
→ 50 updates for shown_today
→ 50 updates for shown_week
→ 50 updates for shown_month
→ 50 inserts for shown_user
= 200 queries for 1 search !

Recommended for you

Clean Code Development
Clean Code DevelopmentClean Code Development
Clean Code Development

Good and Bad Code The Broken Window Theory The Grand Redesign in the Sky The Sushi Chef Rule The Hotel Room Rule The Boy Scout Rule OOP Patterns and Principles SOLID Principles How to measure clean code? Tools

Python na Infraestrutura 
MySQL do Facebook

Python na Infraestrutura 
MySQL do Facebook
Python na Infraestrutura 
MySQL do Facebook

Python na Infraestrutura 
MySQL do Facebook


Uma equipe de apenas 14 engenheiros (junho de 2017) cuida da Infraestrutura do principal banco de dados do Facebook. Toda ação no Instagram, Messenger, WhatsApp e claro, no FB, passa direta ou indiretamente pela infra de dezenas de milhares de servidores que rodam MySQL. A linguagem usada pela equipe e por trás de toda a automação é Python. Nessa palestra, vamos mostrar como a linguagem possibilitou que chegássemos nessa escala, passando pela evolução, desafios e futuro: Interfaces “tipadas” com Thrift; Empacotamento através do Buck; Type Checking com MyPy; Asyncio para novos serviços; Debugging com gdb 7 e pudb; Além disso, durante a palestra, serão discutidas algumas decisões relacionadas a DevOps que podem inspirar soluções em outros ambientes: gerenciamento de dezenas de milhares de servidores e banco de dados, backups e restores contínuos, schema migrations, entre outros.

pythonmysqldevops
Duplicates everywhere (Kiev)
Duplicates everywhere (Kiev)Duplicates everywhere (Kiev)
Duplicates everywhere (Kiev)

This document discusses techniques for finding duplicate records in large datasets. It describes a machine learning framework with two steps: candidate selection and candidate scoring. The candidate selection step uses domain knowledge, information retrieval techniques like "more like this" queries, and approximate nearest neighbors to find candidate duplicates. The candidate scoring step then uses machine learning models trained on pairwise record comparisons to identify true duplicates among the candidates. Features for the models include differences in fields, text similarity measures, image hashes and embeddings. Approximate techniques like locality sensitive hashing allow scaling these methods to very large datasets.

duplicatesmachine learning
Client X : the code
foreach ($jobs as $job) {
$db->query("
insert into shown_today(
jobId,
number
) values(
" . $job['id'] . ",
1
)
on duplicate key
update
number = number + 1
");
$db->query("
insert into shown_week(
jobId,
number
) values(
" . $job['id'] . ",
1
)
on duplicate key
update
number = number + 1
");
$db->query("
insert into shown_month(
jobId,
number
) values(
" . $job['id'] . ",
1
)
on duplicate key
update
number = number + 1
");
$db->query("
insert into shown_user(
jobId,
userId,
when
) values (
" . $job['id'] . ",
" . $user['id'] . ",
now()
)
");
}
Client X : the graph
Client X : the numbers
600-1000 inserts/sec (peaks up to 1600)
400-1000 updates/sec (peaks up to 2600)
16 core machine
Client X : panic !
Mail : "MySQL slave is more than 5 minutes behind master"
We set it up → who did they blame ?
Wait a second !

Recommended for you

Explain this!
Explain this!Explain this!
Explain this!

The document provides information about troubleshooting and monitoring PostgreSQL performance. It includes: - Top output showing PostgreSQL processes using significant CPU resources - Iotop output showing disk IO from PostgreSQL processes - Pg_stat_activity output listing active queries and sessions - Pg_stat_statements output showing top queries by execution time - Explanations and examples of using EXPLAIN, EXPLAIN ANALYZE, and ANALYZE to view query plans and statistics

postgresqldatabasesgdb
Apache Spark Side of Funnels
Apache Spark Side of FunnelsApache Spark Side of Funnels
Apache Spark Side of Funnels

Last year we decided to build an in-house solution for Funnel analysis which should be accessible to our business user through our BI tool. Backend part should run on Ap;ache Spark and since the BI tool can only run SQL queries that implies that the solution is a pure Spark SQL implementation of Funnel analysis. In this talk we will cover various Spark SQL features we have used to optimize query performance and implement various filters which enable end users to get actionable insights. KEY TAKEAWAYS: – single query approach to Funnel analysis (can be applied to any funnel-like problem) – using window functions to ensure ordering of the events in the funnel – examples of higher order functions to calculate funnel metrics

apache spark

 *big data

 *ai

 *
From logs to metrics
From logs to metricsFrom logs to metrics
From logs to metrics

The document discusses transforming Kubernetes logs into metrics using the TICK stack. It begins by describing how syslog logs from journald can be parsed using the go-syslog parser and sent as metrics to InfluxDB via the Telegraf syslog input plugin. It then shows the YAML configuration to deploy Chronograf and InfluxDB for visualization. Finally, it proposes writing a Kapacitor tick script with a UDF to detect and count OOM events from the logs and send as metrics.

#syslog #telegraf #logging #influx #ragel #tick #i
Client X : what's causing those peaks ?
Client X : possible cause ?
Code changes ?
→ According to developers : none
Action : turn on general log, analyze with pt-query-digest
→ 50+-fold increase in 4 queries
→ Developers : 'Oops we did make a change'
After 3 days : 2,5 days behind
Every hour : 50 min extra lag
Client X : But why is the slave lagging ?
Master Slave
File :
master-bin-xxxx.log
File :
master-bin-xxxx.logSlave I/O thread
Binlog dump
thread
Slave
SQL
thread
Client X : Master

Recommended for you

Windbg랑 친해지기
Windbg랑 친해지기Windbg랑 친해지기
Windbg랑 친해지기

The document discusses analyzing crashes using WinDbg. It provides tips on reconstructing crashed call stacks and investigating what thread or lock is causing a hang. The debugging commands discussed include !analyze, !locks, .cxr, kb to find the crashing function and stuck thread.

Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Data Provenance Support in...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Data Provenance Support in...Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Data Provenance Support in...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Data Provenance Support in...

Debugging data processing logic in Data-Intensive Scalable Computing (DISC) systems is a difficult and time consuming effort. To aid this effort, we built Titian, a library that enables data provenance tracking data through transformations in Apache Spark.

apache sparkuclabig data day la
Duplicates everywhere (Berlin)
Duplicates everywhere (Berlin)Duplicates everywhere (Berlin)
Duplicates everywhere (Berlin)

Alexey Grigorev discusses various machine learning techniques for detecting duplicate listings, including feature engineering, candidate selection, and candidate scoring. Candidate selection uses domain knowledge, information retrieval, and approximate k-nearest neighbors to find candidate duplicates. Candidate scoring then uses machine learning models to identify actual duplicates among the candidates. Image embeddings and locality-sensitive hashing are also discussed to enable approximate matching of images for duplicate detection.

duplicatesmachine learning
Client X : Slave
Client X : fix ?
foreach ($jobs as $job) {
$db->query("
insert into shown_today(
jobId,
number
) values(
" . $job['id'] . ",
1
)
on duplicate key
update
number = number + 1
");
$db->query("
insert into shown_week(
jobId,
number
) values(
" . $job['id'] . ",
1
)
on duplicate key
update
number = number + 1
");
$db->query("
insert into shown_month(
jobId,
number
) values(
" . $job['id'] . ",
1
)
on duplicate key
update
number = number + 1
");
$db->query("
insert into shown_user(
jobId,
userId,
when
) values (
" . $job['id'] . ",
" . $user['id'] . ",
now()
)
");
}
Client X : the code change
insert into shown_today values (5, 1), (8, 1), (12, 1), (18, 1), … on duplicate key … ;
insert into shown_week values (5, 1), (8, 1), (12, 1), (18, 1), … on duplicate key … ;
insert into shown_month values (5, 1), (8, 1), (12, 1), (18, 1), … on duplicate key … ;
insert into shown_user values (5, 23, "2013-11-12 12:01:00"), (8, 23, "2013-11-12
12:01:00"), … ;
Client X : the code change
$todayQuery = "
insert into shown_today(
jobId,
number
) values ";
foreach ($jobs as $job) {
$todayQuery .= "(" . $job['id'] . ", 1),";
}
$todayQuery = substr($todayQuery, 0, strlen($todayQuery) - 1);
$todayQuery .= "
)
on duplicate key
update
number = number + 1
";
$db->query($todayQuery);

Recommended for you

5 must have patterns for your microservice - techorama
5 must have patterns for your microservice - techorama5 must have patterns for your microservice - techorama
5 must have patterns for your microservice - techorama

"Netflix is actually a log generating application that just happens to stream movies" Building a service/Microservice is itself easy. Scaling it on the cloud is not that hard either but operating, maintaining and iterating a production large scale service is not just about linearisation. As Cockcroft points out, telemetry and monitoring is the most important aspect of building Microservices We discuss 5 patterns that any serious Microservice should have: - Canary (an endpoint reporting health of underlying dependencies) - IO monitor (measuring all calls from Microservice to external dependencies) - A circuit breaker - An ActivityId-Propagator - An exception and short timeout retry policy

patternsmicroservice
The Dynamic Language is not Enough
The Dynamic Language is not EnoughThe Dynamic Language is not Enough
The Dynamic Language is not Enough

Most today's software is highly static, even if it is written in a dynamic language like Smalltalk. Developers are not encouraged to extend the frameworks they are using; and end-users are unable to change the features of their software without initiating a new development effort. In contrast, extensible software is designed for change; and customizable software can be adapted to new needs without requiring an in-depth knowledge of the underlying implementation domain. In this presentation I will investigate on how to write truly dynamic software and I will distill common patterns of software customizability. As running examples I present tools that I worked on during my path of discovering Smalltalk. One of these examples is Magritte, a dynamic meta-model that gives end-users the possibility to customize their applications without the need of an additional development effort. Another example is Helvetia, an infrastructure enabling on-the-fly customization of the programming language and development environment.

petitparserdynamic languagemagritte
Data Wars: The Bloody Enterprise strikes back
Data Wars: The Bloody Enterprise strikes backData Wars: The Bloody Enterprise strikes back
Data Wars: The Bloody Enterprise strikes back

I would like to describe such cases when we create problems for "future us" just by an accident. I will show how different Java data types can ease or increase the pain in supporting the application later. Most common pitfals and tricky corner cases you probably have never thought about.

data2016datawars
Client X : the chosen solution
$db->autocommit(false);
foreach ($jobs as $job) {
$db->query("
insert into shown_today(
jobId,
number
) values(
" . $job['id'] . ",
1
)
on duplicate key
update
number = number + 1
");
$db->query("
insert into shown_week(
jobId,
number
) values(
" . $job['id'] . ",
1
)
on duplicate key
update
number = number + 1
");
$db->query("
insert into shown_month(
jobId,
number
) values(
" . $job['id'] . ",
1
)
on duplicate key
update
number = number + 1
");
$db->query("
insert into shown_user(
jobId,
userId,
when
) values (
" . $job['id'] . ",
" . $user['id'] . ",
now()
)
");
}
$db->commit();
Client X : conclusion
For loops are bad (we already knew that)
Add master/slave and it gets much worse
Use transactions : it will provide huge performance increase
Result : slave caught up 5 days later
Database → Network
Customer Y
Top 10 site in Belgium
Growing rapidly
At peak traffic :
Unexplicable latency on database
Load on webservers : minimal
Load on database servers : acceptable
Client Y : the network

Recommended for you

Automated reduction of attack surface using call graph enumeration
Automated reduction of attack surface using call graph enumerationAutomated reduction of attack surface using call graph enumeration
Automated reduction of attack surface using call graph enumeration

2018 2nd International Conference on Management Engineering, Software Engineering and Service Sciences (ICMSS) Jan 13-15, 2018 in Wuhan, China

SOLID Ruby, SOLID Rails
SOLID Ruby, SOLID RailsSOLID Ruby, SOLID Rails
SOLID Ruby, SOLID Rails

Presentation I gave with MMahlberg at RailsWayCon 2010 about Uncle Bobs SOLID principles for software design

railswayconrailssolid
Inside Winnyp
Inside WinnypInside Winnyp
Inside Winnyp

Winnyp is an anonymous P2P filesharing software based on Winny. It uses its own encryption key generation algorithm that is more complex than Winny's algorithm, making it more difficult to analyze. The report details Winnyp's internal workings, including how it initializes and patches itself, generates encryption keys through multiple algorithms, and specifies the version of connected nodes by using different encryption keys. It also describes how Winnyp sends packets with dummy data and receives packets to communicate with both Winny and other Winnyp nodes.

winnypinformation security
Client Y : the network
60GB 700GB 700GB
Client Y : network overload
Cause : Drupal hooks → retrieving data that was not needed
Only load data you actually need
Don't know at the start ? → Use lazy loading
Caching :
Same story
Memcached/Redis are fast
But : data still needs to cross the network
Network trouble : more than just traffic
Customer Z
150.000 visits/day
News ticker :
XML feed from other site (owned by same customer)
Cached for 15 min
Customer Z – fetching the feed
if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) {
unlink(APP_DIR . '/tmp/cacheFile.xml');
file_put_contents(
APP_DIR . '/tmp/cacheFile.xml',
file_get_contents('http://www.scrambledsitename.be/xml/feed.xml')
);
}
$xmlfeed = ParseXmlFeed(APP_DIR . '/tmp/cacheFile.xml');
What's wrong with this code ?

Recommended for you

Scaling MySQL Strategies for Developers
Scaling MySQL Strategies for DevelopersScaling MySQL Strategies for Developers
Scaling MySQL Strategies for Developers

Jonathan is a MySQL consultant who specializes in SQL, indexing, and reporting for big data. This tutorial will cover strategies for resolving 80% of performance problems, including indexes, partitioning, intensive table optimization, and finding and addressing bottlenecks. The strategies discussed will be common, established approaches based on the presenter's experience working with MySQL since 2007.

mysql
Bringing bright ideas to life
Bringing bright ideas to lifeBringing bright ideas to life
Bringing bright ideas to life

Wim Godden discusses how to bring bright ideas to life by first determining if an idea is truly original and if there is market demand. He recommends building small initially to get fast feedback and adding features step-by-step. Talking to potential customers can provide feedback but be careful not to share every detail. Leverage existing services like APIs and consider scalability from the start. Financial projections are important, and success may lead to building on the idea through APIs, plugins or white labels. Keep iterating ideas and be willing to let others take over if it does not work out.

businessdevelopingdevelopment
The why and how of moving to php 8
The why and how of moving to php 8The why and how of moving to php 8
The why and how of moving to php 8

With PHP 8.0 recently released and PHP 5.x still accounting for over 40% of all production environments, it's time to paint a clear picture on not just why everyone should move to 8.x, but on how to get code ready for the latest version of PHP. In this talk, we'll look at some handy tools and techniques to ease the migration.

phpphp7php_codesniffer
Customer Z – no feed without the source
Feed source
Customer Z – no feed without the source
Feed source
Customer Z : timeout
default_socket_timeout : 60 sec by default
Each visitor : 60 sec wait time
People keep hitting refresh → more load
More active connections → more load
Apache hits maximum connections → entire site down
Customer Z : timeout fix
$context = stream_context_create(
array(
'http' => array(
'timeout' => 5
)
)
);
if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) {
unlink(APP_DIR . '/tmp/cacheFile.xml');
file_put_contents(
APP_DIR . '/tmp/cacheFile.xml',
file_get_contents('http://www.scrambledsitename.be/xml/feed.xml', false, $context)
);
}
$xmlfeed = ParseXmlFeed(APP_DIR . '/tmp/cacheFile.xml');

Recommended for you

The why and how of moving to php 7
The why and how of moving to php 7The why and how of moving to php 7
The why and how of moving to php 7

With PHP 7.2 recently released and PHP 5.3 and 5.4 still accounting for over 40% of all production environments, it's time to paint a clear picture on not just why everyone should move to 7.0 (or preferably 7.1), but on how to get code ready for the latest version of PHP. Using the version compatibility checker for PHP_CodeSniffer and a few simple step-by-step instructions, upgrading old code to make it compatible with the latest PHP versions becomes actually really easy. In this talk, we'll migrate an old piece of code and get rid of the demons of the past and ready for the present and future.

phpphp7php_codesniffer
My app is secure... I think
My app is secure... I thinkMy app is secure... I think
My app is secure... I think

With more and more sites falling victim to data theft, you've probably read the list of things (not) to do to write secure code. But what else should you do to make sure your code and the rest of your web stack is secure ? In this tutorial we'll go through the basic and more advanced techniques of securing your web and database servers, securing your backend PHP code and your frontend javascript code. We'll also look at how you can build code that detects and blocks intrusion attempts and a bunch of other tips and tricks to make sure your customer data stays secure.

code securitydata theftgdpr
My app is secure... I think
My app is secure... I thinkMy app is secure... I think
My app is secure... I think

With more and more sites falling victim to data theft, you've probably read the list of things (not) to do to write secure code. But what else should you do to make sure your code and the rest of your web stack is secure ? In this tutorial we'll go through the basic and more advanced techniques of securing your web and database servers, securing your backend PHP code and your frontend javascript code. We'll also look at how you can build code that detects and blocks intrusion attempts and a bunch of other tips and tricks to make sure your customer data stays secure.

code securitydata theftgdpr
Customer Z : don't delete from cache
$context = stream_context_create(
array(
'http' => array(
'timeout' => 5
)
)
);
if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) {
unlink(APP_DIR . '/tmp/cacheFile.xml');
file_put_contents(
APP_DIR . '/tmp/cacheFile.xml',
file_get_contents('http://www.scrambledsitename.be/xml/feed.xml', false, $context)
);
}
$xmlfeed = ParseXmlFeed(APP_DIR . '/tmp/cacheFile.xml');
Customer Z : don't delete from cache
$context = stream_context_create(
array(
'http' => array(
'timeout' => 5
)
)
);
if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) {
file_put_contents(
APP_DIR . '/tmp/cacheFile.xml',
file_get_contents('http://www.scrambledsitename.be/xml/feed.xml', false, $context)
);
}
$xmlfeed = ParseXmlFeed(APP_DIR . '/tmp/cacheFile.xml');
Customer Z : don't delete from cache
$context = stream_context_create(
array(
'http' => array(
'timeout' => 5
)
)
);
if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) {
$feed = file_get_contents('http://www.scrambledsitename.be/xml/feed.xml', false, $context);
if ($feed !== false) {
file_put_contents(
APP_DIR . '/tmp/cacheFile.xml',
$feed
);
}
}
$xmlfeed = ParseXmlFeed(APP_DIR . '/tmp/cacheFile.xml');
Customer Z : process early
$context = stream_context_create(
array(
'http' => array(
'timeout' => 5
)
)
);
if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) {
$feed = file_get_contents('http://www.scrambledsitename.be/xml/feed.xml', false, $context);
if ($feed !== false) {
file_put_contents(
APP_DIR . '/tmp/cacheFile.xml',
ParseXmlFeed($feed)
);
}
}

Recommended for you

Building interactivity with websockets
Building interactivity with websocketsBuilding interactivity with websockets
Building interactivity with websockets

The time of static or dynamically generated sites is long gone. Non-stop interaction with users is the new normal. However, polling with Ajax requests is processor intensive and cumbersome. Websockets allow you to interact with users in real-time without increasing system load. We'll go through the basics and see all the different options, illustrated with live examples of how and when to use it, as well as when not to use it.

websocketsphpcrossbar
Bringing bright ideas to life
Bringing bright ideas to lifeBringing bright ideas to life
Bringing bright ideas to life

Who would have thought putting 140 charachter messages about one's life online or having a virtual farm game could ever be popular ? Then again, many of us have those weird (but sometimes brilliant) ideas. But no matter how incredible your ideas might be, getting them launched successfully takes more than writing lots of php code, smacking a sleek design on it and dropping it on a server. So what does it take ? Where do most ideas crashland and how can you avoid making the same mistakes and transform your ideas into reality ? We'll look at what steps are needed to make a service successful and sustainable.

businessdevelopingdevelopment
Your app lives on the network - networking for web developers
Your app lives on the network - networking for web developersYour app lives on the network - networking for web developers
Your app lives on the network - networking for web developers

Our job might be to build web applications, but we can't build apps that rely on networking if we don't know how these networks and the big network that connects them all (this thing called the Internet) actually work. I'll walk through the basics of networking, then dive a lot deeper (from TCP/UDP to IPv4/6, source/destination ports, sockets, DNS and even BGP). Prepare for an eye-opener when you realize how much a typical app relies on all of these (and many more) working flawlessly... and how you can prepare your app for failure in the chain.

bgpipipv6
Customer Z : file_[get|put]_contents atomicity
$context = stream_context_create(
array(
'http' => array(
'timeout' => 5
)
)
);
if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) {
$feed = file_get_contents('http://www.scrambledsitename.be/xml/feed.xml', false, $context);
if ($feed !== false) {
file_put_contents(
APP_DIR . '/tmp/cacheFile.xml',
ParseXmlFeed($feed)
);
}
}
Relying on user → concurrent request → possible corrupt data
Better : run every 15min through cronjob
Network resources
Use timeouts for all :
fopen
curl
SOAP
…
Data source trusted ?
→ setup a webservice
→ let them push updates when their feed changes
→ less load on data source
→ no timeout issues
Add logging → early detection
Logging
Logging = good
Logging in PHP using fopen
→ bad idea : locking issues
→ Use monolog : file, syslog, mail, Pushover, HipChat, Graylog,
Rollbar, ElasticSearch (and 50 more)
For Firefox : FirePHP (add-on for Firebug)
Debug logging = bad on production
Watch your logs !
Don't log on slow disks → I/O bottlenecks
File system : I/O bottlenecks
Causes :
Excessive writes (database updates, logfiles, swapping, …)
Excessive reads (non-indexed database queries, swapping, small file
system cache, …)
How to detect ?
top
iostat
See iowait ? Stop worrying about php, fix the I/O problem !
Cpu(s): 0.2%us, 3.0%sy, 0.0%ni, 61.4%id, 35.5%wa, 0.0%hi, 0.0%si, 0.0%st
avg-cpu: %user %nice %system %iowait %steal %idle
0.10 0.00 0.96 53.70 0.00 45.24
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 120.40 0.00 123289.60 0 616448
sdb 2.10 0.00 4378.10 0 18215
dm-0 4.20 0.00 36.80 0 184
dm-1 0.00 0.00 0.00 0 0

Recommended for you

The why and how of moving to php 7.x
The why and how of moving to php 7.xThe why and how of moving to php 7.x
The why and how of moving to php 7.x

With PHP 7.2 recently released and PHP 5.3 and 5.4 still accounting for over 40% of all production environments, it's time to paint a clear picture on not just why everyone should move to 7.0 (or preferably 7.1), but on how to get code ready for the latest version of PHP. Using the version compatibility checker for PHP_CodeSniffer and a few simple step-by-step instructions, upgrading old code to make it compatible with the latest PHP versions becomes actually really easy. In this talk, we'll migrate an old piece of code and get rid of the demons of the past and ready for the present and future.

phpphp7php_codesniffer
The why and how of moving to php 7.x
The why and how of moving to php 7.xThe why and how of moving to php 7.x
The why and how of moving to php 7.x

The document discusses upgrading from PHP 5.x to PHP 7.x. It begins by explaining why upgrading is important for security, performance and compatibility reasons, as PHP 5.x reaches end of life. It then discusses how to upgrade, including new features in PHP 7.x like scalar type declarations and null coalescing operators, and removed/deprecated functions. It emphasizes automating the process using tools like PHPCompatibility to analyze code for compatibility issues across PHP versions. Upgrading in a staged, tested manner is recommended over postponing upgrades.

phpphpcompatibilityphp_codesniffer
My app is secure... I think
My app is secure... I thinkMy app is secure... I think
My app is secure... I think

With more and more sites falling victim to data theft, you've probably read the list of things (not) to do to write secure code. But what else should you do to make sure your code and the rest of your web stack is secure ? In this tutorial we'll go through the basic and more advanced techniques of securing your web and database servers, securing your backend PHP code and your frontend javascript code. We'll also look at how you can build code that detects and blocks intrusion attempts and a bunch of other tips and tricks to make sure your customer data stays secure.

data theftcode securitygdpr
Much more than code
DB
server
Webserver
User
Network
XML feed
Look beyond PHP (or Perl, Ruby, Python, ...) !
Questions ?
Questions ?

Recommended for you

Building interactivity with websockets
Building interactivity with websocketsBuilding interactivity with websockets
Building interactivity with websockets

The time of static or dynamically generated sites is long gone. Non-stop interaction with users is the new normal. However, polling with Ajax requests is processor intensive and cumbersome. Websockets allow you to interact with users in real-time without increasing system load. We'll go through the basics and see all the different options, illustrated with live examples of how and when to use it.

ajaxjavascriptphp
Your app lives on the network - networking for web developers
Your app lives on the network - networking for web developersYour app lives on the network - networking for web developers
Your app lives on the network - networking for web developers

Our job might be to build web applications, but we can't build apps that rely on networking if we don't know how these networks and the big network that connects them all (this thing called the Internet) actually work. I'll walk through the basics of networking, then dive a lot deeper (from TCP/UDP to IPv4/6, source/destination ports, sockets, DNS and even BGP). Prepare for an eye-opener when you realize how much a typical app relies on all of these (and many more) working flawlessly... and how you can prepare your app for failure in the chain.

web developmentroutingbgp
My app is secure... I think
My app is secure... I thinkMy app is secure... I think
My app is secure... I think

With more and more sites falling victim to data theft, you've probably read the list of things (not) to do to write secure code. But what else should you do to make sure your code and the rest of your web stack is secure ? In this tutorial we'll go through the basic and more advanced techniques of securing your web and database servers, securing your backend PHP code and your frontend javascript code. We'll also look at how you can build code that detects and blocks intrusion attempts and a bunch of other tips and tricks to make sure your customer data stays secure.

phpleuvenrecoverysystem security
Contact
Twitter @wimgtr
Web http://techblog.wimgodden.be
Slides http://www.slideshare.net/wimg
E-mail wim.godden@cu.be
Please...
Rate my talk : http://joind.in/11676
Step-by-step : most common issues
Using NFS ? Get rid of it ;-)
iowait on database server
I/O reads (use iostat) ? → missing/wrong indexes
I/O writes ?
→ no transactions ?
→ too many queries ?
→ bad DB engine settings
iowait on webserver (logs ? static files ?)
CPU on database server (missing/wrong/too many indexes)
CPU on webserver (PHP)

More Related Content

What's hot

Caching and tuning fun for high scalability @ LOAD2012
Caching and tuning fun for high scalability @ LOAD2012Caching and tuning fun for high scalability @ LOAD2012
Caching and tuning fun for high scalability @ LOAD2012
Wim Godden
 
Beyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeBeyond php - it's not (just) about the code
Beyond php - it's not (just) about the code
Wim Godden
 
When dynamic becomes static: the next step in web caching techniques
When dynamic becomes static: the next step in web caching techniquesWhen dynamic becomes static: the next step in web caching techniques
When dynamic becomes static: the next step in web caching techniques
Wim Godden
 
My app is secure... I think
My app is secure... I thinkMy app is secure... I think
My app is secure... I think
Wim Godden
 
The why and how of moving to PHP 5.5/5.6
The why and how of moving to PHP 5.5/5.6The why and how of moving to PHP 5.5/5.6
The why and how of moving to PHP 5.5/5.6
Wim Godden
 
Remove php calls and scale your site like crazy !
Remove php calls and scale your site like crazy !Remove php calls and scale your site like crazy !
Remove php calls and scale your site like crazy !
Wim Godden
 
The why and how of moving to PHP 5.4/5.5
The why and how of moving to PHP 5.4/5.5The why and how of moving to PHP 5.4/5.5
The why and how of moving to PHP 5.4/5.5
Wim Godden
 
Caching and tuning fun for high scalability @ FrOSCon 2011
Caching and tuning fun for high scalability @ FrOSCon 2011Caching and tuning fun for high scalability @ FrOSCon 2011
Caching and tuning fun for high scalability @ FrOSCon 2011
Wim Godden
 
Load Data Fast!
Load Data Fast!Load Data Fast!
MySQL under the siege
MySQL under the siegeMySQL under the siege
MySQL under the siege
Source Ministry
 
Tips on how to improve the performance of your custom modules for high volume...
Tips on how to improve the performance of your custom modules for high volume...Tips on how to improve the performance of your custom modules for high volume...
Tips on how to improve the performance of your custom modules for high volume...
Odoo
 
Varnish, the high performance valhalla?
Varnish, the high performance valhalla?Varnish, the high performance valhalla?
Varnish, the high performance valhalla?
Jeroen van Dijk
 
Survey of Percona Toolkit
Survey of Percona ToolkitSurvey of Percona Toolkit
Survey of Percona Toolkit
Karwin Software Solutions LLC
 
Percona toolkit
Percona toolkitPercona toolkit
MySQL 5.5 Guide to InnoDB Status
MySQL 5.5 Guide to InnoDB StatusMySQL 5.5 Guide to InnoDB Status
MySQL 5.5 Guide to InnoDB Status
Karwin Software Solutions LLC
 
Non-Relational Postgres / Bruce Momjian (EnterpriseDB)
Non-Relational Postgres / Bruce Momjian (EnterpriseDB)Non-Relational Postgres / Bruce Momjian (EnterpriseDB)
Non-Relational Postgres / Bruce Momjian (EnterpriseDB)
Ontico
 
Cassandra for Python Developers
Cassandra for Python DevelopersCassandra for Python Developers
Cassandra for Python Developers
Tyler Hobbs
 
Working with databases in Perl
Working with databases in PerlWorking with databases in Perl
Working with databases in Perl
Laurent Dami
 
New Framework - ORM
New Framework - ORMNew Framework - ORM
New Framework - ORM
Odoo
 
Best Practices in Handling Performance Issues
Best Practices in Handling Performance IssuesBest Practices in Handling Performance Issues
Best Practices in Handling Performance Issues
Odoo
 

What's hot (20)

Caching and tuning fun for high scalability @ LOAD2012
Caching and tuning fun for high scalability @ LOAD2012Caching and tuning fun for high scalability @ LOAD2012
Caching and tuning fun for high scalability @ LOAD2012
 
Beyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeBeyond php - it's not (just) about the code
Beyond php - it's not (just) about the code
 
When dynamic becomes static: the next step in web caching techniques
When dynamic becomes static: the next step in web caching techniquesWhen dynamic becomes static: the next step in web caching techniques
When dynamic becomes static: the next step in web caching techniques
 
My app is secure... I think
My app is secure... I thinkMy app is secure... I think
My app is secure... I think
 
The why and how of moving to PHP 5.5/5.6
The why and how of moving to PHP 5.5/5.6The why and how of moving to PHP 5.5/5.6
The why and how of moving to PHP 5.5/5.6
 
Remove php calls and scale your site like crazy !
Remove php calls and scale your site like crazy !Remove php calls and scale your site like crazy !
Remove php calls and scale your site like crazy !
 
The why and how of moving to PHP 5.4/5.5
The why and how of moving to PHP 5.4/5.5The why and how of moving to PHP 5.4/5.5
The why and how of moving to PHP 5.4/5.5
 
Caching and tuning fun for high scalability @ FrOSCon 2011
Caching and tuning fun for high scalability @ FrOSCon 2011Caching and tuning fun for high scalability @ FrOSCon 2011
Caching and tuning fun for high scalability @ FrOSCon 2011
 
Load Data Fast!
Load Data Fast!Load Data Fast!
Load Data Fast!
 
MySQL under the siege
MySQL under the siegeMySQL under the siege
MySQL under the siege
 
Tips on how to improve the performance of your custom modules for high volume...
Tips on how to improve the performance of your custom modules for high volume...Tips on how to improve the performance of your custom modules for high volume...
Tips on how to improve the performance of your custom modules for high volume...
 
Varnish, the high performance valhalla?
Varnish, the high performance valhalla?Varnish, the high performance valhalla?
Varnish, the high performance valhalla?
 
Survey of Percona Toolkit
Survey of Percona ToolkitSurvey of Percona Toolkit
Survey of Percona Toolkit
 
Percona toolkit
Percona toolkitPercona toolkit
Percona toolkit
 
MySQL 5.5 Guide to InnoDB Status
MySQL 5.5 Guide to InnoDB StatusMySQL 5.5 Guide to InnoDB Status
MySQL 5.5 Guide to InnoDB Status
 
Non-Relational Postgres / Bruce Momjian (EnterpriseDB)
Non-Relational Postgres / Bruce Momjian (EnterpriseDB)Non-Relational Postgres / Bruce Momjian (EnterpriseDB)
Non-Relational Postgres / Bruce Momjian (EnterpriseDB)
 
Cassandra for Python Developers
Cassandra for Python DevelopersCassandra for Python Developers
Cassandra for Python Developers
 
Working with databases in Perl
Working with databases in PerlWorking with databases in Perl
Working with databases in Perl
 
New Framework - ORM
New Framework - ORMNew Framework - ORM
New Framework - ORM
 
Best Practices in Handling Performance Issues
Best Practices in Handling Performance IssuesBest Practices in Handling Performance Issues
Best Practices in Handling Performance Issues
 

Similar to Beyond php - it's not (just) about the code

Beyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeBeyond php - it's not (just) about the code
Beyond php - it's not (just) about the code
Wim Godden
 
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the codeBeyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Wim Godden
 
Beyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeBeyond php - it's not (just) about the code
Beyond php - it's not (just) about the code
Wim Godden
 
A miało być tak... bez wycieków
A miało być tak... bez wyciekówA miało być tak... bez wycieków
A miało być tak... bez wycieków
Konrad Kokosa
 
Clean Code Development
Clean Code DevelopmentClean Code Development
Clean Code Development
Peter Gfader
 
Python na Infraestrutura 
MySQL do Facebook

Python na Infraestrutura 
MySQL do Facebook
Python na Infraestrutura 
MySQL do Facebook

Python na Infraestrutura 
MySQL do Facebook

Artur Rodrigues
 
Duplicates everywhere (Kiev)
Duplicates everywhere (Kiev)Duplicates everywhere (Kiev)
Duplicates everywhere (Kiev)
Alexey Grigorev
 
Explain this!
Explain this!Explain this!
Explain this!
Fabio Telles Rodriguez
 
Apache Spark Side of Funnels
Apache Spark Side of FunnelsApache Spark Side of Funnels
Apache Spark Side of Funnels
Databricks
 
From logs to metrics
From logs to metricsFrom logs to metrics
From logs to metrics
Leonardo Di Donato
 
Windbg랑 친해지기
Windbg랑 친해지기Windbg랑 친해지기
Windbg랑 친해지기
Ji Hun Kim
 
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Data Provenance Support in...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Data Provenance Support in...Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Data Provenance Support in...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Data Provenance Support in...
Data Con LA
 
Duplicates everywhere (Berlin)
Duplicates everywhere (Berlin)Duplicates everywhere (Berlin)
Duplicates everywhere (Berlin)
Alexey Grigorev
 
5 must have patterns for your microservice - techorama
5 must have patterns for your microservice - techorama5 must have patterns for your microservice - techorama
5 must have patterns for your microservice - techorama
Ali Kheyrollahi
 
The Dynamic Language is not Enough
The Dynamic Language is not EnoughThe Dynamic Language is not Enough
The Dynamic Language is not Enough
Lukas Renggli
 
Data Wars: The Bloody Enterprise strikes back
Data Wars: The Bloody Enterprise strikes backData Wars: The Bloody Enterprise strikes back
Data Wars: The Bloody Enterprise strikes back
Victor_Cr
 
Automated reduction of attack surface using call graph enumeration
Automated reduction of attack surface using call graph enumerationAutomated reduction of attack surface using call graph enumeration
Automated reduction of attack surface using call graph enumeration
Ruo Ando
 
SOLID Ruby, SOLID Rails
SOLID Ruby, SOLID RailsSOLID Ruby, SOLID Rails
SOLID Ruby, SOLID Rails
Jens-Christian Fischer
 
Inside Winnyp
Inside WinnypInside Winnyp
Inside Winnyp
FFRI, Inc.
 
Scaling MySQL Strategies for Developers
Scaling MySQL Strategies for DevelopersScaling MySQL Strategies for Developers
Scaling MySQL Strategies for Developers
Jonathan Levin
 

Similar to Beyond php - it's not (just) about the code (20)

Beyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeBeyond php - it's not (just) about the code
Beyond php - it's not (just) about the code
 
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the codeBeyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
 
Beyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeBeyond php - it's not (just) about the code
Beyond php - it's not (just) about the code
 
A miało być tak... bez wycieków
A miało być tak... bez wyciekówA miało być tak... bez wycieków
A miało być tak... bez wycieków
 
Clean Code Development
Clean Code DevelopmentClean Code Development
Clean Code Development
 
Python na Infraestrutura 
MySQL do Facebook

Python na Infraestrutura 
MySQL do Facebook
Python na Infraestrutura 
MySQL do Facebook

Python na Infraestrutura 
MySQL do Facebook

 
Duplicates everywhere (Kiev)
Duplicates everywhere (Kiev)Duplicates everywhere (Kiev)
Duplicates everywhere (Kiev)
 
Explain this!
Explain this!Explain this!
Explain this!
 
Apache Spark Side of Funnels
Apache Spark Side of FunnelsApache Spark Side of Funnels
Apache Spark Side of Funnels
 
From logs to metrics
From logs to metricsFrom logs to metrics
From logs to metrics
 
Windbg랑 친해지기
Windbg랑 친해지기Windbg랑 친해지기
Windbg랑 친해지기
 
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Data Provenance Support in...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Data Provenance Support in...Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Data Provenance Support in...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Data Provenance Support in...
 
Duplicates everywhere (Berlin)
Duplicates everywhere (Berlin)Duplicates everywhere (Berlin)
Duplicates everywhere (Berlin)
 
5 must have patterns for your microservice - techorama
5 must have patterns for your microservice - techorama5 must have patterns for your microservice - techorama
5 must have patterns for your microservice - techorama
 
The Dynamic Language is not Enough
The Dynamic Language is not EnoughThe Dynamic Language is not Enough
The Dynamic Language is not Enough
 
Data Wars: The Bloody Enterprise strikes back
Data Wars: The Bloody Enterprise strikes backData Wars: The Bloody Enterprise strikes back
Data Wars: The Bloody Enterprise strikes back
 
Automated reduction of attack surface using call graph enumeration
Automated reduction of attack surface using call graph enumerationAutomated reduction of attack surface using call graph enumeration
Automated reduction of attack surface using call graph enumeration
 
SOLID Ruby, SOLID Rails
SOLID Ruby, SOLID RailsSOLID Ruby, SOLID Rails
SOLID Ruby, SOLID Rails
 
Inside Winnyp
Inside WinnypInside Winnyp
Inside Winnyp
 
Scaling MySQL Strategies for Developers
Scaling MySQL Strategies for DevelopersScaling MySQL Strategies for Developers
Scaling MySQL Strategies for Developers
 

More from Wim Godden

Bringing bright ideas to life
Bringing bright ideas to lifeBringing bright ideas to life
Bringing bright ideas to life
Wim Godden
 
The why and how of moving to php 8
The why and how of moving to php 8The why and how of moving to php 8
The why and how of moving to php 8
Wim Godden
 
The why and how of moving to php 7
The why and how of moving to php 7The why and how of moving to php 7
The why and how of moving to php 7
Wim Godden
 
My app is secure... I think
My app is secure... I thinkMy app is secure... I think
My app is secure... I think
Wim Godden
 
My app is secure... I think
My app is secure... I thinkMy app is secure... I think
My app is secure... I think
Wim Godden
 
Building interactivity with websockets
Building interactivity with websocketsBuilding interactivity with websockets
Building interactivity with websockets
Wim Godden
 
Bringing bright ideas to life
Bringing bright ideas to lifeBringing bright ideas to life
Bringing bright ideas to life
Wim Godden
 
Your app lives on the network - networking for web developers
Your app lives on the network - networking for web developersYour app lives on the network - networking for web developers
Your app lives on the network - networking for web developers
Wim Godden
 
The why and how of moving to php 7.x
The why and how of moving to php 7.xThe why and how of moving to php 7.x
The why and how of moving to php 7.x
Wim Godden
 
The why and how of moving to php 7.x
The why and how of moving to php 7.xThe why and how of moving to php 7.x
The why and how of moving to php 7.x
Wim Godden
 
My app is secure... I think
My app is secure... I thinkMy app is secure... I think
My app is secure... I think
Wim Godden
 
Building interactivity with websockets
Building interactivity with websocketsBuilding interactivity with websockets
Building interactivity with websockets
Wim Godden
 
Your app lives on the network - networking for web developers
Your app lives on the network - networking for web developersYour app lives on the network - networking for web developers
Your app lives on the network - networking for web developers
Wim Godden
 
My app is secure... I think
My app is secure... I thinkMy app is secure... I think
My app is secure... I think
Wim Godden
 
The promise of asynchronous php
The promise of asynchronous phpThe promise of asynchronous php
The promise of asynchronous php
Wim Godden
 
My app is secure... I think
My app is secure... I thinkMy app is secure... I think
My app is secure... I think
Wim Godden
 
My app is secure... I think
My app is secure... I thinkMy app is secure... I think
My app is secure... I think
Wim Godden
 
Practical git for developers
Practical git for developersPractical git for developers
Practical git for developers
Wim Godden
 
Is your code ready for PHP 7 ?
Is your code ready for PHP 7 ?Is your code ready for PHP 7 ?
Is your code ready for PHP 7 ?
Wim Godden
 
My app is secure... I think
My app is secure... I thinkMy app is secure... I think
My app is secure... I think
Wim Godden
 

More from Wim Godden (20)

Bringing bright ideas to life
Bringing bright ideas to lifeBringing bright ideas to life
Bringing bright ideas to life
 
The why and how of moving to php 8
The why and how of moving to php 8The why and how of moving to php 8
The why and how of moving to php 8
 
The why and how of moving to php 7
The why and how of moving to php 7The why and how of moving to php 7
The why and how of moving to php 7
 
My app is secure... I think
My app is secure... I thinkMy app is secure... I think
My app is secure... I think
 
My app is secure... I think
My app is secure... I thinkMy app is secure... I think
My app is secure... I think
 
Building interactivity with websockets
Building interactivity with websocketsBuilding interactivity with websockets
Building interactivity with websockets
 
Bringing bright ideas to life
Bringing bright ideas to lifeBringing bright ideas to life
Bringing bright ideas to life
 
Your app lives on the network - networking for web developers
Your app lives on the network - networking for web developersYour app lives on the network - networking for web developers
Your app lives on the network - networking for web developers
 
The why and how of moving to php 7.x
The why and how of moving to php 7.xThe why and how of moving to php 7.x
The why and how of moving to php 7.x
 
The why and how of moving to php 7.x
The why and how of moving to php 7.xThe why and how of moving to php 7.x
The why and how of moving to php 7.x
 
My app is secure... I think
My app is secure... I thinkMy app is secure... I think
My app is secure... I think
 
Building interactivity with websockets
Building interactivity with websocketsBuilding interactivity with websockets
Building interactivity with websockets
 
Your app lives on the network - networking for web developers
Your app lives on the network - networking for web developersYour app lives on the network - networking for web developers
Your app lives on the network - networking for web developers
 
My app is secure... I think
My app is secure... I thinkMy app is secure... I think
My app is secure... I think
 
The promise of asynchronous php
The promise of asynchronous phpThe promise of asynchronous php
The promise of asynchronous php
 
My app is secure... I think
My app is secure... I thinkMy app is secure... I think
My app is secure... I think
 
My app is secure... I think
My app is secure... I thinkMy app is secure... I think
My app is secure... I think
 
Practical git for developers
Practical git for developersPractical git for developers
Practical git for developers
 
Is your code ready for PHP 7 ?
Is your code ready for PHP 7 ?Is your code ready for PHP 7 ?
Is your code ready for PHP 7 ?
 
My app is secure... I think
My app is secure... I thinkMy app is secure... I think
My app is secure... I think
 

Recently uploaded

ScrumGathering New Orleans 2024 Catherine Louis.pdf
ScrumGathering New Orleans 2024  Catherine Louis.pdfScrumGathering New Orleans 2024  Catherine Louis.pdf
ScrumGathering New Orleans 2024 Catherine Louis.pdf
Global Agile Consulting- CLL-Group, LLC
 
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptxRPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
SynapseIndia
 
Pigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdfPigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdf
Pigging Solutions
 
Comparison Table of DiskWarrior Alternatives.pdf
Comparison Table of DiskWarrior Alternatives.pdfComparison Table of DiskWarrior Alternatives.pdf
Comparison Table of DiskWarrior Alternatives.pdf
Andrey Yasko
 
How RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptxHow RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptx
SynapseIndia
 
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
Priyanka Aash
 
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...
Toru Tamaki
 
ARTIFICIAL INTELLIGENCE (AI) IN MUSIC.pdf
ARTIFICIAL INTELLIGENCE (AI) IN MUSIC.pdfARTIFICIAL INTELLIGENCE (AI) IN MUSIC.pdf
ARTIFICIAL INTELLIGENCE (AI) IN MUSIC.pdf
Inglês no Mundo Digital
 
Best Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdfBest Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdf
Tatiana Al-Chueyr
 
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-InTrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc
 
Three New Criminal Laws in India 1 July 2024
Three New Criminal Laws in India 1 July 2024Three New Criminal Laws in India 1 July 2024
Three New Criminal Laws in India 1 July 2024
aakash malhotra
 
Coordinate Systems in FME 101 - Webinar Slides
Coordinate Systems in FME 101 - Webinar SlidesCoordinate Systems in FME 101 - Webinar Slides
Coordinate Systems in FME 101 - Webinar Slides
Safe Software
 
Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...
Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...
Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...
Bert Blevins
 
DealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 editionDealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 edition
Yevgen Sysoyev
 
Implementations of Fused Deposition Modeling in real world
Implementations of Fused Deposition Modeling  in real worldImplementations of Fused Deposition Modeling  in real world
Implementations of Fused Deposition Modeling in real world
Emerging Tech
 
The Role of Technology in Payroll Statutory Compliance (1).pdf
The Role of Technology in Payroll Statutory Compliance (1).pdfThe Role of Technology in Payroll Statutory Compliance (1).pdf
The Role of Technology in Payroll Statutory Compliance (1).pdf
paysquare consultancy
 
Applying Retrieval-Augmented Generation (RAG) to Combat Hallucinations in GenAI
Applying Retrieval-Augmented Generation (RAG) to Combat Hallucinations in GenAIApplying Retrieval-Augmented Generation (RAG) to Combat Hallucinations in GenAI
Applying Retrieval-Augmented Generation (RAG) to Combat Hallucinations in GenAI
ssuserd4e0d2
 
WPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide DeckWPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide Deck
Lidia A.
 
Overview of Enterprise-scale landing zones using Cloud Adoption Framework Rea...
Overview of Enterprise-scale landing zones using Cloud Adoption Framework Rea...Overview of Enterprise-scale landing zones using Cloud Adoption Framework Rea...
Overview of Enterprise-scale landing zones using Cloud Adoption Framework Rea...
MarceloMiranda38200
 
Password Rotation in 2024 is still Relevant
Password Rotation in 2024 is still RelevantPassword Rotation in 2024 is still Relevant
Password Rotation in 2024 is still Relevant
Bert Blevins
 

Recently uploaded (20)

ScrumGathering New Orleans 2024 Catherine Louis.pdf
ScrumGathering New Orleans 2024  Catherine Louis.pdfScrumGathering New Orleans 2024  Catherine Louis.pdf
ScrumGathering New Orleans 2024 Catherine Louis.pdf
 
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptxRPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
 
Pigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdfPigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdf
 
Comparison Table of DiskWarrior Alternatives.pdf
Comparison Table of DiskWarrior Alternatives.pdfComparison Table of DiskWarrior Alternatives.pdf
Comparison Table of DiskWarrior Alternatives.pdf
 
How RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptxHow RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptx
 
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
(CISOPlatform Summit & SACON 2024) Keynote _ Power Digital Identities With AI...
 
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...
論文紹介:A Systematic Survey of Prompt Engineering on Vision-Language Foundation ...
 
ARTIFICIAL INTELLIGENCE (AI) IN MUSIC.pdf
ARTIFICIAL INTELLIGENCE (AI) IN MUSIC.pdfARTIFICIAL INTELLIGENCE (AI) IN MUSIC.pdf
ARTIFICIAL INTELLIGENCE (AI) IN MUSIC.pdf
 
Best Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdfBest Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdf
 
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-InTrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
 
Three New Criminal Laws in India 1 July 2024
Three New Criminal Laws in India 1 July 2024Three New Criminal Laws in India 1 July 2024
Three New Criminal Laws in India 1 July 2024
 
Coordinate Systems in FME 101 - Webinar Slides
Coordinate Systems in FME 101 - Webinar SlidesCoordinate Systems in FME 101 - Webinar Slides
Coordinate Systems in FME 101 - Webinar Slides
 
Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...
Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...
Understanding Insider Security Threats: Types, Examples, Effects, and Mitigat...
 
DealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 editionDealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 edition
 
Implementations of Fused Deposition Modeling in real world
Implementations of Fused Deposition Modeling  in real worldImplementations of Fused Deposition Modeling  in real world
Implementations of Fused Deposition Modeling in real world
 
The Role of Technology in Payroll Statutory Compliance (1).pdf
The Role of Technology in Payroll Statutory Compliance (1).pdfThe Role of Technology in Payroll Statutory Compliance (1).pdf
The Role of Technology in Payroll Statutory Compliance (1).pdf
 
Applying Retrieval-Augmented Generation (RAG) to Combat Hallucinations in GenAI
Applying Retrieval-Augmented Generation (RAG) to Combat Hallucinations in GenAIApplying Retrieval-Augmented Generation (RAG) to Combat Hallucinations in GenAI
Applying Retrieval-Augmented Generation (RAG) to Combat Hallucinations in GenAI
 
WPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide DeckWPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide Deck
 
Overview of Enterprise-scale landing zones using Cloud Adoption Framework Rea...
Overview of Enterprise-scale landing zones using Cloud Adoption Framework Rea...Overview of Enterprise-scale landing zones using Cloud Adoption Framework Rea...
Overview of Enterprise-scale landing zones using Cloud Adoption Framework Rea...
 
Password Rotation in 2024 is still Relevant
Password Rotation in 2024 is still RelevantPassword Rotation in 2024 is still Relevant
Password Rotation in 2024 is still Relevant
 

Beyond php - it's not (just) about the code

  • 1. Wim Godden Cu.be Solutions @wimgtr Beyond PHP : It's not (just) about the code
  • 2. Who am I ? Wim Godden (@wimgtr)
  • 12. Belgium – the traffic
  • 13. Who am I ? Wim Godden (@wimgtr) Founder of Cu.be Solutions (http://cu.be) Open Source developer since 1997 Developer of OpenX, PHPCompatibility, PHPConsistent, Nginx SLIC, ... Speaker at PHP and Open Source conferences
  • 14. Cu.be Solutions ? Open source consultancy PHP-centered Training courses High-speed redundant network (BGP, OSPF, VRRP) High scalability development Nginx + extensions MySQL Cluster Projects : mostly IT & Telecom companies lots of public-facing apps/sites
  • 15. Who are you ? Developers ? Anyone setup a MySQL master-slave ? Anyone setup a site/app on separate web and database server ? → How much traffic between them ?
  • 16. The topic Things we take for granted Famous last words : "It should work just fine" Works fine today → might fail tomorrow Most common mistakes PHP code ↔ PHP ecosystem
  • 17. It starts with... … code ! First up : database
  • 18. Database queries – complexity SELECT DISTINCT n.nid, n.uid, n.title, n.type, e.event_start, e.event_start AS event_start_orig, e.event_end, e.event_end AS event_end_orig, e.timezone, e.has_time, e.has_end_date, tz.offset AS offset, tz.offset_dst AS offset_dst, tz.dst_region, tz.is_dst, e.event_start - INTERVAL IF(tz.is_dst, tz.offset_dst, tz.offset) HOUR_SECOND AS event_start_utc, e.event_end - INTERVAL IF(tz.is_dst, tz.offset_dst, tz.offset) HOUR_SECOND AS event_end_utc, e.event_start - INTERVAL IF(tz.is_dst, tz.offset_dst, tz.offset) HOUR_SECOND + INTERVAL 0 SECOND AS event_start_user, e.event_end - INTERVAL IF(tz.is_dst, tz.offset_dst, tz.offset) HOUR_SECOND + INTERVAL 0 SECOND AS event_end_user, e.event_start - INTERVAL IF(tz.is_dst, tz.offset_dst, tz.offset) HOUR_SECOND + INTERVAL 0 SECOND AS event_start_site, e.event_end - INTERVAL IF(tz.is_dst, tz.offset_dst, tz.offset) HOUR_SECOND + INTERVAL 0 SECOND AS event_end_site, tz.name as timezone_name FROM node n INNER JOIN event e ON n.nid = e.nid INNER JOIN event_timezones tz ON tz.timezone = e.timezone INNER JOIN node_access na ON na.nid = n.nid LEFT JOIN domain_access da ON n.nid = da.nid LEFT JOIN node i18n ON n.tnid > 0 AND n.tnid = i18n.tnid AND i18n.language = 'en' WHERE (na.grant_view >= 1 AND ((na.gid = 0 AND na.realm = 'all'))) AND ((da.realm = "domain_id" AND da.gid = 4) OR (da.realm = "domain_site" AND da.gid = 0)) AND (n.language ='en' OR n.language ='' OR n.language IS NULL OR n.language = 'is' AND i18n.nid IS NULL) AND ( n.status = 1 AND ((e.event_start >= '2010-01-31 00:00:00' AND e.event_start <= '2010-03-01 23:59:59') OR (e.event_end >= '2010-01-31 00:00:00' AND e.event_end <= '2010-03-01 23:59:59') OR (e.event_start <= '2010-01-31 00:00:00' AND e.event_end >= '2010-03-01 23:59:59')) ) GROUP BY n.nid HAVING (event_start >= '2010-02-01 00:00:00' AND event_start <= '2010-02-28 23:59:59') OR (event_end >= '2010-02-01 00:00:00' AND event_end <= '2010-02-28 23:59:59') OR (event_start <= '2010-02-01 00:00:00' AND event_end >= '2010-02-28 23:59:59') ORDER BY event_start ASC;
  • 19. Database - indexing 'select id from stock where status = 2 order by qty' → aggregate index on (status, qty) 'select id from stock where status > 2 order by qty' → aggregate index on (status, qty) ? → Depends : - Btree : yes - Hash : range selection stops use of aggregate index → separate index on status and qty (since recent versions)
  • 20. Database - indexing Indexes make database faster → Let's index everything ! → DON'T : Insert/update/delete → Index modification Each query → evaluation of all indexes "Relational schema design is based on data but index design is based on queries" (Bill Karwin, Percona)
  • 21. Databases – detecting problematic queries Slow query log → SET GLOBAL slow_query_log = ON; Queries not using indexes → In my.cnf/my.ini : 'log_queries_not_using_indexes' General query log → SET GLOBAL general_log = ON; → Turn it off quickly ! Percona Toolkit (Maatkit) pt-query-digest
  • 22. Databases - pt-query-digest # Profile # Rank Query ID Response time Calls R/Call Apdx V/M Item # ==== ================== ================ ===== ======= ==== ===== ========== # 1 0x543FB322AE4330FF 16526.2542 98.2% 1208 13.6806 1.00 0.00 SELECT output_option # 2 0xE78FEA32E3AA3221 0.8312 0.0% 6412 0.0001 1.00 0.00 SELECT poller_output poller_item # 3 0x211901BF2E1C351E 0.6811 0.0% 6416 0.0001 1.00 0.00 SELECT poller_time # 4 0xA766EE8F7AB39063 0.2805 0.0% 149 0.0019 1.00 0.00 SELECT wp_terms wp_term_taxonomy wp_term_relationships # 5 0xA3EEB63EFBA42E9B 0.1999 0.0% 51 0.0039 1.00 0.00 SELECT UNION wp_pp_daily_summary wp_pp_hourly_summary # 6 0x94350EA2AB8AAC34 0.1956 0.0% 89 0.0022 1.00 0.01 UPDATE wp_options # MISC 0xMISC 302.8137 1.8% 3853 0.0002 NS 0.0 <147 ITEMS>
  • 23. Databases - pt-query-digest # Query 2: 0.26 QPS, 0.00x concurrency, ID 0x92F3B1B361FB0E5B at byte 14081299 # This item is included in the report because it matches --limit. # Scores: Apdex = 1.00 [1.0], V/M = 0.00 # Query_time sparkline: | _^ | # Time range: 2011-12-28 18:42:47 to 19:03:10 # Attribute pct total min max avg 95% stddev median # ============ === ======= ======= ======= ======= ======= ======= ======= # Count 1 312 # Exec time 50 4s 5ms 25ms 13ms 20ms 4ms 12ms # Lock time 3 32ms 43us 163us 103us 131us 19us 98us # Rows sent 59 62.41k 203 231 204.82 202.40 3.99 202.40 # Rows examine 13 73.63k 238 296 241.67 246.02 10.15 234.30 # Rows affecte 0 0 0 0 0 0 0 0 # Rows read 59 62.41k 203 231 204.82 202.40 3.99 202.40 # Bytes sent 53 24.85M 46.52k 84.36k 81.56k 83.83k 7.31k 79.83k # Merge passes 0 0 0 0 0 0 0 0 # Tmp tables 0 0 0 0 0 0 0 0 # Tmp disk tbl 0 0 0 0 0 0 0 0 # Tmp tbl size 0 0 0 0 0 0 0 0 # Query size 0 21.63k 71 71 71 71 0 71 # InnoDB: # IO r bytes 0 0 0 0 0 0 0 0 # IO r ops 0 0 0 0 0 0 0 0 # IO r wait 0 0 0 0 0 0 0 0 # pages distin 40 11.77k 34 44 38.62 38.53 1.87 38.53 # queue wait 0 0 0 0 0 0 0 0 # rec lock wai 0 0 0 0 0 0 0 0 # Boolean: # Full scan 100% yes, 0% no # String: # Databases wp_blog_one (264/84%), wp_blog_tw… (36/11%)... 1 more # Hosts # InnoDB trxID 86B40B (1/0%), 86B430 (1/0%), 86B44A (1/0%)... 309 more # Last errno 0 # Users wp_blog_one (264/84%), wp_blog_two (36/11%)... 1 more # Query_time distribution # 1us # 10us # 100us # 1ms # 10ms ################################################################ # 100ms # 1s # 10s+ # Tables # SHOW TABLE STATUS FROM `wp_blog_one ` LIKE 'wp_options'G # SHOW CREATE TABLE `wp_blog_one `.`wp_options`G # EXPLAIN /*!50100 PARTITIONS*/ SELECT option_name, option_value FROM wp_options WHERE autoload = 'yes'G
  • 24. Databases – next step : explain explain <query> "How will MySQL execute the query"
  • 25. Databases – next step : explain +----+-------------+-----------+------+---------------+------+---------+------+--------+-------------+ | id | select_type | TABLE | TYPE | possible_keys | KEY | key_len | REF | ROWS | Extra | +----+-------------+-----------+------+---------------+------+---------+------+--------+-------------+ | 1 | SIMPLE | employees | ALL | NULL | NULL | NULL | NULL | 299809 | USING WHERE | +----+-------------+-----------+------+---------------+------+---------+------+--------+-------------+ +----+-------------+------------+-------+-------------------------------+---------+---------+-------+------+-------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+------------+-------+-------------------------------+---------+---------+-------+------+-------+ | 1 | SIMPLE | itdevice | const | PRIMARY,fk_device_devicetype1 | PRIMARY | 4 | const | 1 | | | 1 | SIMPLE | devicetype | const | PRIMARY | PRIMARY | 4 | const | 1 | | +----+-------------+------------+-------+-------------------------------+---------+---------+-------+------+-------+
  • 26. Databases – next step : explain Type of lookup 'system', 'const' and 'ref' = good 'ALL' = bad Extra info Using index = good Using filesort = usually bad
  • 27. For / foreach $customers = CustomerQuery::create() ->filterByState('MN') ->find(); foreach ($customers as $customer) { $contacts = ContactsQuery::create() ->filterByCustomerid($customer->getId()) ->find(); foreach ($contacts as $contact) { doSomestuffWith($contact); } }
  • 28. Joins $contacts = mysql_query(" select contacts.* from customer join contact on contact.customerid = customer.id where state = 'MN' "); while ($contact = mysql_fetch_array($contacts)) { doSomeStuffWith($contact); } or the ORM equivalent
  • 29. Better... 10001 → 1 query Sadly : people still produce code with query loops Usually : Growth not anticipated Internal app → Public app
  • 30. The origins of this talk Customers : Projects we built Projects we didn't build, but got pulled into Fixes Changes Infrastructure migration 15 years of 'how to cause mayhem with a few lines of code'
  • 31. Client X Jobs search site Monitor job views : Daily hits Weekly hits Monthly hits Which user saw which job
  • 32. Client X Originally : when user viewed job details Now : when job is in search result Search for 'php' → 50 jobs = 50 jobs to be updated → 50 updates for shown_today → 50 updates for shown_week → 50 updates for shown_month → 50 inserts for shown_user = 200 queries for 1 search !
  • 33. Client X : the code foreach ($jobs as $job) { $db->query(" insert into shown_today( jobId, number ) values( " . $job['id'] . ", 1 ) on duplicate key update number = number + 1 "); $db->query(" insert into shown_week( jobId, number ) values( " . $job['id'] . ", 1 ) on duplicate key update number = number + 1 "); $db->query(" insert into shown_month( jobId, number ) values( " . $job['id'] . ", 1 ) on duplicate key update number = number + 1 "); $db->query(" insert into shown_user( jobId, userId, when ) values ( " . $job['id'] . ", " . $user['id'] . ", now() ) "); }
  • 34. Client X : the graph
  • 35. Client X : the numbers 600-1000 inserts/sec (peaks up to 1600) 400-1000 updates/sec (peaks up to 2600) 16 core machine
  • 36. Client X : panic ! Mail : "MySQL slave is more than 5 minutes behind master" We set it up → who did they blame ? Wait a second !
  • 37. Client X : what's causing those peaks ?
  • 38. Client X : possible cause ? Code changes ? → According to developers : none Action : turn on general log, analyze with pt-query-digest → 50+-fold increase in 4 queries → Developers : 'Oops we did make a change' After 3 days : 2,5 days behind Every hour : 50 min extra lag
  • 39. Client X : But why is the slave lagging ? Master Slave File : master-bin-xxxx.log File : master-bin-xxxx.logSlave I/O thread Binlog dump thread Slave SQL thread
  • 40. Client X : Master
  • 41. Client X : Slave
  • 42. Client X : fix ? foreach ($jobs as $job) { $db->query(" insert into shown_today( jobId, number ) values( " . $job['id'] . ", 1 ) on duplicate key update number = number + 1 "); $db->query(" insert into shown_week( jobId, number ) values( " . $job['id'] . ", 1 ) on duplicate key update number = number + 1 "); $db->query(" insert into shown_month( jobId, number ) values( " . $job['id'] . ", 1 ) on duplicate key update number = number + 1 "); $db->query(" insert into shown_user( jobId, userId, when ) values ( " . $job['id'] . ", " . $user['id'] . ", now() ) "); }
  • 43. Client X : the code change insert into shown_today values (5, 1), (8, 1), (12, 1), (18, 1), … on duplicate key … ; insert into shown_week values (5, 1), (8, 1), (12, 1), (18, 1), … on duplicate key … ; insert into shown_month values (5, 1), (8, 1), (12, 1), (18, 1), … on duplicate key … ; insert into shown_user values (5, 23, "2013-11-12 12:01:00"), (8, 23, "2013-11-12 12:01:00"), … ;
  • 44. Client X : the code change $todayQuery = " insert into shown_today( jobId, number ) values "; foreach ($jobs as $job) { $todayQuery .= "(" . $job['id'] . ", 1),"; } $todayQuery = substr($todayQuery, 0, strlen($todayQuery) - 1); $todayQuery .= " ) on duplicate key update number = number + 1 "; $db->query($todayQuery);
  • 45. Client X : the chosen solution $db->autocommit(false); foreach ($jobs as $job) { $db->query(" insert into shown_today( jobId, number ) values( " . $job['id'] . ", 1 ) on duplicate key update number = number + 1 "); $db->query(" insert into shown_week( jobId, number ) values( " . $job['id'] . ", 1 ) on duplicate key update number = number + 1 "); $db->query(" insert into shown_month( jobId, number ) values( " . $job['id'] . ", 1 ) on duplicate key update number = number + 1 "); $db->query(" insert into shown_user( jobId, userId, when ) values ( " . $job['id'] . ", " . $user['id'] . ", now() ) "); } $db->commit();
  • 46. Client X : conclusion For loops are bad (we already knew that) Add master/slave and it gets much worse Use transactions : it will provide huge performance increase Result : slave caught up 5 days later
  • 47. Database → Network Customer Y Top 10 site in Belgium Growing rapidly At peak traffic : Unexplicable latency on database Load on webservers : minimal Load on database servers : acceptable
  • 48. Client Y : the network
  • 49. Client Y : the network 60GB 700GB 700GB
  • 50. Client Y : network overload Cause : Drupal hooks → retrieving data that was not needed Only load data you actually need Don't know at the start ? → Use lazy loading Caching : Same story Memcached/Redis are fast But : data still needs to cross the network
  • 51. Network trouble : more than just traffic Customer Z 150.000 visits/day News ticker : XML feed from other site (owned by same customer) Cached for 15 min
  • 52. Customer Z – fetching the feed if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) { unlink(APP_DIR . '/tmp/cacheFile.xml'); file_put_contents( APP_DIR . '/tmp/cacheFile.xml', file_get_contents('http://www.scrambledsitename.be/xml/feed.xml') ); } $xmlfeed = ParseXmlFeed(APP_DIR . '/tmp/cacheFile.xml'); What's wrong with this code ?
  • 53. Customer Z – no feed without the source Feed source
  • 54. Customer Z – no feed without the source Feed source
  • 55. Customer Z : timeout default_socket_timeout : 60 sec by default Each visitor : 60 sec wait time People keep hitting refresh → more load More active connections → more load Apache hits maximum connections → entire site down
  • 56. Customer Z : timeout fix $context = stream_context_create( array( 'http' => array( 'timeout' => 5 ) ) ); if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) { unlink(APP_DIR . '/tmp/cacheFile.xml'); file_put_contents( APP_DIR . '/tmp/cacheFile.xml', file_get_contents('http://www.scrambledsitename.be/xml/feed.xml', false, $context) ); } $xmlfeed = ParseXmlFeed(APP_DIR . '/tmp/cacheFile.xml');
  • 57. Customer Z : don't delete from cache $context = stream_context_create( array( 'http' => array( 'timeout' => 5 ) ) ); if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) { unlink(APP_DIR . '/tmp/cacheFile.xml'); file_put_contents( APP_DIR . '/tmp/cacheFile.xml', file_get_contents('http://www.scrambledsitename.be/xml/feed.xml', false, $context) ); } $xmlfeed = ParseXmlFeed(APP_DIR . '/tmp/cacheFile.xml');
  • 58. Customer Z : don't delete from cache $context = stream_context_create( array( 'http' => array( 'timeout' => 5 ) ) ); if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) { file_put_contents( APP_DIR . '/tmp/cacheFile.xml', file_get_contents('http://www.scrambledsitename.be/xml/feed.xml', false, $context) ); } $xmlfeed = ParseXmlFeed(APP_DIR . '/tmp/cacheFile.xml');
  • 59. Customer Z : don't delete from cache $context = stream_context_create( array( 'http' => array( 'timeout' => 5 ) ) ); if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) { $feed = file_get_contents('http://www.scrambledsitename.be/xml/feed.xml', false, $context); if ($feed !== false) { file_put_contents( APP_DIR . '/tmp/cacheFile.xml', $feed ); } } $xmlfeed = ParseXmlFeed(APP_DIR . '/tmp/cacheFile.xml');
  • 60. Customer Z : process early $context = stream_context_create( array( 'http' => array( 'timeout' => 5 ) ) ); if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) { $feed = file_get_contents('http://www.scrambledsitename.be/xml/feed.xml', false, $context); if ($feed !== false) { file_put_contents( APP_DIR . '/tmp/cacheFile.xml', ParseXmlFeed($feed) ); } }
  • 61. Customer Z : file_[get|put]_contents atomicity $context = stream_context_create( array( 'http' => array( 'timeout' => 5 ) ) ); if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) { $feed = file_get_contents('http://www.scrambledsitename.be/xml/feed.xml', false, $context); if ($feed !== false) { file_put_contents( APP_DIR . '/tmp/cacheFile.xml', ParseXmlFeed($feed) ); } } Relying on user → concurrent request → possible corrupt data Better : run every 15min through cronjob
  • 62. Network resources Use timeouts for all : fopen curl SOAP … Data source trusted ? → setup a webservice → let them push updates when their feed changes → less load on data source → no timeout issues Add logging → early detection
  • 63. Logging Logging = good Logging in PHP using fopen → bad idea : locking issues → Use monolog : file, syslog, mail, Pushover, HipChat, Graylog, Rollbar, ElasticSearch (and 50 more) For Firefox : FirePHP (add-on for Firebug) Debug logging = bad on production Watch your logs ! Don't log on slow disks → I/O bottlenecks
  • 64. File system : I/O bottlenecks Causes : Excessive writes (database updates, logfiles, swapping, …) Excessive reads (non-indexed database queries, swapping, small file system cache, …) How to detect ? top iostat See iowait ? Stop worrying about php, fix the I/O problem ! Cpu(s): 0.2%us, 3.0%sy, 0.0%ni, 61.4%id, 35.5%wa, 0.0%hi, 0.0%si, 0.0%st avg-cpu: %user %nice %system %iowait %steal %idle 0.10 0.00 0.96 53.70 0.00 45.24 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 120.40 0.00 123289.60 0 616448 sdb 2.10 0.00 4378.10 0 18215 dm-0 4.20 0.00 36.80 0 184 dm-1 0.00 0.00 0.00 0 0
  • 65. Much more than code DB server Webserver User Network XML feed
  • 66. Look beyond PHP (or Perl, Ruby, Python, ...) !
  • 69. Contact Twitter @wimgtr Web http://techblog.wimgodden.be Slides http://www.slideshare.net/wimg E-mail wim.godden@cu.be Please... Rate my talk : http://joind.in/11676
  • 70. Step-by-step : most common issues Using NFS ? Get rid of it ;-) iowait on database server I/O reads (use iostat) ? → missing/wrong indexes I/O writes ? → no transactions ? → too many queries ? → bad DB engine settings iowait on webserver (logs ? static files ?) CPU on database server (missing/wrong/too many indexes) CPU on webserver (PHP)

Editor's Notes

  1. 5kbit/sec or 100Mbit/sec ?
  2. Let&amp;apos;s talk about code Without : we don&amp;apos;t exist What are most common mistakes in ecosystem Let&amp;apos;s start with the database
  3. time spent per query pattern how many queries of that query pattern
  4. Get back to what I said Lots of people use ORM - easier - don&amp;apos;t need to write queries - object-oriented but people start doing this Imagine 10000 customers → 10001 queries
  5. Not best code Uses deprecated mysql extension no error handling
  6. Master : 16 CPU cores 12 cores for SQL 1 core for binlog dump rest for system Slave : 16 CPU cores 1 core for slave I/O 1 core for slave SQL
  7. Grouping Works fine, but : maximum size of string ? PHP = no limit MySQL = max_allowed_packet
  8. Grouping Works fine, but : maximum size of string ? PHP = no limit MySQL = max_allowed_packet
  9. All in a single commit Note : transaction has max. size Possible : combination with previous solution
  10. took few moments to figure out No network monitoring → iptraf → 100Mbit/sec limit → packets dropped → connections dropped Customer : upgrade switch Us : why 100Mbit/sec ?
  11. Databases → network What other network related issues ?
  12. Server on which feed located : crashed Fine for few minutes (cache) 15 minutes : file_get_contents uses default_socket_timeout
  13. Better, not perfect. What else is wrong ? Multiple visitors hit expiring cache → file delete → xml feed hit a lot
  14. Better, not perfect. What else is wrong ? Multiple visitors hit expiring cache → file delete → xml feed hit a lot
  15. Better, not perfect. What else is wrong ? Multiple visitors hit expiring cache → file delete → xml feed hit a lot
  16. Better, not perfect. What else is wrong ? Multiple visitors hit expiring cache → file delete → xml feed hit a lot
  17. Better, not perfect. What else is wrong ? Multiple visitors hit expiring cache → file delete → xml feed hit a lot
  18. Better, not perfect. What else is wrong ? Multiple visitors hit expiring cache → file delete → xml feed hit a lot
  19. How do you treat your data : - where do you get it - how long did you have to wait to get it - how is it transported - how is it processed minimize the amount of data : retrieved transported processed, sent to db and users