Your SlideShare is downloading. ×
0
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Beyond PHP - it's not (just) about the code
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Beyond PHP - it's not (just) about the code

7,584

Published on

Most PHP developers focus on writing code. But creating Web applications is about much more than just writing PHP. Take a step outside the PHP cocoon and into the big PHP ecosphere to find out how …

Most PHP developers focus on writing code. But creating Web applications is about much more than just writing PHP. Take a step outside the PHP cocoon and into the big PHP ecosphere to find out how small code changes can make a world of difference on servers and network. This talk is an eye-opener for developers who spend over 80% of their time coding, debugging and testing.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
7,584
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
10
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • 5kbit/sec or 100Mbit/sec ?
  • Let's talk about code Without : we don't exist What are most common mistakes in ecosystem Let's start with the database
  • time spent per query pattern how many queries of that query pattern
  • Get back to what I said Lots of people use ORM - easier - don't need to write queries - object-oriented but people start doing this Imagine 10000 customers → 10001 queries
  • Not best code Uses deprecated mysql extension no error handling
  • Master : 16 CPU cores 12 cores for SQL 1 core for binlog dump rest for system Slave : 16 CPU cores 1 core for slave I/O 1 core for slave SQL
  • Grouping Works fine, but : maximum size of string ? PHP = no limit MySQL = max_allowed_packet
  • All in a single commit Note : transaction has max. size Possible : combination with previous solution
  • took few moments to figure out No network monitoring → iptraf → 100Mbit/sec limit → packets dropped → connections dropped Customer : upgrade switch Us : why 100Mbit/sec ?
  • Databases → network What other network related issues ?
  • Server on which feed located : crashed Fine for few minutes (cache) 15 minutes : file_get_contents uses default_socket_timeout
  • Better, not perfect. What else is wrong ? Multiple visitors hit expiring cache → file delete → xml feed hit a lot
  • Better, not perfect. What else is wrong ? Multiple visitors hit expiring cache → file delete → xml feed hit a lot
  • How do you treat your data : - where do you get it - how long did you have to wait to get it - how is it transported - how is it processed minimize the amount of data : retrieved transported processed, sent to db and users
  • Transcript

    • 1. Beyond PHP :Its not (just) about the codeWim GoddenCu.be Solutions
    • 2. Who am I ?Wim Godden (@wimgtr)Founder of Cu.be Solutions (http://cu.be)Open Source developer since 1997Developer of OpenXZend Certified EngineerZend Framework Certified EngineerMySQL Certified DeveloperSpeaker at PHP and Open Source conferences
    • 3. Cu.be Solutions ?Open source consultancyPHP-centeredHigh-speed redundant network (BGP, OSPF, VRRP)High scalability developmentNginx + extensionsMySQL ClusterProjects :mostly IT & Telecom companieslots of public-facing apps/sites
    • 4. Who are you ?Developers ?Anyone setup a MySQL master-slave ?Anyone setup a site/app on separate web and database server ?→ How much traffic between them ?
    • 5. The topicThings we take for grantedFamous last words : "It should work just fine"Works fine today→ might fail tomorrowMost common mistakesPHP code ↔ PHP ecosystemHow-to & How-NOT-to
    • 6. It starts with...… code !First up : database
    • 7. Database queries – complexitySELECT DISTINCT n.nid, n.uid, n.title, n.type, e.event_start, e.event_start ASevent_start_orig, e.event_end, e.event_end AS event_end_orig, e.timezone,e.has_time, e.has_end_date, tz.offset AS offset, tz.offset_dst AS offset_dst,tz.dst_region, tz.is_dst, e.event_start - INTERVAL IF(tz.is_dst, tz.offset_dst,tz.offset) HOUR_SECOND AS event_start_utc, e.event_end - INTERVALIF(tz.is_dst, tz.offset_dst, tz.offset) HOUR_SECOND AS event_end_utc,e.event_start - INTERVAL IF(tz.is_dst, tz.offset_dst, tz.offset) HOUR_SECOND +INTERVAL 0 SECOND AS event_start_user, e.event_end - INTERVAL IF(tz.is_dst,tz.offset_dst, tz.offset) HOUR_SECOND + INTERVAL 0 SECOND ASevent_end_user, e.event_start - INTERVAL IF(tz.is_dst, tz.offset_dst, tz.offset)HOUR_SECOND + INTERVAL 0 SECOND AS event_start_site, e.event_end -INTERVAL IF(tz.is_dst, tz.offset_dst, tz.offset) HOUR_SECOND + INTERVAL 0SECOND AS event_end_site, tz.name as timezone_name FROM node n INNERJOIN event e ON n.nid = e.nid INNER JOIN event_timezones tz ON tz.timezone =e.timezone INNER JOIN node_access na ON na.nid = n.nid LEFT JOINdomain_access da ON n.nid = da.nid LEFT JOIN node i18n ON n.tnid > 0 ANDn.tnid = i18n.tnid AND i18n.language = en WHERE (na.grant_view >= 1 AND((na.gid = 0 AND na.realm = all))) AND ((da.realm = "domain_id" AND da.gid = 4)OR (da.realm = "domain_site" AND da.gid = 0)) AND (n.language =en ORn.language = OR n.language IS NULL OR n.language = is AND i18n.nid IS NULL)AND ( n.status = 1 AND ((e.event_start >= 2010-01-31 00:00:00 ANDe.event_start <= 2010-03-01 23:59:59) OR (e.event_end >= 2010-01-31 00:00:00AND e.event_end <= 2010-03-01 23:59:59) OR (e.event_start <= 2010-01-3100:00:00 AND e.event_end >= 2010-03-01 23:59:59)) ) GROUP BY n.nid HAVING(event_start >= 2010-02-01 00:00:00 AND event_start <= 2010-02-28 23:59:59)OR (event_end >= 2010-02-01 00:00:00 AND event_end <= 2010-02-28 23:59:59)OR (event_start <= 2010-02-01 00:00:00 AND event_end >= 2010-02-2823:59:59) ORDER BY event_start ASC;
    • 8. Database - indexingselect id from stock where status = 2 order by qty→ aggregate index on (status, qty)select id from stock where status > 2 order by qty→ aggregate index on (status, qty) ?→ No : range selection stops use of aggregate index→ separate index on status and qty
    • 9. Database - indexingIndexes make database faster→ Lets index everything !→ DONT :Insert/update/delete → Index modificationEach query → evaluation of all indexes"Relational schema design is based on databut index design is based on queries"(Bill Karwin, Percona)
    • 10. Databases – detecting problematic queriesSlow query log→ SET GLOBAL slow_query_log = ON;Queries not using indexes→ In my.cnf/my.ini : log_queries_not_using_indexesGeneral query log→ SET GLOBAL general_log = ON;→ Turn it off quickly !Percona Toolkit (Maatkit)pt-query-digest
    • 11. Databases - pt-query-digest# Profile# Rank Query ID Response time Calls R/Call Apdx V/M Item# ==== ================== ================ ===== ======= ==== ===== ==========# 1 0x543FB322AE4330FF 16526.2542 62.0% 1208 13.6806 1.00 0.00 SELECT output_option# 2 0xE78FEA32E3AA3221 0.8312 10.3% 6412 0.0001 1.00 0.00 SELECT poller_output poller_item# 3 0x211901BF2E1C351E 0.6811 8.4% 6416 0.0001 1.00 0.00 SELECT poller_time# 4 0xA766EE8F7AB39063 0.2805 3.5% 149 0.0019 1.00 0.00 SELECT wp_terms wp_term_taxonomy wp_term_relationships# 5 0xA3EEB63EFBA42E9B 0.1999 2.5% 51 0.0039 1.00 0.00 SELECT UNION wp_pp_daily_summary wp_pp_hourly_summary# 6 0x94350EA2AB8AAC34 0.1956 2.4% 89 0.0022 1.00 0.01 UPDATE wp_options# MISC 0xMISC 0.8137 10.0% 3853 0.0002 NS 0.0 <147 ITEMS>
    • 12. Databases - pt-query-digest# Query 2: 0.26 QPS, 0.00x concurrency, ID 0x92F3B1B361FB0E5B at byte 14081299# This item is included in the report because it matches --limit.# Scores: Apdex = 1.00 [1.0], V/M = 0.00# Query_time sparkline: | _^ |# Time range: 2011-12-28 18:42:47 to 19:03:10# Attribute pct total min max avg 95% stddev median# ============ === ======= ======= ======= ======= ======= ======= =======# Count 1 312# Exec time 50 4s 5ms 25ms 13ms 20ms 4ms 12ms# Lock time 3 32ms 43us 163us 103us 131us 19us 98us# Rows sent 59 62.41k 203 231 204.82 202.40 3.99 202.40# Rows examine 13 73.63k 238 296 241.67 246.02 10.15 234.30# Rows affecte 0 0 0 0 0 0 0 0# Rows read 59 62.41k 203 231 204.82 202.40 3.99 202.40# Bytes sent 53 24.85M 46.52k 84.36k 81.56k 83.83k 7.31k 79.83k# Merge passes 0 0 0 0 0 0 0 0# Tmp tables 0 0 0 0 0 0 0 0# Tmp disk tbl 0 0 0 0 0 0 0 0# Tmp tbl size 0 0 0 0 0 0 0 0# Query size 0 21.63k 71 71 71 71 0 71# InnoDB:# IO r bytes 0 0 0 0 0 0 0 0# IO r ops 0 0 0 0 0 0 0 0# IO r wait 0 0 0 0 0 0 0 0# pages distin 40 11.77k 34 44 38.62 38.53 1.87 38.53# queue wait 0 0 0 0 0 0 0 0# rec lock wai 0 0 0 0 0 0 0 0# Boolean:# Full scan 100% yes, 0% no# String:# Databases wp_blog_one (264/84%), wp_blog_tw… (36/11%)... 1 more# Hosts# InnoDB trxID 86B40B (1/0%), 86B430 (1/0%), 86B44A (1/0%)... 309 more# Last errno 0# Users wp_blog_one (264/84%), wp_blog_two (36/11%)... 1 more# Query_time distribution# 1us# 10us# 100us# 1ms# 10ms ################################################################# 100ms# 1s# 10s+# Tables# SHOW TABLE STATUS FROM `wp_blog_one ` LIKE wp_optionsG# SHOW CREATE TABLE `wp_blog_one `.`wp_options`G# EXPLAIN /*!50100 PARTITIONS*/SELECT option_name, option_value FROM wp_options WHERE autoload = yesG
    • 13. Databases – pt-query-digest – Digest UI
    • 14. Databases – next step : explainexplain <query>"How will MySQL execute the query"
    • 15. Databases – next step : explain+----+-------------+-----------+------+---------------+------+---------+------+--------+-------------+| id | select_type | TABLE | TYPE | possible_keys | KEY | key_len | REF | ROWS | Extra |+----+-------------+-----------+------+---------------+------+---------+------+--------+-------------+| 1 | SIMPLE | employees | ALL | NULL | NULL | NULL | NULL | 299809 | USING WHERE |+----+-------------+-----------+------+---------------+------+---------+------+--------+-------------++----+-------------+------------+-------+-------------------------------+---------+---------+-------+------+-------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+----+-------------+------------+-------+-------------------------------+---------+---------+-------+------+-------+| 1 | SIMPLE | itdevice | const | PRIMARY,fk_device_devicetype1 | PRIMARY | 4 | const | 1 | || 1 | SIMPLE | devicetype | const | PRIMARY | PRIMARY | 4 | const | 1 | |+----+-------------+------------+-------+-------------------------------+---------+---------+-------+------+-------+
    • 16. Databases – next step : explainexplain <query>"How will MySQL execute the query"Shows :Indexes availableIndexes used (do you see one ?)Number of rows scannedType of lookupsystem, const and ref = goodALL = badExtra infoUsing index = goodUsing filesort = usually badUsing where = bad
    • 17. Databases – when to use / not to useGood at :Fetching dataStoring dataSearching through dataBad at :select `someField` from `bigTable` where crc32(`field`) = "something"→ full table scan
    • 18. For / foreach$customers = CustomerQuery::create()->filterByState(SC)->find();foreach ($customers as $customer) {$contacts = ContactsQuery::create()->filterByCustomerid($customer->getId())->find();foreach ($contacts as $contact) {doSomestuffWith($contact);}}
    • 19. Joins$contacts = mysql_query("selectcontacts.*fromcustomerjoin contacton contact.customerid = customer.idwherestate = SC");while ($contact = mysql_fetch_array($contacts)) {doSomeStuffWith($contact);}or the ORM equivalent
    • 20. Better...10001 → 1 querySadly : people still produce code with query loopsUsually :Growth not anticipatedInternal app → Public app
    • 21. The origins of this talkCustomers :Projects we builtProjects we didnt build, but got pulled intoFixesChangesInfrastructure migration15 years of how to cause mayhem with a few lines of code
    • 22. Client XJobs search siteMonitor job views :Daily hitsWeekly hitsMonthly hitsWhich user saw which job
    • 23. Client XOriginally : when user viewed job detailsNow : when job is in search resultSearch for php → 50 jobs = 50 jobs to be updated→ 50 updates for shown_today→ 50 updates for shown_week→ 50 updates for shown_month→ 50 inserts for shown_user
    • 24. Client X : the codeforeach ($jobs as $job) {$db->query("insert into shown_today(jobId,number) values(" . $job[id] . ",1)on duplicate keyupdatenumber = number + 1");$db->query("insert into shown_week(jobId,number) values(" . $job[id] . ",1)on duplicate keyupdatenumber = number + 1");$db->query("insert into shown_month(jobId,number) values(" . $job[id] . ",1)on duplicate keyupdatenumber = number + 1");$db->query("insert into shown_user(jobId,userId,when) values (" . $job[id] . "," . $user[id] . ",now())");}
    • 25. Client X : the graph
    • 26. Client X : the numbers600-1000 updates/sec (peaks up to 1600)400-1000 updates/sec (peaks up to 2600)16 core machine
    • 27. Client X : panic !Mail : "MySQL slave is more than 5 minutes behind master"We set it up → who did they blame ?Wait a second !
    • 28. Client X : whats causing those peaks ?
    • 29. Client X : possible cause ?Code changes ?→ According to developers : noneAction : turn on general log, analyze with pt-query-digest→ 50+-fold increase in queries→ Developers : Oops we did make a changeAfter 3 days : 2,5 days behindEvery hour : 50 min extra lag
    • 30. Client X : But why is the slave lagging ?Master SlaveFile :master-bin-xxxx.logFile :master-bin-xxxx.logSlave I/O threadBinlog dumpthreadSlaveSQLthread
    • 31. Client X : Master
    • 32. Client X : Slave
    • 33. Client X : fix ?foreach ($jobs as $job) {$db->query("insert into shown_today(jobId,number) values(" . $job[id] . ",1)on duplicate keyupdatenumber = number + 1");$db->query("insert into shown_week(jobId,number) values(" . $job[id] . ",1)on duplicate keyupdatenumber = number + 1");$db->query("insert into shown_month(jobId,number) values(" . $job[id] . ",1)on duplicate keyupdatenumber = number + 1");$db->query("insert into shown_user(jobId,userId,when) values (" . $job[id] . "," . $user[id] . ",now())");}
    • 34. Client X : the code change$todayQuery = "insert into shown_today(jobId,number) values ";foreach ($jobs as $job) {$todayQuery .= "(" . $job[id] . ", 1),";}$todayQuery = substr($todayQuery, -1);$todayQuery .= ")on duplicate keyupdatenumber = number + 1";$db->query($todayQuery);Careful : max_allowed_packet !Result : insert into shown_today values (5, 1), (8, 1), (12, 1), (18, 1), ...
    • 35. Client X : the chosen solution$db->autocommit(false);foreach ($jobs as $job) {$db->query("insert into shown_today(jobId,number) values(" . $job[id] . ",1)on duplicate keyupdatenumber = number + 1");$db->query("insert into shown_week(jobId,number) values(" . $job[id] . ",1)on duplicate keyupdatenumber = number + 1");$db->query("insert into shown_month(jobId,number) values(" . $job[id] . ",1)on duplicate keyupdatenumber = number + 1");$db->query("insert into shown_user(jobId,userId,when) values (" . $job[id] . "," . $user[id] . ",now())");}$db->commit();
    • 36. Client X : conclusionFor loops are bad (we already knew that)Add master/slave and it gets much worseUse transactions : it will provide huge performance increaseResult : slave caught up 5 days later
    • 37. Database → NetworkCustomer YTop 10 site in BelgiumGrowing rapidlyAt peak traffic :Unexplicable latency on databaseLoad on webservers : minimalLoad on database servers : acceptable
    • 38. Client Y : the network
    • 39. Client Y : the network60GB 700GB 700GB
    • 40. Client Y : network overloadCause : Drupal hooks → retrieving data that was not neededOnly load data you actually needDont know at the start ? → Use lazy loadingCaching :Same storyMemcached/Redis are fastBut : data still needs to cross the network
    • 41. Network trouble : more than just trafficCustomer Z150.000 visits/dayNews ticker :XML feed from other site (owned by same customer)Cached for 15 min
    • 42. Customer Z – fetching the feedif (filectime(APP_DIR . /tmp/ScrambledSiteName.xml) < time() - 900) {unlink(APP_DIR . /tmp/ScrambledSiteName.xml);file_put_contents(APP_DIR . /tmp/ScrambledSiteName.xml,file_get_contents(http://www.scrambledsitename.be/xml/feed.xml));}$xmlfeed = ParseXmlFeed(APP_DIR . /tmp/ScrambledSiteName.xml);Whats wrong with this code ?
    • 43. Customer Z – no feed without the sourceFeed source
    • 44. Customer Z – no feed without the sourceFeed source
    • 45. Customer Z : timeoutdefault_socket_timeout : 60 sec by defaultEach visitor : 60 sec wait timePeople keep hitting refresh → more loadMore active connections → more loadApache hits maximum connections → entire site down
    • 46. Customer Z : timeout fix$context = stream_context_create(array(http => array(timeout => 5)));if (filectime(APP_DIR . /tmp/ScrambledSiteName.xml) < time() - 900) {unlink(APP_DIR . /tmp/ScrambledSiteName.xml);file_put_contents(APP_DIR . /tmp/ScrambledSiteName.xml,file_get_contents(http://www.scrambledsitename.be/xml/feed.xml, false, $context));}$xmlfeed = ParseXmlFeed(APP_DIR . /tmp/ScrambledSiteName.xml);
    • 47. Customer Z : dont delete from cache$context = stream_context_create(array(http => array(timeout => 5)));if (filectime(APP_DIR . /tmp/ScrambledSiteName.xml) < time() - 900) {unlink(APP_DIR . /tmp/ScrambledSiteName.xml);file_put_contents(APP_DIR . /tmp/ScrambledSiteName.xml,file_get_contents(http://www.scrambledsitename.be/xml/feed.xml, false, $context));}$xmlfeed = ParseXmlFeed(APP_DIR . /tmp/ScrambledSiteName.xml);
    • 48. Network resourcesUse timeouts for all :fopencurlSOAP…Data source trusted ?→ setup a webservice→ let them push updates when their feed changes→ less load on data source→ no timeout issuesAdd logging → early detection
    • 49. LoggingLogging = goodLogging in PHP using fopen→ bad idea : locking issues→ Use file_put_contents($filename, $data, FILE_APPEND)For Firefox : FirePHP (add-on for Firebug)Debug logging = bad on productionWatch your logs !Dont log on slow disks → I/O bottlenecks
    • 50. File system : I/O bottlenecksCauses :Excessive writes (database updates, logfiles, swapping, …)Excessive reads (non-indexed database queries, swapping, small filesystem cache, …)How to detect ?topiostatSee iowait ? Stop worrying about php, fix the I/O problem !
    • 51. File systemWorst of all : NFSPHP files → lstat callsTemplates → sameSessions→ locking issues→ corrupt data→ store sessions in database, Memcached, Redis, ...
    • 52. Much more than codeDBserverWebserverUserNetworkXML feed
    • 53. Look beyond PHP !
    • 54. Questions ?
    • 55. Questions ?
    • 56. ContactTwitter @wimgtrWeb http://techblog.wimgodden.beSlides http://www.slideshare.net/wimgE-mail wim.godden@cu.bePlease...Rate my talk : http://joind.in/8186
    • 57. Thanks !Please...Rate my talk : http://joind.in/8186

    ×