Web Performance & Operations Meetup2 march 2012Ahmed Omar – firstname.lastname@example.orgNico Klasens – email@example.com 2
What is Nimbuzz?Nimbuzz is a communication platform which provides services to make calls (Audio and Video), Internet and normal numbers send instant messages share filesOn any mobile device, desktop computer or Internet browser if possible.It connects with popular instant messaging and social networks Facebook, Windows Live Messenger (MSN), GoogleTalk, Yahoo!, and SIP/VoIP accounts 3
ArchitectureGuidelines• Implement business features in external components, not modules inside xmpp router• Bundle functionality in small services• Implement services as stateless as possible• Communicate with users over XMPP• Communicate internal data over other connectionsThis makes it possible to do multiple deployments to Nimbuzz every day. 4
What is XMPP?eXtensible Message and Presence Protocol, Formerly known as JabberA full XMPP session is one XML document. Client opens a <stream> and exchanges xml packets At the end closes </stream>. This requires long running TCP connectionsPackets always have a • From JID (JabberID : firstname.lastname@example.org /resource) • To JID (JabberID : email@example.com /resource)Three subtypes (Message, Presence and IQ) • Message (type=normal/chat, subject, body) • Presence (type=unavailable/subscribe/probe, show=chat/away/dnd) • IQ (id, type=get/set/result/error, one child element with extension namespace) 6
What is XMPP?Instant Messaging and Presence extension• Roster - central point of focus is a list of ones contacts or "buddies" • Local Nimbuzz friends • Transports (gateways to external IM systems) • Transport friends• Presence information - network availability of particular contacts• Presence subscription – authorize contacts to receive “presence”• PrivacyList 7
Ejabberd / ErlangEjabberd is a XMPP instant messaging server, written in Erlang/OTP. Nimbuzz runs a modified version with its own extensions.Erlang is a programming language Erlangs runtime system has built-in support for concurrency, distribution and fault tolerance.OTP is a set of Erlang libraries It includes its own distributed database, debugging and release handling tools. 8
Erlang/OTP - Quick History - Why Erlang? o Concurrency o Fault tolerance o Distribution o Hot code swapping/ High Availability - Who uses Erlang?
XMPP in action• XML stanzas (presence, iq, message)<presence to=‘user3@server-x’ type=‘subscribe’/> <presence> <show> chat </show><status>Just talk</status></presence> <iq from=user2@server-x/pc type=get id=roster_req1> <query xmlns=jabber:iq:roster/> </iq> <message to=user3@server-x from =user1@server-x/pc type = chat‘> <body> Hey </body> </message>
EjabberdWhy XMPP? o Real time communication o ExtensibilityWhy Ejabberd? o Flexible easy to setup a cluster easy to configure easy to extend support for external services. o Powerful o Scalable... with caution.
Persistence BridgeApplication introduced to migrate to ejabberd, while still using the old database schema of the old xmpp server. data requests 1.839.333.853 per day = 21.288 per second A REST service written in java Apply validation/business rules and enhance data. Cache most accessed data in memcached what is stored in MySQL Cache community gateway rosters in Redis Migrate data to more efficient storage backends or database tables. 12
Cache server PracticesUse a naming convention for your keys: namespace:sequence@identifier Sequence has to be configurableChoose a good balance between memory size, expiry and number of servers. Memcached is/was single-threadedDecide on connection, read and data retrieval timeouts. Use command pipelining on every connection. Use different connections for different namespaced keys.Only write to MySQL, try to only read from cache server. update cache server on writeUse Check And Set (CAS) command for partial updates and back out after retries. 13
MySQL PracticesUse connection pooling (mysql driver for java creates connections slow)Test connections with a mysql ping not with a statement “SELECT 1”Make every transaction a single sql-statement and turn autocommit on by defaultSplit read and write statements to different connections.Add requesting host and application as sql comment to statement 14
MySQL PracticesPut statements outside programming code.Never use "SELECT *" column order might change and hard to check which are still usedNever use "INSERT INTO table_name VALUES( ... )“ Use "INSERT INTO column1, colum2 table_name VALUES( ... )“Do not rely on database DEFAULT values. Provide all values on INSERT. An exception on this is a TIMESTAMP field.All columns in the database have to be NOT NULL and a DEFAULT valueUse Primary Keys as much as possible. 15
Data tweaks Users send a lot of junk. Validate and drop. Do not try to correct Do not store data which is implied like + of phonenumbers Bulk insert, update and delete. INSERT INTO table (data) VALUES(?), (?) INSERT INTO table (id, data) VALUES(?,?), (?, ?) ON DUPLICATE KEY UPDATE data = VALUES(data) Do not use REPLACE if you don’t want to DELETE and INSERT. Very bad IO performance Sort rows based on primary key before update and delete Improves InnoDB page locks Use compound primary key to store records of one user together on disk (user_id, auto_increment_id) A mysql index on large table with text columns do not perform. Use fulltext search engines to have an index which is not fully in memory. Remove foreign keys to reduce storage. Trust the application to update and delete 16