Mechanism
Presented By : Dhruv Patel
Introduction
• 819 million monthly active users who used Facebook mobile products
as of June 30, 2013.
• 699 million daily...
Technologies
For faster data transfer
• Cookies and Caches
• GZip compression
• AJAX and JSON
• XMPP messaging
For data st...
Cookies and Caches
Cookies are small pieces of data that are stored on your
computer, mobile phone or other device.
Cache ...
Gzip Compression
Gzip is a software application used for file compression and
decompression
It compresses the image, CSS, ...
AJAX and JSON
AJAX and JSON is a group of interrelated web development techniques
used on the client-side to create asynch...
XMPP Messaging
XMPP stands for Extensible Messaging and Presence Protocol.
XMPP is also called jabber protocol.
Facebook c...
Manage data in large clusters
HBase and Haystack
Horizontal scalability
• HBase & HDFS are elastic by design
• Multiple table shards (regions) per physi...
HBase and Haystack
Automatic failover
• Node failures automatically detected by HBase Master
• Regions on failed node are ...
HBase and Haystack
HDFS ( Highly Distributed File System )
• Fault tolerance (block level replication for redundancy)
• Sc...
Zookeeper
Zookeeper is open source software that FB use mainly for two purposes:
• As the controller for implementing shar...
Memcached
If you've read anything about scaling large websites, you've probably
heard about memcached.
Memcached is a high...
Scribe – Log server
Scribe is a server for aggregating log data streamed in real-time from a
large number of servers. It i...
References
• Facebook.com/engineering
• Facebook.com/data
• Developers.facebook.com
• Newsroom.fb.com
• Hbase.apache.org
•...
Thank you!
Upcoming SlideShare
Loading in …5
×

Fb mechanism

705 views

Published on

This slide is about technology used by facebook to provide user a faster and safer social experience

Published in: Technology
1 Comment
4 Likes
Statistics
Notes
No Downloads
Views
Total views
705
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
0
Comments
1
Likes
4
Embeds 0
No embeds

No notes for slide

Fb mechanism

  1. 1. Mechanism Presented By : Dhruv Patel
  2. 2. Introduction • 819 million monthly active users who used Facebook mobile products as of June 30, 2013. • 699 million daily active users on average in June 2013. • 1.15 billion monthly active users as of June 2013. • 2.5 billion content items shared per day (status updates + wall posts + photos + videos + comments) • 2.7 billion Likes per day • 300 million photos uploaded per day • 100+ petabytes of disk space in one of FB’s largest Hadoop (HDFS) clusters • 105 terabytes of data scanned via Hive, Facebook’s Hadoop query language, every 30 minutes • 70,000 queries executed on these databases per day • 500+terabytes of new data ingested into the databases every day By this statics, Facebook have to use such a great technology to handle this traffic and giving their user a faster and safer social experience
  3. 3. Technologies For faster data transfer • Cookies and Caches • GZip compression • AJAX and JSON • XMPP messaging For data storage • HBase & Haystack • Zookeeper • Memcached • Scribe
  4. 4. Cookies and Caches Cookies are small pieces of data that are stored on your computer, mobile phone or other device. Cache is a type of memory which is used by web browser. When any page loads and it is not changeable for a long time browser cache it’s CSS/JS and read it from memory to reduce the data transfer . It provide and understand a range of products and services. Facebook use this technologies to do things like: • make Facebook easier or faster to use; • enable features and store information about you (including on your device or in your browser cache) and your use of Facebook; • deliver, understand and improve advertising; • monitor and understand the use of FB products and services; • to protect you, others and Facebook.
  5. 5. Gzip Compression Gzip is a software application used for file compression and decompression It compresses the image, CSS, JS sent by server and loads in client machine then decompress it. So there is no change in data and UI but data transfer rate is decreased. So all servers of Facebook used Gzip compression to make web more faster
  6. 6. AJAX and JSON AJAX and JSON is a group of interrelated web development techniques used on the client-side to create asynchronous web applications. With AJAX, web applications can send data to, and retrieve data from, a server asynchronously (in the background) without interfering with the display and behavior of the existing page. Data can be retrieved using the XMlHttpRequest object. Where AJAX-JSON mainly used in Facebook • Like, Comment, Share • Post story • Send message • Load feed • Dialog Box – likes, Mutual friends etc…
  7. 7. XMPP Messaging XMPP stands for Extensible Messaging and Presence Protocol. XMPP is also called jabber protocol. Facebook chat and messages work on this platform. Every user of Facebook has a unique id and personal chat email like 100000874067290@chat.facebook.com and someone wants to send message to that user core script convert it to XML and send to Jabber server. After this process partner user gets the message at same instance due to highly reliable servers.
  8. 8. Manage data in large clusters
  9. 9. HBase and Haystack Horizontal scalability • HBase & HDFS are elastic by design • Multiple table shards (regions) per physical server • On node additions • Load balancer automatically reassigns shards from overloaded nodes to new nodes • Because file system underneath is itself distributed, data for reassigned regions is instantly servable from the new nodes. • Regions can be dynamically split into smaller regions. • Pre-sharding is not necessary • Splits are near instantaneous!
  10. 10. HBase and Haystack Automatic failover • Node failures automatically detected by HBase Master • Regions on failed node are distributed evenly among surviving nodes. • Multiple regions/server model avoids need for substantial overprovisioning • HBase Master failover • 1 active, rest standby • When active master fails, a standby automatically takes over
  11. 11. HBase and Haystack HDFS ( Highly Distributed File System ) • Fault tolerance (block level replication for redundancy) • Scalability • End-to-end checksums to detect and recover from corruptions • Map Reduce for large scale data processing • HDFS already battle tested inside Facebook • running petabyte scale clusters • lot of in-house development and operational experience
  12. 12. Zookeeper Zookeeper is open source software that FB use mainly for two purposes: • As the controller for implementing sharding and failover of application servers • As a store for their discovery service. Since Zookeeper provides FB with a highly available repository and notification mechanism, it goes a long way towards helping FB build a highly available service.
  13. 13. Memcached If you've read anything about scaling large websites, you've probably heard about memcached. Memcached is a high-performance, distributed memory object caching system. Facebook is the world's largest user of memcached. They use memcached to alleviate database load. Memcached is already fast, but they need it to be faster and more efficient than most installations. FB use more than 800 servers supplying over 28 terabytes of memory to their users. Over the past year as Facebook's popularity has skyrocketed, They've run into a number of scaling issues. This ever increasing demand has required them to make modifications to both their operating system and memcached to achieve the performance that provides the best possible experience for our users.
  14. 14. Scribe – Log server Scribe is a server for aggregating log data streamed in real-time from a large number of servers. It is designed to be scalable, extensible without client-side modification, and robust to failure of the network or any specific machine. Scribe was developed at Facebook using Apache Thrift and released in 2008 as open source. Scribe servers are arranged in a directed graph, with each server knowing only about the next server in the graph. This network topology allows for adding extra layers of fan-in as a system grows, and batching messages before sending them between datacenters, without having any code that explicitly needs to understand datacenter topology, only a simple configuration.
  15. 15. References • Facebook.com/engineering • Facebook.com/data • Developers.facebook.com • Newsroom.fb.com • Hbase.apache.org • Zookeeper.apache.org
  16. 16. Thank you!

×