Social Network Technologies
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

Social Network Technologies

  • 665 views
Uploaded on

An overview on social network technologies: are they "typical" website? Or do they work in a different way? How many and what technologies do Facebook and Instagram use? ...

An overview on social network technologies: are they "typical" website? Or do they work in a different way? How many and what technologies do Facebook and Instagram use?
Presentation made for the Multimedia Languages and Environments course at Politecnico di Torino (academic year 2013/2014).

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
665
On Slideshare
665
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
11
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Social Network Technologies An overview Luigi De Russis
  • 2. Prerequisite 07/03/2014Social Networks Technologies 2 Did you know…  the data center definition?  the difference between a logical server and a physical server?  the difference between cache and database access?
  • 3. Let’s think! 07/03/2014 3 Social Networks Technologies
  • 4. Social network A “typical” web site? 07/03/2014 4 Social Networks Technologies Traditional web site
  • 5. Social network A “typical” web site? 07/03/2014 5 Social Networks Technologies Traditional web site In the beginning, probably…
  • 6. Social network A “typical” web site? 07/03/2014 6 Social Networks Technologies Traditional web site … but soon or later…
  • 7. Why?
  • 8. 07/03/2014Social Networks Technologies 8 What tools and technologies are we using for building a website? Web site: tools and technologies
  • 9. Web site: tools and technologies  JVM + JDK + J2EE  Tomcat (or similar)  MySQL (or similar)  PHP  Apache  MySQL (or similar) 07/03/2014 9 Social Networks Technologies Java/JSP PHP For example…
  • 10. Web site: tools and technologies  JVM + JDK + J2EE  MySQL (or similar)  Tomcat (or similar)  PHP  MySQL (or similar)  Apache server 07/03/2014 10 Social Networks Technologies Java/JSP PHP Typically, a vertical stack (with one programming language)
  • 11. Now let’s try with these sites… 07/03/2014Social Networks Technologies 11
  • 12. Now let’s try with these sites… 07/03/2014Social Networks Technologies 12 PHP MySQL Apache Rails MySQL Unicorn Django MySQL Gunicorn Django PostgreSQL Gunicorn
  • 13. Now let’s try with these sites… 07/03/2014Social Networks Technologies 13 PHP MySQL Apache Rails MySQL Unicorn Django MySQL Gunicorn Django PostgreSQL Gunicorn Are you really, really sure?
  • 14. Not so sure?! 07/03/2014 14 Social Networks Technologies https://github.com/twitter https://github.com/facebook Let’s have a look at some pages and projects… https://enginnering.twitter.com/opensource/projects What did you notice? https://code.facebook.com/projects/
  • 15. What did you notice?  Twitter  tracing system, package manager, various servers, NoSQL database, caching system, etc.  Facebook  code-related tools, code transformer, various servers, distributed file system, caching system, NoSQL database, etc. 07/03/2014 15 Social Networks Technologies A lot of different components
  • 16. What did you notice?  Twitter  Java, Scala, Ruby, C++, C, Objective-C, Shell scripting, Python, JavaScript  Facebook  PHP, OCAml, C++, JavaScript, Python, Java, Objective-C, Processing, C, Ruby, Shell scripting, Haskell, Emacs Lisp, ActionScript 07/03/2014 16 Social Networks Technologies A lot of different languages
  • 17. Now, have a look again at this… 07/03/2014 17 Social Networks Technologies Impressive?!
  • 18. This is ONLY the tip of the iceberg…
  • 19. Social Network Characteristics 07/03/2014Social Networks Technologies 19  Wildly popular over last few years  Facebook has more than 1 billion (monthly active) users  Twitter has more than 600M users  Distributed across the planet  Changed how content is created and consumed  Explosion of smartphones  photos and video are now easy to shoot and share  e.g., Facebook has more than 350M photos uploaded each day
  • 20. Facebook as an example… 07/03/2014 20 Social Networks Technologies
  • 21. On December 2013… 07/03/2014Social Networks Technologies 21 More than 3.9 trillion feed actions processed per day More than 1 billion monthly active users 100 million search queries per day More than 200 billion monthly page views 2,5 billion content items shared per day 2.7 billion ‘Likes’ per day 350 million photos uploaded per day Over 500 TB new data ingested per day 757 million active users per day
  • 22. Growth Rate 07/03/2014Social Networks Technologies 22 0 200 400 600 800 1000 1200 1400 December2004 December2005 December2006 April2007 October2007 August2008 January2009 February2009 April2009 July2009 September2009 December2009 February2010 July2010 September2010 January2011 February2011 June2011 September2011 February2012 April2012 July2012 October2012 January2013 April2013 July2013 October2013 January2014 Millions of Users
  • 23. 07/03/2014 23 Social Networks Technologies Each active user wants to write and upload something NOW No perceptible delay is allowed, in any case No matter how “big” the content is Reality Check Data and services have to be available 24/7, everywhere
  • 24. 07/03/2014 24 Social Networks Technologies Moreover, new features and applications are added continuously to Facebook Reality Check No perceptible delay is allowed, in any case Each active user wants to see and share something NOW (she sees the recent status of her friends and pages, anyway) No matter where the information is (geographically speaking)
  • 25. Reality Check 07/03/2014Social Networks Technologies 25 “Rendering a single page of Facebook involves hundreds of machines examining tens of thousands of pieces of data from dozen of services - all in real time.” - from the Infrastructure page of the Facebook Newsroom
  • 26. Reality Check 07/03/2014Social Networks Technologies 26
  • 27. The main problem 07/03/2014Social Networks Technologies 27 Scalability The ability of a system, network or process to handle a growing amount of work in a capable manner or its ability to be enlarged to accommodate that growth.
  • 28. The main problem 07/03/2014Social Networks Technologies 28 Scalability The ability of a system, network or process to handle a growing amount of work in a capable manner or its ability to be enlarged to accommodate that growth. Affected by several factors
  • 29. How to handle such situations? 07/03/2014 29 Social Networks Technologies
  • 30. Solution 07/03/2014Social Networks Technologies 30  No standard solutions…  … each social network made different choices  strongly dependent from the original core  We are in the Cloud Computing realm  We are going to analyze briefly the Facebook and the Instagram cases
  • 31. Facebook architecture at 100 feet How to handle such situations? 07/03/2014 31 Social Networks Technologies
  • 32. Servers and Data Centers 07/03/2014Social Networks Technologies 32  Facebook has 4 data centers  Prineville, Oregon  Forest City, North Carolina  Luleå, Sweden  Altoona, Iowa (in construction)  They build their servers and data centers from the ground up (efficiently)  Servers and data center design is open source  see The Open Compute Project (http://opencompute.org)
  • 33. Complex Infrastructure 07/03/2014Social Networks Technologies 33  Large number of software components  Multiple storage systems  Multiple caching systems  Hundreds of specialized services  Failure is routine! Keep things as simple as possible!
  • 34. Software Architecture 07/03/2014Social Networks Technologies 34
  • 35. Web Tier 07/03/2014Social Networks Technologies 35  Gather Data from the other Tiers  Runs PHP code  Widely used for web development  One single source tree for all the entire code  Same “binary” on every web tier box
  • 36. Web Tier 07/03/2014Social Networks Technologies 36  At the beginning: Zend Interpreter for PHP  reasonably fast (for an interpreter)  rapid development  no recompiling  but, at scale, performance matters!  Then: HipHop compiler for PHP  400% faster  but slow down development  they add an HipHop interpreter  but compiler and interpreter sometime disagree
  • 37. Web Tier 07/03/2014Social Networks Technologies 37  Now: HipHop Virtual Machine  The best of both worlds  9x increase in web request throughput  5x reduction in memory consumption  All this is open source  e.g., you can find HipHop Virtual Machine at http://hhvm.com
  • 38. Storage Tier 07/03/2014Social Networks Technologies 38  Multiple storage systems  MySQL  Hbase (NoSQL) - Messaging and Insight  Haystack (BLOBS)  BLOB: Binary Large Objects (Photos, Videos, Email attachments, etc.)  large files, no updates/appends, sequential reads
  • 39. Cache Tier 07/03/2014Social Networks Technologies 39  Memcache  speak only with the Web Tier  do one thing very well  improved performance by 10x  key-value store
  • 40. Cache Tier 07/03/2014Social Networks Technologies 40  Tao  abstract the Storage Tier  in production for more than a 3-4 years  higher CPU load than memcache  used for the social graph
  • 41. Service Tier 07/03/2014Social Networks Technologies 41
  • 42. Service Tier 07/03/2014Social Networks Technologies 42  Example: News Feed  one of the hundreds of services at Facebook  Characteristics  real-time distribution  writers can potentially broadcast to very large audience  readers wants (and have) different and dynamic ways to filter data  The service should maintain an index and rank the data (in multiple ways)
  • 43. News Feed Service 07/03/2014Social Networks Technologies 43 - write one location
  • 44. News Feed Service 07/03/2014Social Networks Technologies 44  1000s of machines  leafs are in multiple sets, and each set has the entire index  Dealing with (daily) failures  hardware/software, server/network, intermittent/permanent, etc.  if a leaf is inaccessible, failover request to a different set  if an aggregator is inaccessible, “just” pick another  More leafs than aggregators  Reads are more expensive than writes  High network load between aggregator and leafs  fundamental to keep a full leaf set within a single rack on machines
  • 45. Software architecture 07/03/2014Social Networks Technologies 45
  • 46. Instagram architecture at 100 feet How to handle such situations? 07/03/2014 46 Social Networks Technologies
  • 47. Infrastructure 07/03/2014Social Networks Technologies 47  Instagram follows some core principle when choosing a system:  keep it very simple  don’t re-invent the wheel  go with proven and solid technologies when you can  It runs Ubuntu on Amazon EC2
  • 48. Software Architecture 07/03/2014Social Networks Technologies 48  Web Tier  Django (Python) on more than 25 Amazon High-CPU Extra-Large machines  Gunicorn is the chosen Web Server  Storage Tier  PostgreSQL  Redis (key-value storage)  Amazon S3 for photos  Cache Tier  memcache
  • 49. Conclusions 07/03/2014 49 Social Networks Technologies
  • 50. Conclusions 07/03/2014Social Networks Technologies 50  Most Social Networks starts like “traditional” website  They change and/or evolve their hardware and software infrastructure when:  users growth  some functionalities are added/revised  Scalability is a relevant problem…  They are complex entities and, sometimes, required complex or innovative solutions
  • 51. References 07/03/2014Social Networks Technologies 51  Instagram Engineering Blog, http://instagram- engineering.tumblr.com  Facebook Newsroom, http://newsroom.fb.com  Facebook Investor Relations, http://investor.fb.com  HPCA 2012 Facebook Keynote, http://www.ece.lsu.edu/hpca- 18/files/HPCA2012_Facebook_Keynote.pdf
  • 52. License 07/03/2014Tips for Working Successfully in a Group 52  This work is licensed under the Creative Commons “Attribution- NonCommercial-ShareAlike Unported (CC BY-NC-SA 3,0)” License.  You are free:  to Share - to copy, distribute and transmit the work  to Remix - to adapt the work  Under the following conditions:  Attribution - You must attribute the work in the manner specified by the author or licensor (but not in any way that suggests that they endorse you or your use of the work).  Noncommercial - You may not use this work for commercial purposes.  Share Alike - If you alter, transform, or build upon this work, you may distribute the resulting work only under the same or similar license to this one.  To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/