3. Reference: https://www.feedough.com/facebook-business-model-makes-money/
• 17,048 people were in full-time
employment by the social
networking company, up from 150
people in 2006.
• Revenue of 40.7 billion USD with
2.2 billion monthly active users.
• Every 60 seconds, 136,000 photos
are uploaded, 510,000 comments
are posted, and 293,000 status
updates are posted.
• Valued at $321 billion.
• Facebook has enough data to know
us better than our therapists!
Introduction
Technology
4. Reference: http://blog.intercom.io/wp-content/uploads/2012/08/NetworkEffects-2.png
• If you aren’t paying for it, you’re the
product.
• To sustain in the market, you must
know everything about yourself and
your competitors too.
• Facebook builds its business by
learning about its users and
packaging their data for advertisers.
• Insane amount of research and
development.
• Always looking to enhance its user
experience.
Business Model
Technology
5. Reference: https://i.ytimg.com/vi/mK7_5Ag8Z2E/hqdefault.jpg
• Big Data is crucial to the company’s
very being.
• Facebook uses 98 personal data
points to target ads to you.
• Facebook relies on a massive
installation of Hadoop, a highly
scalable open-source framework
that uses clusters of low-cost
servers to solve problems.
Facebook even designs its own
hardware for this purpose. Hadoop
is just one of many Big Data
technologies employed at
Facebook.
Why Data Science ?
Technology
6. Reference https://research.fb.com/prophet-forecasting-at-scale/
Impact on Decision Making
Technology
• Prophet, forecasting tool available in
Python and R.
• It’s a key piece to improve Facebook’s
ability to create a large number of
trustworthy forecasts used for
decision-making and even in product
features.
• Prophet is optimized for the business
forecast tasks.
• Prophet procedure is an additive
regression model.
7. • Data source – Users worldwide.
• Tracking cookies enables them to
track users across web.
• Facial recognition and image
processing capabilities.
• Tag suggestion who to tag in user
photos through image processing and
facial recognition.
• Analysing the Likes will accurately
predicts the traits of an individual.
Data Sources
Technology
Reference https://www.statista.com/statistics/268136/top-15-countries-based-on-
number-of-facebook-users/
8. Data Science Team
Technology
• 60 Data Scientists, Most of them have
proficiency in Hadoop Infrastructure.
• “Facebook runs the world’s largest Hadoop
cluster" says Jay Parikh, Vice President
Infrastructure Engineering, Facebook.
• Social Network Analysis and Modelling,
complexity theory, etc. are team’s interest
areas.
Reference: https://www.slideshare.net/seanjtaylor/putting-the-magic-in-data-science/4-Its_not_a_trick_its
9. Data Science TechniquesIntroduction
• Facebook uses a tool Deep Text for Textual
analysis.
• Facebook uses a DL application called
DeepFace for Facial recognition.
• Flow and Scuba.
• Corona and Prism.
Reference: http://images.slideplayer.com/17/5366578/slides/slide_1.jpg
10. • Data science is helping Facebook to
improve social interactions on everyday
basis.
• How the social connections work not only
in the real world but also the virtual
worldis getting unlocked by studying
these patterns which are hidden in the
data.
• The data science is not just limited to
study the social interactions but also how
a user is behaving with a business entity
and how is reacting towards the
advertisements being so speciaifcally
targeted at him.
Lessons learned
Technology
HTTPS://RESEARCH.FB.COM/CATE
GORY/DATA-SCIENCE/
Facebook Payments Inc.: to let Facebook generate revenue through payment business.
Atlas: ad-serving and measurement platform, offering services to advertisers and agencies.
Instagram: Media Sharing Platform.
Onavo: Mobile utility application.
Parse: back end infrastructure provider for mobile applications.
Moves: Exercise (steps) tracking application.
Oculus: Virtual reality technology.
LiveRail: Publisher Monetization Platform.
WhatsApp: Instant Messaging Client.
Masquerade: Visual Filters mobile application.
the Prophet procedure is an additive regression model with four main components:
A piecewise linear or logistic growth curve trend. Prophet automatically detects changes in trends by selecting changepoints from the data.
A yearly seasonal component modeled using Fourier series.
A weekly seasonal component using dummy variables.
A user-provided list of important holidays.
Facebook is a strong supporter of Open Source and makes most of the work of its AI labs Facebook Artificial Intelligence Research (FAIR) freely available for anyone to use or modify however they like. Most of Facebook’s Deep Learning is built on the Torch platform, a development environment focused on the development of deep learning technologies and neural networks.
https://code.facebook.com/posts/181565595577955/introducing-deeptext-facebook-s-text-understanding-engine/.
DeepFace is a deep learning facial recognition system created by a research group at Facebook. It identifies human faces in digital images. It employs a nine-layer neural net with over 120 million connection weights, and was trained on four million images uploaded by Facebook users.[1][2] The system is said to be 97% accurate, compared to 85% for the FBI's Next Generation Identification system.[3] One of the creators of the software, Yaniv Taigman, came to Facebook via their 2007 acquisition of Face.com.
https://en.wikipedia.org/wiki/DeepFace
Flow is designed to help engineers build, test, and execute machine learning algorithms on a massive scale, and this includes practically any form of machine learning—a broad technology that covers all services capable of learning tasks largely on their own.With Flow, Mehanna says, Facebook trains and tests about 300,000 machine learning models each month. Whereas it once rolled a new AI model onto its social network every 60 days or so, it can now release several new models each week.
https://www.wired.com/2016/05/facebook-trying-create-ai-can-create-ai/
Corona
Developed by an ex-Yahoo man Avery Ching and his team, Corona allows multiple jobs to be processed at a time on a single Hadoop cluster without crashing the system. This concept of Corona sprouted in the minds of developers, when they started facing issues with Hadoop’s framework. It was getting tougher to manage the cluster resources and task trackers. MapReduce was designed on the basis of a pull-based scheduling model, which was causing a delay in processing the small jobs. Hadoop was limited by its slot-based resource management model, which was wasting the slots each time the cluster size could not fit the configuration.
Prism
Hadoop wasn’t designed to run across multiple facilities. Typically, because it requires such heavy communication between servers, clusters are limited to a single data center.
Initially when Facebook implemented Hadoop, it was not designed to run across multiple data centers. And that’s when the requirement to develop Prism was felt by the team of Facebook. Prism is a platform which brings out many namespaces instead of the single one governed by the Hadoop. This in turn helps to develop many logical clusters.
This system is now expandable to as many servers as possible without worrying about increasing the number of data centers.
https://dzone.com/articles/how-is-facebook-deploying-big-data
Facebook uses a tool DeepText for Textual analysis.
Facebook uses a DL application called DeepFace for Facial recognition.
Flow and Scuba.
Corona and Prism.