SlideShare a Scribd company logo
Extract value from
Facebook Data
Abstract.................................................................................................................................................3
Introduction...........................................................................................................................................3
Building Blocks.....................................................................................................................................4
• Configuration files
Parameterized Map/Reduce Program..................................................................................................5
• Parameters
• Extraction Process
Conclusion............................................................................................................................................6
About the Author...................................................................................................................................6
Contents
© Happiest Minds Technologies Pvt. Ltd. All Rights Reserved2
Abstract
In present times any marketing or customer strategy is
incomplete without a social media presence. With custom-
ers depending all the more on social media channels to
access and disseminate information and reviews, it
becomes all the more important for organizations to tap
social media channels for actionable insights. For analytic
engines that churn out insights required for quick and
intelligent decisions, social media is a key channel that
needs to be explored on a consistent basis.
Organizations are increasingly looking towards accelera-
tors and frameworks that enable them to get the required
intelligence from social media channels. Having the right
accelerator enables the organization make intelligent
decisions regarding their customer behaviour.
Extraction Process and Flow
The process and cornerstones of the accelerator is based
on the understanding that Facebook exposes its data in
form of a structured Facebook schema which can be
accessed via GraphAPIs.
Introduction
Modern organisations lay a lot of emphasis on offering
customized services to their customers. In such a situa-
tion, the customer’s social profile and behaviour related
information play a crucial role. Most of the organisations
have an analytic pattern that is customer centric, descrip-
tive, predictive as well as prescriptive. Organisations have
been putting in huge chunks of investments to get the
required view from their customer data and expect a quick
return on their investments.
When the need of the hour is a deliverable system that is
astute as well as fast and reliable, organisations need to
look at quick plug and play accelerators that will allow
them to access the required information quickly in real
time. The main benefit of such data for an organisation is
that it provides the time it needs to concentrate on analytic
problem statements which gives more importance to data.
The accelerator should not only be quick but also be
effective and enterprising, be able to adapt to changing
conditions, as well as be able to make the best use of the
available resources.
Facebook Accelerator
With the amount of time the current generation spends on
social media, it is natural that most enterprises are now
trying to keep in touch with their customers through social
channels. It is no surprise that the top social media chan-
nels like Facebook, LinkedIn and Twitter serve as sources
of data in current times.
© Happiest Minds Technologies Pvt. Ltd. All Rights Reserved3
Hosted on an open source big data frame work
Leverages the power of disruptive technology and
ensures that data is available near real time
Since the accelerator is powered by metadata file, it
allows changes to be made as well as version up grada-
tion of face book schema without altering the code.
•
•
•
Facebook has a mechanism called Facebook Query Language (FQL) to allow data querying from the entire Facebook
schema. The complete schema can be found in the URL [https://developers.facebook.com/docs/reference/fql/ ]. A project
by the name “RestFB” - A subset of FQL schema, provides third party classes for the accelerator.
Building Blocks
Configuration files
© Happiest Minds Technologies Pvt. Ltd. All Rights Reserved4
Mandatory configuration file: The tables and columns in this file are imperative for other tables to gather data. While these
tables are independent, the tables in the optional configuration file are dependent.
Optional configuration file: The tables in this file and their corresponding columns are dependent on the tables in the man-
datory configuration file.
•
•
EVENT
eid
name
nid
pic
host
description
event_type
eveny_subtype
start_time
end_time
creator
update_time
location
venue
•
•
•
•
•
•
•
•
•
•
•
•
•
•
STREAM
post_id
app_id
source_id
updated_time
created_time
actor_id
target_id
message
action_links
attachment
comments
likes
privacy•
•
•
•
•
•
•
•
•
•
•
•
•
PAGE
page_id
pic
page_url
type
company_overview
location
bio
fan_count
•
•
•
•
•
•
•
•
LIKE
object_id
object_id_cursor
object_type
post_id
Li kecol
user_id•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
USER
uid
first_name
last_name
name
pic
birthday
sex
relationship_status
current_location
interests
about_me
profile_url
family•
•
•
•
•
•
•
•
•
•
•
•
•
COMMENT
xid
post_id
from id
time
text
id
username
reply_xid•
•
•
•
•
•
•
•
FQL Table Schema
© Happiest Minds Technologies Pvt. Ltd. All Rights Reserved5
This program is devised to distil data from Facebook and load it in HDFS.
Parameters:
Metadata files make use of linked hashed maps to make sure they retain the order of the existing tables. Optional configu-
ration files are given to all reducers through the distributed cache.
Parameterized Map/Reduce Program
Page id
App id + secure key
Configuration files
Desired database name in HIVE
Number of machines can be specified too (reducers to be launched by Hadoop)
Extraction process involves the following steps
Configuration files are subjected to changes if any
Job is launched with the correct Face book page id as argument
Inside the mapper :
The mandatory configuration file is processed and it collates data from stream and event table
The HDFS folder is used to write output files for the stream and event table
•
•
•
•
•
•
•
•
Data Access from
Analytics programs
Load
Configuration File
+
ID and Access Token
FB name or ID of the
brand page
Script calls a
Map Reduce job
to fetch data in parallel
Fetch Data
Access
HIVEHDFS
Using a plug and play accelerator, teams will get access to almost all the data in near real time and help them do the actual
work (of analytics) rather than data collection and data cleansing. This helps organizations obviate the excess time required
for mundane activities and focus on the more relevant analytics that drive customer insights and revenue growth.
© Happiest Minds Technologies Pvt. Ltd. All Rights Reserved6
Conclusion
Inside the reducer:
The reducer is used to hold the post IDs from the streams and event tables. The number of IDs processed by
the reducers is calculated as the Total number of post IDs/number of reducers fired up. In this way the reduc
ers will have an even distribution of load. While the key is the number of reducers, values are represented by
the post IDs.
The reducers will write their own files which correlate with the tables in the optional configuration file. The
number of reducers can be provided as a parameter while submitting the job.
Post Map-Reduce phase:
Hive script creates database and tables according to the corresponding names specified.
Data from HDFS is copied into appropriate tables in the database created in the above step.
Now data is available in a tabular format and teams requiring this data can connect to Hive database and work
on it.
At the end of the job, the particular program would have collated enough data which gives information on the post, likes
on the post, comments made on the post, number of likes, users who have engaged with the post and basic user informa-
tion. The data will be pushed to Hive Database specified by the user into appropriate tables. The table names and
columns are in accordance with those specified in the configuration files.
Bhawna Manchanda is a Big Data Architect. She plays a key role in conceptualizing and implement-
ing BIG Data Solutions/Framework and Strategies in Happiest minds. She has also worked exten-
sively with Leading Banks in BIDW space.
About the Author
Bhawna Manchanda
Big Data Architect
Sunny Malik has a Master’s Degree in Computer Science from University of Southern California
(USC). He has worked extensively on Application Development using open-source technologies
and currently focused on Big Data Technologies and Algorithm Development.
Sunny Malik
Big Data Technologies and Algorithm
Development
Skanda Bhargav is a Cloudera Certified Hadoop developer. He is a Computer Science graduate
from Viswesvaraya Technological University, Belgaum popularly known as VTU. He has contributed
to 3 books on Big Data subject which was published by http://www.packtpub.com/ .His interests are
Hadoop, Hive, Map Reduce and Sqoop.
Skanda Bhargav
Hadoop developer.
•
•
© Happiest Minds Technologies Pvt. Ltd. All Rights Reserved
Happiest Minds is focused on helping customers build Smart Secure and Connected experience by leveraging disruptive
technologies like mobility, analytics, security, cloud computing, social computing and unified communications. Enterprises
are embracing these technologies to implement Omni-channel strategies, manage structured & unstructured data and
make real time decisions based on actionable insights, while ensuring security for data and infrastructure. Happiest Minds
also offers high degree of skills, IPs and domain expertise across a set of focused areas that include IT Services, Product
Engineering Services, Infrastructure Management, Security, Testing and Consulting.
Headquartered in Bangalore, India, Happiest Minds has operations in the US, UK, Singapore and Australia. It secured a
$45 million Series-A funding led by Canaan Partners, Intel Capital and Ashok Soota.
Happiest Minds
© 2014 Happiest Minds. All Rights Reserved.
E-mail: Business@happiestminds.com
Visit us: www.happiestminds.com
Follow us on
7
This document is an exclusive property of Happiest Minds Technologies Pvt. Ltd.

More Related Content

What's hot

Hilo cppxaml
Hilo cppxamlHilo cppxaml
Hilo cppxaml
Steve Xu
 
Enabling Governed Data Access with Tableau Data Server
Enabling Governed Data Access with Tableau Data Server Enabling Governed Data Access with Tableau Data Server
Enabling Governed Data Access with Tableau Data Server
Tableau Software
 
Deployment guide-for-share point-2013
Deployment guide-for-share point-2013Deployment guide-for-share point-2013
Deployment guide-for-share point-2013
prconcepcion
 
IRJET - Re-Ranking of Google Search Results
IRJET - Re-Ranking of Google Search ResultsIRJET - Re-Ranking of Google Search Results
IRJET - Re-Ranking of Google Search Results
IRJET Journal
 
11 Strategic Considerations & Davinci Demo
11 Strategic Considerations & Davinci Demo11 Strategic Considerations & Davinci Demo
11 Strategic Considerations & Davinci Demo
Christian Buckley
 
Data communications
Data communicationsData communications
Data communications
Nzb sirji
 
IRJET - Health Medicare Data using Tweets in Twitter
IRJET - Health Medicare Data using Tweets in TwitterIRJET - Health Medicare Data using Tweets in Twitter
IRJET - Health Medicare Data using Tweets in Twitter
IRJET Journal
 
Informatica log files
Informatica log filesInformatica log files
Informatica log files
Amit Sharma
 

What's hot (8)

Hilo cppxaml
Hilo cppxamlHilo cppxaml
Hilo cppxaml
 
Enabling Governed Data Access with Tableau Data Server
Enabling Governed Data Access with Tableau Data Server Enabling Governed Data Access with Tableau Data Server
Enabling Governed Data Access with Tableau Data Server
 
Deployment guide-for-share point-2013
Deployment guide-for-share point-2013Deployment guide-for-share point-2013
Deployment guide-for-share point-2013
 
IRJET - Re-Ranking of Google Search Results
IRJET - Re-Ranking of Google Search ResultsIRJET - Re-Ranking of Google Search Results
IRJET - Re-Ranking of Google Search Results
 
11 Strategic Considerations & Davinci Demo
11 Strategic Considerations & Davinci Demo11 Strategic Considerations & Davinci Demo
11 Strategic Considerations & Davinci Demo
 
Data communications
Data communicationsData communications
Data communications
 
IRJET - Health Medicare Data using Tweets in Twitter
IRJET - Health Medicare Data using Tweets in TwitterIRJET - Health Medicare Data using Tweets in Twitter
IRJET - Health Medicare Data using Tweets in Twitter
 
Informatica log files
Informatica log filesInformatica log files
Informatica log files
 

Similar to Whitepaper: Extract value from Facebook Data - Happiest Minds

Big data
Big dataBig data
SAP BODS 4.2
SAP BODS 4.2 SAP BODS 4.2
Essay Database
Essay DatabaseEssay Database
Microsoft big data_solution_brief
Microsoft big data_solution_briefMicrosoft big data_solution_brief
Microsoft big data_solution_brief
Dr. Wilfred Lin (Ph.D.)
 
Record matching over query results
Record matching over query resultsRecord matching over query results
Record matching over query results
ambitlick
 
Real World End to End machine Learning Pipeline
Real World End to End machine Learning PipelineReal World End to End machine Learning Pipeline
Real World End to End machine Learning Pipeline
Srivatsan Srinivasan
 
Data Analytics | How it Works
Data Analytics | How it WorksData Analytics | How it Works
Data Analytics | How it Works
John P. Gough
 
5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake
MetroStar
 
Using a Semantic and Graph-based Data Catalog in a Modern Data Fabric
Using a Semantic and Graph-based Data Catalog in a Modern Data FabricUsing a Semantic and Graph-based Data Catalog in a Modern Data Fabric
Using a Semantic and Graph-based Data Catalog in a Modern Data Fabric
Cambridge Semantics
 
Enabling SQL Access to Data Lakes
Enabling SQL Access to Data LakesEnabling SQL Access to Data Lakes
Enabling SQL Access to Data Lakes
Vasu S
 
Agile Testing Days 2017 Intoducing AgileBI Sustainably - Excercises
Agile Testing Days 2017 Intoducing AgileBI Sustainably - ExcercisesAgile Testing Days 2017 Intoducing AgileBI Sustainably - Excercises
Agile Testing Days 2017 Intoducing AgileBI Sustainably - Excercises
Raphael Branger
 
Hadoop-based architecture approaches
Hadoop-based architecture approachesHadoop-based architecture approaches
Hadoop-based architecture approaches
Miraj Godha
 
Power BI Overview
Power BI OverviewPower BI Overview
Power BI Overview
Nikkia Carter
 
Top Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practicesTop Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practices
SpringPeople
 
final-the-data-teams-guide-to-the-db-lakehouse-platform-rd-6-14-22.pdf
final-the-data-teams-guide-to-the-db-lakehouse-platform-rd-6-14-22.pdffinal-the-data-teams-guide-to-the-db-lakehouse-platform-rd-6-14-22.pdf
final-the-data-teams-guide-to-the-db-lakehouse-platform-rd-6-14-22.pdf
XIAOZEJIN1
 
Big Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential ToolsBig Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential Tools
FredReynolds2
 
About Streaming Data Solutions for Hadoop
About Streaming Data Solutions for HadoopAbout Streaming Data Solutions for Hadoop
About Streaming Data Solutions for Hadoop
Lynn Langit
 
Latest trends in Business Analytics
Latest trends in Business AnalyticsLatest trends in Business Analytics
Latest trends in Business Analytics
Puneet Bhalla
 
Data science in business Administration Nagarajan.pptx
Data science in business Administration Nagarajan.pptxData science in business Administration Nagarajan.pptx
Data science in business Administration Nagarajan.pptx
NagarajanG35
 
Enterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable DigitalEnterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable Digital
sambiswal
 

Similar to Whitepaper: Extract value from Facebook Data - Happiest Minds (20)

Big data
Big dataBig data
Big data
 
SAP BODS 4.2
SAP BODS 4.2 SAP BODS 4.2
SAP BODS 4.2
 
Essay Database
Essay DatabaseEssay Database
Essay Database
 
Microsoft big data_solution_brief
Microsoft big data_solution_briefMicrosoft big data_solution_brief
Microsoft big data_solution_brief
 
Record matching over query results
Record matching over query resultsRecord matching over query results
Record matching over query results
 
Real World End to End machine Learning Pipeline
Real World End to End machine Learning PipelineReal World End to End machine Learning Pipeline
Real World End to End machine Learning Pipeline
 
Data Analytics | How it Works
Data Analytics | How it WorksData Analytics | How it Works
Data Analytics | How it Works
 
5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake
 
Using a Semantic and Graph-based Data Catalog in a Modern Data Fabric
Using a Semantic and Graph-based Data Catalog in a Modern Data FabricUsing a Semantic and Graph-based Data Catalog in a Modern Data Fabric
Using a Semantic and Graph-based Data Catalog in a Modern Data Fabric
 
Enabling SQL Access to Data Lakes
Enabling SQL Access to Data LakesEnabling SQL Access to Data Lakes
Enabling SQL Access to Data Lakes
 
Agile Testing Days 2017 Intoducing AgileBI Sustainably - Excercises
Agile Testing Days 2017 Intoducing AgileBI Sustainably - ExcercisesAgile Testing Days 2017 Intoducing AgileBI Sustainably - Excercises
Agile Testing Days 2017 Intoducing AgileBI Sustainably - Excercises
 
Hadoop-based architecture approaches
Hadoop-based architecture approachesHadoop-based architecture approaches
Hadoop-based architecture approaches
 
Power BI Overview
Power BI OverviewPower BI Overview
Power BI Overview
 
Top Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practicesTop Big data Analytics tools: Emerging trends and Best practices
Top Big data Analytics tools: Emerging trends and Best practices
 
final-the-data-teams-guide-to-the-db-lakehouse-platform-rd-6-14-22.pdf
final-the-data-teams-guide-to-the-db-lakehouse-platform-rd-6-14-22.pdffinal-the-data-teams-guide-to-the-db-lakehouse-platform-rd-6-14-22.pdf
final-the-data-teams-guide-to-the-db-lakehouse-platform-rd-6-14-22.pdf
 
Big Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential ToolsBig Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential Tools
 
About Streaming Data Solutions for Hadoop
About Streaming Data Solutions for HadoopAbout Streaming Data Solutions for Hadoop
About Streaming Data Solutions for Hadoop
 
Latest trends in Business Analytics
Latest trends in Business AnalyticsLatest trends in Business Analytics
Latest trends in Business Analytics
 
Data science in business Administration Nagarajan.pptx
Data science in business Administration Nagarajan.pptxData science in business Administration Nagarajan.pptx
Data science in business Administration Nagarajan.pptx
 
Enterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable DigitalEnterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable Digital
 

More from Happiest Minds Technologies

Largest Electricity provider in the US- Case Study
Largest Electricity provider in the US- Case StudyLargest Electricity provider in the US- Case Study
Largest Electricity provider in the US- Case Study
Happiest Minds Technologies
 
BFSI GLOBAL TRENDS FY 24
BFSI GLOBAL TRENDS FY 24BFSI GLOBAL TRENDS FY 24
BFSI GLOBAL TRENDS FY 24
Happiest Minds Technologies
 
ARTIFICIAL INTELLIGENCE IN DIGITAL BANKING
ARTIFICIAL INTELLIGENCE IN DIGITAL BANKINGARTIFICIAL INTELLIGENCE IN DIGITAL BANKING
ARTIFICIAL INTELLIGENCE IN DIGITAL BANKING
Happiest Minds Technologies
 
DIGITAL MANUFACTURING
DIGITAL MANUFACTURINGDIGITAL MANUFACTURING
DIGITAL MANUFACTURING
Happiest Minds Technologies
 
Exploring the Potential of ChatGPT in Banking, Financial SERVICES & Insurance
Exploring the Potential of ChatGPT in Banking, Financial SERVICES & InsuranceExploring the Potential of ChatGPT in Banking, Financial SERVICES & Insurance
Exploring the Potential of ChatGPT in Banking, Financial SERVICES & Insurance
Happiest Minds Technologies
 
AN OVERVIEW OF THE METAVERSE
AN OVERVIEW OF THE METAVERSEAN OVERVIEW OF THE METAVERSE
AN OVERVIEW OF THE METAVERSE
Happiest Minds Technologies
 
VMware to AWS Cloud Migration
VMware to AWS Cloud MigrationVMware to AWS Cloud Migration
VMware to AWS Cloud Migration
Happiest Minds Technologies
 
Digital-Content-Monetization-DCM-Platform-2.pdf
Digital-Content-Monetization-DCM-Platform-2.pdfDigital-Content-Monetization-DCM-Platform-2.pdf
Digital-Content-Monetization-DCM-Platform-2.pdf
Happiest Minds Technologies
 
AUTOMATING CYBER RISK DETECTION AND PROTECTION WITH SOC 2.0
AUTOMATING CYBER RISK DETECTION AND PROTECTION WITH SOC 2.0AUTOMATING CYBER RISK DETECTION AND PROTECTION WITH SOC 2.0
AUTOMATING CYBER RISK DETECTION AND PROTECTION WITH SOC 2.0
Happiest Minds Technologies
 
Cloud Reshaping Banking
Cloud Reshaping BankingCloud Reshaping Banking
Cloud Reshaping Banking
Happiest Minds Technologies
 
Automating SOC1/2 Compliance- For a leading Software solution company in UK
Automating SOC1/2 Compliance- For a leading Software solution company in UKAutomating SOC1/2 Compliance- For a leading Software solution company in UK
Automating SOC1/2 Compliance- For a leading Software solution company in UK
Happiest Minds Technologies
 
PAMaaS- Powered by CyberArk
PAMaaS- Powered by CyberArkPAMaaS- Powered by CyberArk
PAMaaS- Powered by CyberArk
Happiest Minds Technologies
 
GUIDE TO KEEP YOUR END-USERS CONNECTED TO THE DIGITAL WORKPLACE DURING DISRUP...
GUIDE TO KEEP YOUR END-USERS CONNECTED TO THE DIGITAL WORKPLACE DURING DISRUP...GUIDE TO KEEP YOUR END-USERS CONNECTED TO THE DIGITAL WORKPLACE DURING DISRUP...
GUIDE TO KEEP YOUR END-USERS CONNECTED TO THE DIGITAL WORKPLACE DURING DISRUP...
Happiest Minds Technologies
 
SECURING THE CLOUD DATA LAKES
SECURING THE CLOUD DATA LAKESSECURING THE CLOUD DATA LAKES
SECURING THE CLOUD DATA LAKES
Happiest Minds Technologies
 
Complete Guide to General Data Protection Regulation (GDPR)
Complete Guide to General Data Protection Regulation (GDPR)Complete Guide to General Data Protection Regulation (GDPR)
Complete Guide to General Data Protection Regulation (GDPR)
Happiest Minds Technologies
 
Azure bastion- Remote desktop RDP/SSH in Azure using Bastion Service as (PaaS)
Azure bastion- Remote desktop RDP/SSH in Azure using Bastion Service as (PaaS)Azure bastion- Remote desktop RDP/SSH in Azure using Bastion Service as (PaaS)
Azure bastion- Remote desktop RDP/SSH in Azure using Bastion Service as (PaaS)
Happiest Minds Technologies
 
REDUCING TRANSPORTATION COSTS IN RETAIL THROUGH INTELLIGENT FREIGHT AUDIT
REDUCING TRANSPORTATION COSTS IN RETAIL THROUGH INTELLIGENT FREIGHT AUDITREDUCING TRANSPORTATION COSTS IN RETAIL THROUGH INTELLIGENT FREIGHT AUDIT
REDUCING TRANSPORTATION COSTS IN RETAIL THROUGH INTELLIGENT FREIGHT AUDIT
Happiest Minds Technologies
 
REDUCING TRANSPORTATION COSTS IN CPG THROUGH INTELLIGENT FREIGHT AUDIT
REDUCING TRANSPORTATION COSTS IN CPG THROUGH INTELLIGENT FREIGHT AUDITREDUCING TRANSPORTATION COSTS IN CPG THROUGH INTELLIGENT FREIGHT AUDIT
REDUCING TRANSPORTATION COSTS IN CPG THROUGH INTELLIGENT FREIGHT AUDIT
Happiest Minds Technologies
 
How to Approach Tool Integrations
How to Approach Tool IntegrationsHow to Approach Tool Integrations
How to Approach Tool Integrations
Happiest Minds Technologies
 
REDUCING TRANSPORTATION COSTS IN RETAIL THROUGH INTELLIGENT FREIGHT AUDIT
REDUCING TRANSPORTATION COSTS IN RETAIL THROUGH INTELLIGENT FREIGHT AUDITREDUCING TRANSPORTATION COSTS IN RETAIL THROUGH INTELLIGENT FREIGHT AUDIT
REDUCING TRANSPORTATION COSTS IN RETAIL THROUGH INTELLIGENT FREIGHT AUDIT
Happiest Minds Technologies
 

More from Happiest Minds Technologies (20)

Largest Electricity provider in the US- Case Study
Largest Electricity provider in the US- Case StudyLargest Electricity provider in the US- Case Study
Largest Electricity provider in the US- Case Study
 
BFSI GLOBAL TRENDS FY 24
BFSI GLOBAL TRENDS FY 24BFSI GLOBAL TRENDS FY 24
BFSI GLOBAL TRENDS FY 24
 
ARTIFICIAL INTELLIGENCE IN DIGITAL BANKING
ARTIFICIAL INTELLIGENCE IN DIGITAL BANKINGARTIFICIAL INTELLIGENCE IN DIGITAL BANKING
ARTIFICIAL INTELLIGENCE IN DIGITAL BANKING
 
DIGITAL MANUFACTURING
DIGITAL MANUFACTURINGDIGITAL MANUFACTURING
DIGITAL MANUFACTURING
 
Exploring the Potential of ChatGPT in Banking, Financial SERVICES & Insurance
Exploring the Potential of ChatGPT in Banking, Financial SERVICES & InsuranceExploring the Potential of ChatGPT in Banking, Financial SERVICES & Insurance
Exploring the Potential of ChatGPT in Banking, Financial SERVICES & Insurance
 
AN OVERVIEW OF THE METAVERSE
AN OVERVIEW OF THE METAVERSEAN OVERVIEW OF THE METAVERSE
AN OVERVIEW OF THE METAVERSE
 
VMware to AWS Cloud Migration
VMware to AWS Cloud MigrationVMware to AWS Cloud Migration
VMware to AWS Cloud Migration
 
Digital-Content-Monetization-DCM-Platform-2.pdf
Digital-Content-Monetization-DCM-Platform-2.pdfDigital-Content-Monetization-DCM-Platform-2.pdf
Digital-Content-Monetization-DCM-Platform-2.pdf
 
AUTOMATING CYBER RISK DETECTION AND PROTECTION WITH SOC 2.0
AUTOMATING CYBER RISK DETECTION AND PROTECTION WITH SOC 2.0AUTOMATING CYBER RISK DETECTION AND PROTECTION WITH SOC 2.0
AUTOMATING CYBER RISK DETECTION AND PROTECTION WITH SOC 2.0
 
Cloud Reshaping Banking
Cloud Reshaping BankingCloud Reshaping Banking
Cloud Reshaping Banking
 
Automating SOC1/2 Compliance- For a leading Software solution company in UK
Automating SOC1/2 Compliance- For a leading Software solution company in UKAutomating SOC1/2 Compliance- For a leading Software solution company in UK
Automating SOC1/2 Compliance- For a leading Software solution company in UK
 
PAMaaS- Powered by CyberArk
PAMaaS- Powered by CyberArkPAMaaS- Powered by CyberArk
PAMaaS- Powered by CyberArk
 
GUIDE TO KEEP YOUR END-USERS CONNECTED TO THE DIGITAL WORKPLACE DURING DISRUP...
GUIDE TO KEEP YOUR END-USERS CONNECTED TO THE DIGITAL WORKPLACE DURING DISRUP...GUIDE TO KEEP YOUR END-USERS CONNECTED TO THE DIGITAL WORKPLACE DURING DISRUP...
GUIDE TO KEEP YOUR END-USERS CONNECTED TO THE DIGITAL WORKPLACE DURING DISRUP...
 
SECURING THE CLOUD DATA LAKES
SECURING THE CLOUD DATA LAKESSECURING THE CLOUD DATA LAKES
SECURING THE CLOUD DATA LAKES
 
Complete Guide to General Data Protection Regulation (GDPR)
Complete Guide to General Data Protection Regulation (GDPR)Complete Guide to General Data Protection Regulation (GDPR)
Complete Guide to General Data Protection Regulation (GDPR)
 
Azure bastion- Remote desktop RDP/SSH in Azure using Bastion Service as (PaaS)
Azure bastion- Remote desktop RDP/SSH in Azure using Bastion Service as (PaaS)Azure bastion- Remote desktop RDP/SSH in Azure using Bastion Service as (PaaS)
Azure bastion- Remote desktop RDP/SSH in Azure using Bastion Service as (PaaS)
 
REDUCING TRANSPORTATION COSTS IN RETAIL THROUGH INTELLIGENT FREIGHT AUDIT
REDUCING TRANSPORTATION COSTS IN RETAIL THROUGH INTELLIGENT FREIGHT AUDITREDUCING TRANSPORTATION COSTS IN RETAIL THROUGH INTELLIGENT FREIGHT AUDIT
REDUCING TRANSPORTATION COSTS IN RETAIL THROUGH INTELLIGENT FREIGHT AUDIT
 
REDUCING TRANSPORTATION COSTS IN CPG THROUGH INTELLIGENT FREIGHT AUDIT
REDUCING TRANSPORTATION COSTS IN CPG THROUGH INTELLIGENT FREIGHT AUDITREDUCING TRANSPORTATION COSTS IN CPG THROUGH INTELLIGENT FREIGHT AUDIT
REDUCING TRANSPORTATION COSTS IN CPG THROUGH INTELLIGENT FREIGHT AUDIT
 
How to Approach Tool Integrations
How to Approach Tool IntegrationsHow to Approach Tool Integrations
How to Approach Tool Integrations
 
REDUCING TRANSPORTATION COSTS IN RETAIL THROUGH INTELLIGENT FREIGHT AUDIT
REDUCING TRANSPORTATION COSTS IN RETAIL THROUGH INTELLIGENT FREIGHT AUDITREDUCING TRANSPORTATION COSTS IN RETAIL THROUGH INTELLIGENT FREIGHT AUDIT
REDUCING TRANSPORTATION COSTS IN RETAIL THROUGH INTELLIGENT FREIGHT AUDIT
 

Recently uploaded

Transportation_Channel_Investor_Presentation_April_2024_ Final .pdf
Transportation_Channel_Investor_Presentation_April_2024_ Final .pdfTransportation_Channel_Investor_Presentation_April_2024_ Final .pdf
Transportation_Channel_Investor_Presentation_April_2024_ Final .pdf
Matthewperry105
 
快速办理(worcester毕业证书)伍斯特大学毕业证PDF成绩单一模一样
快速办理(worcester毕业证书)伍斯特大学毕业证PDF成绩单一模一样快速办理(worcester毕业证书)伍斯特大学毕业证PDF成绩单一模一样
快速办理(worcester毕业证书)伍斯特大学毕业证PDF成绩单一模一样
9u4xjk4w
 
Dominate Reddit Discussions.............
Dominate Reddit Discussions.............Dominate Reddit Discussions.............
Dominate Reddit Discussions.............
SocioCosmos
 
一比一原版(AU毕业证)英国阿伯丁大学毕业证如何办理
一比一原版(AU毕业证)英国阿伯丁大学毕业证如何办理一比一原版(AU毕业证)英国阿伯丁大学毕业证如何办理
一比一原版(AU毕业证)英国阿伯丁大学毕业证如何办理
anubug
 
UR BHATTI ACADEMY AND ONLINE COURSES.pdf
UR BHATTI ACADEMY AND ONLINE COURSES.pdfUR BHATTI ACADEMY AND ONLINE COURSES.pdf
UR BHATTI ACADEMY AND ONLINE COURSES.pdf
urbhattiacademy
 
原版制作(Hull毕业证书)赫尔大学毕业证Offer一模一样
原版制作(Hull毕业证书)赫尔大学毕业证Offer一模一样原版制作(Hull毕业证书)赫尔大学毕业证Offer一模一样
原版制作(Hull毕业证书)赫尔大学毕业证Offer一模一样
7lkkjxt
 
STUDY ON THE DEVELOPMENT STRATEGY OF HUZHOU TOURISM
STUDY ON THE DEVELOPMENT STRATEGY OF HUZHOU TOURISMSTUDY ON THE DEVELOPMENT STRATEGY OF HUZHOU TOURISM
STUDY ON THE DEVELOPMENT STRATEGY OF HUZHOU TOURISM
AJHSSR Journal
 
Maximize Your Twitch Potential!..........
Maximize Your Twitch Potential!..........Maximize Your Twitch Potential!..........
Maximize Your Twitch Potential!..........
SocioCosmos
 
HMS Facebook Stories All V1 06092024.docx
HMS Facebook Stories All V1 06092024.docxHMS Facebook Stories All V1 06092024.docx
HMS Facebook Stories All V1 06092024.docx
Charles Bayless
 

Recently uploaded (9)

Transportation_Channel_Investor_Presentation_April_2024_ Final .pdf
Transportation_Channel_Investor_Presentation_April_2024_ Final .pdfTransportation_Channel_Investor_Presentation_April_2024_ Final .pdf
Transportation_Channel_Investor_Presentation_April_2024_ Final .pdf
 
快速办理(worcester毕业证书)伍斯特大学毕业证PDF成绩单一模一样
快速办理(worcester毕业证书)伍斯特大学毕业证PDF成绩单一模一样快速办理(worcester毕业证书)伍斯特大学毕业证PDF成绩单一模一样
快速办理(worcester毕业证书)伍斯特大学毕业证PDF成绩单一模一样
 
Dominate Reddit Discussions.............
Dominate Reddit Discussions.............Dominate Reddit Discussions.............
Dominate Reddit Discussions.............
 
一比一原版(AU毕业证)英国阿伯丁大学毕业证如何办理
一比一原版(AU毕业证)英国阿伯丁大学毕业证如何办理一比一原版(AU毕业证)英国阿伯丁大学毕业证如何办理
一比一原版(AU毕业证)英国阿伯丁大学毕业证如何办理
 
UR BHATTI ACADEMY AND ONLINE COURSES.pdf
UR BHATTI ACADEMY AND ONLINE COURSES.pdfUR BHATTI ACADEMY AND ONLINE COURSES.pdf
UR BHATTI ACADEMY AND ONLINE COURSES.pdf
 
原版制作(Hull毕业证书)赫尔大学毕业证Offer一模一样
原版制作(Hull毕业证书)赫尔大学毕业证Offer一模一样原版制作(Hull毕业证书)赫尔大学毕业证Offer一模一样
原版制作(Hull毕业证书)赫尔大学毕业证Offer一模一样
 
STUDY ON THE DEVELOPMENT STRATEGY OF HUZHOU TOURISM
STUDY ON THE DEVELOPMENT STRATEGY OF HUZHOU TOURISMSTUDY ON THE DEVELOPMENT STRATEGY OF HUZHOU TOURISM
STUDY ON THE DEVELOPMENT STRATEGY OF HUZHOU TOURISM
 
Maximize Your Twitch Potential!..........
Maximize Your Twitch Potential!..........Maximize Your Twitch Potential!..........
Maximize Your Twitch Potential!..........
 
HMS Facebook Stories All V1 06092024.docx
HMS Facebook Stories All V1 06092024.docxHMS Facebook Stories All V1 06092024.docx
HMS Facebook Stories All V1 06092024.docx
 

Whitepaper: Extract value from Facebook Data - Happiest Minds

  • 2. Abstract.................................................................................................................................................3 Introduction...........................................................................................................................................3 Building Blocks.....................................................................................................................................4 • Configuration files Parameterized Map/Reduce Program..................................................................................................5 • Parameters • Extraction Process Conclusion............................................................................................................................................6 About the Author...................................................................................................................................6 Contents © Happiest Minds Technologies Pvt. Ltd. All Rights Reserved2
  • 3. Abstract In present times any marketing or customer strategy is incomplete without a social media presence. With custom- ers depending all the more on social media channels to access and disseminate information and reviews, it becomes all the more important for organizations to tap social media channels for actionable insights. For analytic engines that churn out insights required for quick and intelligent decisions, social media is a key channel that needs to be explored on a consistent basis. Organizations are increasingly looking towards accelera- tors and frameworks that enable them to get the required intelligence from social media channels. Having the right accelerator enables the organization make intelligent decisions regarding their customer behaviour. Extraction Process and Flow The process and cornerstones of the accelerator is based on the understanding that Facebook exposes its data in form of a structured Facebook schema which can be accessed via GraphAPIs. Introduction Modern organisations lay a lot of emphasis on offering customized services to their customers. In such a situa- tion, the customer’s social profile and behaviour related information play a crucial role. Most of the organisations have an analytic pattern that is customer centric, descrip- tive, predictive as well as prescriptive. Organisations have been putting in huge chunks of investments to get the required view from their customer data and expect a quick return on their investments. When the need of the hour is a deliverable system that is astute as well as fast and reliable, organisations need to look at quick plug and play accelerators that will allow them to access the required information quickly in real time. The main benefit of such data for an organisation is that it provides the time it needs to concentrate on analytic problem statements which gives more importance to data. The accelerator should not only be quick but also be effective and enterprising, be able to adapt to changing conditions, as well as be able to make the best use of the available resources. Facebook Accelerator With the amount of time the current generation spends on social media, it is natural that most enterprises are now trying to keep in touch with their customers through social channels. It is no surprise that the top social media chan- nels like Facebook, LinkedIn and Twitter serve as sources of data in current times. © Happiest Minds Technologies Pvt. Ltd. All Rights Reserved3 Hosted on an open source big data frame work Leverages the power of disruptive technology and ensures that data is available near real time Since the accelerator is powered by metadata file, it allows changes to be made as well as version up grada- tion of face book schema without altering the code. • • •
  • 4. Facebook has a mechanism called Facebook Query Language (FQL) to allow data querying from the entire Facebook schema. The complete schema can be found in the URL [https://developers.facebook.com/docs/reference/fql/ ]. A project by the name “RestFB” - A subset of FQL schema, provides third party classes for the accelerator. Building Blocks Configuration files © Happiest Minds Technologies Pvt. Ltd. All Rights Reserved4 Mandatory configuration file: The tables and columns in this file are imperative for other tables to gather data. While these tables are independent, the tables in the optional configuration file are dependent. Optional configuration file: The tables in this file and their corresponding columns are dependent on the tables in the man- datory configuration file. • • EVENT eid name nid pic host description event_type eveny_subtype start_time end_time creator update_time location venue • • • • • • • • • • • • • • STREAM post_id app_id source_id updated_time created_time actor_id target_id message action_links attachment comments likes privacy• • • • • • • • • • • • • PAGE page_id pic page_url type company_overview location bio fan_count • • • • • • • • LIKE object_id object_id_cursor object_type post_id Li kecol user_id• • • • • • • • • • • • • • • • • • • USER uid first_name last_name name pic birthday sex relationship_status current_location interests about_me profile_url family• • • • • • • • • • • • • COMMENT xid post_id from id time text id username reply_xid• • • • • • • • FQL Table Schema
  • 5. © Happiest Minds Technologies Pvt. Ltd. All Rights Reserved5 This program is devised to distil data from Facebook and load it in HDFS. Parameters: Metadata files make use of linked hashed maps to make sure they retain the order of the existing tables. Optional configu- ration files are given to all reducers through the distributed cache. Parameterized Map/Reduce Program Page id App id + secure key Configuration files Desired database name in HIVE Number of machines can be specified too (reducers to be launched by Hadoop) Extraction process involves the following steps Configuration files are subjected to changes if any Job is launched with the correct Face book page id as argument Inside the mapper : The mandatory configuration file is processed and it collates data from stream and event table The HDFS folder is used to write output files for the stream and event table • • • • • • • • Data Access from Analytics programs Load Configuration File + ID and Access Token FB name or ID of the brand page Script calls a Map Reduce job to fetch data in parallel Fetch Data Access HIVEHDFS
  • 6. Using a plug and play accelerator, teams will get access to almost all the data in near real time and help them do the actual work (of analytics) rather than data collection and data cleansing. This helps organizations obviate the excess time required for mundane activities and focus on the more relevant analytics that drive customer insights and revenue growth. © Happiest Minds Technologies Pvt. Ltd. All Rights Reserved6 Conclusion Inside the reducer: The reducer is used to hold the post IDs from the streams and event tables. The number of IDs processed by the reducers is calculated as the Total number of post IDs/number of reducers fired up. In this way the reduc ers will have an even distribution of load. While the key is the number of reducers, values are represented by the post IDs. The reducers will write their own files which correlate with the tables in the optional configuration file. The number of reducers can be provided as a parameter while submitting the job. Post Map-Reduce phase: Hive script creates database and tables according to the corresponding names specified. Data from HDFS is copied into appropriate tables in the database created in the above step. Now data is available in a tabular format and teams requiring this data can connect to Hive database and work on it. At the end of the job, the particular program would have collated enough data which gives information on the post, likes on the post, comments made on the post, number of likes, users who have engaged with the post and basic user informa- tion. The data will be pushed to Hive Database specified by the user into appropriate tables. The table names and columns are in accordance with those specified in the configuration files. Bhawna Manchanda is a Big Data Architect. She plays a key role in conceptualizing and implement- ing BIG Data Solutions/Framework and Strategies in Happiest minds. She has also worked exten- sively with Leading Banks in BIDW space. About the Author Bhawna Manchanda Big Data Architect Sunny Malik has a Master’s Degree in Computer Science from University of Southern California (USC). He has worked extensively on Application Development using open-source technologies and currently focused on Big Data Technologies and Algorithm Development. Sunny Malik Big Data Technologies and Algorithm Development Skanda Bhargav is a Cloudera Certified Hadoop developer. He is a Computer Science graduate from Viswesvaraya Technological University, Belgaum popularly known as VTU. He has contributed to 3 books on Big Data subject which was published by http://www.packtpub.com/ .His interests are Hadoop, Hive, Map Reduce and Sqoop. Skanda Bhargav Hadoop developer. • •
  • 7. © Happiest Minds Technologies Pvt. Ltd. All Rights Reserved Happiest Minds is focused on helping customers build Smart Secure and Connected experience by leveraging disruptive technologies like mobility, analytics, security, cloud computing, social computing and unified communications. Enterprises are embracing these technologies to implement Omni-channel strategies, manage structured & unstructured data and make real time decisions based on actionable insights, while ensuring security for data and infrastructure. Happiest Minds also offers high degree of skills, IPs and domain expertise across a set of focused areas that include IT Services, Product Engineering Services, Infrastructure Management, Security, Testing and Consulting. Headquartered in Bangalore, India, Happiest Minds has operations in the US, UK, Singapore and Australia. It secured a $45 million Series-A funding led by Canaan Partners, Intel Capital and Ashok Soota. Happiest Minds © 2014 Happiest Minds. All Rights Reserved. E-mail: Business@happiestminds.com Visit us: www.happiestminds.com Follow us on 7 This document is an exclusive property of Happiest Minds Technologies Pvt. Ltd.