SlideShare a Scribd company logo
1 of 22
Download to read offline
Reliable and Efficient
       Facebook data
         processing

Andres Buritica
Socal Piggies
2013-01-17
http://thelinuxkid.com
Python developer

Facebook Graph API experience

Ubernear

FounderDating
Meta
Draws on lessons from Ubernear

Based on traversal of public nodes in Graph
Use case
Pages or users (owners) with public events

Event discovery
Flow

                     Update owners
   Owner IDs         Update owner events



 Expire events
 Event details          Partial events




   Complete events
Why?
Separation of concerns

Parallel processing with separate
user/apps/servers

Less data load on batch requests

Want to store all data
Update owners
Check for migrated owners
  ○ (#21) Page ID <old_id> was migrated to page ID
    <new_id>. Please update your API calls to the new
    ID


Move migrated owners to separate table

Add new owners
Update owner events
Events for owners not checked since datetime

Stop at last event previously collected
Expire events
end_time has passed move to another table

False
  ○ Data might exist


Should have returned False?
  ○ [100] Unsupported get request
  ○ Data might exist
Expire events
Alias not found
  ○ (#803) Some of the aliases you requested do not
    exist...
Event details
Skip completed (no refreshing)

Transient errors (retry)
   ○   None
   ○   OAuthException...Error validating application
   ○    (#1) "Unknown error occured"
   ○   "(#2) Service temporarily unavailable"
   ○   "(#4) User request limit reached" (throttle)
   ○   "(#4) Application request limit reached" (throttle)
   ○   "(#17) User request limit reached" (throttle)
Datetimes
All in ISO-8601

Events
  ○   date_format modifier has no effect
  ○   timezones after "Events Timezone Migration"
  ○   is_date_only
  ○   legacy without timezone
Batch requests
POST

50 requests in one

Large or complex can time out

Nested calls count towards rate limiting

One top level access token, many nested
Batch request example
User's profile and 50 friends

batch=[{
  "method":"GET",
  "relative_url":"me"
  },
  {
  "method":"GET",
  "relative_url":"me/friends?limit=50"
}]
Batch dependencies
Reference results of a previous operation

JSONPath

Child operation executed after parent

Parent returns None unless forced
Batch dependencies example
Get details of 5 friends

batch=[{
  "method":"GET",
  "name":"get-friends",
  "relative_url":"me/friends?limit=5",
  },
  {
  "method":"GET",
  "relative_url":"?ids={result=get-friends:$.data.*.id}"
}]
Field expansion
GET

"join" multiple graph queries into a single call

Replacement for FQL

fields, connections, modifiers and identifiers

No JSONPath
Field expansion example
User's name and birthday plus id and picture of
the last 10 photos

/me?
 fields=
   name,
   birthday,
   photos.limit(10).fields(id,picture)
Batch request with field expansions
User's profile and picture link of 10 photos

batch=[{
  "method":"GET",
  "relative_url":"me"
  },
  {
  "method":"GET",
  "relative_url":"me?fields=photos.limit(10).fields(picture)"
}]
Process
Facebook is constantly improving

Testing crucial

Beta tier
Not covered
Refreshing data

Pagination

Throttling

Insights
Thanks
Sample code at http://ubernear.com

Questions

http://thelinuxkid.com

More Related Content

Similar to Reliable and Efficient Facebook data processing

From Event Receivers to SharePoint Webhooks (SPS Lisbon 2017)
From Event Receivers to SharePoint Webhooks (SPS Lisbon 2017)From Event Receivers to SharePoint Webhooks (SPS Lisbon 2017)
From Event Receivers to SharePoint Webhooks (SPS Lisbon 2017)André Vala
 
MongoDB World 2019: Securing Application Data from Day One
MongoDB World 2019: Securing Application Data from Day OneMongoDB World 2019: Securing Application Data from Day One
MongoDB World 2019: Securing Application Data from Day OneMongoDB
 
From Event Receivers to SharePoint Webhooks
From Event Receivers to SharePoint WebhooksFrom Event Receivers to SharePoint Webhooks
From Event Receivers to SharePoint WebhooksAndré Vala
 
MongoDB World 2018: Evolving your Data Access with MongoDB Stitch
MongoDB World 2018: Evolving your Data Access with MongoDB StitchMongoDB World 2018: Evolving your Data Access with MongoDB Stitch
MongoDB World 2018: Evolving your Data Access with MongoDB StitchMongoDB
 
O365Con18 - Reach for the Cloud Build Solutions with the Power of Microsoft G...
O365Con18 - Reach for the Cloud Build Solutions with the Power of Microsoft G...O365Con18 - Reach for the Cloud Build Solutions with the Power of Microsoft G...
O365Con18 - Reach for the Cloud Build Solutions with the Power of Microsoft G...NCCOMMS
 
Building serverless applications with Microsoft Graph and Azure Functions
Building serverless applications with Microsoft Graph and Azure FunctionsBuilding serverless applications with Microsoft Graph and Azure Functions
Building serverless applications with Microsoft Graph and Azure FunctionsDragan Panjkov
 
HTML5 Server Sent Events/JSF JAX 2011 Conference
HTML5 Server Sent Events/JSF  JAX 2011 ConferenceHTML5 Server Sent Events/JSF  JAX 2011 Conference
HTML5 Server Sent Events/JSF JAX 2011 ConferenceRoger Kitain
 
An Architecture for Agile Machine Learning in Real-Time Applications
An Architecture for Agile Machine Learning in Real-Time ApplicationsAn Architecture for Agile Machine Learning in Real-Time Applications
An Architecture for Agile Machine Learning in Real-Time ApplicationsJohann Schleier-Smith
 
ATD 13 - Enhancing your applications using Microsoft Graph API
ATD 13 - Enhancing your applications using Microsoft Graph APIATD 13 - Enhancing your applications using Microsoft Graph API
ATD 13 - Enhancing your applications using Microsoft Graph APIDragan Panjkov
 
Evolving your Data Access with MongoDB Stitch
Evolving your Data Access with MongoDB StitchEvolving your Data Access with MongoDB Stitch
Evolving your Data Access with MongoDB StitchMongoDB
 
Amundsen: From discovering to security data
Amundsen: From discovering to security dataAmundsen: From discovering to security data
Amundsen: From discovering to security datamarkgrover
 
Server-Sent Events in Action
Server-Sent Events in ActionServer-Sent Events in Action
Server-Sent Events in ActionAndrei Rusu
 
GraphQL Bangkok Meetup 2.0
GraphQL Bangkok Meetup 2.0GraphQL Bangkok Meetup 2.0
GraphQL Bangkok Meetup 2.0Tobias Meixner
 
Perfect Norikra 2nd Season
Perfect Norikra 2nd SeasonPerfect Norikra 2nd Season
Perfect Norikra 2nd SeasonSATOSHI TAGOMORI
 
Microsoft Graph: Connect to essential data every app needs
Microsoft Graph: Connect to essential data every app needsMicrosoft Graph: Connect to essential data every app needs
Microsoft Graph: Connect to essential data every app needsMicrosoft Tech Community
 
Microsoft Graph: Connect to essential data every app needs
Microsoft Graph: Connect to essential data every app needsMicrosoft Graph: Connect to essential data every app needs
Microsoft Graph: Connect to essential data every app needsMicrosoft Tech Community
 
Coding against the Office Graph
Coding against the Office GraphCoding against the Office Graph
Coding against the Office GraphOliver Wirkus
 
Instrument Data Orchestration with Globus Search and Flows
Instrument Data Orchestration with Globus Search and FlowsInstrument Data Orchestration with Globus Search and Flows
Instrument Data Orchestration with Globus Search and FlowsGlobus
 
How Lyft Drives Data Discovery
How Lyft Drives Data DiscoveryHow Lyft Drives Data Discovery
How Lyft Drives Data DiscoveryNeo4j
 
Duke at SplunkLive! Charlotte
Duke at SplunkLive! CharlotteDuke at SplunkLive! Charlotte
Duke at SplunkLive! CharlotteSplunk
 

Similar to Reliable and Efficient Facebook data processing (20)

From Event Receivers to SharePoint Webhooks (SPS Lisbon 2017)
From Event Receivers to SharePoint Webhooks (SPS Lisbon 2017)From Event Receivers to SharePoint Webhooks (SPS Lisbon 2017)
From Event Receivers to SharePoint Webhooks (SPS Lisbon 2017)
 
MongoDB World 2019: Securing Application Data from Day One
MongoDB World 2019: Securing Application Data from Day OneMongoDB World 2019: Securing Application Data from Day One
MongoDB World 2019: Securing Application Data from Day One
 
From Event Receivers to SharePoint Webhooks
From Event Receivers to SharePoint WebhooksFrom Event Receivers to SharePoint Webhooks
From Event Receivers to SharePoint Webhooks
 
MongoDB World 2018: Evolving your Data Access with MongoDB Stitch
MongoDB World 2018: Evolving your Data Access with MongoDB StitchMongoDB World 2018: Evolving your Data Access with MongoDB Stitch
MongoDB World 2018: Evolving your Data Access with MongoDB Stitch
 
O365Con18 - Reach for the Cloud Build Solutions with the Power of Microsoft G...
O365Con18 - Reach for the Cloud Build Solutions with the Power of Microsoft G...O365Con18 - Reach for the Cloud Build Solutions with the Power of Microsoft G...
O365Con18 - Reach for the Cloud Build Solutions with the Power of Microsoft G...
 
Building serverless applications with Microsoft Graph and Azure Functions
Building serverless applications with Microsoft Graph and Azure FunctionsBuilding serverless applications with Microsoft Graph and Azure Functions
Building serverless applications with Microsoft Graph and Azure Functions
 
HTML5 Server Sent Events/JSF JAX 2011 Conference
HTML5 Server Sent Events/JSF  JAX 2011 ConferenceHTML5 Server Sent Events/JSF  JAX 2011 Conference
HTML5 Server Sent Events/JSF JAX 2011 Conference
 
An Architecture for Agile Machine Learning in Real-Time Applications
An Architecture for Agile Machine Learning in Real-Time ApplicationsAn Architecture for Agile Machine Learning in Real-Time Applications
An Architecture for Agile Machine Learning in Real-Time Applications
 
ATD 13 - Enhancing your applications using Microsoft Graph API
ATD 13 - Enhancing your applications using Microsoft Graph APIATD 13 - Enhancing your applications using Microsoft Graph API
ATD 13 - Enhancing your applications using Microsoft Graph API
 
Evolving your Data Access with MongoDB Stitch
Evolving your Data Access with MongoDB StitchEvolving your Data Access with MongoDB Stitch
Evolving your Data Access with MongoDB Stitch
 
Amundsen: From discovering to security data
Amundsen: From discovering to security dataAmundsen: From discovering to security data
Amundsen: From discovering to security data
 
Server-Sent Events in Action
Server-Sent Events in ActionServer-Sent Events in Action
Server-Sent Events in Action
 
GraphQL Bangkok Meetup 2.0
GraphQL Bangkok Meetup 2.0GraphQL Bangkok Meetup 2.0
GraphQL Bangkok Meetup 2.0
 
Perfect Norikra 2nd Season
Perfect Norikra 2nd SeasonPerfect Norikra 2nd Season
Perfect Norikra 2nd Season
 
Microsoft Graph: Connect to essential data every app needs
Microsoft Graph: Connect to essential data every app needsMicrosoft Graph: Connect to essential data every app needs
Microsoft Graph: Connect to essential data every app needs
 
Microsoft Graph: Connect to essential data every app needs
Microsoft Graph: Connect to essential data every app needsMicrosoft Graph: Connect to essential data every app needs
Microsoft Graph: Connect to essential data every app needs
 
Coding against the Office Graph
Coding against the Office GraphCoding against the Office Graph
Coding against the Office Graph
 
Instrument Data Orchestration with Globus Search and Flows
Instrument Data Orchestration with Globus Search and FlowsInstrument Data Orchestration with Globus Search and Flows
Instrument Data Orchestration with Globus Search and Flows
 
How Lyft Drives Data Discovery
How Lyft Drives Data DiscoveryHow Lyft Drives Data Discovery
How Lyft Drives Data Discovery
 
Duke at SplunkLive! Charlotte
Duke at SplunkLive! CharlotteDuke at SplunkLive! Charlotte
Duke at SplunkLive! Charlotte
 

Reliable and Efficient Facebook data processing