1 ©2013 Apigee. Confidential – All Rights Reserved.
Apps + Data + APIs
A New Data Architecture for the
App Economy
Anant Jhingran, Apigee
2 ©2013 Apigee. Confidential – All Rights Reserved.
Developer User
Digital Business Value Chain
API APPBackend Services
Internal
Partner
External
Customer
Employee
Partner
Existing
Partner
New
3 ©2013 Apigee. Confidential – All Rights Reserved.
Digital Signals come in Three Forms in this value chain
Digital Assets
B&M
Web
Events
Entities
Context
4 ©2013 Apigee. Confidential – All Rights Reserved.
•  /timestamp:
•  {“timestamp”: 134578901234,
•  “payload”: {
•  “sending entity”: UUID1,
•  “receiving entitiy”: UUID2,
•  “data”: {
•  “field1”: value1,
•  …
•  }
•  }
•  }
•  Outside the billionaire’s club, might be more typically 30 – 50
MM/day
Event Structure – generalization of “Facts” in Data Warehouse
5 ©2013 Apigee. Confidential – All Rights Reserved.
•  POST/GET
•  /users
•  /developers
•  /buddies
•  /locations
•  /products
•  …
•  Typical environments, ~100,000 – 1MM entities
Entity Structure, generalization of “Dimensions” in Data
Warehouse
6 ©2013 Apigee. Confidential – All Rights Reserved.
Context
= “Secondary Entities + Events”
7 ©2013 Apigee. Confidential – All Rights Reserved.
★
Time of
Event
Context = Other
nearby relevant and
interesting events
Time as Context
8 ©2013 Apigee. Confidential – All Rights Reserved.
The Rugby World Cup’s Effect on Beer Consumption in AU
Context
Analysis
9 ©2013 Apigee. Confidential – All Rights Reserved.
Context = Nearby,
interesting, relevant locations
Location as Context
10 ©2013 Apigee. Confidential – All Rights Reserved.
Where does a User fulfill her needs?
/storelocator
/product
/search
/buy
/findinstore
< 3 days
< 1 day
Context
Analysis
11 ©2013 Apigee. Confidential – All Rights Reserved.
Context = Complementary, supplementary and substitute
entities (products, services, data)
Related Entities as Context
12 ©2013 Apigee. Confidential – All Rights Reserved.
•  /addtocart/product/12345
•  /addtocart/product/34577
•  Context is
–  Product Categories
–  /addtocart/product/12345?category=menscoats
–  /addtocart/product/34577?category=menscoats
•  Analysis is
–  Promotion Effectiveness (within a 1 week window) grouped
by product category (not product)
Determining effectiveness of promotions
13 ©2013 Apigee. Confidential – All Rights Reserved.
Developer Activity as Context
•  Developer Activity
–  Checkins, Repos, Follows
•  Developer Profile
–  Skills, Languages, Platforms
•  Developer Network
–  Follows, Followers, Watchers
14 ©2013 Apigee. Confidential – All Rights Reserved.
Building the right APIs, Hackathons, SDKs for developers
Context
Analysis
15 ©2013 Apigee. Confidential – All Rights Reserved.
Information and Use as Context
Reviews
Description
Category
Demand
User Action
(e.g. Purchase)
Context = Information leading to decisions in end user
use cases
16 ©2013 Apigee. Confidential – All Rights Reserved.
Behavior Patterns as Context (Habits)
•  User Activity on Apps establishes
patterns of Behavior and Actions
•  Deviations from the behavior profile
are interesting also
17 ©2013 Apigee. Confidential – All Rights Reserved.
Public Profiles and Social Activity as Context
•  Social Profile, Network and Activity describe users
•  Features like the Facebook Timeline for user’s
preferences
18 ©2013 Apigee. Confidential – All Rights Reserved.
Critical Technical Features
19 ©2013 Apigee. Confidential – All Rights Reserved.
The Big Data System for the App Economy must understand…
Events
Entities
Context
DATA:
ANALYSIS:
Both “Batch” and “Real-Time”
20 ©2013 Apigee. Confidential – All Rights Reserved.
•  Half Life of Data
•  ETL
•  Data Modeling
•  Real-Time Complement
Many things are Different
21 ©2013 Apigee. Confidential – All Rights Reserved.
Half Life of Data
Volume Value
NOWNOW – 1 YEAR
App
Economy
“Old”
Economy
22 ©2013 Apigee. Confidential – All Rights Reserved.
APIs displace ETL
API
s
ET
L
Fed by handful of core apps Myriad apps and services
Concise data Verbose data
Data optimized for storage Data optimized for consumption
Well-modeled business systems
and data owned by enterprise
Disparate, dynamic data in fast-paced
mobile, social apps ecosystems
Works as self-contained ‘cubes’ Works by mixing with other APIs
23 ©2013 Apigee. Confidential – All Rights Reserved.
The new Broad Data Platform needs some new constructs
Enterprise
Systems"
External
Online Data"
Data Collection
Data Processing
Entity and Event
Model
APIs
API DataApp Data
SQL
Dimensions
and Facts
Joins and
Aggregations
ETL
Map Reduce, Pig, Hive
Key Value
Aggregations
Bulk Loads, Flume…
REST, Odata?
Collections, Time
Series
Entity Resolution,
Signal Amplification,…
API based access
Warehousing Big Data Broad Data
24 ©2013 Apigee. Confidential – All Rights Reserved.
Batch must also Affect Real-Time traffic, and vice-versa
Big Data “Batch” Analysis
?
Real-Time “Gateway”
25 ©2013 Apigee. Confidential – All Rights Reserved.
Computer Science is about Abstractions
RDBMS
Map/Reduce
Entities, Events and
Context
Abstractions
Flexibility
File System
Abstractions Reduce the Number of
Problems that can be solved
But Significantly Improve Time to Value
26 ©2013 Apigee. Confidential – All Rights Reserved.
One Possible Architectural Block Diagram
RDBMS Cassandra
Entities and Events in the App Economy
Data Import and Access
APIs
CRUD and Analytical Libraries
•  Tailored for “data” and use cases in the App Economy
•  Built around fundamental transformations of ETL, Warehousing and Big Data
Hadoop
27 ©2013 Apigee. Confidential – All Rights Reserved.
And also requires a different approach given that context can be
overwhelming
Insights
Data
API Traffic
Developer
Activity
Mobile App
Activity
28 ©2013 Apigee. Confidential – All Rights Reserved.
•  New Big Data Abstractions of
–  Entities
–  Events
–  Context (secondary entities and events)
•  New Data Processing Techniques
–  Determining “value” of the data
–  Data Stitching for enhancing signal to noise
•  New Analytical Techniques
–  Time Series Analysis
–  Graph Traversals
–  Real-Time Complement to Batch Analysis
•  New Approach to Data Science
Summary
29 ©2013 Apigee. Confidential – All Rights Reserved.
Thank you.

A New Data Architecture for the App Economy - StampedeCon 2013

  • 1.
    1 ©2013 Apigee.Confidential – All Rights Reserved. Apps + Data + APIs A New Data Architecture for the App Economy Anant Jhingran, Apigee
  • 2.
    2 ©2013 Apigee.Confidential – All Rights Reserved. Developer User Digital Business Value Chain API APPBackend Services Internal Partner External Customer Employee Partner Existing Partner New
  • 3.
    3 ©2013 Apigee.Confidential – All Rights Reserved. Digital Signals come in Three Forms in this value chain Digital Assets B&M Web Events Entities Context
  • 4.
    4 ©2013 Apigee.Confidential – All Rights Reserved. •  /timestamp: •  {“timestamp”: 134578901234, •  “payload”: { •  “sending entity”: UUID1, •  “receiving entitiy”: UUID2, •  “data”: { •  “field1”: value1, •  … •  } •  } •  } •  Outside the billionaire’s club, might be more typically 30 – 50 MM/day Event Structure – generalization of “Facts” in Data Warehouse
  • 5.
    5 ©2013 Apigee.Confidential – All Rights Reserved. •  POST/GET •  /users •  /developers •  /buddies •  /locations •  /products •  … •  Typical environments, ~100,000 – 1MM entities Entity Structure, generalization of “Dimensions” in Data Warehouse
  • 6.
    6 ©2013 Apigee.Confidential – All Rights Reserved. Context = “Secondary Entities + Events”
  • 7.
    7 ©2013 Apigee.Confidential – All Rights Reserved. ★ Time of Event Context = Other nearby relevant and interesting events Time as Context
  • 8.
    8 ©2013 Apigee.Confidential – All Rights Reserved. The Rugby World Cup’s Effect on Beer Consumption in AU Context Analysis
  • 9.
    9 ©2013 Apigee.Confidential – All Rights Reserved. Context = Nearby, interesting, relevant locations Location as Context
  • 10.
    10 ©2013 Apigee.Confidential – All Rights Reserved. Where does a User fulfill her needs? /storelocator /product /search /buy /findinstore < 3 days < 1 day Context Analysis
  • 11.
    11 ©2013 Apigee.Confidential – All Rights Reserved. Context = Complementary, supplementary and substitute entities (products, services, data) Related Entities as Context
  • 12.
    12 ©2013 Apigee.Confidential – All Rights Reserved. •  /addtocart/product/12345 •  /addtocart/product/34577 •  Context is –  Product Categories –  /addtocart/product/12345?category=menscoats –  /addtocart/product/34577?category=menscoats •  Analysis is –  Promotion Effectiveness (within a 1 week window) grouped by product category (not product) Determining effectiveness of promotions
  • 13.
    13 ©2013 Apigee.Confidential – All Rights Reserved. Developer Activity as Context •  Developer Activity –  Checkins, Repos, Follows •  Developer Profile –  Skills, Languages, Platforms •  Developer Network –  Follows, Followers, Watchers
  • 14.
    14 ©2013 Apigee.Confidential – All Rights Reserved. Building the right APIs, Hackathons, SDKs for developers Context Analysis
  • 15.
    15 ©2013 Apigee.Confidential – All Rights Reserved. Information and Use as Context Reviews Description Category Demand User Action (e.g. Purchase) Context = Information leading to decisions in end user use cases
  • 16.
    16 ©2013 Apigee.Confidential – All Rights Reserved. Behavior Patterns as Context (Habits) •  User Activity on Apps establishes patterns of Behavior and Actions •  Deviations from the behavior profile are interesting also
  • 17.
    17 ©2013 Apigee.Confidential – All Rights Reserved. Public Profiles and Social Activity as Context •  Social Profile, Network and Activity describe users •  Features like the Facebook Timeline for user’s preferences
  • 18.
    18 ©2013 Apigee.Confidential – All Rights Reserved. Critical Technical Features
  • 19.
    19 ©2013 Apigee.Confidential – All Rights Reserved. The Big Data System for the App Economy must understand… Events Entities Context DATA: ANALYSIS: Both “Batch” and “Real-Time”
  • 20.
    20 ©2013 Apigee.Confidential – All Rights Reserved. •  Half Life of Data •  ETL •  Data Modeling •  Real-Time Complement Many things are Different
  • 21.
    21 ©2013 Apigee.Confidential – All Rights Reserved. Half Life of Data Volume Value NOWNOW – 1 YEAR App Economy “Old” Economy
  • 22.
    22 ©2013 Apigee.Confidential – All Rights Reserved. APIs displace ETL API s ET L Fed by handful of core apps Myriad apps and services Concise data Verbose data Data optimized for storage Data optimized for consumption Well-modeled business systems and data owned by enterprise Disparate, dynamic data in fast-paced mobile, social apps ecosystems Works as self-contained ‘cubes’ Works by mixing with other APIs
  • 23.
    23 ©2013 Apigee.Confidential – All Rights Reserved. The new Broad Data Platform needs some new constructs Enterprise Systems" External Online Data" Data Collection Data Processing Entity and Event Model APIs API DataApp Data SQL Dimensions and Facts Joins and Aggregations ETL Map Reduce, Pig, Hive Key Value Aggregations Bulk Loads, Flume… REST, Odata? Collections, Time Series Entity Resolution, Signal Amplification,… API based access Warehousing Big Data Broad Data
  • 24.
    24 ©2013 Apigee.Confidential – All Rights Reserved. Batch must also Affect Real-Time traffic, and vice-versa Big Data “Batch” Analysis ? Real-Time “Gateway”
  • 25.
    25 ©2013 Apigee.Confidential – All Rights Reserved. Computer Science is about Abstractions RDBMS Map/Reduce Entities, Events and Context Abstractions Flexibility File System Abstractions Reduce the Number of Problems that can be solved But Significantly Improve Time to Value
  • 26.
    26 ©2013 Apigee.Confidential – All Rights Reserved. One Possible Architectural Block Diagram RDBMS Cassandra Entities and Events in the App Economy Data Import and Access APIs CRUD and Analytical Libraries •  Tailored for “data” and use cases in the App Economy •  Built around fundamental transformations of ETL, Warehousing and Big Data Hadoop
  • 27.
    27 ©2013 Apigee.Confidential – All Rights Reserved. And also requires a different approach given that context can be overwhelming Insights Data API Traffic Developer Activity Mobile App Activity
  • 28.
    28 ©2013 Apigee.Confidential – All Rights Reserved. •  New Big Data Abstractions of –  Entities –  Events –  Context (secondary entities and events) •  New Data Processing Techniques –  Determining “value” of the data –  Data Stitching for enhancing signal to noise •  New Analytical Techniques –  Time Series Analysis –  Graph Traversals –  Real-Time Complement to Batch Analysis •  New Approach to Data Science Summary
  • 29.
    29 ©2013 Apigee.Confidential – All Rights Reserved. Thank you.