Small Data Sets to Scale:
  Planning for the Evolution of Data

      Poornima Vijayashanker
     CEO & Founder BizeeBee
 ...
AGENDA
I. Stealth Mode - “pre-data” phase
II. Launch
III. Compute Growth Rate
IV. Optimizations
V. Data Storage
Pre-Data
Stealth Mode - “pre-data” phase
Small initial data set
  Easy storage
  Storage solutions like Heroku, RackSpace
...
0 to 100k to 1M
0 - 100k easiest schema design
  Single DB - with user & static data
  Single instance of app accessing th...
Growth Rate
What is your user growth rate?
  Basic unit e.g. Mint - transaction
  User generated content
  Size of unit e....
Optimizations
Capacity - throw hardware
Seek - throw software
  Cache data

Size - design around it
  Limit usage size e.g...
Optimizations Cont’d
Code Level
  Processes - Computation v. Retrieval
  DB Techniques - Index, De-Normalize

Data Level
 ...
Data Storage
Single User’s Data v. Aggregated Data
  Single user’s data v. data aggregated across users
  e.g Mint - Spend...
Conclusion
  Start small - provide enough value to user
  Monitor & project growth rate of data
  Break data apart
  Simpl...
Upcoming SlideShare
Loading in …5
×

P. Vijayashanker From Small Datasets to Scale: Planning for the Evolution of Data Social Developer Summit

376 views
348 views

Published on

Social Developer Summit

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
376
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide









  • P. Vijayashanker From Small Datasets to Scale: Planning for the Evolution of Data Social Developer Summit

    1. 1. Small Data Sets to Scale: Planning for the Evolution of Data Poornima Vijayashanker CEO & Founder BizeeBee poornima@bizeebee.com @poornima www.femgineer.com
    2. 2. AGENDA I. Stealth Mode - “pre-data” phase II. Launch III. Compute Growth Rate IV. Optimizations V. Data Storage
    3. 3. Pre-Data Stealth Mode - “pre-data” phase Small initial data set Easy storage Storage solutions like Heroku, RackSpace Design features around it Simplicity of Storage v. Complexity of Design e.g. Mint - 3 months of financial data, FB - social graph is limited to universities
    4. 4. 0 to 100k to 1M 0 - 100k easiest schema design Single DB - with user & static data Single instance of app accessing the db 100k - 1M+ time to re-design db and app Break up databases - user & static Multiple instances of the app
    5. 5. Growth Rate What is your user growth rate? Basic unit e.g. Mint - transaction User generated content Size of unit e.g. FB - photo Storage capacity v. Seek v. Size
    6. 6. Optimizations Capacity - throw hardware Seek - throw software Cache data Size - design around it Limit usage size e.g. 4MB picture
    7. 7. Optimizations Cont’d Code Level Processes - Computation v. Retrieval DB Techniques - Index, De-Normalize Data Level Partioning: Siloed v. Interconnected
    8. 8. Data Storage Single User’s Data v. Aggregated Data Single user’s data v. data aggregated across users e.g Mint - Spending Trends Scheme to compute, store, and retrieve aggregated data
    9. 9. Conclusion Start small - provide enough value to user Monitor & project growth rate of data Break data apart Simple optimizations - indexing, de- normalizing, caching Large data sets - warehousing, partitioning db Hiring designer & engineer for BizeeBee :)

    ×