Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Small Data Sets to Scale:
  Planning for the Evolution of Data

      Poornima Vijayashanker
     CEO & Founder BizeeBee
 ...
AGENDA
I. Stealth Mode - “pre-data” phase
II. Launch
III. Compute Growth Rate
IV. Optimizations
V. Data Storage
Pre-Data
Stealth Mode - “pre-data” phase
Small initial data set
  Easy storage
  Storage solutions like Heroku, RackSpace
...
0 to 100k to 1M
0 - 100k easiest schema design
  Single DB - with user & static data
  Single instance of app accessing th...
Growth Rate
What is your user growth rate?
  Basic unit e.g. Mint - transaction
  User generated content
  Size of unit e....
Optimizations
Capacity - throw hardware
Seek - throw software
  Cache data

Size - design around it
  Limit usage size e.g...
Optimizations Cont’d
Code Level
  Processes - Computation v. Retrieval
  DB Techniques - Index, De-Normalize

Data Level
 ...
Data Storage
Single User’s Data v. Aggregated Data
  Single user’s data v. data aggregated across users
  e.g Mint - Spend...
Conclusion
  Start small - provide enough value to user
  Monitor & project growth rate of data
  Break data apart
  Simpl...
Upcoming SlideShare
Loading in …5
×

P. Vijayashanker From Small Datasets to Scale: Planning for the Evolution of Data Social Developer Summit

431 views

Published on

Social Developer Summit

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

P. Vijayashanker From Small Datasets to Scale: Planning for the Evolution of Data Social Developer Summit

  1. 1. Small Data Sets to Scale: Planning for the Evolution of Data Poornima Vijayashanker CEO & Founder BizeeBee poornima@bizeebee.com @poornima www.femgineer.com
  2. 2. AGENDA I. Stealth Mode - “pre-data” phase II. Launch III. Compute Growth Rate IV. Optimizations V. Data Storage
  3. 3. Pre-Data Stealth Mode - “pre-data” phase Small initial data set Easy storage Storage solutions like Heroku, RackSpace Design features around it Simplicity of Storage v. Complexity of Design e.g. Mint - 3 months of financial data, FB - social graph is limited to universities
  4. 4. 0 to 100k to 1M 0 - 100k easiest schema design Single DB - with user & static data Single instance of app accessing the db 100k - 1M+ time to re-design db and app Break up databases - user & static Multiple instances of the app
  5. 5. Growth Rate What is your user growth rate? Basic unit e.g. Mint - transaction User generated content Size of unit e.g. FB - photo Storage capacity v. Seek v. Size
  6. 6. Optimizations Capacity - throw hardware Seek - throw software Cache data Size - design around it Limit usage size e.g. 4MB picture
  7. 7. Optimizations Cont’d Code Level Processes - Computation v. Retrieval DB Techniques - Index, De-Normalize Data Level Partioning: Siloed v. Interconnected
  8. 8. Data Storage Single User’s Data v. Aggregated Data Single user’s data v. data aggregated across users e.g Mint - Spending Trends Scheme to compute, store, and retrieve aggregated data
  9. 9. Conclusion Start small - provide enough value to user Monitor & project growth rate of data Break data apart Simple optimizations - indexing, de- normalizing, caching Large data sets - warehousing, partitioning db Hiring designer & engineer for BizeeBee :)

×