A Behind the Scenes Look at the Force.com Platform

3,950 views

Published on

A Behind the Scenes Look at the Force.com Platform

  1. 1. A Behind the Scenes Look at theForce.com PlatformWalter Macklem, salesforce.com, CTO of Platform
  2. 2. Safe Harbor Safe harbor statement under the Private Securities Litigation Reform Act of 1995: This presentation may contain forward-looking statements that involve risks, uncertainties, and assumptions. If any such uncertainties materialize or if any of the assumptions proves incorrect, the results of salesforce.com, inc. could differ materially from the results expressed or implied by the forward-looking statements we make. All statements other than statements of historical fact could be deemed forward-looking, including any projections of product or service availability, subscriber growth, earnings, revenues, or other financial items and any statements regarding strategies or plans of management for future operations, statements of belief, any statements concerning new, planned, or upgraded services or technology developments and customer contracts or use of our services. The risks and uncertainties referred to above include – but are not limited to – risks associated with developing and delivering new functionality for our service, new products and services, our new business model, our past operating losses, possible fluctuations in our operating results and rate of growth, interruptions or delays in our Web hosting, breach of our security measures, the outcome of intellectual property and other litigation, risks associated with possible mergers and acquisitions, the immature market in which we operate, our relatively limited operating history, our ability to expand, retain, and motivate our employees and manage our growth, new releases of our service and successful customer deployment, our limited history reselling non-salesforce.com products, and utilization and selling to larger enterprise customers. Further information on potential factors that could affect the financial results of salesforce.com, inc. is included in our annual report on Form 10-Q for the most recent fiscal quarter ended July 31, 2012. This documents and others containing important disclosures are available on the SEC Filings section of the Investor Information section of our Web site. Any unreleased services or features referenced in this or other presentations, press releases or public statements are not currently available and may not be delivered on time or at all. Customers who purchase our services should make the purchase decisions based upon features that are currently available. Salesforce.com, inc. assumes no obligation and does not intend to update these forward- looking statements.
  3. 3. Key Takeaways Internal & External Data Constructs Multitenancy Data Infrastructure
  4. 4. Key TakeawaysData Infrastructure • Building blocks of the Force.com service. Relational database, Distributed File System, Search. High Availability, Backups, and Disaster Recovery.Multitenant Data Management • Platformize the raw data infrastructure to make it work for the cloud. Enable multiple customers to utilize a shared resource pool.Internal Development with Data • Dogfooding. How do internal Salesforce engineers build on top of this multitenant data platform?
  5. 5. Pod == Hardware Topology
  6. 6. Pod• Self-contained set of hardware*• Each customer is in one pod• Each pod services many customers• Data persistence and System of Record• Data processing• Hardware mirroring* Exceptions being: Edge router and a few other services
  7. 7. Pod Salesforce Users Pod #1 Application Servers Relational Database Distributed Search File System
  8. 8. Pod Horizontal Scalability POD #1 POD #2 POD #3 POD #4 POD #5 POD #N NA1 NA7 EU1 CS8 AP0
  9. 9. Data Infrastructure Building Blocks• Relational Database• Distributed File System• Search
  10. 10. Relational Database• Sharding / Partitioning • 32-way • Shard based on customer• High availability • 8 machine database cluster • Automatic failover
  11. 11. Relational Database• Backups • 3 lag databases • Near Realtime • 2 Hour • 48 Hour • Tape / Disk• Disaster Recovery • Hardware block-level replication• 6 logical copies of all bits • >> 6 physical copies of all bits
  12. 12. Distributed File System• Binary Object Store• Homegrown Technology called FileForce• Optimized for High Availability
  13. 13. Distributed File System • File Handles are stored in a HA relational database • Block stores: • High density cheap machines • Dumb • RAID10 • Deployed in “buddy” pairs • Buddy • Leader election • Backup • DR
  14. 14. Distributed File System File API Coordination Service Block Store 1 File Handles Block Store 2 Small File Block Store Block Store N
  15. 15. Distributed File System• Small files are a problem• Examples • #1. One 10MB file = 10MB • #2. 10 million one byte file = 10MB• Stored initially in a HA database• Bundled with other small files into a big file• File handles reference an offset into the big file
  16. 16. Search • Full-text search capability • Wide variety of data to support: • Structured data: id, email, phone number • Unstructured data: long documents, short chatter posts • Real-time indexing and querying • 90% of events indexed in < 3 mins • Lucene & Solr
  17. 17. Next-gen QueryingOriginal architecture QUERY TIER ? Java Java Java Java Java Java Application Java Application Java Application Application Query Query Query Query Search Hosts Hosts Hosts Hosts NFS NFS Application Application Servers Application Servers Servers Application Servers Servers Servers Servers Servers Servers Primary DB Indexer SAN Secondary Back- Indexer ups INDEXING TIER STORAGE
  18. 18. Next-gen QueryingCurrent architecture QUERY TIER ? Java Java Java Java Java Java Application Java Application Java Application Application Query Query Query Query Search Hosts Hosts Hosts Hosts NFS NFS Application Application Servers Application Servers Servers Application Servers Servers Servers Servers Servers Servers Primary DB Indexer SAN Secondary Back- Indexer ups INDEXING TIER STORAGE
  19. 19. Query Performance Enabled In-Memory Querying
  20. 20. SolrNext Generation Architecture Production DR Replicati Java Java Java Search Query Search Query Java Java Query Query Query Query Search on Java Application Java Application Java Search Hosts Hosts Application Application Application Hosts Hosts Hosts Hosts Hosts Hosts Application Servers Application Servers Application Servers Servers Servers Servers Servers Servers Servers Servers Backup DB FFX
  21. 21. Concludes Data Infrastructure.On to Multitenancy.
  22. 22. Multitenancy • Condominium Complex = Data Infrastructure • Tenant = Organization (aka Company) • Each Organization has many sub-tenants (aka Users)
  23. 23. How do we take a plain old relational database andmake it multitenant?
  24. 24. Multitenant Database Customize standard schema  Create columns Add new schema  Create new tables & columns Scale  Create indexes and materialized views  Statistics gathering  Adhoc querying with optimized query plans
  25. 25. Multitenant Database Customers have created 2 million database tables Tens of millions of columns on those tables Ten of billions of rows in those tables
  26. 26. Sharing Relational Data Structures is Hard Your Definitions Your Data Your Optimizations Indexes Pivot table for non-unique indexes Dell’s UniqueFields Pivot table for unique Product Data indexes Relationships Pivot table for foreign keys Burberry’s Clothing MRUIndex Data Pivot table for most-recently- used Your Payroll Data …others…
  27. 27. Flex Schema on Steroids: Everyone’s Data Flex Column: Multiple Data Types ID Tenant Data 1 Data 2 Data N 1000001 You $190 1000002 You $250 1000003 You $680 1000004 Burberry True 1000005 Burberry False 1000006 Burberry True 1000007 Dell Monitor 1000008 Dell Laptop 1000009 Dell Server
  28. 28. Flex Schema: Everyone’s Optimizations Muti-Tenant Table Multi-tenant Index ID Data 1 Data 2 10002 unus erat toto naturae 10003 vultus in orbe 10004 quem dixere chaeos 10005 rudis indigestaque 10006 meis perpetuum 10007 deducite temopra ID Tenant Data 2 Tenant Text Number 10008 Boolean carmen ante 10009 mare et terras 1000001 You $190 10010 tegit et quod You $190 10011 10012 omnia unus erat caelum toto naturae 1000002 You $250 10013 vultus in orbe You $250 10014 quem dixere chaeos 1000003 You $680 10015 rudis indigestaque 10016 meis perpetuum You $680 10017 deducite temopra 1000004 Burberry True 10018 carmen ante Burberry 10019 True mare et terras 1000005 Burberry False Redundant Burberry 10020 10021 tegit omnia False et quod caelum 10022 unus erat toto naturae 1000006 Burberry True Storage Burberry 10023 10024 vultus True quem dixere in orbe chaeos 1000007 Dell Monitor 10025 rudis indigestaque 10026 meis perpetuum Dell Monitor 10027 deducite temopra 1000008 Dell Laptop 10028 carmen ante Dell Laptop 10029 mare et terras 1000009 Dell Server 10030 tegit et quod 10031 omnia caelum Dell Server 10032 unus erat toto naturae 10033 vultus in orbe
  29. 29. Multitenant Database • To support Custom Objects, we use: • Arbitrary Transaction Support • Locking • Row caching • To support Custom Objects, we don’t use: • Native data typing • Native indexing • Foreign Key Constraints • Query Optimization • Stats Collection
  30. 30. A Real World Question Michael Dell wants to know if Servers are selling well in the West SELECT SUM(Amount) FROM Opportunities WHERE Product = ‘Servers’ AND Region = ‘West’ How will we answer this question quickly?
  31. 31. Multi-tenant Query Optimizer Indexes ID Data 1 Data 2 10002 unus erat toto naturae 10003 vultus in orbe 10004 quem dixere chaeos 10005 rudis indigestaque 10006 Servers meis perpetuum 10007 10008 deducite carmen temopra ante Visibility The fastest path to 10009 10010 10011 mare tegit omnia et terras et quod caelum ID 10002 unus erat Data 1 Data 2 toto naturae the answer 10012 10013 10014 unus erat vultus quem dixere totonaturae in orbe chaeos 10003 10004 10005 vultus quem dixere rudis in orbe chaeos indigestaque 10015 rudis indigestaque 10006 meis perpetuum 10016 meis perpetuum 10007 deducite temopra 10017 deducite temopra 10008 carmen ante 10018 carmen ante 10009 mare et terras 10019 mare et terras 10010 tegit et quod 10020 tegit et quod 10011 omnia caelum 10021 10022 10023 West omnia unus erat vultus caelum toto naturae in orbe 10012 10013 10014 unus erat vultus quem dixere totonaturae in orbe chaeos 10024 quem dixere chaeos 10015 rudis indigestaque 10025 rudis indigestaque 10016 meis perpetuum 10026 meis perpetuum 10017 deducite temopra 10027 deducite temopra 10018 carmen ante Millions of Sales 10028 10029 carmen mare ante et terras 10019 10020 mare tegit et terras et quod 10030 tegit et quod Line Items 10031 omnia caelum 10021 10022 omnia unus erat caelum toto naturae 10032 10033 unus erat vultus toto naturae in orbe 10023 10024 M. Dell vultus quem dixere in orbe chaeos 10025 rudis indigestaque 10026 meis perpetuum 10027 deducite temopra 10028 carmen ante 10029 mare et terras 10030 tegit et quod 10031 omnia caelum 10032 unus erat toto naturae 10033 vultus in orbe
  32. 32. Multi-tenant Query Optimizer GoShared Shared Run pre-queriesVisibility Indexes Check user User # of rows that = Visibility Visibility the user can ID Data 1 Data 2ID Data 1 Data 2 10002 10003 10004 unus erat vultus quem dixere toto naturae in orbe chaeos 10002 10003 10004 10005 unus erat vultus quem dixere rudis toto naturae in orbe chaeos indigestaque Check filter access 10005 rudis indigestaque selectivity Filter 10006 meis perpetuum 10006 meis perpetuum = How specific 10007 deducite temopra 10007 deducite temopra 10008 carmen ante 10008 carmen ante 10009 mare et terras 10009 mare et terras Selectivity 10010 tegit et quod 10010 tegit et quod 10011 10012 omnia unus erat caelum totonaturae 10011 10012 omnia unus erat caelum totonaturae is this filter? Multi-tenant 10013 vultus in orbe 10013 vultus in orbe 10014 quem dixere chaeos 10014 quem dixere chaeos 10015 rudis indigestaque 10015 rudis indigestaque Write query-based 10016 meis perpetuum 10016 meis perpetuum 10017 10018 10019 deducite carmen mare temopra ante et terras 10017 10018 10019 deducite carmen mare temopra ante et terras Optimizer Statistics on results of pre- 10020 tegit et quod 10020 tegit et quod 10021 omnia caelum 10021 omnia caelum 10022 unus erat toto naturae 10022 unus erat toto naturae 10023 vultus in orbe 10023 vultus in orbe 10024 10025 10026 10027 quem dixere rudis meis deducite chaeos indigestaque perpetuum temopra 10024 10025 10026 10027 quem dixere rudis meis deducite chaeos indigestaque perpetuum temopra queries 10028 carmen ante 10028 carmen ante 10029 mare et terras 10029 mare et terras 10030 tegit et quod 10030 tegit et quod 10031 omnia caelum 10031 omnia caelum 10032 unus erat toto naturae 10032 unus erat toto naturae 10033 vultus in orbe 10033 vultus in orbe Execute query Stop
  33. 33. The Machine is Alive!!!! Automatic creation of indexes  Watches queries, logs certain behaviors, selects potential candidates, tests and ranks best candidates, and then builds indexes for candidates Runtime predictor for long-running queries  Factors in selectivity, cardinality, # of joins, presence of indexes, current server conditions  Machine Learning via Decision Forest
  34. 34. How can we make internal salesforce developers more efficient?
  35. 35. We want to create the Quote standard object• DDL scripts• Hand-coded SQL• ORM• Sharing• Workflow• Apex• Visualforce• Validation Rules• API
  36. 36. Why can Force.com developers create a CustomObject in about 30 secs, but it takes me 30 days?
  37. 37. Our Solution for the Quote Standard ObjectBase Platform Objects (BPOs)Exactly the same as Custom Objects, but exposed to internalsaleforce.com developers.Schema defined in XML
  38. 38. Our Solution for the Quote Standard ObjectCustomer’s getting the benefit of it! Zero downtime for majorreleases and the shrinkage of maintenance windows.
  39. 39. Conclusion - Key TakeawaysData Infrastructure • Building blocks of the Force.com service. Relational database, Distributed File System, Search. High Availability, Backups, and Disaster Recovery.Multitenant Data Management • Platformize the raw data infrastructure to make it work for the cloud. Enable multiple customers to utilize a shared resource pool.Internal Development with Data • Dogfooding. How do internal Salesforce engineers build on top of this multitenant data platform?

×