2. What are we building with MongoDB?
SecureDocs
What is it?
GS employees secure
ebriefcase
Access from mobile and
traditional clients
What tech backs it?
MongoDB 2.2 and Apache
Tomcat 7
Hardware load balancing
Why Mongo?
Completely user driven
tagging structure
Out of the box HA
2June 21, 2013 MongoNYC
3. What are we building with MongoDB?
Social PipeLine
What is it?
Internal social platform for quick
information sharing
Real time analytics platform for
external social trends
What tech backs it?
MongoDB 2.2, Apache Kafka, Solr
and Apache Tomcat 7
Commodity hardware on all
layers
Why Mongo?
Highly unstructured data across
all possible social sources
Sharding and performance
3June 21, 2013 MongoNYC
4. Why MongoDB?
Scale out
For Performance and Size
Global Availability and Resiliency
Statement-Level Transaction and Consistency Semantics
Strong Consistency Where Needed
Relaxed Consistency Where Possible
Easy to Use
Powerful APIs
No ORM required
10gen
June 21, 2013 MongoNYC 4
5. Why MongoDB?
“Sweet-spot” Between Filesystems and Relational Database
Security Model
Primary Keys and Secondary Indexes
Replication and Sharding
Highly Structured – but Not Enforced
June 21, 2013 MongoNYC 5
RDBMS
7. DaaS in a Private Cloud: Motivations
Facilitate Scale out
For Performance and Size
Global Availability and Resiliency
Rapid Deployment + Development
Efficiencies and Economies of Scale
“Late Affinity” of purpose
Platform
Version
Infrastructure Agility
Spare hardware
On-boarding pipeline
Supply-side Inventory Management
Keep the platform “easy to use”
June 21, 2013 MongoNYC 7
8. DaaS in a Private Cloud: Challenges
Building for unknown use cases
Defining “shapes”
Database platform specific hardware pools
Virtualization + Shared tenancy
Performance and scale considerations
SSD Storage
Security and Controls
Integrated into on-boarding pipeline
Audit
Backups and Archive
Off-host / large footprint
Sensitive Data and Masking
Inventory Management
Location aware for geographic resiliency
June 21, 2013 MongoNYC 8
9. DaaS in a Private Cloud: Challenges
#1 Challenge
CPU :: Memory :: Storage :: Price
Moving to “cloud” means limiting choice on these ratios
Scale out for storage
May over-allocate compute
Scale out for compute
May over-allocate storage
June 21, 2013 MongoNYC 9
10. Onboarding MongoDB @ Goldman Sachs
Before: MongoDB Cluster Topologies Not Standardized
DevOps Model, Informal User Groups
Informal 10gen Engagement
Various Versions of MongoDB
After: Private Cloud Service w/ Standardized Topologies
Fully Onboarded and Supported Database Platform
Formalized 10gen Relationship (via Database Group)
Standardize on MongoDB Enterprise Edition
June 21, 2013 MongoNYC 10
11. Engineering MongoDB for Private Cloud
Supply Flow
Provision
Virtual Machine
Register Node as
Available
Nodes are NOT
configured for
specific cluster
Demand Flow
User Orders Cluster
Based on Primary
& Resiliency Region
Reserve Nodes
from Available
Inventory
Perform operations
to give node
“Personality”
Configuration
Seed First Node,
Expand With
Others
Build Based on
Inventory
Delivery of Cluster
to Requestor
June 21, 2013 MongoNYC 11
12. MongoDB for Private Cloud
Topology
Required global topology for out of region resiliency (min = 3 nodes)
Each cluster is considered a “building block” for larger sharded clusters
MongoC and MongoS co-located with MongoD
Sharding
Teams encouraged to consider Shard Key even if no sharding plans
Sharding is the only supported way to grow (fixed internal storage)
Provided with a single Shard by default
Monitoring
&
Self Service
Custom Monitoring Stack
Ordering Automation and Developer Self-Service
June 21, 2013 MongoNYC 12
13. MongoDB for Private Cloud
Backup
Periodic backups to object storage
Working toward Point in Time Recovery (PITR)
Security
Kerberos ticket based authentication required
Authorization policies will continue to mature
Support
Database team supports service offering, only
Use cases that utilize “Sensitive Data” are not yet supported
June 21, 2013 MongoNYC 13
14. Private Cloud Challenges
Unlike Public
Cloud
• We don’t profit
from under
utilization
• Our incentives
are different
• This dramatically
affects our scale
out approach
One Size
Fits All
• Sharding is
primary strategy
for growing
• Both small and
large apps waste
resources
• CPU/Memory vs.
Storage
MongoDB
Cloud Goals
• Utilize available
resources
efficiently
• Maintain
customization
expected by
users
• Maintain
ease of use
Ideal Shape
Differs by App
• Small
Impedance
mismatch
preferred if it
enables scale
• Evaluate more
shapes if
mismatch is
egregious
June 21, 2013 MongoNYC 14
Take a < $10,000 Machine, Split it 1,2,4,8 ways and Build MongoDB Service...
15. Looking Forward: Cloud-Oriented Feature Requests
• Multi-tenancy on shared data repositories
Better
Security Models
• Addresses a broader array of use cases
Enhanced
Multi Master
• Increase utilization of fixed storageCompression
• Object StorageOff-host Backups
• Address Shape Mismatches?Better Shard Sizing
• Introduce more “Named Resource” conceptsNamed Clusters
15June 21, 2013 MongoNYC