How Government Agencies are Using MongoDB to Build Data as a Service Solutions
1. #MongoDB
Data as a Service:
How Government Agencies are
Using MongoDB to Build DaaS
Solutions
Dave Erickson
david.erickson@mongodb.com
Senior Solutions Architect, MongoDB
2. MongoDB
The Modern Operational Database
2
General
Purpose
Open-
Source
Real-time Document
Oriented
3. The MongoDB Company
3
400+ employees 1000+ customers
13 offices around the world Over $230 million in funding
6. MongoDB Use Cases
Big Data Product & Asset
6
Catalogs
Security &
Fraud
Internet of
Things
Database-as-a-
Service
Mobile
Apps
Customer Data
Management
Top Investment and
Retail Banks
Single View Social &
Collaboration
Content
Management
Intelligence Agencies
Top Global Shipping
Company
Top Industrial Equipment
Manufacturer
Top Media Company
Top Investment and
Retail Banks
8. Fully Featured
8
MongoDB
{
first_name: ‘Paul’,
surname: ‘Miller’,
city: ‘London’,
location: [45.123,47.232],
cars: [
{ model: ‘Bentley’,
year: 1973,
value: 100000, … },
{ model: ‘Rolls Royce’,
year: 1965,
value: 330000, … }
}
}
Rich Queries
• Find Paul’s cars
• Find everybody in London with a car
built between 1970 and 1980
Geospatial
• Find all of the car owners within 5km of
Trafalgar Sq.
Text Search
• Find all the cars described as having
leather seats
Aggregation
• Calculate the average value of Paul’s
car collection
Native Indexes
• Secondary
• Compound
• Geospatial
• Full Text
• Hash
• Covering
14. “On the one hand information wants to be
expensive, because it's so valuable. The right
information in the right place just changes your
life.
On the other hand, information wants to be free,
because the cost of getting it out is getting
lower and lower all the time. So you have these
two fighting against each other.”
Commonly attributed as comment by
Stewart Brand to Steve Wozniak at the first
Hackers Conference in 1984
20. Innovation means EVOLUTION
• Applications coupled to Data Islands
• Batch Data Warehouses
• Service Oriented
20
– Exposing remote procedure calls via Web technologies
• Resource Oriented
– Web addressable content and data
• Data Lakes
• X - as - a – Service
THESE ARE TOOLS FOR EXECUTING MISSION!
Not necessarily aligned with objectives
21. Data as a Service
• Data Centric Approach
21
– Data as the focus and product of your business
– Data at the center of your architecture
– Decouple infrastructure from schema and format
– No Apps?, Multiple Apps?, 1 Killer App?
22. Data as a Service
• Combines strengths of recent technology trends
22
– Enterprise controls of Service Oriented
– Ease of consumption Resource Oriented
– Elastic Scale of Cloud / Virtual Computing
– Platform Independence, Non Proprietary Data Format
• Reusable / Transferable Infrastructure
enables
Repeatable Success
24. Use Cases
• VA VLER
24
– Data as a central service
• CFPB
– Open Data Initiatives
• FCC
– Crowd Sourced Mobile Speed Tests
• City of Chicago
– Predictive analytics improve city services
25. Veterans Affairs VLER
• Virtual Lifetime Electronic Record
• Challenge
25
– Growing and evolving cyber threats
– Transformation of the healthcare industry
– Increasing pressure on federal budgets
– Greater number of Veterans receiving and using Benefits
• Multidirectional data sharing
– Within the VA & with other Agencies (DoD)
– Healthcare and Benefit providers
– Veterans and other beneficiaries
http://www.va.gov/vler/
27. Veterans Affairs VLER
• eCRUD Service
27
– Efficiency, Security, Agility
– Abstract data from platform
– Schema Agnostic
– Enable Rapid Application Development
– No Server Side Coding
• Extensive Security Engineering
– Best practices in use of MongoDB security features
– Hardening guides
– http://www.mongodb.com/lp/contact/stig-requests
28. Consumer Financial
Protection Bureau
• Protect Consumers
28
– Making data about consumer banking, credit cards,
consumer financial products available
• CFPB open data platform: qu
– Written in Clojure
– MongoDB backend
– https://github.com/cfpb/qu
31. FCC - Mobile Broadband
Speed Test
• Challenge
31
– Government is used to releasing data by paper report.
– Spectrum Sensor data is massive and complicated
– Consumers want to know their mobile and wired
broadband options
• Mobile Broadband Speed Test
– Crowd source data collection
– Make data returned relevant to consumer
32. We can do better than this (2011)
http://www.wired.com/2011/01/verizon-or-att-iphone
35. MongoDB DC
Washington DC, October 14
http://www.mongodb.com/events/mongodb-dc-2014
35
#MongoDBWorld
A day of workshops, sessions, hands on tips,
Commercial and Government speakers, and
great ideas.
36. MongoDB DaaS Quickstart
• DaaS platform built on MongoDB
36
– Open data for the public
– Serve data to the enterprise
• MongoDB Enterprise on up to 3 nodes for HA
• Quickstart solution deployed by our consultants
• Solution & architecture documentation
• 1 day of solution review and training
• http://bit.ly/1qXcYNA
Today we wanted to talk through a trend we are seeing about how government agencies are
starting to treat data sets within their own organization that is a slight departure from past years,
but is really an evolution of several concepts and trends that have been happening in enterprise IT for a while now.
MongoDB is the Modern Operational Database
Firstly The Open source model provides transparency and flexibility while driving cost out
Like the traditional relational database, MongoDB is a general purpose and has seen widespread adoption across domains and problem spaces.
Unlike the traditional database, MongoDB uses a document oriented data model from which we have seen numerous advantages shake out.
Our future is secure. We aren’t going anywhere
15 seconds
#5 most popular DB, measured by combination of use, awareness, and activity on the internet
Passed DB2 in Feb.
On track to pass postgres in a month or so.
From there quite a jump to the next tier but still a very good showing – and the only document / rich shape product on the radar.
Here’s another reason for the popularity and strength of the platform: We have 400 partners and growing by about 10 monthly. Much More than others in the NoSQL space.
We have strategic partnerships with progressive companies like Pentaho in BI and AppDynamics for system health and performance monitoring.
And we have certification programs for systems integrators too so you can outsource with confidence.
IBM: Standardizing on BSON, MongoDB query language, and MongoDB wire protocol for DB2 integration, and that sends a very strong signal about our position in this space. Just google for IBM DB2 JSON and you’ll see.
Historically, mongoDB is very cloud friendly and although financial services tend not to use public clouds as much due to personal info and data secrecy issues, the tools and techniques developed in the public clouds for provisioning, monitoring, multitenancy, etc. can be reproduced in private clouds inside your firewall so financial services can get a leg up on that so to speak.
Customer Data Management (e.g., Customer Relationship Management, Biometrics, User Profile Management)
Product and Asset Catalogs (e.g., eCommerce, Inventory Management)
Social and Collaboration Apps: (e.g., Social Networks and Feeds, Document and Project Collaboration Tools)
Mobile Apps (e.g., for Smartphones and Tablets)
Content Management (e.g, Web CMS, Document Management, Digital Asset and Metadata Management)
Internet of Things / Machine to Machine (e.g., mHealth, Connected Home, Smart Meters)
Security and Fraud Apps (e.g., Fraud Detection, Cyberthreat Analysis)
DbaaS (Cloud Database-as-a-Service)
Data Hub (Aggregating Data from Multiple Sources for Operational or Analytical Purposes)
Big Data (e.g., Genomics, Clickstream Analysis, Customer Sentiment Analysis)
Here we have greatly reduced the relational data model for this application to two tables. In reality no database has two tables. It is much more common to have hundreds or thousands of tables. And as a developer where do you begin when you have a complex data model?? If you’re building an app you’re really thinking about just a hand full of common things, like products, and these can be represented in a document much more easily that a complex relational model where the data is broken up in a way that doesn’t really reflect the way you think about the data or write an application.
We enable new apps, personalized apps and data, through flexible and dynamic data capture, which leads to a better customer experience.
Technically, The product and the way it interacts with the software stack around it leads to faster time to market and lower TCO.
Government has managed massive amounts of information for a while now
This is a picture of the FBI’s finger print and identity database back in world war II
It’s actually a card catalog that took up the whole DC Armory.
One of the first images of the “Whole Earth” taken by NASA in 1967 image. No one thought the public would be interestedIt took a grass roots petition lead by Stuart Brand to get NASA to release the image directly to citizens an not through the press.
Stuart brand coined a phrase “Information wants to be free”
Stuart Brand was the person who coined the term “information wants to be free”, which I think inspires a lot of what is happening now.
In 2013 the white house got behind a trend in government, requiring agencies to make their datasets transparent and free by default
References past release of GPS systems for free
May 09, 2013 -- "MAKING OPEN AND MACHINE READABLE THE NEW DEFAULT FOR GOVERNMENT INFORMATION" Weather data ... geospatial information
Civic Hacking … This is a visualization created using open government data of property tax changes in Phillidelphia from 2013 to 2014. Hack Conferences are being set up
One of the neat parts of open data is that solutions are repeatable. This was a winner at a “hack for change” event in Baltimore, that inspired or “forked” from the Philly project.
While the data may be distributed and span a number of repositories, logically it’s in the center rather than on the edge
I find this one interesting because what we may have found is a consumer open data set more useful than twitter
FCC collects data in an unbiased way based on where consumers live
FCC uses hex girds and MongoDB’s GIS capabiltiies to compute visualizations and data sets at multiple resolutions
FCC unveiled their project that result available to the consumer