SlideShare a Scribd company logo
Why EDP chose



                Artyom Diky
                William Biesty
                Mark Velez
Agenda
• Who are we?
• Evolution of Document Management
  • File system to relational DB
  • Relational to document-oriented DB
  • Paper to electronic
• Advantages and Challenges
• Questions?
Who Are We?
•   New York City Department of Health and Mental Hygiene
    •   Environmental Health Services (EHS)
         •   Environmental Disease Prevention (EDP)
               •   Lead Poisoning Prevention Program (LPPP)
                     •   MIS Unit   we are here
• We support many programs within EDP
• Who are our stakeholders?
  • Inspectors
  • Researchers
  • Clinical Staff
  • Lawyers (FOIL)
Evolution of Document Management
                         Paper

• A lot of legal documents on paper
  • Historic - from the '70s and up
  • Current (ongoing)
• Problems with Paper
  • Time and Labor Intensive
     • Locate, Copy, Redact, Copy, Mail (Repeat….)
  • Storage Space
  • Disaster Recovery
Evolution of Document Management
                        eFiles

• VB6
• Scanning utilities
• File-system based storage
  • Millions of files
  • Identifiers based on child ID
Evolution of Document Management
                                eFiles Issues
•   Technical
    •   VB6 phased out
    •   Outdated 3rd party tools changed API
    •   License expired
•   Security
    •   Documents have been redacted permanently
    •   No access control to private information
•   Scalability
    •   New document types
    •   New indexing (tagging) mechanisms for search
Evolution of Document Management

• Need for better document management
  • Paperless offices mandate
  • Expand searchable attributes and document text
  • Update technology
  • Improved security
  • HIPAA compliance
  • Platform for future applications
File System to Relational DB

• Challenges:
  • 1M+ historical documents as image files
  • Need for document metadata
     • Various and evolving schemas
  • Security
  • Updates and migration
  • Fail-safe storage
Technologies

• We use Microsoft technologies
  • SQL Server
  • .NET
• We are a small team that develop and support
 dozens of data collection apps (forms)
  •   Risk assessments       •   Case Management
  •   Inspection Reports
  •   Research
Example Documents
event_date   document_type   child ID




                  me_num
File System to Relational DB
                              FileStream
•   MSSQL 2008
     o Data storage with FileStream
     o Metadata with Entity-Attribute-Value
             sql_variant
     o   Data-driven application design
•   Rich service-oriented API through WCF
    •    Search engine
•   Added features
     o   Versioning
             Change and revert
DocSpace SQL Architecture
Limitations of Relational Model

• Need faster development cycle
  • Double effort for development and maintenance
  • On application and database level
• Document definition (metadata) first, content
 later
  • Changing schema
• Rigid document structure
     •   Not amenable to change
     •   No support for non-primitive values
Effects on Development Cycle

• SQL        Waterfall-like approach
  • Fully develop requirements before
   implementation
     •   Gotta get the schema right to avoid hassle
  • Change discouraged
• MongoDB            Rapid Application Development
  • Prototyping
  • Change accommodated
Document Management System Done Right

• Faster development cycles
   • No translation of complex document structure
     into relational model
   • Application driven schema
• Document content first, metadata later
• Flexible document structure driven by user
  requirements
• GridFS for large documents
DocSpace MongoDB Architecture
Case Study - Traffic Fatalities

• A study of traffic-related fatalities in NYC
  • Injury Surveillance and Prevention
• Offline data collection
• 330+ data points
• Multiple weekly changes to schema
   o Add/remove fields
   o Value types

• Developed in 500 hrs (3 months)
  • 1 intermediate developer, 1 novice
Evolving Use of MongoDB

• Single Node with Database Security
   • Nightly Dump for Backup Archiving
• Master – Slave Nodes
• Replica Sets – 3 Nodes
   • Distributed across Metropolitan Area Network
      • Bare Iron Primary, VMware ESX and Hyper-V VM Secondaries
• Hurricane Sandy – No downtime, one node
  failed
Thank you!
Questions
Contact Us
William Biesty, Database Administrator, wbiesty@health.nyc.gov
Art Diky, Software Engineer, adiky@health.nyc.gov
Mark Velez, Software Engineer, mvalez@health.nyc.gov



                      nyc.gov/health

More Related Content

What's hot

Sept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the CloudSept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the Cloud
National Information Standards Organization (NISO)
 
Sept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the CloudSept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the Cloud
National Information Standards Organization (NISO)
 
Library cloud abcd
Library cloud   abcdLibrary cloud   abcd
Library cloud abcd
kevin_donovan
 
Discovery Systems Used in Academic Libraries Projects & Case Study
Discovery Systems Used in Academic Libraries Projects & Case StudyDiscovery Systems Used in Academic Libraries Projects & Case Study
Discovery Systems Used in Academic Libraries Projects & Case Study
Hong (Jenny) Jing
 
CLOVER: PROPERTY GRAPH BASED METADATA MANAGEMENT SERVICE
CLOVER: PROPERTY GRAPH BASED METADATA MANAGEMENT SERVICECLOVER: PROPERTY GRAPH BASED METADATA MANAGEMENT SERVICE
CLOVER: PROPERTY GRAPH BASED METADATA MANAGEMENT SERVICE
Faculty of Technical Sciences, University of Novi Sad
 
Sept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the CloudSept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the Cloud
National Information Standards Organization (NISO)
 
Sept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the CloudSept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the Cloud
National Information Standards Organization (NISO)
 
ICIC 2016: Building a Crowdsourced Chemical Database from the Web (Bring Dee...
ICIC 2016: Building a Crowdsourced Chemical Database from the Web  (Bring Dee...ICIC 2016: Building a Crowdsourced Chemical Database from the Web  (Bring Dee...
ICIC 2016: Building a Crowdsourced Chemical Database from the Web (Bring Dee...
Dr. Haxel Consult
 
Sept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the CloudSept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the Cloud
National Information Standards Organization (NISO)
 
ICIC 2016: New Product Introduction RightsDirect
ICIC 2016: New Product Introduction RightsDirectICIC 2016: New Product Introduction RightsDirect
ICIC 2016: New Product Introduction RightsDirect
Dr. Haxel Consult
 
Linked Data Book: DC Semantic Web Meetup 20130129
Linked Data Book: DC Semantic Web Meetup 20130129Linked Data Book: DC Semantic Web Meetup 20130129
Linked Data Book: DC Semantic Web Meetup 201301293 Round Stones
 
Scott Wajon presenting at the digital collecting seminar 27 May 2019
Scott Wajon presenting at the digital collecting seminar 27 May 2019Scott Wajon presenting at the digital collecting seminar 27 May 2019
Scott Wajon presenting at the digital collecting seminar 27 May 2019
PublicLibraryServices
 

What's hot (13)

Sept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the CloudSept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the Cloud
 
Sept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the CloudSept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the Cloud
 
ReSTfulAPIs
ReSTfulAPIsReSTfulAPIs
ReSTfulAPIs
 
Library cloud abcd
Library cloud   abcdLibrary cloud   abcd
Library cloud abcd
 
Discovery Systems Used in Academic Libraries Projects & Case Study
Discovery Systems Used in Academic Libraries Projects & Case StudyDiscovery Systems Used in Academic Libraries Projects & Case Study
Discovery Systems Used in Academic Libraries Projects & Case Study
 
CLOVER: PROPERTY GRAPH BASED METADATA MANAGEMENT SERVICE
CLOVER: PROPERTY GRAPH BASED METADATA MANAGEMENT SERVICECLOVER: PROPERTY GRAPH BASED METADATA MANAGEMENT SERVICE
CLOVER: PROPERTY GRAPH BASED METADATA MANAGEMENT SERVICE
 
Sept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the CloudSept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the Cloud
 
Sept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the CloudSept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the Cloud
 
ICIC 2016: Building a Crowdsourced Chemical Database from the Web (Bring Dee...
ICIC 2016: Building a Crowdsourced Chemical Database from the Web  (Bring Dee...ICIC 2016: Building a Crowdsourced Chemical Database from the Web  (Bring Dee...
ICIC 2016: Building a Crowdsourced Chemical Database from the Web (Bring Dee...
 
Sept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the CloudSept 24 NISO Virtual Conference: Library Data in the Cloud
Sept 24 NISO Virtual Conference: Library Data in the Cloud
 
ICIC 2016: New Product Introduction RightsDirect
ICIC 2016: New Product Introduction RightsDirectICIC 2016: New Product Introduction RightsDirect
ICIC 2016: New Product Introduction RightsDirect
 
Linked Data Book: DC Semantic Web Meetup 20130129
Linked Data Book: DC Semantic Web Meetup 20130129Linked Data Book: DC Semantic Web Meetup 20130129
Linked Data Book: DC Semantic Web Meetup 20130129
 
Scott Wajon presenting at the digital collecting seminar 27 May 2019
Scott Wajon presenting at the digital collecting seminar 27 May 2019Scott Wajon presenting at the digital collecting seminar 27 May 2019
Scott Wajon presenting at the digital collecting seminar 27 May 2019
 

Similar to Why EDP Chose MongoDB

Managing Your Research Data
Managing Your Research DataManaging Your Research Data
Managing Your Research Data
Kristin Briney
 
Semantic Technologies for Enterprise Cloud Management
Semantic Technologies for Enterprise Cloud ManagementSemantic Technologies for Enterprise Cloud Management
Semantic Technologies for Enterprise Cloud ManagementPeter Haase
 
4- DB Ch6 18-3-2020.pptx
4- DB Ch6 18-3-2020.pptx4- DB Ch6 18-3-2020.pptx
4- DB Ch6 18-3-2020.pptx
Shoaibmirza18
 
Database Systems
Database SystemsDatabase Systems
Database Systems
Sheikh Hussnain
 
Digital Preservation at UNM Libraries
Digital Preservation at UNM LibrariesDigital Preservation at UNM Libraries
Digital Preservation at UNM Libraries
Kevin J. Comerford, University of New Mexico
 
Electronic Lab Notebooks
Electronic Lab NotebooksElectronic Lab Notebooks
Electronic Lab Notebooks
Kristin Briney
 
Capture All the URLs: First Steps in Web Archiving
Capture All the URLs: First Steps in Web ArchivingCapture All the URLs: First Steps in Web Archiving
Capture All the URLs: First Steps in Web Archiving
Kristen Yarmey
 
OpenStack Swift In the Enterprise
OpenStack Swift In the EnterpriseOpenStack Swift In the Enterprise
OpenStack Swift In the Enterprise
Hostway|HOSTING
 
introduction to NOSQL Database
introduction to NOSQL Databaseintroduction to NOSQL Database
introduction to NOSQL Database
nehabsairam
 
Mis assignment (database)
Mis assignment (database)Mis assignment (database)
Mis assignment (database)
Muhammad Sultan Bhatti
 
Foundations of business intelligence databases and information management
Foundations of business intelligence databases and information managementFoundations of business intelligence databases and information management
Foundations of business intelligence databases and information management
Amity University | FMS - DU | IMT | Stratford University | KKMI International Institute | AIMA | DTU
 
Data Vault Automation at the Bijenkorf
Data Vault Automation at the BijenkorfData Vault Automation at the Bijenkorf
Data Vault Automation at the Bijenkorf
Rob Winters
 
Module 1 - Chapter1.pptx
Module 1 - Chapter1.pptxModule 1 - Chapter1.pptx
Module 1 - Chapter1.pptx
SoniaDevi15
 
USG Summit - September 2014 - Web Management using Drupal
USG Summit - September 2014 - Web Management using DrupalUSG Summit - September 2014 - Web Management using Drupal
USG Summit - September 2014 - Web Management using Drupal
Eric Sembrat
 
Chapter5
Chapter5Chapter5
Chapter5
Sabana Maharjan
 
Dm1.1
Dm1.1Dm1.1
Dm1.1
popo pp
 
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Lucas Jellema
 
dbms introduction.pptx
dbms introduction.pptxdbms introduction.pptx
dbms introduction.pptx
ATISHAYJAIN847270
 
Kent State University Libraries Develops a New System for Resource Selection
Kent State University Libraries Develops a New System for Resource SelectionKent State University Libraries Develops a New System for Resource Selection
Kent State University Libraries Develops a New System for Resource Selection
Charleston Conference
 
Wednesday 6 May: Hand me the data! What you should know as a humanities resea...
Wednesday 6 May: Hand me the data! What you should know as a humanities resea...Wednesday 6 May: Hand me the data! What you should know as a humanities resea...
Wednesday 6 May: Hand me the data! What you should know as a humanities resea...
WARCnet
 

Similar to Why EDP Chose MongoDB (20)

Managing Your Research Data
Managing Your Research DataManaging Your Research Data
Managing Your Research Data
 
Semantic Technologies for Enterprise Cloud Management
Semantic Technologies for Enterprise Cloud ManagementSemantic Technologies for Enterprise Cloud Management
Semantic Technologies for Enterprise Cloud Management
 
4- DB Ch6 18-3-2020.pptx
4- DB Ch6 18-3-2020.pptx4- DB Ch6 18-3-2020.pptx
4- DB Ch6 18-3-2020.pptx
 
Database Systems
Database SystemsDatabase Systems
Database Systems
 
Digital Preservation at UNM Libraries
Digital Preservation at UNM LibrariesDigital Preservation at UNM Libraries
Digital Preservation at UNM Libraries
 
Electronic Lab Notebooks
Electronic Lab NotebooksElectronic Lab Notebooks
Electronic Lab Notebooks
 
Capture All the URLs: First Steps in Web Archiving
Capture All the URLs: First Steps in Web ArchivingCapture All the URLs: First Steps in Web Archiving
Capture All the URLs: First Steps in Web Archiving
 
OpenStack Swift In the Enterprise
OpenStack Swift In the EnterpriseOpenStack Swift In the Enterprise
OpenStack Swift In the Enterprise
 
introduction to NOSQL Database
introduction to NOSQL Databaseintroduction to NOSQL Database
introduction to NOSQL Database
 
Mis assignment (database)
Mis assignment (database)Mis assignment (database)
Mis assignment (database)
 
Foundations of business intelligence databases and information management
Foundations of business intelligence databases and information managementFoundations of business intelligence databases and information management
Foundations of business intelligence databases and information management
 
Data Vault Automation at the Bijenkorf
Data Vault Automation at the BijenkorfData Vault Automation at the Bijenkorf
Data Vault Automation at the Bijenkorf
 
Module 1 - Chapter1.pptx
Module 1 - Chapter1.pptxModule 1 - Chapter1.pptx
Module 1 - Chapter1.pptx
 
USG Summit - September 2014 - Web Management using Drupal
USG Summit - September 2014 - Web Management using DrupalUSG Summit - September 2014 - Web Management using Drupal
USG Summit - September 2014 - Web Management using Drupal
 
Chapter5
Chapter5Chapter5
Chapter5
 
Dm1.1
Dm1.1Dm1.1
Dm1.1
 
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
 
dbms introduction.pptx
dbms introduction.pptxdbms introduction.pptx
dbms introduction.pptx
 
Kent State University Libraries Develops a New System for Resource Selection
Kent State University Libraries Develops a New System for Resource SelectionKent State University Libraries Develops a New System for Resource Selection
Kent State University Libraries Develops a New System for Resource Selection
 
Wednesday 6 May: Hand me the data! What you should know as a humanities resea...
Wednesday 6 May: Hand me the data! What you should know as a humanities resea...Wednesday 6 May: Hand me the data! What you should know as a humanities resea...
Wednesday 6 May: Hand me the data! What you should know as a humanities resea...
 

More from MongoDB

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB
 

More from MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Why EDP Chose MongoDB

  • 1. Why EDP chose Artyom Diky William Biesty Mark Velez
  • 2. Agenda • Who are we? • Evolution of Document Management • File system to relational DB • Relational to document-oriented DB • Paper to electronic • Advantages and Challenges • Questions?
  • 3. Who Are We? • New York City Department of Health and Mental Hygiene • Environmental Health Services (EHS) • Environmental Disease Prevention (EDP) • Lead Poisoning Prevention Program (LPPP) • MIS Unit we are here • We support many programs within EDP • Who are our stakeholders? • Inspectors • Researchers • Clinical Staff • Lawyers (FOIL)
  • 4. Evolution of Document Management Paper • A lot of legal documents on paper • Historic - from the '70s and up • Current (ongoing) • Problems with Paper • Time and Labor Intensive • Locate, Copy, Redact, Copy, Mail (Repeat….) • Storage Space • Disaster Recovery
  • 5. Evolution of Document Management eFiles • VB6 • Scanning utilities • File-system based storage • Millions of files • Identifiers based on child ID
  • 6. Evolution of Document Management eFiles Issues • Technical • VB6 phased out • Outdated 3rd party tools changed API • License expired • Security • Documents have been redacted permanently • No access control to private information • Scalability • New document types • New indexing (tagging) mechanisms for search
  • 7. Evolution of Document Management • Need for better document management • Paperless offices mandate • Expand searchable attributes and document text • Update technology • Improved security • HIPAA compliance • Platform for future applications
  • 8. File System to Relational DB • Challenges: • 1M+ historical documents as image files • Need for document metadata • Various and evolving schemas • Security • Updates and migration • Fail-safe storage
  • 9. Technologies • We use Microsoft technologies • SQL Server • .NET • We are a small team that develop and support dozens of data collection apps (forms) • Risk assessments • Case Management • Inspection Reports • Research
  • 10. Example Documents event_date document_type child ID me_num
  • 11. File System to Relational DB FileStream • MSSQL 2008 o Data storage with FileStream o Metadata with Entity-Attribute-Value  sql_variant o Data-driven application design • Rich service-oriented API through WCF • Search engine • Added features o Versioning  Change and revert
  • 13. Limitations of Relational Model • Need faster development cycle • Double effort for development and maintenance • On application and database level • Document definition (metadata) first, content later • Changing schema • Rigid document structure • Not amenable to change • No support for non-primitive values
  • 14. Effects on Development Cycle • SQL Waterfall-like approach • Fully develop requirements before implementation • Gotta get the schema right to avoid hassle • Change discouraged • MongoDB Rapid Application Development • Prototyping • Change accommodated
  • 15. Document Management System Done Right • Faster development cycles • No translation of complex document structure into relational model • Application driven schema • Document content first, metadata later • Flexible document structure driven by user requirements • GridFS for large documents
  • 17. Case Study - Traffic Fatalities • A study of traffic-related fatalities in NYC • Injury Surveillance and Prevention • Offline data collection • 330+ data points • Multiple weekly changes to schema o Add/remove fields o Value types • Developed in 500 hrs (3 months) • 1 intermediate developer, 1 novice
  • 18. Evolving Use of MongoDB • Single Node with Database Security • Nightly Dump for Backup Archiving • Master – Slave Nodes • Replica Sets – 3 Nodes • Distributed across Metropolitan Area Network • Bare Iron Primary, VMware ESX and Hyper-V VM Secondaries • Hurricane Sandy – No downtime, one node failed
  • 20. Contact Us William Biesty, Database Administrator, wbiesty@health.nyc.gov Art Diky, Software Engineer, adiky@health.nyc.gov Mark Velez, Software Engineer, mvalez@health.nyc.gov nyc.gov/health