SlideShare a Scribd company logo
1 of 11
Download to read offline
Colorado State Address Dataset 
Automated Processing 
Nathan Lowry, GIS Outreach Coordinator 
State of Colorado 
September 24, 2014
Common Data Model 
● Allows local and state-wide querying, analysis, and integration … 
● Accommodates information exchanges 
▪ Hierarchical - City to County, County to Region, Region to State 
▪ Among neighboring jurisdictions (eg. County to County, etc.) 
● Allows profiles to provide data in standard forms for specific 
objectives 
▪ NENA CLDXF for NG-911 
▪ USPS Pub-28 for CASS 
▪ ArcGIS Geocoding (for quality comparisons, etc.) 
● It’s more efficient (less work) and assures more quality (less loss)
FGDC-STD-016-2011 
United States Thoroughfare, Landmark, and Postal Address Data Standard 
Of Greatest Significance: 
1.Everything* is ‘fully explicit’ (fully spelled‐out) 
No abbreviations allowed; No Ambiguity 
*The only exception is two‐letter state postal codes (eg. “CO” = Colorado) 
●2.You will express exactly how each address will be parsed 
Parsing is no longer subject to interpretation 
The break‐down is stored in the data for each record 
3.Each Address must be assigned a Unique Identifier (UID) 
Multiple representations of the same address can be “tied 
together” if and only if (iff) addresses are assigned UIDs. 
These are big changes that few have yet implemented 
•Our common data model is designed to accommodate both: 
‒your current state and 
‒this “to be” state
Presuppositions: 
● SQL Server Integration Services (SSIS) 
o Parallel processing - fast translations - True. 
o Most Compatible with SQL Server - Irrelevant* 
o Developed by DBAs for DBAs - No, developed by app 
developers for app developers 
▪ (ie. Normalization tools) - Hah, hah, hah, hah, 
hah! 
o No Additional Cost - (This one bore out) 
o I learned French instead of Spanish - (SSIS instead of 
Python) 
● No Parsing 
o I will translate, but it’ll be the locals’ responsibility to 
pre-parse... - No parsing, no geocoding* 
o In addition, no last lines, no geocoding* 
● 6-8 Weeks Processing - 6-8 Months of Processing
Automating Processes
Colorado State Address Dataset 
Automated and Manual Processes
Automating Processes
Observations 
● SQL Server Integration Services (SSIS) 
○ SSIS is quirky 
○ SSIS Expression Language is Swahili 
○ A modeling canvas may be more effective for design 
○ SSIS can integrate with many other server processes (FTP) 
● Parsing and “Last Lining” will give CO jurisdictions a 
leg up 
○ The level of effort can be significant 
○ CLDXF Street Naming and Address Numbering Conventions 
● Standards 
○ Jurisdictional pretypes, sequencers - minor tweaks 
○ Subaddress conventions need ... something
Opportunities 
● Standards 
○ Improvement via implementation 
○ Coalescence on Subaddresses 
● Common implementations of data models 
○ Reduce the cost of development 
○ Makes sharing of code useful and possible 
● Common code 
○ Shared parsing tools 
○ Shared applications
Questions? 
Thank You!

More Related Content

Viewers also liked

γελασματα-ΚΩΝΣΤΑΝΤΙΝΟΣ \ ΣΤΕΡΓΙΟΣ
γελασματα-ΚΩΝΣΤΑΝΤΙΝΟΣ \ ΣΤΕΡΓΙΟΣγελασματα-ΚΩΝΣΤΑΝΤΙΝΟΣ \ ΣΤΕΡΓΙΟΣ
γελασματα-ΚΩΝΣΤΑΝΤΙΝΟΣ \ ΣΤΕΡΓΙΟΣ3odhmotiko
 
RLabs - Nigeria
RLabs - NigeriaRLabs - Nigeria
RLabs - Nigeriarlabsza
 
I006 Graphics Information Brian Mcneal
I006 Graphics Information Brian McnealI006 Graphics Information Brian Mcneal
I006 Graphics Information Brian McnealNorfolk Naval Shipyard
 
Task 3b energy drink logo
Task 3b   energy drink logoTask 3b   energy drink logo
Task 3b energy drink logoJChorlton15
 
Instituto de educacion superior tecnológico publico
Instituto de educacion superior tecnológico publicoInstituto de educacion superior tecnológico publico
Instituto de educacion superior tecnológico publicoPiero Ronchi Vasquez
 
източен специален божествен масаж
източен специален божествен масажизточен специален божествен масаж
източен специален божествен масажArman Hovsepyan
 
ظاهرة دوبلر.Docx2.docx333
ظاهرة دوبلر.Docx2.docx333ظاهرة دوبلر.Docx2.docx333
ظاهرة دوبلر.Docx2.docx333Nada Khaled
 
Machado's art gallery bloc
Machado's art gallery blocMachado's art gallery bloc
Machado's art gallery blocenglishmachado
 
Médicos geração digital
Médicos   geração digitalMédicos   geração digital
Médicos geração digitalSandra Sanches
 
Eye on the markets
Eye on the marketsEye on the markets
Eye on the marketsAndrew Lee
 

Viewers also liked (18)

γελασματα-ΚΩΝΣΤΑΝΤΙΝΟΣ \ ΣΤΕΡΓΙΟΣ
γελασματα-ΚΩΝΣΤΑΝΤΙΝΟΣ \ ΣΤΕΡΓΙΟΣγελασματα-ΚΩΝΣΤΑΝΤΙΝΟΣ \ ΣΤΕΡΓΙΟΣ
γελασματα-ΚΩΝΣΤΑΝΤΙΝΟΣ \ ΣΤΕΡΓΙΟΣ
 
Re edition my little picasso
Re edition my little picassoRe edition my little picasso
Re edition my little picasso
 
Week 3 - Fruity
Week 3 - FruityWeek 3 - Fruity
Week 3 - Fruity
 
Tg nec 17_hz3
Tg nec 17_hz3Tg nec 17_hz3
Tg nec 17_hz3
 
Free Press Cauca Viejo
Free Press Cauca ViejoFree Press Cauca Viejo
Free Press Cauca Viejo
 
RLabs - Nigeria
RLabs - NigeriaRLabs - Nigeria
RLabs - Nigeria
 
Se for que seja
Se for que sejaSe for que seja
Se for que seja
 
I006 Graphics Information Brian Mcneal
I006 Graphics Information Brian McnealI006 Graphics Information Brian Mcneal
I006 Graphics Information Brian Mcneal
 
Task 3b energy drink logo
Task 3b   energy drink logoTask 3b   energy drink logo
Task 3b energy drink logo
 
Christmas Time 2014-15
Christmas Time 2014-15Christmas Time 2014-15
Christmas Time 2014-15
 
Info Gps
Info   GpsInfo   Gps
Info Gps
 
Instituto de educacion superior tecnológico publico
Instituto de educacion superior tecnológico publicoInstituto de educacion superior tecnológico publico
Instituto de educacion superior tecnológico publico
 
източен специален божествен масаж
източен специален божествен масажизточен специален божествен масаж
източен специален божествен масаж
 
ظاهرة دوبلر.Docx2.docx333
ظاهرة دوبلر.Docx2.docx333ظاهرة دوبلر.Docx2.docx333
ظاهرة دوبلر.Docx2.docx333
 
Encontro celestial
Encontro celestialEncontro celestial
Encontro celestial
 
Machado's art gallery bloc
Machado's art gallery blocMachado's art gallery bloc
Machado's art gallery bloc
 
Médicos geração digital
Médicos   geração digitalMédicos   geração digital
Médicos geração digital
 
Eye on the markets
Eye on the marketsEye on the markets
Eye on the markets
 

Similar to Colorado State Address Dataset Automated Processing

2013 GISCO Track, Quality Assessment and Improvement for Addressed Locations ...
2013 GISCO Track, Quality Assessment and Improvement for Addressed Locations ...2013 GISCO Track, Quality Assessment and Improvement for Addressed Locations ...
2013 GISCO Track, Quality Assessment and Improvement for Addressed Locations ...GIS in the Rockies
 
Dirty Data? Clean it up! - Rocky Mountain DataCon 2016
Dirty Data? Clean it up! - Rocky Mountain DataCon 2016Dirty Data? Clean it up! - Rocky Mountain DataCon 2016
Dirty Data? Clean it up! - Rocky Mountain DataCon 2016Dan Lynn
 
Handling the growth of data
Handling the growth of dataHandling the growth of data
Handling the growth of dataPiyush Katariya
 
An Introduction to MapReduce
An Introduction to MapReduce An Introduction to MapReduce
An Introduction to MapReduce Sina Ebrahimi
 
Locality Sensitive Hashing By Spark
Locality Sensitive Hashing By SparkLocality Sensitive Hashing By Spark
Locality Sensitive Hashing By SparkSpark Summit
 
Introduction to PostgreSQL
Introduction to PostgreSQLIntroduction to PostgreSQL
Introduction to PostgreSQLJim Mlodgenski
 
Ledingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @LendingkartLedingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @LendingkartMukesh Singh
 
Big Data processing with Apache Spark
Big Data processing with Apache SparkBig Data processing with Apache Spark
Big Data processing with Apache SparkLucian Neghina
 
Dirty data? Clean it up! - Datapalooza Denver 2016
Dirty data? Clean it up! - Datapalooza Denver 2016Dirty data? Clean it up! - Datapalooza Denver 2016
Dirty data? Clean it up! - Datapalooza Denver 2016Dan Lynn
 
A Day in the Life of a Druid Implementor and Druid's Roadmap
A Day in the Life of a Druid Implementor and Druid's RoadmapA Day in the Life of a Druid Implementor and Druid's Roadmap
A Day in the Life of a Druid Implementor and Druid's RoadmapItai Yaffe
 
Streamsets and spark at SF Hadoop User Group
Streamsets and spark at SF Hadoop User GroupStreamsets and spark at SF Hadoop User Group
Streamsets and spark at SF Hadoop User GroupHari Shreedharan
 
Using python to analyze spatial data
Using python to analyze spatial dataUsing python to analyze spatial data
Using python to analyze spatial dataKudos S.A.S
 
Geospatial Options in Apache Spark
Geospatial Options in Apache SparkGeospatial Options in Apache Spark
Geospatial Options in Apache SparkDatabricks
 
Satisfaction hadoop meetup presentation
Satisfaction hadoop meetup presentationSatisfaction hadoop meetup presentation
Satisfaction hadoop meetup presentationJerome Banks
 
Open source Geospatial Business Intelligence in action with GeoMondrian and S...
Open source Geospatial Business Intelligence in action with GeoMondrian and S...Open source Geospatial Business Intelligence in action with GeoMondrian and S...
Open source Geospatial Business Intelligence in action with GeoMondrian and S...Thierry Badard
 
Hybrid Databases - PHP UK Conference 22 February 2019
Hybrid Databases - PHP UK Conference 22 February 2019Hybrid Databases - PHP UK Conference 22 February 2019
Hybrid Databases - PHP UK Conference 22 February 2019Dave Stokes
 

Similar to Colorado State Address Dataset Automated Processing (20)

2013 GISCO Track, Quality Assessment and Improvement for Addressed Locations ...
2013 GISCO Track, Quality Assessment and Improvement for Addressed Locations ...2013 GISCO Track, Quality Assessment and Improvement for Addressed Locations ...
2013 GISCO Track, Quality Assessment and Improvement for Addressed Locations ...
 
Dirty Data? Clean it up! - Rocky Mountain DataCon 2016
Dirty Data? Clean it up! - Rocky Mountain DataCon 2016Dirty Data? Clean it up! - Rocky Mountain DataCon 2016
Dirty Data? Clean it up! - Rocky Mountain DataCon 2016
 
Handling the growth of data
Handling the growth of dataHandling the growth of data
Handling the growth of data
 
An Introduction to MapReduce
An Introduction to MapReduce An Introduction to MapReduce
An Introduction to MapReduce
 
Locality Sensitive Hashing By Spark
Locality Sensitive Hashing By SparkLocality Sensitive Hashing By Spark
Locality Sensitive Hashing By Spark
 
Introduction to PostgreSQL
Introduction to PostgreSQLIntroduction to PostgreSQL
Introduction to PostgreSQL
 
Ledingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @LendingkartLedingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @Lendingkart
 
Druid
DruidDruid
Druid
 
Big Data processing with Apache Spark
Big Data processing with Apache SparkBig Data processing with Apache Spark
Big Data processing with Apache Spark
 
Dirty data? Clean it up! - Datapalooza Denver 2016
Dirty data? Clean it up! - Datapalooza Denver 2016Dirty data? Clean it up! - Datapalooza Denver 2016
Dirty data? Clean it up! - Datapalooza Denver 2016
 
A Day in the Life of a Druid Implementor and Druid's Roadmap
A Day in the Life of a Druid Implementor and Druid's RoadmapA Day in the Life of a Druid Implementor and Druid's Roadmap
A Day in the Life of a Druid Implementor and Druid's Roadmap
 
Streamsets and spark at SF Hadoop User Group
Streamsets and spark at SF Hadoop User GroupStreamsets and spark at SF Hadoop User Group
Streamsets and spark at SF Hadoop User Group
 
Neo4j graph database
Neo4j graph databaseNeo4j graph database
Neo4j graph database
 
Using python to analyze spatial data
Using python to analyze spatial dataUsing python to analyze spatial data
Using python to analyze spatial data
 
Geospatial Options in Apache Spark
Geospatial Options in Apache SparkGeospatial Options in Apache Spark
Geospatial Options in Apache Spark
 
NoSQL for Artificial Intelligence
NoSQL for Artificial IntelligenceNoSQL for Artificial Intelligence
NoSQL for Artificial Intelligence
 
Satisfaction hadoop meetup presentation
Satisfaction hadoop meetup presentationSatisfaction hadoop meetup presentation
Satisfaction hadoop meetup presentation
 
Open source Geospatial Business Intelligence in action with GeoMondrian and S...
Open source Geospatial Business Intelligence in action with GeoMondrian and S...Open source Geospatial Business Intelligence in action with GeoMondrian and S...
Open source Geospatial Business Intelligence in action with GeoMondrian and S...
 
AS-STATS
AS-STATSAS-STATS
AS-STATS
 
Hybrid Databases - PHP UK Conference 22 February 2019
Hybrid Databases - PHP UK Conference 22 February 2019Hybrid Databases - PHP UK Conference 22 February 2019
Hybrid Databases - PHP UK Conference 22 February 2019
 

More from GeCo in the Rockies

Fusion of Geodesy and GIS at NOAA as NGS
Fusion of Geodesy and GIS at NOAA as NGSFusion of Geodesy and GIS at NOAA as NGS
Fusion of Geodesy and GIS at NOAA as NGSGeCo in the Rockies
 
Stone national spatial reference system heights
Stone national spatial reference system   heightsStone national spatial reference system   heights
Stone national spatial reference system heightsGeCo in the Rockies
 
Edwards frontier precision terrestrial imagingandmeasurement
Edwards frontier precision terrestrial imagingandmeasurementEdwards frontier precision terrestrial imagingandmeasurement
Edwards frontier precision terrestrial imagingandmeasurementGeCo in the Rockies
 
Siddle connecting surveying and mgis to mesa countys rtrn
Siddle connecting surveying and mgis to mesa countys rtrnSiddle connecting surveying and mgis to mesa countys rtrn
Siddle connecting surveying and mgis to mesa countys rtrnGeCo in the Rockies
 
Londe mobile devices appropriate uses
Londe mobile devices appropriate usesLonde mobile devices appropriate uses
Londe mobile devices appropriate usesGeCo in the Rockies
 
Lowry colorado state address dataset data quality
Lowry colorado state address dataset data qualityLowry colorado state address dataset data quality
Lowry colorado state address dataset data qualityGeCo in the Rockies
 
Vetter employee residence reports weld county
Vetter employee residence reports weld countyVetter employee residence reports weld county
Vetter employee residence reports weld countyGeCo in the Rockies
 
Caldwell community sustainability and land use policy
Caldwell community sustainability and land use policyCaldwell community sustainability and land use policy
Caldwell community sustainability and land use policyGeCo in the Rockies
 
Behunin and lasslo inexpensive mobile mapping solutions
Behunin and lasslo inexpensive mobile mapping solutionsBehunin and lasslo inexpensive mobile mapping solutions
Behunin and lasslo inexpensive mobile mapping solutionsGeCo in the Rockies
 

More from GeCo in the Rockies (20)

Fusion of Geodesy and GIS at NOAA as NGS
Fusion of Geodesy and GIS at NOAA as NGSFusion of Geodesy and GIS at NOAA as NGS
Fusion of Geodesy and GIS at NOAA as NGS
 
Stone national spatial reference system heights
Stone national spatial reference system   heightsStone national spatial reference system   heights
Stone national spatial reference system heights
 
Buck appgeo
Buck appgeoBuck appgeo
Buck appgeo
 
Edwards frontier precision terrestrial imagingandmeasurement
Edwards frontier precision terrestrial imagingandmeasurementEdwards frontier precision terrestrial imagingandmeasurement
Edwards frontier precision terrestrial imagingandmeasurement
 
Siddle connecting surveying and mgis to mesa countys rtrn
Siddle connecting surveying and mgis to mesa countys rtrnSiddle connecting surveying and mgis to mesa countys rtrn
Siddle connecting surveying and mgis to mesa countys rtrn
 
Stone four corners monument
Stone four corners monumentStone four corners monument
Stone four corners monument
 
Isaac esri living atlas
Isaac esri living atlasIsaac esri living atlas
Isaac esri living atlas
 
Londe mobile devices appropriate uses
Londe mobile devices appropriate usesLonde mobile devices appropriate uses
Londe mobile devices appropriate uses
 
Lowry colorado state address dataset data quality
Lowry colorado state address dataset data qualityLowry colorado state address dataset data quality
Lowry colorado state address dataset data quality
 
Lindemann arc gis forlocalgovt
Lindemann arc gis forlocalgovtLindemann arc gis forlocalgovt
Lindemann arc gis forlocalgovt
 
Duran here presentation
Duran here presentationDuran here presentation
Duran here presentation
 
Underwood esri serug
Underwood esri serugUnderwood esri serug
Underwood esri serug
 
Korris national map corps
Korris national map corpsKorris national map corps
Korris national map corps
 
Chamberlain hazus
Chamberlain hazusChamberlain hazus
Chamberlain hazus
 
Gup web mobilegis
Gup web mobilegisGup web mobilegis
Gup web mobilegis
 
Vetter employee residence reports weld county
Vetter employee residence reports weld countyVetter employee residence reports weld county
Vetter employee residence reports weld county
 
Caldwell community sustainability and land use policy
Caldwell community sustainability and land use policyCaldwell community sustainability and land use policy
Caldwell community sustainability and land use policy
 
Caldwell uas
Caldwell uasCaldwell uas
Caldwell uas
 
Gijselaers lights camerang911
Gijselaers lights camerang911Gijselaers lights camerang911
Gijselaers lights camerang911
 
Behunin and lasslo inexpensive mobile mapping solutions
Behunin and lasslo inexpensive mobile mapping solutionsBehunin and lasslo inexpensive mobile mapping solutions
Behunin and lasslo inexpensive mobile mapping solutions
 

Colorado State Address Dataset Automated Processing

  • 1. Colorado State Address Dataset Automated Processing Nathan Lowry, GIS Outreach Coordinator State of Colorado September 24, 2014
  • 2.
  • 3. Common Data Model ● Allows local and state-wide querying, analysis, and integration … ● Accommodates information exchanges ▪ Hierarchical - City to County, County to Region, Region to State ▪ Among neighboring jurisdictions (eg. County to County, etc.) ● Allows profiles to provide data in standard forms for specific objectives ▪ NENA CLDXF for NG-911 ▪ USPS Pub-28 for CASS ▪ ArcGIS Geocoding (for quality comparisons, etc.) ● It’s more efficient (less work) and assures more quality (less loss)
  • 4. FGDC-STD-016-2011 United States Thoroughfare, Landmark, and Postal Address Data Standard Of Greatest Significance: 1.Everything* is ‘fully explicit’ (fully spelled‐out) No abbreviations allowed; No Ambiguity *The only exception is two‐letter state postal codes (eg. “CO” = Colorado) ●2.You will express exactly how each address will be parsed Parsing is no longer subject to interpretation The break‐down is stored in the data for each record 3.Each Address must be assigned a Unique Identifier (UID) Multiple representations of the same address can be “tied together” if and only if (iff) addresses are assigned UIDs. These are big changes that few have yet implemented •Our common data model is designed to accommodate both: ‒your current state and ‒this “to be” state
  • 5. Presuppositions: ● SQL Server Integration Services (SSIS) o Parallel processing - fast translations - True. o Most Compatible with SQL Server - Irrelevant* o Developed by DBAs for DBAs - No, developed by app developers for app developers ▪ (ie. Normalization tools) - Hah, hah, hah, hah, hah! o No Additional Cost - (This one bore out) o I learned French instead of Spanish - (SSIS instead of Python) ● No Parsing o I will translate, but it’ll be the locals’ responsibility to pre-parse... - No parsing, no geocoding* o In addition, no last lines, no geocoding* ● 6-8 Weeks Processing - 6-8 Months of Processing
  • 7. Colorado State Address Dataset Automated and Manual Processes
  • 9. Observations ● SQL Server Integration Services (SSIS) ○ SSIS is quirky ○ SSIS Expression Language is Swahili ○ A modeling canvas may be more effective for design ○ SSIS can integrate with many other server processes (FTP) ● Parsing and “Last Lining” will give CO jurisdictions a leg up ○ The level of effort can be significant ○ CLDXF Street Naming and Address Numbering Conventions ● Standards ○ Jurisdictional pretypes, sequencers - minor tweaks ○ Subaddress conventions need ... something
  • 10. Opportunities ● Standards ○ Improvement via implementation ○ Coalescence on Subaddresses ● Common implementations of data models ○ Reduce the cost of development ○ Makes sharing of code useful and possible ● Common code ○ Shared parsing tools ○ Shared applications