2. Quick FOSS4G Update
● Replaced ArcSDE with PostGIS
– Dev, QA (Prod TBA...)
● 'Non-GIS' users using QGIS
● Editable PostGIS Views!
● Data-driven cartography
– OSM data with styles saved in PostGIS
● Manager now using QGIS
3. SQL Server
Dev QA Prod
PostGIS
ArcGIS
Enterprise
maps.dpsk12.org
Database Structure
5. ETL Process Needs
● PostGIS read/write
● SQL Server Spatial read/write
● Daily Updating of Tables
● Daily Building of Datasets
● Daily Delivery to DPS Enterprise
6. Goals of ETL Development
● Break dependency on GUI-based tools
● Overcome 'other' FOSS ETL Tools
– Geokettle
– GDAL
● Avoid commercial ETL tools
– SSIS
– FME
7. Creating New Tables
● Import Shapefiles to Dev
– QGIS DB Manager
● Import Non-Spatial Tables
– CSVKit - Python via command line
– Read CSV Schema
● Generate SQL ‘Create Table’
13. Python for Databases
● Pypyodbc
– Pure python implementation of pyodbc
– Connect to databases using ODBC
● MS SQL Server
● Psycopg2
– PostgreSQL adapter (libraries) for Python
17. Python ETL Pattern
● Connect to databases
● Source
● Destination
● Set Up Cursors
● Select from Source
● Use SQL Expression (with spatial function)
● Assign data to rows (in memory)
● Insert into Destination
● Create insert statement with parameters
● Iterate through rows (data)
● Assign row values to variables
● Commit data with Insert
● Truncate Destination
18. Example: PostGIS to PostGIS
import psycopg2
connSource = psycopg2.connect("host=arcgisdev01 dbname=dpspgisdev user=dpsdata
password=*** ")
curSource = connSource.cursor()
connDest = psycopg2.connect("host=FOSS4GLin01 dbname=dpspgisqa user=dpsdata
password=*** ")
curDest = connDest.cursor()
curSource.execute('''
select addressid, cast(geom as varchar) from public."Address_Master"
''')
sql = ('''
insert into dqmt.Address_Master (addressid, geom) values (%s, %s)
''')
data = []
rows = curSource.fetchall()
for row in rows:
data = [row[0], row[1]]
curDest.execute (sql, data)
connDest.commit()
connSource.close()
connDest.close()
19. Deployed Processes
● Daily Active Students
– Extract from MSSQL View joining geometry to students
– Deliver to PostGIS and MSSQL
● Refresh Boundaries
– PostGIS Materialized Views
● Geocoding
● Enterprise Delivery
– Schools and Boundaries
– Shared Enrollment Zone Info
– Current Addresses and Boundary Information (spatial join)