Oh, you're from Jersey? What exit?

1,087 views

Published on

Lessons learned bringing a large dataset in-house and where we are going from here.

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,087
On SlideShare
0
From Embeds
0
Number of Embeds
402
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • New Jersey’s Office of GIS is in the Office of information Technology under the Treasurer.We are 10 people and we: Support agency data and GIS application initiatives Develop applications internally Manage the State’s GIS infrastructure Coordinate the state’s geospatial activities with government agencies and commercial
  • We are primarily an Esri shop, but we are using other technologies to support our services
  • We are the data steward for a numberof the State’s framework datasets.
  • Tom Tom supported geocoding and routing had approximately 50.5k miles of roadsNJDOT supported linear referencing integration with their straight line diagram had approximately 41k miles of roads
  • As we built our own feature class, nearly 6,200 miles were added.The major data sources in the State’s dataset are: County-developed centerlines TIGER 2010 2007 & 2010 imagery State’s Parcel datasetNo commercial datasets are used as source data
  • Oh, you're from Jersey? What exit?

    1. 1. Oh, you’re from Jersey? What exit?Lessons learned bringing the roadsdataset in house and where we areheaded from here
    2. 2. A little background…
    3. 3. A little bit more background…
    4. 4. Last little bit…
    5. 5. Road Centerlines – in the past &
    6. 6. Road Centerlines - now
    7. 7. But getting it out to folks…
    8. 8. A bit of a change Feature Class Alternate Labeling LRS Values Names Table Table Table SEG_ID SEG_ID SEG_ID SEG_ID FULLNAME PRE_DIR L1_NAME SRI ADDR_L_FR PRE_TYPE L2_NAME SLD_MP_ST ADDR_L_TO PRE_MOD H1_NAME SLD_MP_END ADDR_R_FR NAME H1_SHLD PAR_MP_ST ADDR_R_TO SUF_TYPE H1_SUBSHLD* PAR_MP_END ZIP_L SUF_DIR H1_NUM FLP_MP_ST ZIP_R SUF_MOD H2_NAME FLP_MP_END MUNI_L H2_SHLD MMS_MP_ST MUNI_R H2_SUBSHLD* MMS_MP_END PRE_DIR H2_NUM TPK_MP_ST PRE_TYPE H3_NAME TPK_MP_END PRE_MOD H3_SHLD NAME H3_SUBSHLD* SUF_TYPE H3_NUM SUF_DIR SUF_MOD ROUTE_TYPE STATUS SURFACE PLUS: ZLEV_FR Teleatlas/TomTom Flat Format ZLEV_TO ACCESS Teleatlas/TomTom Coincident Geometry JURIS Tiburon Format Address Locator
    9. 9. The goal…. Repeatable Extract Repeatable Repeatable Load Transform
    10. 10. The solution….
    11. 11. The uber-language
    12. 12. Maybe not• 3 seconds per transaction• 1.1 million records• 3.3 million seconds = ~ 38 days
    13. 13. I’m not dead yet
    14. 14. Parallel Python
    15. 15. ppservers = ()if len(sys.argv) > 1: ncpus = int(sys.argv[1]) job_server = pp.Server(ncpus, ppservers=ppservers)else: job_server = pp.Server(ppservers=ppservers)
    16. 16. jobs = [(segment,job_server.submit(segment_handling,(segment.SEG_ID,segment.ADDR_L_FR,segment.ADDR_L_TO,segment.ADDR_R_FR,segment.ADDR_R_TO,segment.ZIP_L,segment.ZIP_R,segment.MUNIC_ID_L,segment.MUNIC_ID_L),(log, Road_Segment),("arcpy",)))for segment insegments]for segment, job in jobs: result = job() log(Segment ID: + str(result.segment_id) + n)
    17. 17. We did our best• 1.1 seconds/transaction• 1.1 million records• 1.2 seconds = ~ 14 days
    18. 18. What I learned…• Python is cool as all get out and powerful• Not a good transactional tool for that many records• I have a ton to learn• But we have code that does what we need it to in order to stand up the geocoding services for future use
    19. 19. Going back to basics• SDE spatial viewssdetable -o create_view -T SEG_NAME_ETL -t SEG_NAME,L_STREET_ABBR-c SEG_NAME.SEG_ID, SEG_NAME.NAME_TYPE_ID,SEGNAME.NAME_PRE_DIR, SEG_NAME.NAME_PRE_TYPE,SEG_NAME.NAME_PRE_MOD, SEG_NAME.NAME,SEG_NAME.NAME_SUF_TYPE, SEG_NAME.NAME_SUF_DIR,SEG_NAME.NAME_SUF_MOD -a SEG_ID, NAME_TYPE_ID, PRE_DIR,PRE_TYPE, PRE_MOD, NAME, SUF_TYPE, SUF_DIR, SUF_MOD -w {whereclause} -i 5157 -s ****.state.nj.us -u **** -p ****• Oracle table viewsSELECT (a whole bunch of columns)FROM ROAD.SEG_NAME_V, ROAD.SEGMENT_NJWHERE SEG_NAME_V.SEG_ID = SEGMENT_NJ.SEG_ID
    20. 20. Or so we thought…FULL OUTER JOINs not creating the m:1relationship in SDE views
    21. 21. Query Tables and Virtual ID’s• ArcPy, Python and Geoprocessing models• Virtual ID’s
    22. 22. So what’s next?Get it out there!Keep it up to date!
    23. 23. Sean McGinnise: sean.mcginnis@oit.state.nj.use: seankmcginnis@gmail.comt: @seankmcginnisw: www.georamblings.comw: www.seankmcginnis.com

    ×