6. Goal: To automate and enhance the
content of monthly loads of the new version
(DTF 8.1) of the NSG into a fixed MS SQL
Server schema.
7. Problem: Source Data Mayhem
● The NSG is a conflated dataset formed from the LSG supplied by Local Authorities,
Network Rail and Highways England.
● Not all of the elements of the ‘standard’ are mandatory.
● Some suppliers contribute a different level of quality of data to others.
● It’s a hierarchical format, with streets being formed of ‘child’ segments.
● Much of the metadata about the geometry e.g. ‘One Way Exemptions’ needs to be
referenced to a geometry segment with a primary key.
● The NSG is supplied as CSV files.
8. Problem: It’s ALL about Attributes!
● The NSG is supplied as CSV files.
● LG = Street file
● AD = Additional street data
● Every row in each CSV has its own ‘Record Type’, this identifies the content and
means that each row potentially has a different schema to the next row.
Type 11 – Street Record
Type 12 – Street XREF (1:M - child of 11)
9. Achieving Automation
• Define schema and map to target SQL Server database (17
record types, all different schemas)
• AttributeFilter: Separate data streams for each record type
• VertexCreator and PointConnector: Build the geometry
• AttributeManager: Enhance the schema specific to YW
• Supplied a batch file to run the workspaces in sequence. Can
be scheduled to run monthly when new supply arrives
10. Achieving Quality
Custom transformer built by 1Spatial – the MessageLogger to alert Yorkshire
Water to specific data anomalies that we could trap:
• Zero length roads: Road segment start and end coordinates in same
location.
• Missing start or end: Street section resolves to a point geometry.
• No coordinates: Street section results in NULL geometry
11. Achieving a Common Schema
Transform the DTF schema
to conform to Yorkshire
Water’s database:
AttributeManager
Many AttributeManagers to
adjust attributes in each
flow.
14. Royal Canadian Mounted Police
E Division
Heidi Lee | Robert Shultz
Goal: Load GPS records into ArcGIS
Different hardware meant problems:
➔Inconsistent date formats
➔Time zones
➔ Daylight saving
Solution: TCLCaller & some TCL date functions
Dates and times are
complicated.
15. Formatting?
● YYMMDD, HHMMSS, UTC
● Jun 2016
● ‘on Saturday, Jan 9th 2016, 01:00 am’ & ‘+0530’
● 2016-12-07 12:20:07.785403-05
● 20160313020000.000 (March 13 - Daylight Saving)
● <d v="2016-12-13T00:00:00"/> (Excel)
● YYYY-MM-DD hh:mm:ss[.nnnnnnn] (SQL Server
‘datetime2’ value)
Calculations?
● Date2-Date1 = How many days?
21. Southern Company
Jeff DeWitt
HOK Inc.
David Baldacchino
Goals:
➔ Test for patterns in attribute values
➔ Extract substrings from attribute values
➔ Validate strings
2. Finding patterns
Fisher German
Seb Kingsley
RCMP E Div.
Heidi Lee
22. Southern Company
Problem: Attribute value cleanup
- MONTANA * or Sales/Other (1)
HOK Inc.
Problem: Extract Sheet numbers from file names
- MyProject - Sheet - A512 - PARTITION TYPES & …
- G001 - GENERAL NOTES, ABBREVIATIONS, SYMBOLS, ...
Fisher German
Problem: Validate address strings and find postcodes
- 14 High Street, Ashby, LE65 2UZ, UK
23. Postcode Hunting
14 High Street, Ashby, LE652UZ, UK
(GIR 0AA|[A-PR-UWYZ]([0-9][0-9A-HJKS-UW]
?|[A-HK-Y][0-9][0-9ABEHMNPRV-Y]?)[0-9]
[ABD-HJLNP-UW-Z]{2})
Address validation
Rue Achille Masset 52A
^((([a-zA-Z]+) )+)([0-9]+)([a-zA-Z]*)$
Wave the wand of regex
Regex is
awesome!
26. Regex vs. String Functions Example
Code ABD3705337067
Regular Expression: ([A-Z]{3})([0-9]+)
String Functions:
Attribute String Function
alpha @Left(@Value(Code),3)
beta @Substring(@Value(Code),3,-1)
27. Summary
1. Attribute transformation is a major part of ETL
work.
- Data migration relies on it.
- Use it to extract value (like )
2. DateTime transformers and Text Editor functions
help with:
○ Date/time formatting
○ Calculations
3. Regex and string functions help with pattern
matching.
28. Questions?
Learn more:
Tutorial: Data & Time Attributes
AttributeManager
Documentation
http://rubular.com/
David Eagle
Managing Consultant
FME Certified Professional & Trainer
david.eagle@1spatial.com
Twitter @david_eagle