Rapid Assembly of Geo-Centred Linked DataApplications Rapid Assembly of Geo-Centred Linked Data Applications Lucy Diamond, Research Scientist, Ordnance Survey 18/04/2012
About RAGLD• A collaborative project between Ordnance Survey, the University of Southampton and Seme4• Part-funded by the Technology Strategy Board‘s “Harnessing Large and Diverse Sources of Data” programme• 18 month long project. Started Oct 2011. Due to complete March 2013• Building tools to enable developers to make greater use of linked data
RAGLD builds on See UK (http://apps.seme4.com/see-uk/ )(and the sameAs service (www.sameas.org ))
Feedback wanted!• The designs of RAGLD components and services make use of the user requirements that we gathered from our questionnaire responses. User requirement summaries from the questionnaire responses can be seen on www.ragld.com• The purpose of this presentation is to run through the basic descriptions for the components and services that will be built for RAGLD, and to get feedback on their potential usefulness or applicability for linked data projects and activities
Project Milestones Text in pink for Work Packages completed or well under way (08/05/2012)• WP 1 - User requirements survey, development of design principles, identification of data sources, high-level architecture designs• WP 2 – Data integration components and services• WP 3 – Data-enabled components and services• WP 4 – Development of a technical demonstrator based on UK crime data analysis• WP 5 – Engagement (stakeholder interviews and design feedback), dissemination and exploitation
RAGLD High Level Component Architecture Accessing Data SPARQL Endpoint Normalisation Identity Management Tools/Services Tools/ServicesResolvable URIs Linked Data API Infrastructure Services Data Enhancement Visualisation Relationship Tools/Services Tools/Services Tools/Services Data Sources Aggregation Interpolation Spatial Operations Orchestration Mediation Metrics All of the components are able to interface with each other through a common interface specification, and can therefore be orchestrated by the infrastructure service to create workflows to fulfil the use cases identified from the user requirements analysis.
Work Package 2Data Integration components and services
Reconciliation Service• Reconciliation service for spreadsheets/Google Refine to recognise common codes/identifiers and translate to appropriate Linked Data URIs• In order to get into the Linked Data world, it is necessary to get from strings that identify things roughly to URIs that identify things properly.
48M URIs 17M DistinctThe Web of Data has many equivalent URIs. This sameAs service helps you manage co-refs between different data sets.
Relationship management services• sameAs - Enter a known URI, get back list of equivalent URIs• differentFrom - This is a partner service to sameAs - when anything is retracted from the sameAs service, it should be asserted into the differentFrom service. Before asserting into the sameAs service, the differentFrom should be consulted, and a warning or rejection given.• Co-reference identification services - Link-finding and co- reference, discovering relationships between datasets• More Generic Relationships - Generalisation of sameAs type service to store other kinds of relationship. Examples may include Contains, Within, Touches, Part Of, Overlaps, Near
Work Package 3Data-Enabled Components and Services
Spatial Query Services• Bounding box containment - An index service which will allow efficient queries to be made identifying coordinate points that reside within a given Bounding box. More sophisticated version potentially understands types of entities, e.g. “Find me the postcodes in this area, or the wards in this area”, etc.• Geometric queries - E.g. Coordinates to wards. Queries involving one type of geometry (e.g. point) to another (e.g. polygon), nearest services, co-ordinate data points that lies within given polygon intersection of two polygons, transect line “Which areas are contained within / touch / overlap with another given area?” “Is this point within any other spatial area?” “What is within this area of interest that I have defined?” “Can I generalise a larger area from these smaller?”• Free text search - This provides an auxiliary service, that indexes the store, and then enables pure text searches to be done over the external service.
Dataset transformation services• Co-ordinate transformation - E.g. convert lat long to National Grid• Statistical Transformations - Transformations such as changing units, and normalisations e.g. by population or area for regions. Aggregation and interpolation of region-based statistics “I have dataset expressed as X per Y but want it as X per Z” “I have population for wards but want to find population by school catchment area “• Geography to Geography (one set of abstract areas into another) - Convert between different levels of geography e.g. “Give me all the deaths in Hampshire if I know the deaths in all of the settlement regions”. - An enactment of statistical transformation operations.
Visualisation• Map showing regions of a specific type (e.g. Ward, LSOA)• Map showing coordinate points from a dataset as pins• Map showing region-based statistics with appropriate colouring• Area selection widget (in conjunction with geometric query)• Graph-based visualisations of dataset features• Pop-up “Info box” - give generic information about a point/area• Display custom regions / boundaries on map
Workflow management• Enactment engine Workflow style activity, using scripts that are manageable, editable and run-able at will. Means to extract data for the RAGLD services from other services• Dashboard (display components and services available, invocating sequence of services) This component gives the user a web view that enables them to observe and manage the RAGLD installation, configuring components, where does the data come from, go to, what services should be used, trouble spots...• File/resource management (interacts with the Dashboard) RAGLD will move data between services, stores and files, transforming it as it goes. Therefore, the user of a RAGLD installation needs the system to keep track of where the data is, data versions, and data relationships.
RAGLD contact for further informationMark Pendlington,Project Leader, RAGLDmark.email@example.comResearchOrdnance SurveyAdanac DriveSOUTHAMPTONUnited KingdomSO16 0ASPhone: +44 (0)2380055771