Geospatial Big Data
Business Cases from proDataMarket
Dumitru Roman
dumitru.roman@sintef.no
Geospatial Big Data
Property Data and the proDataMarket project
Example Business Cases
Data Marketplace
http://www.millennium-project.org/
Geospatial Big Data
are
societal
opportunities
Geospatial Big Data
Raster Vector Sensors Mobile
It’s easier than ever to collect geospatial data,
but how can we exploit these geospatial big data?
Example: Property data
One of the most valuable datasets managed by
governments worldwide
Extensively used in various domains by private and
public organizations
Challenges in working with property data
• Difficult to access
• Cross-sectors
• Data is highly heterogeneous and possibly large
• Data quality
• Time-consuming integration
• Lack of innovation
• …
How can we innovate (and make money)
with property-related (Open) Data?
proDataMarket project goals
• To make property data more accessible,
more usable and easier to understand
• To make it easier for:
• Property data providers to publish and
distribute their data
• Data consumers to find and access
property data needed for their businesses
2.5 Years
(2015-2017)
€4.5M
20+
Datasets
proDataMarket deliveries
7
data-driven business
products and
services
1
data
marketplace
Example business case #1
Objective evaluation of the real estate properties
Business Intelligence companies
(e.g. Cerved)
Automation and cost-reduction in
property valuations, new services
Public administration Fact-driven social policy
Real estate agencies Speed up evaluation of properties, more
objective estimation of properties
Property buyers/sellers Eliminate intermediaries
Example business case #1 (cont’)
Objective evaluation of the real estate properties
in Italy, by
Istat Census
Snapshot of Italy, socio-
demographic data about: house
(its characteristics), people of the
family (personal data, education,
profession, work / study place)
people that live in house (guests)
OpenStreetMap
Point of interest of the city
about transport, downtown,
environment
Cadastral report
Property details (surface,
cadastral category, quality
status, age, ownership details)
~ 10M buildings
The evaluation of real estate
property
An up-to-date, objective evaluation
of the real estate properties in
territories in Italy
=
Market price €
++
=+
SYNTETIC INDEX ISTAT = -0.23 - (0.12 * UNEMPLOYED) + (0.2 * HISTORIC_BUILDINGS) + (0.58 * GRADUATES_ON_RESIDENTS) + (0.6 * STUDENTS_ON_RESIDENTS)
SYNTETIC INDEX POI = -0.5 + (0.15 * closest_metro_station) + (0.14 * closest_railway_station) + (0.24 * n_bus_stops_within_800m) +
(0.6 * n_small_green_areas_pois_within_800m) + (0.02 * n_pedestrian_paths_within_1000m) - (0.05 * closest_airport)
Example business case #1 (cont’)
Objective evaluation of the real estate properties
in Italy, by
Example business case #1 (cont’)
Objective evaluation of the real estate properties
in Italy, by
Sample technical challenges
Semantic data heterogeneity
How to translate a point of interest
into an OSM query?
How to retrieve data from the
whole Italy?
Structural data heterogeneity
How to compute indicators on
different data structures?
Messy data
How to exclude from computation
duplicated annotations of the same
real-world entities?
• Stakeholders:
• Public administration (e.g. FEGA
in Spain)
• Farmers and land owners
• Intermediaries (e.g. service
providers)
• Problems:
• Unfair grant assignment and
expenditure on audits
• Incorrect grant assignments
• Features defined subjectively
Example business case #2
Common Agriculture Policy (CAP) funds assignments
in Spain, by
Cadaster Information
Parcels and their features:
surfaces, limits, slope….
EFAs & LEs
Ecological focused areas and
Landscape elements accurately
defined using LIDAR
Satellite
Kind of crops, Health
status, Set aside zones,
Nitrogen fixing crops, CO2
fixing crops…
Accurately defined
CAP parameters objectively
defined, Automated process to
create new datasets related to
CAP Funds, Less errors, Less
audits and field visits…
=
CAP Funds
++
Fund assignment rules examples
• Crop Diversification
• Kind, density and surface of Ecological Focus Areas
• Conditionality
Example business case #2 (cont’)
Common Agriculture Policy (CAP) funds assignments
in Spain, by
4) There are patterns:
Groups, lines, isolated
trees, etc.
5) Trees in line, hedges
Non-aligned groups,
copses
6) A viewer
2) Classified points by
their height
1) Raw datasets,
just points
3) Points are grouped: Yellow
(soil), Green (trees), Orange
(bushes)
Example business case #2 (cont’)
Common Agriculture Policy (CAP) funds assignments
in Spain, by
Example business case #3 (cont’)
Augment Reality (AR) for Property-related Data
in Norway, by
AR for buildings AR for underground infrastructure
What’s the impact of a new
building on its surroundings?
Where are the underground pipes?
• A hard copy of 314 pages and as a PDF
file
• 6 Person-Months
• Data collection with spreadsheets
• Quality assurance through e-mails and
phone correspondence
Pains: Time consuming, Poor data quality,
Static report without live updating
• Live service
• Efficient sharing of data
• Simplified integration with
external datasets
• Live updating
• Reliable access
• …
• Risk and vulnerability analysis, e.g.
buildings affected by flooding
• Analysis of leasing prices
Report Reporting Service 3rd party services
Example business case #4
Reporting state-owned real estate properties
in Norway, by
https://datagraft.net 21
Linked Data Approach: DataGraft
DataGraft: Data Transformation and
Knowledge Graph Publication Process
• Interactive design of transformations
• Repeatable transformations
• Reuse/share transformations (user-
based access)
• Cloud-based deployment of
transformations
• Self-serviced process
• Data and Transformation as-a-Service
22
Transform
Generate
RDF
Ontology X
Ontology X
Ontology X
Ontology
mapping
RDF Graph
Raw Data Prepared Data
Map
Map
Semantic graph
database
Geospatial Data is BIGthing
Innovation with property-related
data in proDataMarket
Thank you!
Contact: dumitru.roman@sintef.no
http://prodatamarket.eu
https://datagraft.net
@prodatamarket
05.05.2016 25

Geospatial Big Data: Business Cases from proDataMarket

  • 1.
    Geospatial Big Data BusinessCases from proDataMarket Dumitru Roman dumitru.roman@sintef.no
  • 2.
    Geospatial Big Data PropertyData and the proDataMarket project Example Business Cases Data Marketplace
  • 3.
  • 4.
    Geospatial Big Data RasterVector Sensors Mobile It’s easier than ever to collect geospatial data, but how can we exploit these geospatial big data?
  • 5.
    Example: Property data Oneof the most valuable datasets managed by governments worldwide Extensively used in various domains by private and public organizations
  • 6.
    Challenges in workingwith property data • Difficult to access • Cross-sectors • Data is highly heterogeneous and possibly large • Data quality • Time-consuming integration • Lack of innovation • …
  • 7.
    How can weinnovate (and make money) with property-related (Open) Data?
  • 8.
    proDataMarket project goals •To make property data more accessible, more usable and easier to understand • To make it easier for: • Property data providers to publish and distribute their data • Data consumers to find and access property data needed for their businesses 2.5 Years (2015-2017) €4.5M 20+ Datasets
  • 9.
  • 10.
    Example business case#1 Objective evaluation of the real estate properties Business Intelligence companies (e.g. Cerved) Automation and cost-reduction in property valuations, new services Public administration Fact-driven social policy Real estate agencies Speed up evaluation of properties, more objective estimation of properties Property buyers/sellers Eliminate intermediaries
  • 11.
    Example business case#1 (cont’) Objective evaluation of the real estate properties in Italy, by Istat Census Snapshot of Italy, socio- demographic data about: house (its characteristics), people of the family (personal data, education, profession, work / study place) people that live in house (guests) OpenStreetMap Point of interest of the city about transport, downtown, environment Cadastral report Property details (surface, cadastral category, quality status, age, ownership details) ~ 10M buildings The evaluation of real estate property An up-to-date, objective evaluation of the real estate properties in territories in Italy = Market price € ++
  • 12.
    =+ SYNTETIC INDEX ISTAT= -0.23 - (0.12 * UNEMPLOYED) + (0.2 * HISTORIC_BUILDINGS) + (0.58 * GRADUATES_ON_RESIDENTS) + (0.6 * STUDENTS_ON_RESIDENTS) SYNTETIC INDEX POI = -0.5 + (0.15 * closest_metro_station) + (0.14 * closest_railway_station) + (0.24 * n_bus_stops_within_800m) + (0.6 * n_small_green_areas_pois_within_800m) + (0.02 * n_pedestrian_paths_within_1000m) - (0.05 * closest_airport) Example business case #1 (cont’) Objective evaluation of the real estate properties in Italy, by
  • 13.
    Example business case#1 (cont’) Objective evaluation of the real estate properties in Italy, by Sample technical challenges Semantic data heterogeneity How to translate a point of interest into an OSM query? How to retrieve data from the whole Italy? Structural data heterogeneity How to compute indicators on different data structures? Messy data How to exclude from computation duplicated annotations of the same real-world entities?
  • 14.
    • Stakeholders: • Publicadministration (e.g. FEGA in Spain) • Farmers and land owners • Intermediaries (e.g. service providers) • Problems: • Unfair grant assignment and expenditure on audits • Incorrect grant assignments • Features defined subjectively Example business case #2 Common Agriculture Policy (CAP) funds assignments in Spain, by
  • 15.
    Cadaster Information Parcels andtheir features: surfaces, limits, slope…. EFAs & LEs Ecological focused areas and Landscape elements accurately defined using LIDAR Satellite Kind of crops, Health status, Set aside zones, Nitrogen fixing crops, CO2 fixing crops… Accurately defined CAP parameters objectively defined, Automated process to create new datasets related to CAP Funds, Less errors, Less audits and field visits… = CAP Funds ++ Fund assignment rules examples • Crop Diversification • Kind, density and surface of Ecological Focus Areas • Conditionality Example business case #2 (cont’) Common Agriculture Policy (CAP) funds assignments in Spain, by
  • 16.
    4) There arepatterns: Groups, lines, isolated trees, etc. 5) Trees in line, hedges Non-aligned groups, copses 6) A viewer 2) Classified points by their height 1) Raw datasets, just points 3) Points are grouped: Yellow (soil), Green (trees), Orange (bushes) Example business case #2 (cont’) Common Agriculture Policy (CAP) funds assignments in Spain, by
  • 17.
    Example business case#3 (cont’) Augment Reality (AR) for Property-related Data in Norway, by AR for buildings AR for underground infrastructure What’s the impact of a new building on its surroundings? Where are the underground pipes?
  • 20.
    • A hardcopy of 314 pages and as a PDF file • 6 Person-Months • Data collection with spreadsheets • Quality assurance through e-mails and phone correspondence Pains: Time consuming, Poor data quality, Static report without live updating • Live service • Efficient sharing of data • Simplified integration with external datasets • Live updating • Reliable access • … • Risk and vulnerability analysis, e.g. buildings affected by flooding • Analysis of leasing prices Report Reporting Service 3rd party services Example business case #4 Reporting state-owned real estate properties in Norway, by
  • 21.
  • 22.
    DataGraft: Data Transformationand Knowledge Graph Publication Process • Interactive design of transformations • Repeatable transformations • Reuse/share transformations (user- based access) • Cloud-based deployment of transformations • Self-serviced process • Data and Transformation as-a-Service 22 Transform Generate RDF Ontology X Ontology X Ontology X Ontology mapping RDF Graph Raw Data Prepared Data Map Map Semantic graph database
  • 23.
    Geospatial Data isBIGthing Innovation with property-related data in proDataMarket
  • 24.
  • 25.