11:20 Beyond the Query: Transforming Air Quality Data Discovery with AI (D Topping)
1.
Beyond the Query:Transforming Air
Quality Data Discovery with AI
David Topping
david.topping@manchester.ac.uk
2.
•Build a setof user
stories describing the needs of
users throughout their journey of
using environmental data.
•Build a set of user archetypes.
These describe what different
users are setting out to achieve,
what steps they go through to
achieve these, and important
challenges faced by these groups.
User needs mapping approach
Air quality within wider environmental science
https://www.digital-solutions.uk/wp-content/uploads/2023/09/NERC-DSH-Report-ODM-FINAL.pdf
Stakeholders - Who are we talking to?
3.
Several common challengesemerged, including:
• a lack of suitable data to satisfy the task in hand, with
many datasets being inaccessible due to lack of
discoverability, paywalls
• the proliferation and confusing nature of different
platforms that required bespoke access approaches,
inchoate data formats and methods of retrieval and
unclear provenance.
• Having to work with locked down systems due to security or
data protection requirements.
• a lack of a coherent and centralised data management
infrastructure, prevalence of legacy datasets and difficulty
in getting people to openly share data.
Unique Property Reference Number (UPRN)
for every addressable location across the
UK. May be any kind of building, or it may
be an object that might not have a 'normal'
address – such as a bus shelter. UPRN
tagging for geospatial data
https://satre-specification.readthedocs.io/en/stable/
Data data everywhere - Technologies and standards to enable health and air-quality data integration
Air quality within wider environmental science
4.
Multiple, uncoordinated sources— AURN, LA monitoring, research networks, private
sensors.
• No single point of discovery, no shared metadata standards.
• Responsibility for coordination is unclear — DEFRA, devolved administrations,
UKRI, LAs, research centres all play roles.
• Some datasets now Geotagged in e.g. BioBank and used 'as standard'
The Current AQ data Landscape — “Rich Data, Poor Access”?
We’re between stages 2 and 3 — rich data, some shared systems, but not yet discoverable or interoperable.
• Filing Cabinet Era 2. Shared Drive Era 3. Cloud Era 4. Federated Era
“Information is often held in people’s heads, not systems.” – NERC DSH user research
5.
The biggest friction
isn’tjust technical
— it’s cultural and
organisational.
Key
Challenges
Theme Issue Possible Solutions
6.
Data data everywhere- Technologies and standards to enable health and air-quality data integration
AI placing renewed emphasis on data access and potential solutions
7.
Taxonomy of AIregulatory approaches
flexibility
stricter
controls
• DSIT's “pro-innovation to AI regulation”
•
• EU’s AI Act
• Principles based
• Standards based
• Agile and experimentalist
• Facilitating and enabling
• Adapting existing laws
• Access to information and
transparency mandates
• Risk based
• Rights based
• Liability
"....due to digital divides within countries,
the development and use of specific AI
systems may likely produce enormous
returns for a few powerful people and
simultaneously generate significant
adverse effects for the general population
and marginalized
populations....legislators should discuss
and explore regulatory
instruments..including:
....
i. Access to data.
....
"
Data data everywhere - focus is on ensuring access to data
AI placing renewed emphasis on data access and potential solutions
8.
Data data everywhere- Leveraging AI to improve search and discovery
AI placing renewed emphasis on data access and potential solutions
metadata
catalogue
Opportunities — Towardsa Federated, Discoverable System
We don’t need a single owner — we need a consolidated way to find and use
air quality data
11.
Summary
Summary - lotsof positive work but do we need to do better in a number of areas?
Continuing demonstration of data science technologies in enabling data discovery.
Likely to continue and move to automated workflows.
There are existing barriers on data access - fundamental problem
• Cultural change needed
• AI will help with search and discovery - does not negate need for data provenance
wrt regulations and standards.
• Are roles and responsibilities clear? No
No longer an isolated academic area of work. Service provision evolving
• Partnerships with technology providers will be essential