3. The problem
• Researchers focus on a particular time
frame and scope for testing their
hypotheses.
• But the conclusions of the research are
projected to the future.
• Paradox: the work that predicts things for
tomorrow, becomes a snapshot of what
happened until today.
4. Proposed approach
• New data relevant to some hypotheses
gets continuously aggregated as time
passes.
• With common semantics, it can be
combined or related to other datasets.
• Represent the hypothesis as programs
that are executed repeatedly.
5. The method
• The case of study
– Lenten, L. J., & Moosa, I. A. (2003). An
empirical investigation into long-term climate
change in Australia. Environmental Modelling
& Software, 18(1), 59-70.
• The authors claim that the temperature
series has some a trend feature.
6. The method (II)
• Let’s find some data sources.
– ACORN-SAT, from the Australian Bureau of
Meteorology. This uses LD!!
– NOAA weather data, not in LD but easy to
parse…
• Periodically ingest data (e.g., into a
relational database)
• An R script checks if the trend on the data
has changed…
• Ingested data is semantically tagged…
7. Results
• We are checking for Lenten & Moosa’s
hypothesis every week.
– More extensive time scope.
– Wider geographical scope, to all data
available for Australia.
• The snapshot becomes a movie.
• Executable paper
8. Conclusions
• The tools we already have allows us to
use large-scale computation
infrastructures easily to support science.
– The agINFRA project
• Massive data ingestion.
• Data integration and interlinking.
• User-tailored service execution.
9. Strengths
• Data availability
– The data is ingested (from LD sources, but
not only) and published.
• Data interoperability
– The data is not stored by itself.
• Actionable data
– Ready to be addressed, used and generate
new actionable data.
11. Future work
• Further dataset interlinking
– More plural value for physical parameters.
– Dataset value error detection.
• Advance in hypothesis representation
– Machine readable research processes.