Science the hell out of it
Kostas Stergiou
Institute of Marine Biological Resources and Inland Waters, Hellenic Centre for Marine Research, Athens, Greece
School of Biology, Aristotle University of Thessaloniki, Thessaloniki, Greece
Shiny new insightsBig OLd (and
new) Data
for what is not directly observed
Anchovy
Spawning groundsClimate models
Niche models
IBM
GLMs, GAMs
Classification Tree Analysis
Random Forests
Boosted Regression Trees
Multivariate Adaptive
Regression Splines
Surface Range Envelope
and eventually
to knowledge
information
One important issue is that Big OLd (and new)
Data (BOLD) allow us to construct cheap time
series…
A series of measurements of a variable
at equal time intervals
and the usual notation is
Yt,
for t = 1, 2, 3,…, K
0
200
400
600
800
1000
1200
1900 1920 1940 1960 1980 2000
Year
Zooplanktonbiomass
Why time series?
Because one measurement in time is not enough …
More points … better …
… but still not enough …
0
200
400
600
800
1000
1200
1900 1920 1940 1960 1980 2000
Year
Zooplanktonbiomass
Estimate of
mean and
variance
Increasing the time horizon reveals
many interesting features …
0
200
400
600
800
1000
1200
1900 1920 1940 1960 1980 2000
Year
Zooplanktonbiomass
Missing
point
Gap
Unusual event
or something
went wrong
…Cycles
and reveal
the
invisible
present
Ecology should
predict
Simple models
Explanations at two levels:
Proximate level (what factors
are responsible …)
Ultimate level (evolutionary
advantage)
Scientists should:
• end up with simple, predictive models
• that have explanations at both levels
(proximate, ultimate)
Thus, BOLD lead to shiny new insights that hopefully…
Scientists will be able to
e.g.
prove that what people
think is wrong
(see Davis 1971)
… and allow us to handle the ‘so what’ question…
Let see several
examples from various
disciplines, from fish,
fisheries and marine
ecosystems to
academia
1
FishBase
Research/Information tool
www.fishbase.org
1989
1995
1998
First publication citing
FishBase
Froese (1990) Fishbyte
FishBase is the modern tool …
Widened the scope of fisheries science
Global studies
Κnowledge Ιnformation
framework for
answering high-
order questions
0
500
1000
1500
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5
Length (log; cm)
Frequency
Upscaled fisheries science
Stergiou (2003)
Parameter
Biological
Oxygen consumption
Gill surface
Brain size
Body girth
Tail surface
Mouth area
Swimming speed
Acceleration
Manoeurability
(i.e. turning rate, radius, angle)
Demographic
Age
Weight
Length at maturity
Age at maturity
Fecundity
von Bertalanffy K
Ecological
Mortality
Trophic level
Prey length
Fisheries
Length at capture
Length at optimum exploitation
Management
Economic value
Sensitization of public
Body length (log or not)
Parameter(logornot)
FishBase - Quantification of the
biology and ecology of fishes
This is true
Length is the most
important
parameter related
to a variety of
other parameters
Predict the value
of other
parameters for
rare or not
studied species
Length
From existing
relationships
Otherparameter
FishBase
Patterns and
propensities in fish
biology and ecology
Quantification
Results of Bsc, MSc
or PhD theses
2
Climate change impacts on
Mediterranean
food-web structure
Use of the coupling of species distribution and trophic models for predicting
climate change impacts on food-web structure across the Mediterranean.
The authors
propose a new
framework
They used data from the Mediterranean continental shelf:
• published food webs
• stomach contents from literature and FishBase
• actual geographical distributions (occurrence maps) for 256
endemic and native coastal fish species
• Sea Surface water Temperature (SST) as the main forcing
variable
• the Mediterranean regional marine model (NEMOMED8) that
predicts observed and future SST based on a variety of
drivers
• projected SST values were extracted for 2080–2099 from
NEMOMED8 outputs, based on the SERS IPCC A2 scenario
• this scenario is conservative but not the most pessimistic.
number of feeding
links between fish
species would
decrease on 73.4%
trophic level
Vulnerability, the mean
number of consumer
species per prey
species, would
decrease
Differences in
various attributes
between
1961–1980
(baseline scenario)
2080–2099.
3
Annual probability maps and
habitat allocation of species
20
A series of solid work done within the framework of several projects of
IMBRIW with the participation of the different “Small pelagic” research
groups in the Mediterranean
EU FUNDED project, FP6 Framework
MARIFISH WP7: Collaborative Research Programmes
Regional Scale Study-The Mediterranean
MEDISEH
DG MARE Contract Service
REPROdUCE: MARIFISH Framework
IEO
IOF
• Mediterranean Acoustic Survey since 2008
• Internationally coordinated
• Simultaneous hydrographic sampling
Annual probability maps and habitat
allocation of the main small pelagic species
Data used to develop habitat maps:
Biological data (at different seasons/years/areas):
• Acoustic surveys (MEDIAS)
• Ichthyoplankton surveys
• Pelagic trawl data
Environmental data (satellites):
• Sea level anomaly
• SST, depth
• Chl-a
• Salinity
• Photosynthetic active radiation 21
Statistical models
+
Habitat maps at the
Mediterranean scale
Biological &
environmental data
SST inSST in ooCCSST inSST in ooCC
22o 23o 24o 25o 26o 27o
38o
39o
40o
41o
2004
22o 23o 24o 25o 26o 27o
38o
39o
40o
41o
22o 23o 24o 25o 26o 27o22o 23o 24o 25o 26o 27o
38o
39o
40o
41o
2004
-5 0 5 10 15 20 25 30 35 40
30
35
40
45
0.25 to 0.5
0.51 to 0.75
0.751 to 1
This has been done (in
different publications)
for:
• Anchovy
• Sardine
• Mediterranean horse mackerel
• Mackerel
Practical uses of habitat suitability maps
23
Evaluate existing
FRAs /MPAs
&
Define new
FRAs/MPAs
Provide input to IBMs
&
Examine climate change
scenario
Set a framework to
minimize discards
H2020 MINOW
Provide input to
Ecosystem models with
spatial perspective
Covariates in
other habitat
suitability models
4
Consumers and conservation
3858 recipes
No vulnerable, low
popularity in recipes
vulnerable,
popular in recipes
Numberofrecipes
Vulnerability
5
Academic tenure
Tenure leads to decreased productivity…. True?
Number of
publications is an
index of
productivity
Mean number of publications (N) of 2136 profs from 123
Universities from 15 countries
N
1996 2014 1996 2014
7
2
7
2
r2=0,95, p<0,0001
r2=0,97, p<0,0001
r2=0,92, p<0,0001
r2=0,81, p<0,0001
Wrong !!!!!!!!!!
6
Fame …
Fame is what it is known about a name.
Fame can be estimated ...
How? From the frequency a name e.g. appears in a book…
What is the most famous fish?
http://books.google.com/ngrams
Culturomics
Ένα χρυσόψαρο μέσα στη γυάλα
και μια γατούλα μούρλια θηλυκό
πρασινομάτα και κοκκινομάλα
αρχίσανε ένα παιχνιδάκι ερωτικό ...
Από την ταινία
Το Κοροϊδάκι της
δεσποινίδος (1960)
Frequency of common names goldfish, rainbow trout, swordfish, Atlantic
salmon, guppy English books 1800-2000.
0
120
Fame(infamons)
60
Year
1800 1840 1880 1920 1960 2000
Swordfish
Σολωμός
Guppy
Rainbow trout
Year
Fame
Goldfish has penetrated in all aspects of cultural life and Darwin has a
chapter dedicated to it…
(Darwin 1868, p. 296-297)
7
Fisheries and the cheapest
information available …
0
100
200
300
400
500
600
1928 1930 1932 1934 1936 1938
Year
Landingsint
Annual landings (in t)
of cephalopods,
1928-1939
(data from NSSH) 400
500
600
700
192819291930193119321933193419351936193719381939
Year
Numberoftrawlers
Although the NSSH officials told
me that there are no data prior to
1964 ….
….. But soon I found NSSH data for
effort and landings during 1928-
1939 !!!!!!!
(c)
0
2000
4000
6000
8000
10000
1964 1968 1972 1976 1980 1984 1988 1992 1996 2000
Year
Cephalopodlandings(int)
0
2000
4000
6000
8000
10000
1928 1930 1932 1934 1936 1938
Year
Cephalopodlandings(t)
Stergiou & Laskaratos 1993
And the list goes on …
Climate models
Niche models
IBM
GLMs, GAMs
Classification Tree
Analysis
Random Forests
Boosted Regression Trees
Multivariate Adaptive
Regression Splines
Surface Range Envelope
our ecoscope
Thank you for your time
Ευχαριστώ για το χρόνο σας

Science the hell out if - K. Stergiou

  • 1.
    Science the hellout of it Kostas Stergiou Institute of Marine Biological Resources and Inland Waters, Hellenic Centre for Marine Research, Athens, Greece School of Biology, Aristotle University of Thessaloniki, Thessaloniki, Greece
  • 2.
    Shiny new insightsBigOLd (and new) Data for what is not directly observed Anchovy Spawning groundsClimate models Niche models IBM GLMs, GAMs Classification Tree Analysis Random Forests Boosted Regression Trees Multivariate Adaptive Regression Splines Surface Range Envelope and eventually to knowledge information
  • 3.
    One important issueis that Big OLd (and new) Data (BOLD) allow us to construct cheap time series… A series of measurements of a variable at equal time intervals and the usual notation is Yt, for t = 1, 2, 3,…, K
  • 4.
    0 200 400 600 800 1000 1200 1900 1920 19401960 1980 2000 Year Zooplanktonbiomass Why time series? Because one measurement in time is not enough …
  • 5.
    More points …better … … but still not enough … 0 200 400 600 800 1000 1200 1900 1920 1940 1960 1980 2000 Year Zooplanktonbiomass Estimate of mean and variance
  • 6.
    Increasing the timehorizon reveals many interesting features … 0 200 400 600 800 1000 1200 1900 1920 1940 1960 1980 2000 Year Zooplanktonbiomass Missing point Gap Unusual event or something went wrong …Cycles and reveal the invisible present
  • 7.
    Ecology should predict Simple models Explanationsat two levels: Proximate level (what factors are responsible …) Ultimate level (evolutionary advantage) Scientists should: • end up with simple, predictive models • that have explanations at both levels (proximate, ultimate) Thus, BOLD lead to shiny new insights that hopefully…
  • 8.
    Scientists will beable to e.g. prove that what people think is wrong (see Davis 1971) … and allow us to handle the ‘so what’ question…
  • 9.
    Let see several examplesfrom various disciplines, from fish, fisheries and marine ecosystems to academia
  • 10.
  • 11.
    Research/Information tool www.fishbase.org 1989 1995 1998 First publicationciting FishBase Froese (1990) Fishbyte FishBase is the modern tool …
  • 12.
    Widened the scopeof fisheries science Global studies Κnowledge Ιnformation framework for answering high- order questions 0 500 1000 1500 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 Length (log; cm) Frequency Upscaled fisheries science Stergiou (2003)
  • 13.
    Parameter Biological Oxygen consumption Gill surface Brainsize Body girth Tail surface Mouth area Swimming speed Acceleration Manoeurability (i.e. turning rate, radius, angle) Demographic Age Weight Length at maturity Age at maturity Fecundity von Bertalanffy K Ecological Mortality Trophic level Prey length Fisheries Length at capture Length at optimum exploitation Management Economic value Sensitization of public Body length (log or not) Parameter(logornot) FishBase - Quantification of the biology and ecology of fishes This is true Length is the most important parameter related to a variety of other parameters
  • 14.
    Predict the value ofother parameters for rare or not studied species Length From existing relationships Otherparameter FishBase Patterns and propensities in fish biology and ecology Quantification Results of Bsc, MSc or PhD theses
  • 15.
    2 Climate change impactson Mediterranean food-web structure
  • 16.
    Use of thecoupling of species distribution and trophic models for predicting climate change impacts on food-web structure across the Mediterranean. The authors propose a new framework
  • 17.
    They used datafrom the Mediterranean continental shelf: • published food webs • stomach contents from literature and FishBase • actual geographical distributions (occurrence maps) for 256 endemic and native coastal fish species • Sea Surface water Temperature (SST) as the main forcing variable • the Mediterranean regional marine model (NEMOMED8) that predicts observed and future SST based on a variety of drivers • projected SST values were extracted for 2080–2099 from NEMOMED8 outputs, based on the SERS IPCC A2 scenario • this scenario is conservative but not the most pessimistic.
  • 18.
    number of feeding linksbetween fish species would decrease on 73.4% trophic level Vulnerability, the mean number of consumer species per prey species, would decrease Differences in various attributes between 1961–1980 (baseline scenario) 2080–2099.
  • 19.
    3 Annual probability mapsand habitat allocation of species
  • 20.
    20 A series ofsolid work done within the framework of several projects of IMBRIW with the participation of the different “Small pelagic” research groups in the Mediterranean EU FUNDED project, FP6 Framework MARIFISH WP7: Collaborative Research Programmes Regional Scale Study-The Mediterranean MEDISEH DG MARE Contract Service REPROdUCE: MARIFISH Framework IEO IOF • Mediterranean Acoustic Survey since 2008 • Internationally coordinated • Simultaneous hydrographic sampling Annual probability maps and habitat allocation of the main small pelagic species
  • 21.
    Data used todevelop habitat maps: Biological data (at different seasons/years/areas): • Acoustic surveys (MEDIAS) • Ichthyoplankton surveys • Pelagic trawl data Environmental data (satellites): • Sea level anomaly • SST, depth • Chl-a • Salinity • Photosynthetic active radiation 21
  • 22.
    Statistical models + Habitat mapsat the Mediterranean scale Biological & environmental data SST inSST in ooCCSST inSST in ooCC 22o 23o 24o 25o 26o 27o 38o 39o 40o 41o 2004 22o 23o 24o 25o 26o 27o 38o 39o 40o 41o 22o 23o 24o 25o 26o 27o22o 23o 24o 25o 26o 27o 38o 39o 40o 41o 2004 -5 0 5 10 15 20 25 30 35 40 30 35 40 45 0.25 to 0.5 0.51 to 0.75 0.751 to 1 This has been done (in different publications) for: • Anchovy • Sardine • Mediterranean horse mackerel • Mackerel
  • 23.
    Practical uses ofhabitat suitability maps 23 Evaluate existing FRAs /MPAs & Define new FRAs/MPAs Provide input to IBMs & Examine climate change scenario Set a framework to minimize discards H2020 MINOW Provide input to Ecosystem models with spatial perspective Covariates in other habitat suitability models
  • 24.
  • 25.
    3858 recipes No vulnerable,low popularity in recipes vulnerable, popular in recipes Numberofrecipes Vulnerability
  • 26.
  • 27.
    Tenure leads todecreased productivity…. True?
  • 28.
    Number of publications isan index of productivity Mean number of publications (N) of 2136 profs from 123 Universities from 15 countries N 1996 2014 1996 2014 7 2 7 2 r2=0,95, p<0,0001 r2=0,97, p<0,0001 r2=0,92, p<0,0001 r2=0,81, p<0,0001 Wrong !!!!!!!!!!
  • 29.
  • 30.
    Fame is whatit is known about a name. Fame can be estimated ... How? From the frequency a name e.g. appears in a book… What is the most famous fish?
  • 31.
  • 32.
    Ένα χρυσόψαρο μέσαστη γυάλα και μια γατούλα μούρλια θηλυκό πρασινομάτα και κοκκινομάλα αρχίσανε ένα παιχνιδάκι ερωτικό ... Από την ταινία Το Κοροϊδάκι της δεσποινίδος (1960)
  • 33.
    Frequency of commonnames goldfish, rainbow trout, swordfish, Atlantic salmon, guppy English books 1800-2000. 0 120 Fame(infamons) 60 Year 1800 1840 1880 1920 1960 2000 Swordfish Σολωμός Guppy Rainbow trout Year Fame Goldfish has penetrated in all aspects of cultural life and Darwin has a chapter dedicated to it…
  • 34.
  • 35.
    7 Fisheries and thecheapest information available …
  • 36.
    0 100 200 300 400 500 600 1928 1930 19321934 1936 1938 Year Landingsint Annual landings (in t) of cephalopods, 1928-1939 (data from NSSH) 400 500 600 700 192819291930193119321933193419351936193719381939 Year Numberoftrawlers Although the NSSH officials told me that there are no data prior to 1964 …. ….. But soon I found NSSH data for effort and landings during 1928- 1939 !!!!!!! (c) 0 2000 4000 6000 8000 10000 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 Year Cephalopodlandings(int) 0 2000 4000 6000 8000 10000 1928 1930 1932 1934 1936 1938 Year Cephalopodlandings(t)
  • 37.
  • 38.
    And the listgoes on …
  • 39.
    Climate models Niche models IBM GLMs,GAMs Classification Tree Analysis Random Forests Boosted Regression Trees Multivariate Adaptive Regression Splines Surface Range Envelope our ecoscope
  • 40.
    Thank you foryour time Ευχαριστώ για το χρόνο σας