The document discusses the history and development of hypertext and modular document structures. It describes Douglas Engelbart's 1968 demonstration of hypertext which allowed linking content in a non-linear way. It then outlines several efforts over the years to develop modular or structured approaches to representing content, breaking documents into independent cognitive units. These include XML-based formats as well as ontologies to represent meaningful relationships between these content modules.
To date, most digitisation of taxonomic literature has led to a more or less simple digital copy of a paper original – the output has effectively been an electronic copy of a traditional library. While this has increased accessibility of publications through internet access, for many scientific papers the means of indexing and locating them is much the same as with traditional libraries. OCR and born-digital papers allow use of web search engines to locate instances of taxon names and other terms, but OCR efficiency in recognising names is still relatively poor, people’s ability to use search engines effectively is mixed, and many papers cannot be directly searched. Instead of building digital analogues of traditional publications, we should consider what properties we require of future taxonomic information access. Ideally the content of each new digital publication should be accessible in the context of all previous published data, and the user able to retrieve nomenclatural, taxonomic and other data / information in the form required without having to scan all of the original paper and extract target content manually. This opens the door to dynamic linking of new content with extant systems – automatic population and updating of taxonomic catalogues, ZooBank and faunal lists, all descriptions of a taxon and its children instantly accessible with a single search, comparison of classifications used in different publications, and so on. The means to do this is currently marking up content into XML, the more atomised the mark-up the greater the possibilities for data retrieval and integration. Mark-up requires XML that accommodates the required content elements and is interoperable with other XML schemas, and there are now several written to do this, particularly TaxPub, taxonX and taXMLit, the last of these being the most atomised. Building on earlier systems for mark-up of legacy literature ViBRANT is developing a new workflow and seeking to increase the automated component of the process. Manual and automatic data and information retrieval is demonstrated by projects such as INOTAXA and Plazi. As we move to creating and using taxonomic products through the power of the internet, we need to ensure the output, while satisfying the requirements of the Code, is fit for purpose in the future.
To date, most digitisation of taxonomic literature has led to a more or less simple digital copy of a paper original – the output has effectively been an electronic copy of a traditional library. While this has increased accessibility of publications through internet access, for many scientific papers the means of indexing and locating them is much the same as with traditional libraries. OCR and born-digital papers allow use of web search engines to locate instances of taxon names and other terms, but OCR efficiency in recognising names is still relatively poor, people’s ability to use search engines effectively is mixed, and many papers cannot be directly searched. Instead of building digital analogues of traditional publications, we should consider what properties we require of future taxonomic information access. Ideally the content of each new digital publication should be accessible in the context of all previous published data, and the user able to retrieve nomenclatural, taxonomic and other data / information in the form required without having to scan all of the original paper and extract target content manually. This opens the door to dynamic linking of new content with extant systems – automatic population and updating of taxonomic catalogues, ZooBank and faunal lists, all descriptions of a taxon and its children instantly accessible with a single search, comparison of classifications used in different publications, and so on. The means to do this is currently marking up content into XML, the more atomised the mark-up the greater the possibilities for data retrieval and integration. Mark-up requires XML that accommodates the required content elements and is interoperable with other XML schemas, and there are now several written to do this, particularly TaxPub, taxonX and taXMLit, the last of these being the most atomised. Building on earlier systems for mark-up of legacy literature ViBRANT is developing a new workflow and seeking to increase the automated component of the process. Manual and automatic data and information retrieval is demonstrated by projects such as INOTAXA and Plazi. As we move to creating and using taxonomic products through the power of the internet, we need to ensure the output, while satisfying the requirements of the Code, is fit for purpose in the future.
COAR Next Generation Repositories WG - Text mining and Recommender system sto...petrknoth
One of the key aims of the COAR NGR group is to help us to overcome the challenges that still make it difficult to move beyond repositories as document silos. The group wants to see a globally interoperable network of repositories and global services built on top of repositories fulfilling the expectations of users in the 21st century. During this talk, I will address two use cases the COAR NGR working group aims to enable: text and data mining and recommender systems.
Use of probabilistic topic models to create scalable representation of documents aim to: (1) organize, summarise and search them, (2) explore them in a way that you can index of ideas contained in them, and (3) browse them in a way that you can find documents dealing specific areas
study or concern about what kinds of things exist
what entities there are in the universe.
the ontology derives from the Greek onto (being) and logia (written or spoken). It is a branch of metaphysics , the study of first principles or the root of things.
Library of Congress and Sears List of Subject Headings.pptxSubhajit Panda
Subject headings are the creative initiative of a librarian from his love to organise things. Technically, subject headings are the standardized words assigned to a concept. Using subject headings helps to decrease the “junk,” or irrelevant results. And it is based on the theme of the topic, not the words that appeared in the text. According to Charles A. Cutter, the most important subject cataloguing principle was a consideration of the best interest of the catalogue user. Cutter stated: “The convenience of the public is always to be set before the ease of the cataloguer”. This presentation focuses on the two most popular subject heading lists: Sears List of Subject Headings (SLSH) and Library of Congress Subject Headings (LCSH). Starting from an overview and background study, this project further nurtures the work architecture, functionalities, usage, advantages and disadvantages of each of the selected subject heading lists.
COAR Next Generation Repositories WG - Text mining and Recommender system sto...petrknoth
One of the key aims of the COAR NGR group is to help us to overcome the challenges that still make it difficult to move beyond repositories as document silos. The group wants to see a globally interoperable network of repositories and global services built on top of repositories fulfilling the expectations of users in the 21st century. During this talk, I will address two use cases the COAR NGR working group aims to enable: text and data mining and recommender systems.
Use of probabilistic topic models to create scalable representation of documents aim to: (1) organize, summarise and search them, (2) explore them in a way that you can index of ideas contained in them, and (3) browse them in a way that you can find documents dealing specific areas
study or concern about what kinds of things exist
what entities there are in the universe.
the ontology derives from the Greek onto (being) and logia (written or spoken). It is a branch of metaphysics , the study of first principles or the root of things.
Library of Congress and Sears List of Subject Headings.pptxSubhajit Panda
Subject headings are the creative initiative of a librarian from his love to organise things. Technically, subject headings are the standardized words assigned to a concept. Using subject headings helps to decrease the “junk,” or irrelevant results. And it is based on the theme of the topic, not the words that appeared in the text. According to Charles A. Cutter, the most important subject cataloguing principle was a consideration of the best interest of the catalogue user. Cutter stated: “The convenience of the public is always to be set before the ease of the cataloguer”. This presentation focuses on the two most popular subject heading lists: Sears List of Subject Headings (SLSH) and Library of Congress Subject Headings (LCSH). Starting from an overview and background study, this project further nurtures the work architecture, functionalities, usage, advantages and disadvantages of each of the selected subject heading lists.
Talk at the World Science Festival at Columbia, June 2, 2017: session on Big Data and Physics: http://www.worldsciencefestival.com/programs/big-data-future-physics/
Data Repositories: Recommendation, Certification and Models for Cost RecoveryAnita de Waard
Talk at NITRD Workshop "Measuring the Impact of Digital Repositories" February 28 – March 1, 2017 https://www.nitrd.gov/nitrdgroups/index.php?title=DigitalRepositories
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfPeter Spielvogel
Building better applications for business users with SAP Fiori.
• What is SAP Fiori and why it matters to you
• How a better user experience drives measurable business benefits
• How to get started with SAP Fiori today
• How SAP Fiori elements accelerates application development
• How SAP Build Code includes SAP Fiori tools and other generative artificial intelligence capabilities
• How SAP Fiori paves the way for using AI in SAP apps
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
UiPath Test Automation using UiPath Test Suite series, part 5DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 5. In this session, we will cover CI/CD with devops.
Topics covered:
CI/CD with in UiPath
End-to-end overview of CI/CD pipeline with Azure devops
Speaker:
Lyndsey Byblow, Test Suite Sales Engineer @ UiPath, Inc.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Introduction: The Past - Future of Research Communications
1. The
Future
of
Research
Communica3ons:
The
Past
Anita
de
Waard
Elsevier
Labs/UUtrecht
h@p://elsatglabs.com/labs/anita
2. New
Formats:Hypertext
Engelbart,
1968,
First
demo...
-‐ h9p://sloan.stanford.edu/MouseSite/1968Demo.html#player2
‘If,
in
your
office,
you,
as
an
intellectual
worker,
were
supplied
with
a
computer
display
backed
up
with
a
computer
that
was
alive
for
you
all
day,
and
was
instantly
responsible,
-‐
responsive,
hehe
-‐
how
much
value
would
you
derive
from
that?’
...and
first
demonstraOon
of
hypertext:
-‐ h9p://sloan.stanford.edu/MouseSite/1968Demo.html#player11
‘Content
represents
concepts,
but
there
is
also
a
rela+on
between
the
content
of
concepts,
their
structure,
and
the
structure
of
other
domains
of
human
thought,
that
is
too
complex
to
inves+gate
in
linear
text’
2
3. New
Formats:
Hypertext
Three
parts:
1.Modular
content
components
2.Meaningful
links
3.Claim
-‐>
evidence
networks
3
4. Hypertext,
1:
Modular
Content
Components
• Kircz,
’98:
“a
much
more
radical
approach
would
be
to
[break]
apart
the
linear
text
into
independent
modules,
each
with
its
own
unique
cogniOve
character.”
• Harmsze,
‘00:
modular
model
for
physics
papers
>
• XPharm,
2001:
modular
text
book
in
pharmacology
>>
• ABCDE
Format:
modular
computer
science
proceedings
paper
>>>
• LiquidPub,
2010:
Structured
Knowledge
Objects>>>>
• HCLS
Rhet
Doc:
Medium-‐grained
structure:
core
narraOve
components
^
• DoCo:
core
Document
Components
4
5. Hypertext,
1:
Modular
Content
Components
• Kircz,
’98:
“a
much
more
radical
approach
would
be
to
[break]
apart
the
linear
text
into
independent
modules,
each
with
its
own
unique
cogniOve
character.”
• Harmsze,
‘00:
modular
model
for
physics
papers
>
• XPharm,
2001:
modular
text
book
in
pharmacology
>>
• ABCDE
Format:
modular
computer
science
proceedings
paper
>>>
• LiquidPub,
2010:
Structured
Knowledge
Objects>>>>
• HCLS
Rhet
Doc:
Medium-‐grained
structure:
core
narraOve
components
^
• DoCo:
core
Document
Components
4
6. Hypertext,
1:
Modular
Content
Components
• Kircz,
’98:
“a
much
more
radical
approach
would
be
to
[break]
apart
the
linear
text
into
independent
modules,
each
with
its
own
unique
cogniOve
character.”
• Harmsze,
‘00:
modular
model
for
physics
papers
>
• XPharm,
2001:
modular
text
book
in
pharmacology
>>
• ABCDE
Format:
modular
computer
science
proceedings
paper
>>>
• LiquidPub,
2010:
Structured
Knowledge
Objects>>>>
• HCLS
Rhet
Doc:
Medium-‐grained
structure:
core
narraOve
components
^
• DoCo:
core
Document
Components
4
7. Hypertext,
1:
Modular
Content
Components
• Kircz,
’98:
“a
much
more
radical
approach
would
be
to
[break]
apart
the
Annotation
linear
text
into
independent
modules,
each
with
its
own
unique
cogniOve
character.”
• Harmsze,
‘00:
modular
model
for
physics
papers
>
• XPharm,
2001:
modular
text
book
in
pharmacology
>>
• ABCDE
Format:
modular
computer
science
proceedings
paper
>>>
• LiquidPub,
2010:
Structured
Knowledge
Objects>>>>
• HCLS
Rhet
Doc:
Medium-‐grained
structure:
core
narraOve
components
^
• DoCo:
core
Document
Components
4
8. Hypertext,
1:
Modular
Content
Components
• Kircz,
’98:
“a
much
more
radical
approach
would
be
to
[break]
apart
the
Annotation
linear
text
into
independent
modules,
each
with
its
own
unique
cogniOve
character.”
• Harmsze,
‘00:
modular
model
for
physics
papers
>
• XPharm,
2001:
modular
text
book
in
pharmacology
>>
• ABCDE
Format:
modular
computer
science
proceedings
paper
>>>
• LiquidPub,
2010:
Structured
Knowledge
Objects>>>>
• HCLS
Rhet
Doc:
Medium-‐grained
structure:
core
narraOve
components
^
• DoCo:
core
Document
Components
4
9. Hypertext,
1:
Modular
Content
Components
• Kircz,
’98:
“a
much
more
radical
approach
would
be
to
[break]
apart
the
Annotation
linear
text
into
independent
modules,
each
with
its
own
unique
cogniOve
character.”
• Harmsze,
‘00:
modular
model
for
physics
papers
>
• XPharm,
2001:
modular
text
book
in
pharmacology
>>
• ABCDE
Format:
modular
computer
science
proceedings
paper
>>>
• LiquidPub,
2010:
Structured
Knowledge
Objects>>>>
• HCLS
Rhet
Doc:
Medium-‐grained
structure:
core
narraOve
components
^
• DoCo:
core
Document
Components
4
10. Hypertext,
1:
Modular
Content
Components
• Kircz,
’98:
“a
much
more
radical
approach
would
be
to
[break]
apart
the
Annotation
linear
text
into
independent
modules,
each
with
its
own
unique
cogniOve
character.”
• Harmsze,
‘00:
modular
model
for
physics
papers
>
• XPharm,
2001:
modular
text
book
in
pharmacology
>>
• ABCDE
Format:
modular
computer
science
proceedings
paper
>>>
• LiquidPub,
2010:
Structured
Knowledge
Objects>>>>
• HCLS
Rhet
Doc:
Medium-‐grained
structure:
core
narraOve
components
^
• DoCo:
core
Document
Components
4
11. Hypertext,
2:
Meaningful
links
• Harmsze
(1999):
Ontology
of
content
relaOonships>
• IBIS,
ClaiMaker:
Linking
argumentaOonal
components
>>
• Diligent
argumentaOon
ontology
V
• RDF
does
allow
for
these
funcOonaliOes,
but
most
ontologies
are
sOll
based
on
SKOS?!
5
12. Hypertext,
2:
Meaningful
links
• Harmsze
(1999):
Ontology
of
content
relaOonships>
• IBIS,
ClaiMaker:
Linking
argumentaOonal
components
>>
• Diligent
argumentaOon
ontology
V
• RDF
does
allow
for
these
funcOonaliOes,
but
most
ontologies
are
sOll
based
on
SKOS?!
5
13. Hypertext,
2:
Meaningful
links
• Harmsze
(1999):
Ontology
of
content
relaOonships>
• IBIS,
ClaiMaker:
Linking
argumentaOonal
components
>>
• Diligent
argumentaOon
ontology
V
• RDF
does
allow
for
these
funcOonaliOes,
but
most
ontologies
are
sOll
based
on
SKOS?!
5
14. Hypertext,
2:
Meaningful
links
• Harmsze
(1999):
Ontology
of
content
relaOonships>
• IBIS,
ClaiMaker:
Linking
argumentaOonal
components
>>
• Diligent
argumentaOon
ontology
V
• RDF
does
allow
for
these
funcOonaliOes,
but
most
ontologies
are
sOll
based
on
SKOS?!
5
15. Hypertext,
2:
Meaningful
links
• Harmsze
(1999):
Ontology
of
content
relaOonships>
• IBIS,
ClaiMaker:
Linking
argumentaOonal
components
>>
• Diligent
argumentaOon
ontology
V
• RDF
does
allow
for
these
funcOonaliOes,
but
most
ontologies
are
sOll
based
on
SKOS?!
5
16. Hypertext,
3:
Claim-‐Evidence
Networks
• Special
case
of
modules
of
content
and
meaningful
relaOonships
• Buckingham
Shum,
1999:>
• SWAN:
Clark,
Ciccarese
et
al.,
2005:
>
• HypER:
6
groups
developing
prototypes
on
this
basis
(Harvard,
Oxford,
DERI,
KMI,
Utrecht,
SIOC)
• NanopublicaOons:
research
data
+
bit
of
knowledge
(see
also:
the
Present
and
the
Future)
6
17. Hypertext,
3:
Claim-‐Evidence
Networks
• Special
case
of
modules
of
content
and
meaningful
relaOonships
• Buckingham
Shum,
1999:>
• SWAN:
Clark,
Ciccarese
et
al.,
2005:
>
• HypER:
6
groups
developing
prototypes
on
this
basis
(Harvard,
Oxford,
DERI,
KMI,
Utrecht,
SIOC)
• NanopublicaOons:
research
data
+
bit
of
knowledge
(see
also:
the
Present
and
the
Future)
6
18. Hypertext,
3:
Claim-‐Evidence
Networks
• Special
case
of
modules
of
content
and
meaningful
relaOonships
• Buckingham
Shum,
1999:>
• SWAN:
Clark,
Ciccarese
et
al.,
2005:
>
• HypER:
6
groups
developing
prototypes
on
this
basis
(Harvard,
Oxford,
DERI,
KMI,
Utrecht,
SIOC)
• NanopublicaOons:
research
data
+
bit
of
knowledge
(see
also:
the
Present
and
the
Future)
6
19. Hypertext,
3:
Claim-‐Evidence
Networks
• Special
case
of
modules
of
content
and
meaningful
relaOonships
• Buckingham
Shum,
1999:>
• SWAN:
Clark,
Ciccarese
et
al.,
2005:
>
• HypER:
6
groups
developing
prototypes
on
this
basis
(Harvard,
Oxford,
DERI,
KMI,
Utrecht,
SIOC)
• NanopublicaOons:
research
data
+
bit
of
knowledge
(see
also:
the
Present
and
the
Future)
6
20. Hypertext,
3:
Claim-‐Evidence
Networks
• Special
case
of
modules
of
content
and
meaningful
relaOonships
• Buckingham
Shum,
1999:>
• SWAN:
Clark,
Ciccarese
et
al.,
2005:
>
• HypER:
6
groups
developing
prototypes
on
this
basis
(Harvard,
Oxford,
DERI,
KMI,
Utrecht,
SIOC)
• NanopublicaOons:
research
data
+
bit
of
knowledge
(see
also:
the
Present
and
the
Future)
6
22. So...
• The
basic
idea
has
been
around
since
the
60ies
• The
standards,
technologies
and
tools
have
been
around
since
the
nineOes
• But
(almost)
no
content
has
been
created
this
way
-‐
why?
7
23. So...
• The
basic
idea
has
been
around
since
the
60ies
• The
standards,
technologies
and
tools
have
been
around
since
the
nineOes
• But
(almost)
no
content
has
been
created
this
way
-‐
why?
• Let’s
look
at
the
history
of
the
other
breakout
topics
first:
–
Tools
and
standards
–
Business
models
–
Research
data
–
A9ribuOon
and
credit
7
24. Four
periods:
• 1960s
-‐
1980s,
Pre-‐Web:
Online
databases,
main
concepts
of
hypertext
• 1990-‐2000,
Web:
Preprint
servers,
web
ubiquitous;
‘era
of
standards’
• 2000
-‐
2005,
SemanOc
Web:
Seperate
content
from
presentaOon;
Open
Access
• 2005
-‐
2011:
Social
Web:
Crowdsourcing,
cloud
compuOng,
handhelds
1.What
happened?
2.What
stuck?
8
26. Tools
and
standards
• 1960s
-‐
1980s:
(La)TeX,
SGML,
Word,
WP
• 1990
-‐
2000:
XML,
SMIL,
XLink,
SVG,
CSS,
PDF,
MathML
• 2000
-‐
2005:
RDF;
Annotea,
Haystack,
SemanOc
Desktop
• 2005
-‐
2011:
LOD,
Provenance;
Twi9er,
Skype,
Google
Docs,
Github;
Utopia...
What
stuck,
and
why?
Some
thoughts:
• LaTeX,
MathML:
Fierce
community
of
adopters
who
like
UI
• Word,
PDF:
Commercial
interest
to
maintain
front
end
• XML,
html:
Shallower
learning
curve
than
SGML
• RDF
over
XLink:
‘SemanOc’
message:
world
was
ready?
• Social
media:
Simple
tools
to
express
basic
human
urge?
9
27. Business
models
• 1960s
-‐
1980s:
Publishing,
including
distribuOon,
is
in
hands
of
publishers
and
socie+es,
selling
to
libraries.
DIALOG
computers
allow
access
to
abstracts.
• 1990-‐2000:
ArXiV,
preprint
servers:
content
direct
to
end-‐users.
• 2000
-‐
2005:
BioMed
Central,
Faculty
1000,
PLoS,
Crea+ve
Commons
-‐
development
of
‘author-‐pays’,
‘peer-‐review
arer’
• 2005
-‐
2011:
Content
share/creaOon
is
ubiquitous.
Open
Data
movement.
10
28. Business
models
• 1960s
-‐
1980s:
Publishing,
including
distribuOon,
is
in
hands
of
publishers
and
socie+es,
selling
to
libraries.
DIALOG
computers
allow
access
to
abstracts.
• 1990-‐2000:
ArXiV,
preprint
servers:
content
direct
to
end-‐users.
• 2000
-‐
2005:
BioMed
Central,
Faculty
1000,
PLoS,
Crea+ve
Commons
-‐
development
of
‘author-‐pays’,
‘peer-‐review
arer’
• 2005
-‐
2011:
Content
share/creaOon
is
ubiquitous.
Open
Data
movement.
What
stuck,
and
why?
• Commercial
business
model
engrained
in
budgeOng
etc.
• SocieOes
and
‘author-‐pays’
models
also
become
publishers
• IndignaOon
drives
Open
Access
-‐
but
also
have
a
day
job
10
29. Research
Data
• 1960s
-‐
1980s:
Locally
stored,
except
for
CERN/DARPA
• 1990-‐2000:
Collaboratories:
CAST,
UARC,
Sloan
DSS,
DOE;
Digital
repositories:
ADS,
DBLP,
JSTOR,
Citeseer
• 2000
-‐
2005:
Workflows
&
Grids:
Taverna,
MyGrid,
GriPhyn
• 2005
-‐
2011:
MyExperiment,
Vistrails,
Dataverse,
Datacite,
‘The
Data
Journal’
11
30. Research
Data
• 1960s
-‐
1980s:
Locally
stored,
except
for
CERN/DARPA
• 1990-‐2000:
Collaboratories:
CAST,
UARC,
Sloan
DSS,
DOE;
Digital
repositories:
ADS,
DBLP,
JSTOR,
Citeseer
• 2000
-‐
2005:
Workflows
&
Grids:
Taverna,
MyGrid,
GriPhyn
• 2005
-‐
2011:
MyExperiment,
Vistrails,
Dataverse,
Datacite,
‘The
Data
Journal’
What
stuck,
and
why?
• Local
data
stores
are
centrally
(and
long-‐term)
funded
• ADS/DBLP/JSTOR
fulfill
a
need
for
domain-‐specific
access,
funded
by
‘invisible’
sources
• Workflow
tools
not
yet
ubiquitous
-‐
need
not
great
enough?
11
32. A@ribu3on
and
credit
• 1960s
-‐
1980s:
Impact
factor
• 1990-‐2000:
Citeseer,
DBLP
• 2000
-‐
2005:
H-‐Index,
Google
Scholar
• 2005
-‐
2011:
Blogs,
downloads,
‘Alt-‐metrics’
What
stuck,
and
why?
• Impact
factor:
direct
connecOon
to
author’s
fame
• Google
Scholar:
easy
UI,
‘Open’
image
• All
other
metric
measurements
are
not
yet
engrained
in
assessment
tradiOon
12
34. Summary:
some
factors
driving
support
• Commercial
support:
– Commercial
publishing:
great
financial
interest
– Word,
PDF:
investment
to
maintain
format
13
35. Summary:
some
factors
driving
support
• Commercial
support:
– Commercial
publishing:
great
financial
interest
– Word,
PDF:
investment
to
maintain
format
• Community
support:
– LaTeX:
Fierce
community
of
adopters
– Open
Access:
Social
indignaOon
13
36. Summary:
some
factors
driving
support
• Commercial
support:
– Commercial
publishing:
great
financial
interest
– Word,
PDF:
investment
to
maintain
format
• Community
support:
– LaTeX:
Fierce
community
of
adopters
– Open
Access:
Social
indignaOon
• Ease
of
use,
domain
relevance
-‐
user
friendliness:
– Google
Scholar:
model
known,
perceived
objecOvity
– DBLP,
ADS,
JSToR:
‘invisible’
funding,
domain-‐specificity
13
37. Summary:
some
factors
driving
support
• Commercial
support:
– Commercial
publishing:
great
financial
interest
– Word,
PDF:
investment
to
maintain
format
• Community
support:
– LaTeX:
Fierce
community
of
adopters
– Open
Access:
Social
indignaOon
• Ease
of
use,
domain
relevance
-‐
user
friendliness:
– Google
Scholar:
model
known,
perceived
objecOvity
– DBLP,
ADS,
JSToR:
‘invisible’
funding,
domain-‐specificity
• Academic
credit
depends
on
it:
– Impact
factor
– Grant
proposals
-‐
complex,
not
logical,
but
life
depends
on
it...
13
38. Summary:
some
factors
driving
support
• Commercial
support:
– Commercial
publishing:
great
financial
interest
– Word,
PDF:
investment
to
maintain
format
• Community
support:
Exercise:
Which
of
– LaTeX:
Fierce
community
of
adopters these
could
apply
– Open
Access:
Social
indignaOon to
hypertext
models?
• Ease
of
use,
domain
relevance
-‐
user
friendliness:
– Google
Scholar:
model
known,
perceived
objecOvity
– DBLP,
ADS,
JSToR:
‘invisible’
funding,
domain-‐specificity
• Academic
credit
depends
on
it:
– Impact
factor
– Grant
proposals
-‐
complex,
not
logical,
but
life
depends
on
it...
13