Data linking with kblog

•Download as PPTX, PDF•

0 likes•918 views

The document discusses using WordPress as a platform for data linking with knowledge blogs. It provides examples of data linking blogs created on ontogenesis and taverna that contain scientific articles. These blogs currently receive thousands of page views but have further room for growth. The document outlines features needed for bidirectional data linking and integrating data into research articles. It argues that WordPress and similar technologies can be part of the scientific publishing landscape by allowing primary scientific content in different forms from multiple authors.

Data linking with kblog Phillip Lord Newcastle University

The Long Tail http://en.wikipedia.org/wiki/File:La_Palmyre_041-crop.jpg

Example Data ID_REF VALUE 1007_s_at 2.867330709 1053_at 10.50302152 117_at 2.702517066 121_at 3.052316166 1255_g_at 2.278998026 1294_at 5.360226024 1316_at 5.496447322 1320_at 4.475412175 1405_i_at 2.301359647

The problem? http://en.wikipedia.org/wiki/File:Clock_in_Kings_Cross.jpg

The problem? http://en.wikipedia.org/wiki/File:Clock_in_Kings_Cross.jpg http://en.wikipedia.org/wiki/File:New_British_Coinage_2008.jpg

Coach Building 250,000 articles per year 240 million Downloads Cost: 1.5 Billion Euro Elsevier 17 million articles > 20 languages 365 million readers Total Cost: 10 million dollars Wikipedia http://commons.wikimedia.org/wiki/File:Hackney-coach,_about_1680.png

Wordpress Has one critical feature It has an edit dialog Word Latex Open Office Asciidoc Textile Markdown By email

Features Reviewing Metadata – coins, metatags * Crawlability * Multiple authors Archiving (UKWA) Searchability

Features Bi-directional links Permalinks (purls to follow) DOIs (datacite!) Versioning Extensibility Nice maths * (and mathjax) Syntax Highlighting Bibliographic Support (with DOIs, and incompletely CiTO) * ePUB and PDF (!?) export

Data Linking Bi-directional links require support at both ends Adding this generically Adding this for specific data sets (microarray) Data linking into papers

Old technology Most of this technology pre-exists So why don’t people use it! There is a good reason... TECHNOLOGY IS BORING

Content http://ontogenesis.knowledgeblog.org Now has 15k page views (not hits!) 25 articles, multiple authors Seeking pubmed inclusion Advertising: two blog articles about ontogenesis happened with 1 day of first article. http://taverna.knowledgeblog.org 10 articles About scientific workflows Supplement to myExperiment

Well... These stats are not going to scare either Elsevier or Wikipedia But, they are not bad either And it allows primary scientific content of many different forms We believe it can form part of the scientific landscape

Acknowledgements Phillip Lord (me!) Dan Swan Simon Cockell Robert Stevens (Manchester) Georgina Moulton (Manchester) Thanks also to JISC, David Shotton, BL, Datacite, and WordPress.

The latest statistics from WeChat place its monthly active users (MAU) at 700million, with audiences visiting the application upwards of 30 times per day. While follower numbers for most brands continue to grow, the honeymoon appears to be over. Signs are starting to emerge that follower growth rates for brand accounts are slowing. At the same time, the government has started to apply pressure to regulate H5 apps built onto WeChat. And Tencent itself is applying greater control over brand activities. Brands will have to employ more effective content strategies on WeChat moving forward. In this presentation we share our tips to help brands continue to grow by attracting/retaining audiences on WeChat.

20 Ideas for your Website Homepage Content

Barry Feldman

ESIP 2013 - Cleanweb: Leveraging IT to Drive Global Sustainability, Economic ...

Blake Burris

ESIP Summer Meeting 2013, Chapel Hill, NC Speakers: Blake Burris & Chris George Thursday, July 11, 2013 This is a presentation on Cleanweb to ESIP (Federation of Earth Science Information Partners) conference session: Sustainability in the context of Energy-Water Nexus, Climate Change and Extreme Events. More info: http://commons.esipfed.org/node/1463 Abstract: Capital efficient, quick-to-market, IT-based solutions are driving smarter, more efficient energy use, enable the sharing economy, accelerate the adoption of clean technologies, and spread more sustainable behaviors globally. Accelerated growth in mobile, social, sensors, processing power, big data analytics, and other information technologies is creating powerful new opportunities to address the world’s critical resource challenges. And just as information technology itself, the Cleanweb will continue to evolve in ways we can’t even imagine yet. The Cleanweb is a meme, a movement, a market, and a living, viral phenomenon − and perhaps the biggest impact and economic opportunity of our time. In this session, we will focus on inspiring citizens to tackle constraint issues via hackathons and similar events in coordination with the public and private sector. As part of the discussion, we will outline how a global community of hackers and activists (hacktivists) are able to engage organizations ranging from well-known international organizations, like C40, to large government agencies, like New York City, to multi-national corporations, like Facebook and Twitter. Come and build with us.

Getting Serious About A Community Bio Service Catalogue

BioCatalogue

Emerging technology trends for libraries for 2017

David King

Technology has changed the face of libraries, and is continuing to change how we work and how we deliver services to customers. This workshop introduces emerging technology trends and shows how those trends are reshaping library services. Examples are provided of how to incorporate these evolving trends into libraries. Attendees learn what trends to look for, find out the difference between a technology trend and a fad, and get ideas on how their library can respond to technology as it emerges.

Technology has changed the face of libraries and is continuing to change how we work and how we deliver services to customers. This workshop introduces emerging technology trends and shows how those trends are reshaping library services. Examples are provided of how to incorporate these evolving trends into libraries. Attendees learn what trends to look for, find out the difference between a technology trend and a fad, and get ideas on how their library can respond to technology as it emerges.

Working with data.open.ac.uk, the Linked Data Platform of the Open University

Mathieu d'Aquin

Detecting Off-Topic Web Pages at #CUWARC

Michele Weigle

A Global Commons for Scientific Data: Molecules and Wikidata

petermurrayrust

Tech Trends for Libraries in 2019 and Beyond

David King

Open (linked) bibliographic data

Edmund Chamberlain

Open (linked) bibliographic data edmund chamberlain (university of cambridge)RDTF-Discovery

Beyond MARC: MARC, linked data, and BibframeThomas Meehan

2013 DataCite Summer Meeting - Introducing DataCite services (Jan Brase - Dat...

datacite

URI Disambiguation in the Context of Linked Databutest

Who Will Archive the Archives? Thoughts About the Future of Web Archiving

Michael Nelson

Introduction to CrossRef for Publishers

Crossref

Umedia2011 - uP: A lightweight protocol for services in smart spaces

Fabricio Nogueira Buzeto

basic-engineering-circuit-analysis-10th-Irwin.pdf

AngelGabrielParianGa1

Alternative Search Mechanism for Web 2.0 Resourcesuji_geotec

Collaborating in the CloudsTom Ipri

SADI CSHALS 2013

Mark Wilkinson

Elizabeth Buie - Older adults: Are we really designing for our future selves?

Nexer Digital

Pushing the limits of ePRTC: 100ns holdover for 100 days

Adtran

Monitoring Java Application Security with JDK Tools and JFR Events

Ana-Maria Mihalceanu

The Metaverse and AI: how can decision-makers harness the Metaverse for their...

Jen Stirrup

The Metaverse is popularized in science fiction, and now it is becoming closer to being a part of our daily lives through the use of social media and shopping companies. How can businesses survive in a world where Artificial Intelligence is becoming the present as well as the future of technology, and how does the Metaverse fit into business strategy when futurist ideas are developing into reality at accelerated rates? How do we do this when our data isn't up to scratch? How can we move towards success with our data so we are set up for the Metaverse when it arrives? How can you help your company evolve, adapt, and succeed using Artificial Intelligence and the Metaverse to stay ahead of the competition? What are the potential issues, complications, and benefits that these technologies could bring to us and our organizations? In this session, Jen Stirrup will explain how to start thinking about these technologies as an organisation.

Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf

Paige Cruz

Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack. While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack. I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:

A tale of scale & speed: How the US Navy is enabling software delivery from l...

sonjaschweigert1

Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved: - Reduction in onboarding time from 5 weeks to 1 day - Improved developer experience and productivity through actionable findings and reduction of false positives - Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO) Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production. We will cover: - How to remove silos in DevSecOps - How to build efficient development pipeline roles and component templates - How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence) - How to streamline operations with automated policy checks on container images

LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...

DanBrown980551

Do you want to learn how to model and simulate an electrical network from scratch in under an hour? Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)! During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook. PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides: - A fully editable and extendable library for grid component modelling; - Visualization tools to display your network; - Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses; The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well. What you will learn during the webinar: - For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills; - For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.

By Design, not by Accident - Agile Venture Bolzano 2024

Pierluigi Pugliese

Quantum Computing: Current Landscape and the Future Role of APIs

Vlad Stirbu

Epistemic Interaction - tuning interfaces to provide information for AI support

Alan Dix

Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024 https://alandix.com/academic/papers/synergy2024-epistemic/ As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.

GraphRAG is All You need? LLM & Knowledge Graph

Guy Korland

Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs. 1. Unifying Large Language Models and Knowledge Graphs: A Roadmap. https://arxiv.org/abs/2306.08302 2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs: https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/

UiPath Test Automation using UiPath Test Suite series, part 4

DianaGray10

Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap. The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies. Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques What will you get from this session? 1. Insights into SAP testing best practices 2. Heatmap utilization for testing 3. Optimization of testing processes 4. Demo Topics covered: Execution from the test manager Orchestrator execution result Defect reporting SAP heatmap example with demo Speaker: Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP

Video Streaming: Then, Now, and in the Future

Alpen-Adria-Universität

In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.

Recently uploaded (20)

Elizabeth Buie - Older adults: Are we really designing for our future selves?

Pushing the limits of ePRTC: 100ns holdover for 100 days

Monitoring Java Application Security with JDK Tools and JFR Events

The Metaverse and AI: how can decision-makers harness the Metaverse for their...

Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf

zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...

PCI PIN Basics Webinar from the Controlcase Team

FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf

Free Complete Python - A step towards Data Science

Enhancing Performance with Globus and the Science DMZ

Removing Uninteresting Bytes in Software Fuzzing

A tale of scale & speed: How the US Navy is enabling software delivery from l...

LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...

By Design, not by Accident - Agile Venture Bolzano 2024

Quantum Computing: Current Landscape and the Future Role of APIs

Epistemic Interaction - tuning interfaces to provide information for AI support

GraphRAG is All You need? LLM & Knowledge Graph

UiPath Test Automation using UiPath Test Suite series, part 4

Video Streaming: Then, Now, and in the Future

Data linking with kblog

1. Data linking with kblog Phillip Lord Newcastle University

2. The Long Tail http://en.wikipedia.org/wiki/File:La_Palmyre_041-crop.jpg

3. Example Data ID_REF VALUE 1007_s_at 2.867330709 1053_at 10.50302152 117_at 2.702517066 121_at 3.052316166 1255_g_at 2.278998026 1294_at 5.360226024 1316_at 5.496447322 1320_at 4.475412175 1405_i_at 2.301359647

4. Example Data ID_REF VALUE 1007_s_at 2.867330709 1053_at 10.50302152 117_at 2.702517066 121_at 3.052316166 1255_g_at 2.278998026 1294_at 5.360226024 1316_at 5.496447322 1320_at 4.475412175 1405_i_at 2.301359647

5. Example Data ID_REF VALUE 1007_s_at 2.867330709 1053_at 10.50302152 117_at 2.702517066 121_at 3.052316166 1255_g_at 2.278998026 1294_at 5.360226024 1316_at 5.496447322 1320_at 4.475412175 1405_i_at 2.301359647

6. Example Data ID_REF VALUE 1007_s_at 2.867330709 1053_at 10.50302152 117_at 2.702517066 121_at 3.052316166 1255_g_at 2.278998026 1294_at 5.360226024 1316_at 5.496447322 1320_at 4.475412175 1405_i_at 2.301359647

7. The paper

8. The problem? http://en.wikipedia.org/wiki/File:Clock_in_Kings_Cross.jpg

9. The problem? http://en.wikipedia.org/wiki/File:Clock_in_Kings_Cross.jpg http://en.wikipedia.org/wiki/File:New_British_Coinage_2008.jpg

10. The problem? http://en.wikipedia.org/wiki/File:Clock_in_Kings_Cross.jpg http://en.wikipedia.org/wiki/File:New_British_Coinage_2008.jpg

11. Coach Building 250,000 articles per year 240 million Downloads Cost: 1.5 Billion Euro Elsevier 17 million articles > 20 languages 365 million readers Total Cost: 10 million dollars Wikipedia http://commons.wikimedia.org/wiki/File:Hackney-coach,_about_1680.png

12. The process

13. The process

14. The process

15. The process

16. The process

17. The process

18. The process

19. Our Solution

20. Wordpress Has one critical feature It has an edit dialog Word Latex Open Office Asciidoc Textile Markdown By email

21. Features Reviewing Metadata – coins, metatags * Crawlability * Multiple authors Archiving (UKWA) Searchability

22. Features Bi-directional links Permalinks (purls to follow) DOIs (datacite!) Versioning Extensibility Nice maths * (and mathjax) Syntax Highlighting Bibliographic Support (with DOIs, and incompletely CiTO) * ePUB and PDF (!?) export

23. Data Linking Bi-directional links require support at both ends Adding this generically Adding this for specific data sets (microarray) Data linking into papers

24. Old technology Most of this technology pre-exists So why don’t people use it! There is a good reason... TECHNOLOGY IS BORING

25. Content http://ontogenesis.knowledgeblog.org Now has 15k page views (not hits!) 25 articles, multiple authors Seeking pubmed inclusion Advertising: two blog articles about ontogenesis happened with 1 day of first article. http://taverna.knowledgeblog.org 10 articles About scientific workflows Supplement to myExperiment

26. Well... These stats are not going to scare either Elsevier or Wikipedia But, they are not bad either And it allows primary scientific content of many different forms We believe it can form part of the scientific landscape

27. Acknowledgements Phillip Lord (me!) Dan Swan Simon Cockell Robert Stevens (Manchester) Georgina Moulton (Manchester) Thanks also to JISC, David Shotton, BL, Datacite, and WordPress.

Editor's Notes

So, today I am going to talk about data linking with knowledge blog. Normally, talks start at the beginning. I thought to buck this trend and instead...
Start at the end....The long tail was mentioned yesterday. Much research data comes from individual research labsFrom individual researchers, each producing relatively small amounts of data, but collectivelyProducing a lot. So, long tail or big science?My field, bioinformatics, does both.
But the data from the long tail and big science is different. While big science generally produces Sequence data, which is generally all of the same type. The long tail doesn’t. For example, We start with microarray expression data. Then we have MIAME compliant metadata, An RNA degredation plot and finally a paper, in this case a random one that I found on PLoSYesterday. Of these, we have data standards for many parts – the second part, often called “metadata” even Though it isn’t, whichusesMIAME which is one of the older information content standards in Bioinformatics. To me, all of this is data. Without the later three, the “raw data” is just junk.
The paper is the richest form in terms of expressivity – is carries the most complex ideas, usesThe largest vocabulary. Also the least open to reuse, although in general it gives meaning to all the rest. And is the form of scientific data storage Which has changed the least
So, what is the problem. Well first the process of publishing is very time-consuming. Secondly, it’s very expensive. And finally, it’s a process where, to misquote Douglas AdamsWhich is so amazingly primitive that we still think PDFs are a pretty neat idea. But in general, this form of data capture only happens for the most cherry picked data. The positive data, the significant data, the data where the experiment worked. What aboutThe negative data, the insignificant, what about the standard operating procedure, what about the tutorialInformation and so on. This is not a small issue – the massive publication bias in biology hampersOur understanding of the way that organisms function. In medicine, people die because not through lack of knowledge, but because we cannot collate information that exists.
So, why is this the case. Well, scientific publishing is basically still at the stage of coach building.Consider these stats: the second biggest STM publisher in the world looks like this – and costs1.5 billion euros per annum. This is Elsevier. The biggest looks like this. It only costs 10 million dollars per annum. This is wikipedia.Is this comparison fair? Are the two equivalent? No, probably not, but they are not two orders Of magnitude different either.
Consider for example this process from one of the major publishers that I have Published with. I wrote my article in latex. I converted it to PDF. The website converted it to anotherPDF (which I had to check). The publishers then (and this is true) converted it to a word doc. From there, they turn it into XML, which was finally converted to HTML and, yes, you guessedIt, another PDF. Now, not only is this a waste of time, but it’s inaccurate. Errors happen. And trying to get Structured or data linked publications through this process. You might as well give up.
My solution.Wordpress. Actually, more importantly, commodity software. And by commodity, I mean commodity, and not research. There are some excellent tools from academia – widely used. Open Journal Systems, for example, powers6000 journals. Wordpress is behind 10% of ALL websites.
Why wordpress. Well, it has an edit dialog. But it’s not very good. But you can blog from word – I don’t think that is very good either. But, it is the way that itIs, it’s what people use. So wordpress fits in with peoples workflows. It supports everything. Nothing would ever convince me to add this level of support to a tool.
What other features are suitable for academic publishing. Well, here, we borrowed, stole and occasionally wrote our own. Reviewings – courtesy of EditFlow. Metadata, and crawlability features we added. Multiple authors we borrowed. These allow archiving – this comes from the UK web archive. Also searchability (google scholar)
Bi Directional links. As well as permalinks, it also supports legacy identifiers in the shape of DOIs --- thanks to datacite. And it’s extensible. So I added nice look maths (scalable, thanks to mathjax), syntax highlighting. Bibliographic support Exists . We can do typed linking, with CiTO (thanks to David Shotton), although clunkily at the moment. This will beImproved – also want to add client renderable – the user should choose the citation format. And finally, epub and even PDF export.
We also want to extend bi-directional linking – blogs do this out of the box, but support required at both ends.And finally we want to be able to embed the data directly into the paper.
So, why are people not doing this already. I’ve now spent a fair bit of time learning PhP, javascript. And whilePoking around in the innards of wordpress I have discovered something that I now reveal to you
Short articles, single author, example based articles.

Data linking with kblog

Recommended

Recommended

More Related Content

Similar to Data linking with kblog

Similar to Data linking with kblog (17)

Recently uploaded

Recently uploaded (20)

Data linking with kblog

Editor's Notes