RDA Fourth Plenary Keynote - Prof. Barend Mons, Biosemantics Group at Leiden University Medical Center and Head of Node of ELIXIR-NL - Keynote "Bringing Data to Broadway" - Monday 22nd Sept 2014, Amsterdam, the Netherlands
https://rd-alliance.org/plenary-meetings/fourth-plenary/plenary4-programme.html
II-SDV 2017: The Next Era: Deep Learning for Biomedical ResearchDr. Haxel Consult
Deep learning is hot, making waves, delivering results, and is somewhat of a buzzword today. There is a desire to apply deep learning to anything that is digital. Unlike the brain, these artificial neural networks have a very strict predefined structure. The brain is made up of neurons that talk to each other via electrical and chemical signals. We do not differentiate between these two types of signals in artificial neural networks. They are essentially a series of advanced statistics based exercises that review the past to indicate the likely future. Another buzzword that was used for the last few years across all industries is “big data”. In biomedical and health sciences, both unstructured and structured information constitute "big data". On the one hand deep learning needs lot of data whereas “big data" has value only when it generates actionable insight. Given this, these two areas are destined to be married. The couple is made for each other. The time is ripe now for a synergistic association that will benefit the pharmaceutical companies. It may be only a short time before we have vice presidents of machine learning or deep learning in pharmaceutical and biotechnology companies. This presentation will review the prominent deep learning methods and discuss these techniques for their usefulness in biomedical and health informatics.
IBC FAIR Data Prototype Implementation slideshowMark Wilkinson
Discussion about ways of achieving FAIRness of both metadata and data. Brute force approaches, and more elegant "projection" approaches are shown.
Relevant papers are at:
doi: 10.7717/peerj-cs.110 (https://peerj.com/articles/cs-110/)
doi: 10.3389/fpls.2016.00641 (https://doi.org/10.3389/fpls.2016.00641)
Spanish Ministerio de Economía y Competitividad grant number TIN2014-55993-R
smartAPIs: EUDAT Semantic Working Group Presentation @ RDA 9th PlenaryMark Wilkinson
smartAPIs are an approach to the incremental, machine-aided, semantic annotation of Web APIs. Starting from existing, popular standards, we will provide enhanced tools for authoring ever-richer metadata, guided by global community knowledge encapsulated in ontologies, and aided by "smart suggestions" based on mining the metadata from previous API specifications.
The project is led by Michel Dumontier (Maastricht University). This presentation was given on his behalf by Mark Wilkinson (UPM, Madrid; Spanish Ministerio de Economía y Competitividad grant number TIN2014-55993-R)
Data analysis software for upper atmospheric research. The software was written by JavaFX. The software can handle many kinds of upper atmospheric data observed by ground-based observation.
II-SDV 2017: The Next Era: Deep Learning for Biomedical ResearchDr. Haxel Consult
Deep learning is hot, making waves, delivering results, and is somewhat of a buzzword today. There is a desire to apply deep learning to anything that is digital. Unlike the brain, these artificial neural networks have a very strict predefined structure. The brain is made up of neurons that talk to each other via electrical and chemical signals. We do not differentiate between these two types of signals in artificial neural networks. They are essentially a series of advanced statistics based exercises that review the past to indicate the likely future. Another buzzword that was used for the last few years across all industries is “big data”. In biomedical and health sciences, both unstructured and structured information constitute "big data". On the one hand deep learning needs lot of data whereas “big data" has value only when it generates actionable insight. Given this, these two areas are destined to be married. The couple is made for each other. The time is ripe now for a synergistic association that will benefit the pharmaceutical companies. It may be only a short time before we have vice presidents of machine learning or deep learning in pharmaceutical and biotechnology companies. This presentation will review the prominent deep learning methods and discuss these techniques for their usefulness in biomedical and health informatics.
IBC FAIR Data Prototype Implementation slideshowMark Wilkinson
Discussion about ways of achieving FAIRness of both metadata and data. Brute force approaches, and more elegant "projection" approaches are shown.
Relevant papers are at:
doi: 10.7717/peerj-cs.110 (https://peerj.com/articles/cs-110/)
doi: 10.3389/fpls.2016.00641 (https://doi.org/10.3389/fpls.2016.00641)
Spanish Ministerio de Economía y Competitividad grant number TIN2014-55993-R
smartAPIs: EUDAT Semantic Working Group Presentation @ RDA 9th PlenaryMark Wilkinson
smartAPIs are an approach to the incremental, machine-aided, semantic annotation of Web APIs. Starting from existing, popular standards, we will provide enhanced tools for authoring ever-richer metadata, guided by global community knowledge encapsulated in ontologies, and aided by "smart suggestions" based on mining the metadata from previous API specifications.
The project is led by Michel Dumontier (Maastricht University). This presentation was given on his behalf by Mark Wilkinson (UPM, Madrid; Spanish Ministerio de Economía y Competitividad grant number TIN2014-55993-R)
Data analysis software for upper atmospheric research. The software was written by JavaFX. The software can handle many kinds of upper atmospheric data observed by ground-based observation.
A Template-Based Approach for Annotating Long-Tailed Datasetsdgarijo
An increasing amount of data is shared on the Web through heterogeneous spreadsheets and CSV files. In order to homogenize and query these data, the scientific community has developed Extract, Transform and Load (ETL) tools and services that help making these files machine readable in Knowledge Graphs (KGs). However, tabular data may be complex; and the level of expertise required by existing ETL tools makes it difficult for users to describe their own data. In this paper we propose a simple annotation schema to guide users when transforming complex tables into KGs. We have implemented our approach by extending T2WML, a table annotation tool designed to help users annotate their data and upload the results to a public KG. We have evaluated our effort with six non-expert users, obtaining promising preliminary results.
FOOPS!: An Ontology Pitfall Scanner for the FAIR principlesdgarijo
Slides presented at the DBpedia Day, at the Semantcis conference in 2021. FOOPS! (available at https://w3id.org/foops) is a validator based on the FAIR principles that will guide users when conforming their ontologies to them. For each principle, FOOPS! runs a series of tests and notifies errors, suggestions and ways to conform to the best practices.
Towards Knowledge Graphs of Reusable Research Software Metadatadgarijo
Research software is a key asset for understanding, reusing and reproducing results in computational sciences. An increasing amount of software is stored in code repositories, which usually contain human readable instructions indicating how to use it and set it up. However, developers and researchers often need to spend a significant amount of time to understand how to invoke a software component, prepare data in the required format, and use it in combination with other software. In addition, this time investment makes it challenging to discover and compare software with similar functionality. In this talk I will describe our efforts to address these issues by creating and using Open Knowledge Graphs that describe research software in a machine readable manner. Our work includes: 1) an ontology that extends schema.org and codemeta, designed to describe software and the specific data formats it uses; 2) an approach to publish software metadata as an open knowledge graph, linked to other Web of Data objects; and 3) a framework for automatically extracting metadata from software repositories; and 4) a framework to curate, query, explore and compare research software metadata in a collaborative manner. The talk will illustrate our approach with real-world examples, including a domain application for inspecting and discovering hydrology, agriculture, and economic software models; and the results of our framework when enriching the research software entries in Zenodo.org.
A Linked Data Prototype for the Union Catalog of Digital Archives Taiwanandrea huang
Linked data paradigm has provided the potential for any data to link or to be linked with structural information, internally and externally. To improve on current cultural
service of the Union Catalog of Digital Archives Taiwan (catalog.digitalarchives.tw), a linked data prototype is developed and benefited by extending the Art & Architecture Thesaurus (AAT) for a machine-understandable catalog service.
However, knowledge engineering is time and labor consuming, especially for an archive that is non-western based in culture and multidisciplinary in natural. This
makes data semantics of the UCdaT are extremely challenged for mapping to international standards and vocabularies.
At this stage, the triple store is an experimental addition to the existing Union Catalog of Digital Archives Taiwan architecture, and provides semantic links to target collections for relative suggestions. This will guide us in creating a future technical architecture that is scalable to the whole archive level, compliant with learning by doing
guidelines, and preserves the data even that is difficult to be understood fully at present, but at least to be linked by others that may provide third-party’s understandings for their own reuse.
Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...Stuart Chalk
Scientists are looking for ways to leverage web 2.0 technologies in the research laboratory and as a consequence a number of approaches to web-based electronic notebooks are being evaluated. In this presentation I discuss the Eureka Research Workbench, an electronic laboratory notebook built on semantic technology and XML. Using this approach the context of the information recorded in the laboratory can be captured and searched along with the data itself. A discussion of the current system is presented along with the next planned development of the framework and long-term plans relative to linked open data. Presented at the 246th American Chemical Society Meeting in Indianapolis, IN, USA on September 12th, 2013.
Keynote on software sustainability given at the 2nd Annual Netherlands eScience Symposium, November 2014.
Based on the article
Carole Goble ,
Better Software, Better Research
Issue No.05 - Sept.-Oct. (2014 vol.18)
pp: 4-8
IEEE Computer Society
http://www.computer.org/csdl/mags/ic/2014/05/mic2014050004.pdf
http://doi.ieeecomputersociety.org/10.1109/MIC.2014.88
http://www.software.ac.uk/resources/publications/better-software-better-research
II-PIC 2017: Artificial Intelligence, Machine Learning, And Deep Neural Netwo...Dr. Haxel Consult
Parthiban Srinivasan (VINGYANI, India)
When new technologies become easier to use, they transform industries. That's what's happening with artificial intelligence (AI) and big data. Machine learning is often described as a type of AI where computers learn to do something without being programmed to do it. Deep learning, a subset of machine learning, is proving to work especially well on classification. Big breakthroughs happen when what is suddenly possible meets what is desperately needed. For years, patent analysts have been searching and reviewing terabytes of information, not only patents but also non-patent information. Not only to find prior art but also to identify patents of interest, rate their quality, assess the potential value of patent clusters, and identify potential business partners or infringers. With the rapid increase in the number of patent documents worldwide, demand for their automatic clustering/categorization has grown significantly. Many information science researchers have started to experiment with machine learning tools, but the adoption in the patent information space has been sporadic. In this talk, we aim to review the prevailing machine learning techniques and present several sample implementations by various research groups. We will also discuss how data science compares with machine learning, deep learning, AI, statistics and applied mathematics.
FAIR Workflows: A step closer to the Scientific Paper of the Futuredgarijo
Keynote presented at the Computational and Autonomous Workflows (CAW-2021) at the Oak Ridge National Laboratory. The keynote describes an overview of the different aspects to take into account when aiming to create FAIR workflows and associated resources.
ICIC 2017: The Next Era: Deep Learning for Biomedical ResearchDr. Haxel Consult
Srinivasan Parthiban (VINGYANI, India)
Deep learning is hot, making waves, delivering results, and is somewhat of a buzzword today. There is a desire to apply deep learning to anything that is digital. Unlike the brain, these artificial neural networks have a very strict predefined structure. The brain is made up of neurons that talk to each other via electrical and chemical signals. We do not differentiate between these two types of signals in artificial neural networks. They are essentially a series of advanced statistics based exercises that review the past to indicate the likely future. Another buzzword that was used for the last few years across all industries is “big data”. In biomedical and health sciences, both unstructured and structured information constitute "big data". On the one hand deep learning needs lot of data whereas “big data" has value only when it generates actionable insight. Given this, these two areas are destined to be married. The couple is made for each other. The time is ripe now for a synergistic association that will benefit the pharmaceutical companies. It may be only a short time before we have vice presidents of machine learning or deep learning in pharmaceutical and biotechnology companies. This presentation will review the prominent deep learning methods and discuss these techniques for their usefulness in biomedical and health informatics.
Presentation of the "Coming to terms to FAIR semantics" paper for 22nd International Conference on Knowledge Engineering and Knowledge Management (EKAW 2020).
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphsdgarijo
In this presentation we describe the Ontology-Based APIs framework (OBA), our approach to automatically create REST APIs from ontologies while following RESTful API best
practices. Given an ontology (or ontology network) OBA uses standard technologies familiar to web developers (OpenAPI Specification, JSON) and combines them with W3C standards (OWL, JSON-LD frames and SPARQL) to create maintainable APIs with documentation, units tests, automated validation of resources and clients (in Python, Javascript, etc.) for non Semantic Web experts to access the contents of a target
knowledge graph. We showcase OBA with three examples that illustrate the capabilities of the framework for different ontologies.
An increasing number of researchers rely on computational methods to generate the results described in their publications. Research software created to this end is heterogeneous (e.g., scripts, libraries, packages, notebooks, etc.) and usually difficult to find, reuse, compare and understand due to its disconnected documentation (dispersed in manuals, readme files, web sites, and code comments) and a lack of structured metadata to describe it. In this talk I will describe the main challenges for finding, comparing and reusing research software, how structured metadata can help to address some of them, which are the best practices being proposed by the community; and current initiatives to aid their adoption by researchers within EOSC.
Impact: The talk addresses an important aspect of the EOSC infrastructure for quality research software by ensuring that software contributed to the EOSC ecosystem can be found, compared and reused by researchers. The talk also aims to address metadata quality of current research products, which is critical for successful adoption.
Presented at the EOSC symposium
Barend Mons slides from #ISMB 2014: Trends in data publishing. Talk 3 in the "What Bioinformaticians need to know about digital publishing beyond the PDF2" workshop at ISMB 2014, Boston, 16th July 2014
A number of recent milestones in AI have rekindled the faith that human-grade computer intelligence can fuel the next technological revolution. In parallel and almost independently, the job role of Data Scientist rose to one of the hottest tickets in the technology sector. Despite the obvious overlap in the domains of Data Science and Artificial Intelligence, the two approaches are sufficiently distinct that choosing the wrong one might trigger a product to fail or a hiring process to go wrong. This presentation will offer some clarity and best practices with regards to understanding what data analysis requirements you really have, as what opposed to what you think you have.
A Template-Based Approach for Annotating Long-Tailed Datasetsdgarijo
An increasing amount of data is shared on the Web through heterogeneous spreadsheets and CSV files. In order to homogenize and query these data, the scientific community has developed Extract, Transform and Load (ETL) tools and services that help making these files machine readable in Knowledge Graphs (KGs). However, tabular data may be complex; and the level of expertise required by existing ETL tools makes it difficult for users to describe their own data. In this paper we propose a simple annotation schema to guide users when transforming complex tables into KGs. We have implemented our approach by extending T2WML, a table annotation tool designed to help users annotate their data and upload the results to a public KG. We have evaluated our effort with six non-expert users, obtaining promising preliminary results.
FOOPS!: An Ontology Pitfall Scanner for the FAIR principlesdgarijo
Slides presented at the DBpedia Day, at the Semantcis conference in 2021. FOOPS! (available at https://w3id.org/foops) is a validator based on the FAIR principles that will guide users when conforming their ontologies to them. For each principle, FOOPS! runs a series of tests and notifies errors, suggestions and ways to conform to the best practices.
Towards Knowledge Graphs of Reusable Research Software Metadatadgarijo
Research software is a key asset for understanding, reusing and reproducing results in computational sciences. An increasing amount of software is stored in code repositories, which usually contain human readable instructions indicating how to use it and set it up. However, developers and researchers often need to spend a significant amount of time to understand how to invoke a software component, prepare data in the required format, and use it in combination with other software. In addition, this time investment makes it challenging to discover and compare software with similar functionality. In this talk I will describe our efforts to address these issues by creating and using Open Knowledge Graphs that describe research software in a machine readable manner. Our work includes: 1) an ontology that extends schema.org and codemeta, designed to describe software and the specific data formats it uses; 2) an approach to publish software metadata as an open knowledge graph, linked to other Web of Data objects; and 3) a framework for automatically extracting metadata from software repositories; and 4) a framework to curate, query, explore and compare research software metadata in a collaborative manner. The talk will illustrate our approach with real-world examples, including a domain application for inspecting and discovering hydrology, agriculture, and economic software models; and the results of our framework when enriching the research software entries in Zenodo.org.
A Linked Data Prototype for the Union Catalog of Digital Archives Taiwanandrea huang
Linked data paradigm has provided the potential for any data to link or to be linked with structural information, internally and externally. To improve on current cultural
service of the Union Catalog of Digital Archives Taiwan (catalog.digitalarchives.tw), a linked data prototype is developed and benefited by extending the Art & Architecture Thesaurus (AAT) for a machine-understandable catalog service.
However, knowledge engineering is time and labor consuming, especially for an archive that is non-western based in culture and multidisciplinary in natural. This
makes data semantics of the UCdaT are extremely challenged for mapping to international standards and vocabularies.
At this stage, the triple store is an experimental addition to the existing Union Catalog of Digital Archives Taiwan architecture, and provides semantic links to target collections for relative suggestions. This will guide us in creating a future technical architecture that is scalable to the whole archive level, compliant with learning by doing
guidelines, and preserves the data even that is difficult to be understood fully at present, but at least to be linked by others that may provide third-party’s understandings for their own reuse.
Eureka Research Workbench: A Semantic Approach to an Open Source Electroni...Stuart Chalk
Scientists are looking for ways to leverage web 2.0 technologies in the research laboratory and as a consequence a number of approaches to web-based electronic notebooks are being evaluated. In this presentation I discuss the Eureka Research Workbench, an electronic laboratory notebook built on semantic technology and XML. Using this approach the context of the information recorded in the laboratory can be captured and searched along with the data itself. A discussion of the current system is presented along with the next planned development of the framework and long-term plans relative to linked open data. Presented at the 246th American Chemical Society Meeting in Indianapolis, IN, USA on September 12th, 2013.
Keynote on software sustainability given at the 2nd Annual Netherlands eScience Symposium, November 2014.
Based on the article
Carole Goble ,
Better Software, Better Research
Issue No.05 - Sept.-Oct. (2014 vol.18)
pp: 4-8
IEEE Computer Society
http://www.computer.org/csdl/mags/ic/2014/05/mic2014050004.pdf
http://doi.ieeecomputersociety.org/10.1109/MIC.2014.88
http://www.software.ac.uk/resources/publications/better-software-better-research
II-PIC 2017: Artificial Intelligence, Machine Learning, And Deep Neural Netwo...Dr. Haxel Consult
Parthiban Srinivasan (VINGYANI, India)
When new technologies become easier to use, they transform industries. That's what's happening with artificial intelligence (AI) and big data. Machine learning is often described as a type of AI where computers learn to do something without being programmed to do it. Deep learning, a subset of machine learning, is proving to work especially well on classification. Big breakthroughs happen when what is suddenly possible meets what is desperately needed. For years, patent analysts have been searching and reviewing terabytes of information, not only patents but also non-patent information. Not only to find prior art but also to identify patents of interest, rate their quality, assess the potential value of patent clusters, and identify potential business partners or infringers. With the rapid increase in the number of patent documents worldwide, demand for their automatic clustering/categorization has grown significantly. Many information science researchers have started to experiment with machine learning tools, but the adoption in the patent information space has been sporadic. In this talk, we aim to review the prevailing machine learning techniques and present several sample implementations by various research groups. We will also discuss how data science compares with machine learning, deep learning, AI, statistics and applied mathematics.
FAIR Workflows: A step closer to the Scientific Paper of the Futuredgarijo
Keynote presented at the Computational and Autonomous Workflows (CAW-2021) at the Oak Ridge National Laboratory. The keynote describes an overview of the different aspects to take into account when aiming to create FAIR workflows and associated resources.
ICIC 2017: The Next Era: Deep Learning for Biomedical ResearchDr. Haxel Consult
Srinivasan Parthiban (VINGYANI, India)
Deep learning is hot, making waves, delivering results, and is somewhat of a buzzword today. There is a desire to apply deep learning to anything that is digital. Unlike the brain, these artificial neural networks have a very strict predefined structure. The brain is made up of neurons that talk to each other via electrical and chemical signals. We do not differentiate between these two types of signals in artificial neural networks. They are essentially a series of advanced statistics based exercises that review the past to indicate the likely future. Another buzzword that was used for the last few years across all industries is “big data”. In biomedical and health sciences, both unstructured and structured information constitute "big data". On the one hand deep learning needs lot of data whereas “big data" has value only when it generates actionable insight. Given this, these two areas are destined to be married. The couple is made for each other. The time is ripe now for a synergistic association that will benefit the pharmaceutical companies. It may be only a short time before we have vice presidents of machine learning or deep learning in pharmaceutical and biotechnology companies. This presentation will review the prominent deep learning methods and discuss these techniques for their usefulness in biomedical and health informatics.
Presentation of the "Coming to terms to FAIR semantics" paper for 22nd International Conference on Knowledge Engineering and Knowledge Management (EKAW 2020).
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphsdgarijo
In this presentation we describe the Ontology-Based APIs framework (OBA), our approach to automatically create REST APIs from ontologies while following RESTful API best
practices. Given an ontology (or ontology network) OBA uses standard technologies familiar to web developers (OpenAPI Specification, JSON) and combines them with W3C standards (OWL, JSON-LD frames and SPARQL) to create maintainable APIs with documentation, units tests, automated validation of resources and clients (in Python, Javascript, etc.) for non Semantic Web experts to access the contents of a target
knowledge graph. We showcase OBA with three examples that illustrate the capabilities of the framework for different ontologies.
An increasing number of researchers rely on computational methods to generate the results described in their publications. Research software created to this end is heterogeneous (e.g., scripts, libraries, packages, notebooks, etc.) and usually difficult to find, reuse, compare and understand due to its disconnected documentation (dispersed in manuals, readme files, web sites, and code comments) and a lack of structured metadata to describe it. In this talk I will describe the main challenges for finding, comparing and reusing research software, how structured metadata can help to address some of them, which are the best practices being proposed by the community; and current initiatives to aid their adoption by researchers within EOSC.
Impact: The talk addresses an important aspect of the EOSC infrastructure for quality research software by ensuring that software contributed to the EOSC ecosystem can be found, compared and reused by researchers. The talk also aims to address metadata quality of current research products, which is critical for successful adoption.
Presented at the EOSC symposium
The Research Object Initiative:Frameworks and Use Cases
Similar to Prof. Barend Mons, Biosemantics Group at Leiden University Medical Center and Head of Node of ELIXIR-NL - Keynote "Bringing Data to Broadway"
Barend Mons slides from #ISMB 2014: Trends in data publishing. Talk 3 in the "What Bioinformaticians need to know about digital publishing beyond the PDF2" workshop at ISMB 2014, Boston, 16th July 2014
A number of recent milestones in AI have rekindled the faith that human-grade computer intelligence can fuel the next technological revolution. In parallel and almost independently, the job role of Data Scientist rose to one of the hottest tickets in the technology sector. Despite the obvious overlap in the domains of Data Science and Artificial Intelligence, the two approaches are sufficiently distinct that choosing the wrong one might trigger a product to fail or a hiring process to go wrong. This presentation will offer some clarity and best practices with regards to understanding what data analysis requirements you really have, as what opposed to what you think you have.
Opening talk at the "Interdisciplinary Data Resources to Address the Challenges of Urban Living” Workshop at the Urban Big Data Centre, University of Glasgow, 4 April 2016
Big Data [sorry] & Data Science: What Does a Data Scientist Do?Data Science London
What 'kind of things' does a data scientist do? What are the foundations and principles of data science? What is a Data Product? What does the data science process looks like? Learning from data: Data Modeling or Algorithmic Modeling? - talk by Carlos Somohano @ds_ldn at The Cloud and Big Data: HDInsight on Azure London 25/01/13
How to Feed a Data Hungry Organization – by Traveloka Data TeamTraveloka
In Traveloka's Inaugural Data Meetup held in April 2017, Ainun Najib (Head of Data), Dr. Philip Thomas (Lead Data Scientist), and Rendy B. Junior (Lead Data Engineer) shared about the journey that Traveloka's Data Team have taken so far so that the audience can learn from the struggles and triumphs in managing Traveloka's burgeoning data.
You will learn more about:
1) Data culture in Traveloka
2) Data engineering in Traveloka
3) Data science in Traveloka
To follow our LinkedIn page, visit bit.ly/TravelokaLinkedInPage
Safe Harbor Statement
Our discussion may include predictions, estimates or other information that might be considered conclusive. While these conclusive statements represent our current judgment on the best practices, they are subject to risks and uncertainties that could cause actual results to differ materially. You are cautioned not to place undue reliance on our statements, which reflect our opinions only as of the date of this presentation. Please keep in mind that we are not obligating ourselves to revise or publicly release the results of any revision to these presentation materials in light of new information or future events.
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAIBig Data Week
Charles Cai has more than two decades of experience and track records of global transformational programme deliveries – from vision, evangelism to end-to-end execution in global investment banks, and energy trading companies, where he excels at designing and building innovative, large scale, Big Data systems in high volume low latency trading, global Energy Trading & Risk Management, and advanced temporal and geospatial predictive analytics, as Chief Front Office Technical Architect and Head of Data Science. He’s also a frequent speaker at Google Campus, Big Data Innovation Summit, Cloud World Forum, Data Science London, QCon London and MoD CIO Symposium etc, to promote knowledge and best practice sharing, with audience ranging from developers, data scientists, to CXO level senior executives from both IT and business background. He has in-depth knowledge and experience Scala, Python, C# / F#, C++, Node.js, Java, R, Haskell programming languages in Mobile, Desktop, Hadoop/Spark, Cloud IoT/MCU and BlockChain etc, and TOGAF9, EMC-DS, AWS CNE4 etc. certifications.
Publishing of Scientific Data - Science Foundation Ireland Summit 2010jodischneider
Slides prepared for the Publishing of Scientific Data workshop at the Science Foundation Ireland Summit 2010. I was one of three panelists. We had a lively discussion!
Similar to Prof. Barend Mons, Biosemantics Group at Leiden University Medical Center and Head of Node of ELIXIR-NL - Keynote "Bringing Data to Broadway" (20)
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
I have heard many times that architecture is not important for the front-end. Also, many times I have seen how developers implement features on the front-end just following the standard rules for a framework and think that this is enough to successfully launch the project, and then the project fails. How to prevent this and what approach to choose? I have launched dozens of complex projects and during the talk we will analyze which approaches have worked for me and which have not.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
7. f
2005: Text Mining ?
Why Bury it first and then mine it again !
8.
9.
10. Part II
The Explicitome
and the Elusive Part
(our own fault)
The Explicitome: everything we already asserted
11. The Elusive Explicitome Phenomenon
example from: Yepes & Verspoor, 2013
narrative
Tables/figures
abstract
# of assertions
Supplementary data
5 500* 1000 50K-1M+
# of SNP-Phen: 2% 4% 50%*
The Elusive Explicitome: what escapes us (95%)
Hurdle 1:
Paywalls
Hurdle 2:
‘TIF’walls
Hurdle 3:
The Wall of Broken Links
12. Data loss is real and significant, while data growth is
staggering
Nature news, 19 December 2013 • Computer speed and storage
capacity is doubling every 18
months and this rate is steady
• DNA sequence data is
doubling every 6-8 months
over the last 3 years and looks
‘Oops, that link was the laptop of my PhD student’ to continue for this decade
13. The trends in e-Science
Computer Analytics
(takes charge)
Enormity of datasets
(beyond narrative)
Collaborative Intelligence
(calls for million minds)
Irreversable movement
(towards OA)
FAIR
?
Data
Publishing &
Stewardship
22. FAIR for computers FAIR for people
AERIAL SURVEY
pattern recognition in
Ridiculograms
HUMAN EXCAVATION
rationalisation and
‘confirmational reading’
X
‘Why would I believe this association’???
23. For KD we need each association only once
23
Cardinal Assertion
(<1011)
n identical
assertions
‘n’ different
provenances
24. We publish about less than a million LS Concepts !
24 106 concept clusters (Knowlets)
25. www.biosemantics.org LUMC - LIACS
BioSemantics Knowledge Discovery Pipeline
⊲
data sources ‘coordinated’ data
!
nanopub cache
cardinal
assertion
store
semantic
data
indexing modelling
reasoning
algorithms
trends
phase
transitions
‘new’ data
alerts differentials
{
funding
priorities
• gene
• disease
semantic
query
{
27. Part 3
Unavoidable: some science of ‘our own’
Part IV
Towards Solutions
Bigger is not Better
Zipping the Explicitome
but…..as examples, sorry
28. Electronic
Health
Databases
The Rescued Explicitome
Value
Added
Databases
narrative
Tables/figures
Supplementary data
abstract
PROVENANCE
Total Explicitome
an estimated
1014 asserted associations
in 2,500 data sources
ETL to
FAIR
FAIR
to
read
29. Assertions
Concepts
1014
1011
106
Semantic MedLine
U+C+CT+EG+GO = 36 M
80%
20%
Cardinal
Zipping the Explicitome
30. Part 3
Unavoidable: some science of ‘our own’
Part V
(FAIR) data should take
CENTER STAGE
but…..as examples, sorry
32. PID
Metadata (intrinsic)
'provenance' (user defined)
Data (elements)
A simplified diagram of a Digital (data) Object irrespective of technological choices and naming
33. Digital Object Architecture
PID
Metadata (intrinsic)
'provenance' (user defined)
Data (elements)
s are Digital Objects
Some Research Objects Nanopublications are Research Objects
are
34. Data as increasingly FAIR Digital Objects
Totally UNFAIR
PID
Metadata (intrinsic)
'provenance' (user defined)
Data (elements)
Usable for Humans
PID
Findable
Metadata (intrinsic)
'provenance' (user defined)
Data (elements)
PID
FAIR metadata
Metadata (intrinsic)
'provenance' (user defined)
Data (elements)
PID
FAIR data-restricted
access
Metadata (intrinsic)
'provenance' (user defined)
Data (elements)
FAIR data-
Open Access
PID
Metadata (intrinsic)
'provenance' (user defined)
Data (elements)
Open Access/Functionally Linked
PID
FAIR data-
Metadata (intrinsic)
'provenance' (user defined)
Data (elements)
37. Data Owners
(supp)
data
Data
bases
Repositories
FAIRport proof of concept
ELIXIR FAIR Data Search Index
End-users
FAIR L2
ELIXIR
Data
FAIR
Port
ELIXIR federated data
ELIXIR semantic data repository
FAIR L1
Search for
datasets
Download
data (sub)
sets in many
formats (xml,
rdf, json etc)
FAIR
L3
FAIR L4
ASPs, Inhouse IT,
Bioinformatics
Etc..
Tools &
Applications
Elixir
Fin.
Elixir
Esp.
Elixir
Nor.
Elixir
Elixir UK
Elixir SWE
NL..
Elixir
Fin.
Elixir
Esp.
Elixir
Nor.
Elixir
Elixir UK
Elixir SWE
NL..
www.nanopubmed.org
38. Parties needed Typical Candidates NL-example
Tusted Party
Usually Public Sector
With 'data stewardship' mandate
1
Executive Party/
Coordinator
Usually Public or Private Sector
With Expert Knowledge on Project
ans relation management
2
Technology
Providers
3 4 PID/ARTA stewards
DTL/ELIXIR-nl
others
5 DOA architecture/IMS CNRI + EURETOS
6 Publishing pipeline EURETOS
7 Repository Software
8 eInfrastructure
39. Malpractices…….
Journal Impact Factor
Ignore Altmetrics
No data stewardship plan
Obstruct Tenure
Data Experts
‘supplementary data’
Knowledge Sharing Impaired
40. NITRD
FORCE11 ORCID VIVO
4/10/14
EUDAT
40
DATAVERSE
BD2K
DANS
ELIXIR
NIHCom
mons
H2020
DRYAD RDA
FigShare
Nanopub
Biosharing
Elsevier
Science Nature
SageBio
HVP
DataCite
EGA
Reseach Objects
Nebulus
Embassy
SADI
EURETOS
YARCdata
IMI
interoperability
ISA
Open PHACTS
Data Fabric
41. Good practices (apart from collaborating)
‘professional data publishing’
RO Impact Factor
Award Altmetrics
5% for
data stewardship plan
Train & Tenure
Data Experts
FAIR play
43. Endorsed by 82 organisations and [y] individuals
1. FAIR guiding principles with public discussion forum:
https://www.force11.org/group/fairgroup/fairprinciples
2. Notes and Annexes: https://www.force11.org/node/6062/
3. Group home page https://www.force11.org/group/fairgroup
COMMENT: (till October 1st)
ENDORSE: (after October 1st)