ViBRANT is a European project that aims to connect people, data, and science related to biodiversity. As part of this project, researchers developed IKey+, a new web service for automatically generating single-access identification keys. IKey+ allows users to submit taxonomic data in the standard SDD format and generates keys with various parameters. It was designed as a freely available open-source tool to help biologists identify specimens. Benchmark tests showed IKey+ can generate a key for 144 taxa in about 1.8 seconds on average.
User Identity Linkage: Data Collection, DataSet Biases, Method, Control and A...IIIT Hyderabad
Online Social Networks (OSNs) are popular platforms for online users. Users typically register and maintain their accounts (user identities) across different OSNs to share a variety of content and remain connected with their friends. Consequently, linking user identities across OSN platforms, referred to as user identity linkage (UIL) becomes a critical problem. Solving this problem enables us to build a more comprehensive view of user’s activities across OSNs, which is highly beneficial for targeted advertisements, recommendations, and many more applications. In the thesis, we propose approaches for analyzing data collection methods, investigating biases in identity linkage datasets, linkage of user identities across social networks, control-ability of user identity linkage, and application of user identity linkage solutions to solve related problems.
Studying user footprints in different online social networksIIIT Hyderabad
With the growing popularity and usage of online social media services, people now have accounts (some times several) on multiple and diverse services like Facebook, LinkedIn, Twitter and YouTube. Publicly available information can be used to create a digital footprint of any user using these social media services. Generating such digital footprints can be very useful for personalization, profile management, detecting malicious behavior of users. A very important application of analyzing users’ online digital footprints is to protect users from potential privacy and security risks arising from the huge publicly available user information. We extracted information about user identities on different social networks through Social Graph API, FriendFeed, and Profilactic; we collated our own dataset to create the digital footprints of the users. We used username, display name, description, location, profile image, and number of connections to generate the digital footprints of the user. We applied context specific techniques (e.g. Jaro Winkler
similarity, Wordnet based ontologies) to measure the similarity of the user profiles on different social networks. We specifically focused on Twitter and LinkedIn. In this paper, we present the analysis and results from applying automated classifiers for
disambiguating profiles belonging to the same user from different social networks UserID and Name were found to be the most discriminative features for disambiguating user profiles. Using the most promising set of features and similarity metrics, we
achieved accuracy, precision and recall of 98%, 99%, and 96%, respectively.
Everything you wanted to know about Pinterest.
Including:
How the idea came about,
What’s the big deal?
Stats, facts and demographics,
Getting started
What are people pinning?
It’s not just imagery
Some of the more unusual uses
What else can I do?
Which brands are using it?
How are they?
Pinning on the move
Extending the experience off platform
Key take-outs
User Identity Linkage: Data Collection, DataSet Biases, Method, Control and A...IIIT Hyderabad
Online Social Networks (OSNs) are popular platforms for online users. Users typically register and maintain their accounts (user identities) across different OSNs to share a variety of content and remain connected with their friends. Consequently, linking user identities across OSN platforms, referred to as user identity linkage (UIL) becomes a critical problem. Solving this problem enables us to build a more comprehensive view of user’s activities across OSNs, which is highly beneficial for targeted advertisements, recommendations, and many more applications. In the thesis, we propose approaches for analyzing data collection methods, investigating biases in identity linkage datasets, linkage of user identities across social networks, control-ability of user identity linkage, and application of user identity linkage solutions to solve related problems.
Studying user footprints in different online social networksIIIT Hyderabad
With the growing popularity and usage of online social media services, people now have accounts (some times several) on multiple and diverse services like Facebook, LinkedIn, Twitter and YouTube. Publicly available information can be used to create a digital footprint of any user using these social media services. Generating such digital footprints can be very useful for personalization, profile management, detecting malicious behavior of users. A very important application of analyzing users’ online digital footprints is to protect users from potential privacy and security risks arising from the huge publicly available user information. We extracted information about user identities on different social networks through Social Graph API, FriendFeed, and Profilactic; we collated our own dataset to create the digital footprints of the users. We used username, display name, description, location, profile image, and number of connections to generate the digital footprints of the user. We applied context specific techniques (e.g. Jaro Winkler
similarity, Wordnet based ontologies) to measure the similarity of the user profiles on different social networks. We specifically focused on Twitter and LinkedIn. In this paper, we present the analysis and results from applying automated classifiers for
disambiguating profiles belonging to the same user from different social networks UserID and Name were found to be the most discriminative features for disambiguating user profiles. Using the most promising set of features and similarity metrics, we
achieved accuracy, precision and recall of 98%, 99%, and 96%, respectively.
Everything you wanted to know about Pinterest.
Including:
How the idea came about,
What’s the big deal?
Stats, facts and demographics,
Getting started
What are people pinning?
It’s not just imagery
Some of the more unusual uses
What else can I do?
Which brands are using it?
How are they?
Pinning on the move
Extending the experience off platform
Key take-outs
FAIR Computational Workflows
Computational workflows capture precise descriptions of the steps and data dependencies needed to carry out computational data pipelines, analysis and simulations in many areas of Science, including the Life Sciences. The use of computational workflows to manage these multi-step computational processes has accelerated in the past few years driven by the need for scalable data processing, the exchange of processing know-how, and the desire for more reproducible (or at least transparent) and quality assured processing methods. The SARS-CoV-2 pandemic has significantly highlighted the value of workflows.
This increased interest in workflows has been matched by the number of workflow management systems available to scientists (Galaxy, Snakemake, Nextflow and 270+ more) and the number of workflow services like registries and monitors. There is also recognition that workflows are first class, publishable Research Objects just as data are. They deserve their own FAIR (Findable, Accessible, Interoperable, Reusable) principles and services that cater for their dual roles as explicit method description and software method execution [1]. To promote long-term usability and uptake by the scientific community, workflows (as well as the tools that integrate them) should become FAIR+R(eproducible), and citable so that author’s credit is attributed fairly and accurately.
The work on improving the FAIRness of workflows has already started and a whole ecosystem of tools, guidelines and best practices has been under development to reduce the time needed to adapt, reuse and extend existing scientific workflows. An example is the EOSC-Life Cluster of 13 European Biomedical Research Infrastructures which is developing a FAIR Workflow Collaboratory based on the ELIXIR Research Infrastructure for Life Science Data Tools ecosystem. While there are many tools for addressing different aspects of FAIR workflows, many challenges remain for describing, annotating, and exposing scientific workflows so that they can be found, understood and reused by other scientists.
This keynote will explore the FAIR principles for computational workflows in the Life Science using the EOSC-Life Workflow Collaboratory as an example.
[1] Carole Goble, Sarah Cohen-Boulakia, Stian Soiland-Reyes,Daniel Garijo, Yolanda Gil, Michael R. Crusoe, Kristian Peters, and Daniel Schober FAIR Computational Workflows Data Intelligence 2020 2:1-2, 108-121 https://doi.org/10.1162/dint_a_00033.
Marios Chatziangelou presents the EGI applications database | OSFair2017 Workshop
Workshop overview:
This collaborative workshop comes in the context of coordinating EOSC related activities across large European infrastructures at European and national level. The workshop will offer an opportunity for cross-pollination on issues ranging from open scholarship to technical service provision, training, community engagement and support. OpenAIRE NOADs, EGI NGIs, GEANT NRENs and other national e-Infrastructure representatives will discuss gaps, synergies, coordination and service integration opportunities.
DAY 3 - PARALLEL SESSION 6 & 7
German Conference on Bioinformatics 2021
https://gcb2021.de/
FAIR Computational Workflows
Computational workflows capture precise descriptions of the steps and data dependencies needed to carry out computational data pipelines, analysis and simulations in many areas of Science, including the Life Sciences. The use of computational workflows to manage these multi-step computational processes has accelerated in the past few years driven by the need for scalable data processing, the exchange of processing know-how, and the desire for more reproducible (or at least transparent) and quality assured processing methods. The SARS-CoV-2 pandemic has significantly highlighted the value of workflows.
This increased interest in workflows has been matched by the number of workflow management systems available to scientists (Galaxy, Snakemake, Nextflow and 270+ more) and the number of workflow services like registries and monitors. There is also recognition that workflows are first class, publishable Research Objects just as data are. They deserve their own FAIR (Findable, Accessible, Interoperable, Reusable) principles and services that cater for their dual roles as explicit method description and software method execution [1]. To promote long-term usability and uptake by the scientific community, workflows (as well as the tools that integrate them) should become FAIR+R(eproducible), and citable so that author’s credit is attributed fairly and accurately.
The work on improving the FAIRness of workflows has already started and a whole ecosystem of tools, guidelines and best practices has been under development to reduce the time needed to adapt, reuse and extend existing scientific workflows. An example is the EOSC-Life Cluster of 13 European Biomedical Research Infrastructures which is developing a FAIR Workflow Collaboratory based on the ELIXIR Research Infrastructure for Life Science Data Tools ecosystem. While there are many tools for addressing different aspects of FAIR workflows, many challenges remain for describing, annotating, and exposing scientific workflows so that they can be found, understood and reused by other scientists.
This keynote will explore the FAIR principles for computational workflows in the Life Science using the EOSC-Life Workflow Collaboratory as an example.
[1] Carole Goble, Sarah Cohen-Boulakia, Stian Soiland-Reyes,Daniel Garijo, Yolanda Gil, Michael R. Crusoe, Kristian Peters, and Daniel Schober FAIR Computational Workflows Data Intelligence 2020 2:1-2, 108-121 https://doi.org/10.1162/dint_a_00033.
Biology, medicine, physics, astrophysics, chemistry: all these scientific domains need to process large amount of data with more and more complex software systems. For achieving reproducible science, there are several challenges ahead involving multidisciplinary collaboration and socio-technical innovation with software at the center of the problem. Despite the availability of data and code, several studies report that the same data analyzed with different software can lead to different results. I am seeing this problem as a manifestation of deep software variability: many factors (operating system, third-party libraries, versions, workloads, compile-time options and flags, etc.) themselves subject to variability can alter the results, up to the point it can dramatically change the conclusions of some scientific studies. In this keynote, I argue that deep software variability is a threat and also an opportunity for reproducible science. I first outline some works about (deep) software variability, reporting on preliminary evidence of complex interactions between variability layers. I then link the ongoing works on variability modelling and deep software variability in the quest for reproducible science.
Semantic domain ontologies are increasingly seen as the key for enabling
interoperability across heterogeneous systems and sensor-based applications.
The ontologies deployed in these systems and applications are developed by
restricted groups of domain experts and not by semantic web experts. Lately,
folksonomies are increasingly exploited in developing ontologies. The
“collective intelligence”, which emerge from collaborative tagging can be
seen as an alternative for the current effort at semantic web ontologies.
However, the uncontrolled nature of social tagging systems leads to many
kinds of noisy annotations, such as misspellings, imprecision and ambiguity.
Thus, the construction of formal ontologies from social tagging data remains
a real challenge. Most of researches have focused on how to discover
relatedness between tags rather than producing ontologies, much less domain
ontologies. This paper proposed an algorithm that utilises tags in social
tagging systems to automatically generate up-to-date specific-domain
ontologies. The evaluation of the algorithm, using a dataset extracted from
BibSonomy, demonstrated that the algorithm could effectively learn a
domain terminology, and identify more meaningful semantic information for
the domain terminology. Furthermore, the proposed algorithm introduced a
simple and effective method for disambiguating tags.
PATHS state of the art monitoring reportpathsproject
This document provides an update to an Initial State of the Art Monitoring report delivered by the project. The report covers the areas of Educational Informatics, Information Retrieval and Semantic Similarity relatedness.
A consistent and efficient graphical User Interface Design and Querying Organ...CSCJournals
We propose a software layer called GUEDOS-DB upon Object-Relational Database Management System ORDMS. In this work we apply it in Molecular Biology, more precisely Organelle complete genome. We aim to offer biologists the possibility to access in a unified way information spread among heterogeneous genome databanks. In this paper, the goal is firstly, to provide a visual schema graph through a number of illustrative examples. The adopted, human-computer interaction technique in this visual designing and querying makes very easy for biologists to formulate database queries compared with linear textual query representation.
Scientific Workflows: what do we have, what do we miss?Paolo Romano
Presentation given on June 22, 2013, in Nice, at the CIBB 2013 International Workshop.
In collaboration with Paolo Missier, University of Newcastle upon Tyne, UK
https://bigscience.huggingface.co/
EN: Presentation of the BigScience project: a research initiative launched by HuggingFace and aiming to build a large language model (inspired by OpenAI and GPTx) over multiple languages and a very large processing cluster. The participants plan to investigate the dataset and the model from all angles: bias, social impact, capabilities, limitations, ethics, potential improvements, specific domain performances, carbon impact, general AI/cognitive research landscape.
FR : Présentation du projet Bigscience : un projet de recherche ouvert lancé par HuggingFace et qui a pour objectif de contruire un modèle de langue (ie un peu comme openAI et GPT-3) mais en explorant les problèmes liés au jeux de données et au modèle selon les angles des biais cognitifs, de l'impact social et environemental, des limites éthiques, des possibles gain de performance et de l'impact général de ce type d'approche lorsque le but n'est pas seulement "d'avoir un plus gros modèle".
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
FAIR Computational Workflows
Computational workflows capture precise descriptions of the steps and data dependencies needed to carry out computational data pipelines, analysis and simulations in many areas of Science, including the Life Sciences. The use of computational workflows to manage these multi-step computational processes has accelerated in the past few years driven by the need for scalable data processing, the exchange of processing know-how, and the desire for more reproducible (or at least transparent) and quality assured processing methods. The SARS-CoV-2 pandemic has significantly highlighted the value of workflows.
This increased interest in workflows has been matched by the number of workflow management systems available to scientists (Galaxy, Snakemake, Nextflow and 270+ more) and the number of workflow services like registries and monitors. There is also recognition that workflows are first class, publishable Research Objects just as data are. They deserve their own FAIR (Findable, Accessible, Interoperable, Reusable) principles and services that cater for their dual roles as explicit method description and software method execution [1]. To promote long-term usability and uptake by the scientific community, workflows (as well as the tools that integrate them) should become FAIR+R(eproducible), and citable so that author’s credit is attributed fairly and accurately.
The work on improving the FAIRness of workflows has already started and a whole ecosystem of tools, guidelines and best practices has been under development to reduce the time needed to adapt, reuse and extend existing scientific workflows. An example is the EOSC-Life Cluster of 13 European Biomedical Research Infrastructures which is developing a FAIR Workflow Collaboratory based on the ELIXIR Research Infrastructure for Life Science Data Tools ecosystem. While there are many tools for addressing different aspects of FAIR workflows, many challenges remain for describing, annotating, and exposing scientific workflows so that they can be found, understood and reused by other scientists.
This keynote will explore the FAIR principles for computational workflows in the Life Science using the EOSC-Life Workflow Collaboratory as an example.
[1] Carole Goble, Sarah Cohen-Boulakia, Stian Soiland-Reyes,Daniel Garijo, Yolanda Gil, Michael R. Crusoe, Kristian Peters, and Daniel Schober FAIR Computational Workflows Data Intelligence 2020 2:1-2, 108-121 https://doi.org/10.1162/dint_a_00033.
Marios Chatziangelou presents the EGI applications database | OSFair2017 Workshop
Workshop overview:
This collaborative workshop comes in the context of coordinating EOSC related activities across large European infrastructures at European and national level. The workshop will offer an opportunity for cross-pollination on issues ranging from open scholarship to technical service provision, training, community engagement and support. OpenAIRE NOADs, EGI NGIs, GEANT NRENs and other national e-Infrastructure representatives will discuss gaps, synergies, coordination and service integration opportunities.
DAY 3 - PARALLEL SESSION 6 & 7
German Conference on Bioinformatics 2021
https://gcb2021.de/
FAIR Computational Workflows
Computational workflows capture precise descriptions of the steps and data dependencies needed to carry out computational data pipelines, analysis and simulations in many areas of Science, including the Life Sciences. The use of computational workflows to manage these multi-step computational processes has accelerated in the past few years driven by the need for scalable data processing, the exchange of processing know-how, and the desire for more reproducible (or at least transparent) and quality assured processing methods. The SARS-CoV-2 pandemic has significantly highlighted the value of workflows.
This increased interest in workflows has been matched by the number of workflow management systems available to scientists (Galaxy, Snakemake, Nextflow and 270+ more) and the number of workflow services like registries and monitors. There is also recognition that workflows are first class, publishable Research Objects just as data are. They deserve their own FAIR (Findable, Accessible, Interoperable, Reusable) principles and services that cater for their dual roles as explicit method description and software method execution [1]. To promote long-term usability and uptake by the scientific community, workflows (as well as the tools that integrate them) should become FAIR+R(eproducible), and citable so that author’s credit is attributed fairly and accurately.
The work on improving the FAIRness of workflows has already started and a whole ecosystem of tools, guidelines and best practices has been under development to reduce the time needed to adapt, reuse and extend existing scientific workflows. An example is the EOSC-Life Cluster of 13 European Biomedical Research Infrastructures which is developing a FAIR Workflow Collaboratory based on the ELIXIR Research Infrastructure for Life Science Data Tools ecosystem. While there are many tools for addressing different aspects of FAIR workflows, many challenges remain for describing, annotating, and exposing scientific workflows so that they can be found, understood and reused by other scientists.
This keynote will explore the FAIR principles for computational workflows in the Life Science using the EOSC-Life Workflow Collaboratory as an example.
[1] Carole Goble, Sarah Cohen-Boulakia, Stian Soiland-Reyes,Daniel Garijo, Yolanda Gil, Michael R. Crusoe, Kristian Peters, and Daniel Schober FAIR Computational Workflows Data Intelligence 2020 2:1-2, 108-121 https://doi.org/10.1162/dint_a_00033.
Biology, medicine, physics, astrophysics, chemistry: all these scientific domains need to process large amount of data with more and more complex software systems. For achieving reproducible science, there are several challenges ahead involving multidisciplinary collaboration and socio-technical innovation with software at the center of the problem. Despite the availability of data and code, several studies report that the same data analyzed with different software can lead to different results. I am seeing this problem as a manifestation of deep software variability: many factors (operating system, third-party libraries, versions, workloads, compile-time options and flags, etc.) themselves subject to variability can alter the results, up to the point it can dramatically change the conclusions of some scientific studies. In this keynote, I argue that deep software variability is a threat and also an opportunity for reproducible science. I first outline some works about (deep) software variability, reporting on preliminary evidence of complex interactions between variability layers. I then link the ongoing works on variability modelling and deep software variability in the quest for reproducible science.
Semantic domain ontologies are increasingly seen as the key for enabling
interoperability across heterogeneous systems and sensor-based applications.
The ontologies deployed in these systems and applications are developed by
restricted groups of domain experts and not by semantic web experts. Lately,
folksonomies are increasingly exploited in developing ontologies. The
“collective intelligence”, which emerge from collaborative tagging can be
seen as an alternative for the current effort at semantic web ontologies.
However, the uncontrolled nature of social tagging systems leads to many
kinds of noisy annotations, such as misspellings, imprecision and ambiguity.
Thus, the construction of formal ontologies from social tagging data remains
a real challenge. Most of researches have focused on how to discover
relatedness between tags rather than producing ontologies, much less domain
ontologies. This paper proposed an algorithm that utilises tags in social
tagging systems to automatically generate up-to-date specific-domain
ontologies. The evaluation of the algorithm, using a dataset extracted from
BibSonomy, demonstrated that the algorithm could effectively learn a
domain terminology, and identify more meaningful semantic information for
the domain terminology. Furthermore, the proposed algorithm introduced a
simple and effective method for disambiguating tags.
PATHS state of the art monitoring reportpathsproject
This document provides an update to an Initial State of the Art Monitoring report delivered by the project. The report covers the areas of Educational Informatics, Information Retrieval and Semantic Similarity relatedness.
A consistent and efficient graphical User Interface Design and Querying Organ...CSCJournals
We propose a software layer called GUEDOS-DB upon Object-Relational Database Management System ORDMS. In this work we apply it in Molecular Biology, more precisely Organelle complete genome. We aim to offer biologists the possibility to access in a unified way information spread among heterogeneous genome databanks. In this paper, the goal is firstly, to provide a visual schema graph through a number of illustrative examples. The adopted, human-computer interaction technique in this visual designing and querying makes very easy for biologists to formulate database queries compared with linear textual query representation.
Scientific Workflows: what do we have, what do we miss?Paolo Romano
Presentation given on June 22, 2013, in Nice, at the CIBB 2013 International Workshop.
In collaboration with Paolo Missier, University of Newcastle upon Tyne, UK
https://bigscience.huggingface.co/
EN: Presentation of the BigScience project: a research initiative launched by HuggingFace and aiming to build a large language model (inspired by OpenAI and GPTx) over multiple languages and a very large processing cluster. The participants plan to investigate the dataset and the model from all angles: bias, social impact, capabilities, limitations, ethics, potential improvements, specific domain performances, carbon impact, general AI/cognitive research landscape.
FR : Présentation du projet Bigscience : un projet de recherche ouvert lancé par HuggingFace et qui a pour objectif de contruire un modèle de langue (ie un peu comme openAI et GPT-3) mais en explorant les problèmes liés au jeux de données et au modèle selon les angles des biais cognitifs, de l'impact social et environemental, des limites éthiques, des possibles gain de performance et de l'impact général de ce type d'approche lorsque le but n'est pas seulement "d'avoir un plus gros modèle".
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
2. Copyedited by: GS MANUSCRIPT CATEGORY: Article
[13:42 7/9/2012 Sysbio-sys069.tex] Page: 2 1–5
2 SYSTEMATIC BIOLOGY
FIGURE 1. IKey+use cases.
protocols), which is particularly suitable for these two
usecases(cf.Fig.1)andallowstheclienttobuildcomplex
workflows, with a high level of automation.
Architecture
Because the main purpose of ViBRANT is to provide
a platform that is both open source and easily reusable,
we organized IKey+ in two parts:
• an Application Programming Interface (API)
consisting of 3 distinct modules; the Data Model,
that is, the computer representation of the
descriptive data, the Input-Output (IO) module,
whichparsestheSDDinputfilesandloadsthedata
in the Model and generates the output files, and
the Algorithm, which uses the data contained in
the Model to generate a key and returns it through
the IO module.
• a Simple Object Access Protocol (SOAP) or a
REpresentational State Transfer (REST) Service
Layer encapsulating the API, which manages the
communication between the client and the API.
This allows potential developers to integrate the service
into their workflows and adapt it to their specific needs
(e.g., integrating the API in a standalone software). The
whole application was developed using the J2EE (Java 2,
Enterprise Edition) programming environment.
Web Service Input Parameters
IKey+ has several parameters available to the end-
user. These parameters allow the end-user to change the
topology of the key (e.g., pruning the key, promoting
some characters), change the visual aspect of the key, or
the output format. A complete list of these parameters
can be found in the user documentation, available here:
http://www.identificationkey.fr/resources/docs/ident
ificationKeyGeneratorWS_UserGuide.pdf (last accessed
13 August 2012).
Algorithm
The single-access key generation algorithm is a
recursive depth-first graph construction algorithm. Its
arguments are a list of taxa, taxaList and the list
of character considered, charList (cf. Supplementary
Materials, appendices 1, 2 and 3, available at http://
datadryad.org, doi:10.5061/dryad.3ft19). The resulting
graph (cf. Fig. 2) is a directed acyclic graph, consisting
of non-terminal nodes labelled with a character, edges
labelled with the states of the character of the previous
node, and terminal nodes labelled with taxa. At each
step of the algorithm, the best character among those
available is selected by the BEST_CHAR function, which
iterates over the available characters and returns the
character with the greatest discriminant power.
Discriminant Power of a Character
During the last 50 years, estimating a character’s
discriminant power has been the central point of key-
construction algorithms. Many measurement methods
have been suggested, and a review of these methods can
be found in Gower and Payne (1975), Pankhurst (1991),
and Delgado-Calvo-Flores et al. (2006). In a statistical
context, examples include the Bayesian probability and
generalized entropy, such as the Shannon entropy used
in the ID3 algorithm developed by Quinlan (1986) or the
Gini index used in Breiman et al. (1984). In a context
where there is no probability associated with the states
of the characters, one can use the separation coefficient
or the variance as measurement methods of a character’s
discriminant power.
In IKey+, the discriminant power of a given character
C is calculated by the DPOWER function and is
an estimate of C’s ability to differentiate the taxa
of the current list. DPOWER is an extension of the
Gyllenberg separation factor (Gyllenberg 1963) and
is a measurement of the number of pairs of taxa
discriminated by C, which is a generalization of the
variance. Indeed, the variance of a variable can be
computed by comparing all pairs of values (here,
comparing all pairs of taxa description for C). With
different comparison functions, each adapted to a certain
type of character (e.g., a categorical or a numerical
character), it is possible to obtain a generalized formula
to estimate the discriminant power of a given character,
even with polymorphic characters or characters with
missing data. For categorical characters, DPOWER can
use a binary function (boolean comparison) or other
atFreieUniversitaetBerlinonSeptember18,2012http://sysbio.oxfordjournals.org/Downloadedfrom
3. Copyedited by: GS MANUSCRIPT CATEGORY: Article
[13:42 7/9/2012 Sysbio-sys069.tex] Page: 3 1–5
2012 BURGUIERE ET AL.—Ikey+ 3
Character I
Character II
State I-a
Taxon 1
State I-b
Character III
State II-a
Taxon 2
State II-b
Character IV
State II-c OR State II-d
Taxon 3
State III-a
Taxon 4
State III-b
Taxon 5
State IV-a
Taxon 6, Taxon 7
State IV-b
FIGURE 2. Formal identification key example.
functions such as the Sokal and Michener coefficient
(Sokal and Michener 1958) or the Jaccard coefficient
(Jaccard 1901). For numerical characters (e.g., a size
measured in milllimetre), the set of values (i.e., all the
values entered for C for all taxa) is split in two intervals.
In order to determine the threshold value separating
these two intervals, we consider the list of min and max
value of C for each of the remaining taxa. We then choose
from this list the value that separates the remaining
taxa into two groups of equal size (±1 taxon). These
two intervals are considered as 2 discrete states for the
calculation of the discriminant power.
DPOWER iterates over the available taxa (i.e., those
that are compatible with the current description), and
determines, for each pair of taxa Ta and Tb), the
dissimilarity(i.e.,thepossibilityofdiscriminatingTa and
Tb with C) that is based on the number of common states
of C for Ta and Tb (n11), the number of states of C that
occur only for Ta ( n10), the number of states of C that
occur only for Tb (n01), and the number of states of C that
do not occur for neither Ta or Tb (n00). The discriminant
power is then calculated as
DP=
a b
SCORE(n11,n10,n01,n00).
Three different SCORE measurements are currently
available in IKey+, the Xper coefficient (Ung et al. 2010;
Vignes et al. 1989), the Jaccard coefficient, and the Sokal
and Michener coefficient.
BENCHMARKS
Tests
In order to assess the performance of our web
service, we conducted a series of tests using the
same input file for every test. This file contains a
data set representing the subfamily Cichorieae of the
plant family Asteraceae with 303 observable characters,
144 taxa, and their descriptions. This data set was
generated and curated as an exemplar group for the
EDIT project (Hand et al. 2009) [the file is available
here: http://www.infosyslab.fr/vibrant/project/test/
Cichorieae-fullSDD.xml (last accessed 13 August 2012)].
The aim of these tests was to evaluate the overall
performance of the algorithm, to assess the impact
of the various parameters available to the end-user
on performance, and to ensure that IKey+ would be
robust in a production environment. We measured the
time necessary for the web service to respond to 100
identification keys generation queries, while counting
the number of rejected queries due to CPU overload
(IKey+ rejects any query received when the CPU load
of the host server is greater than 80%). We ran one
reference test (described in Supplementary Materials,
appendix 4, doi:10.5061/dryad.3ft19), and several tests
with variations on the input parameters, the web service
communication protocol used or the parallelization
setups. The complete list of performance test setups
is available in Supplementary Materials, appendix 5,
doi:10.5061/dryad.3ft19. We also measured the time
necessarytogenerateasinglekey,usingthereferencetest
configuration described in Supplementary Materials,
appendix 4, doi:10.5061/dryad.3ft19. Finally, we tested
the usability of IKey+ from an end-user perspective,
when generating the Cichorieae identification key using
the web interface. We asked 11 persons who were not
involved in the development of IKey+ to test the web
service and the web interface. They were given the
SDD formated Cichorieae data set and were asked to
use the web interface to create the identification key.
We measured the time needed by each test subject to
generate the identification key.
Results
The average length of the paths leading to taxa
that were identified by the algorithm is 4.67 steps,
with the shortest path being 1 step long, and the
longest being 10 steps long. The generation of a single
atFreieUniversitaetBerlinonSeptember18,2012http://sysbio.oxfordjournals.org/Downloadedfrom
4. Copyedited by: GS MANUSCRIPT CATEGORY: Article
[13:42 7/9/2012 Sysbio-sys069.tex] Page: 4 1–5
4 SYSTEMATIC BIOLOGY
key took roughly 1.8 s. The results of the other tests
are shown in Supplementary Materials, appendix 6,
doi:10.5061/dryad.3ft19. The reference test took roughly
50 s to complete, that is, 500 ms per query, which is
consistent with the time measured for the generation
of a single key, because the reference test uses 4
simultaneous threads to generate the keys. In this test,
few queries were rejected due to CPU overload. Among
the parameters available to the end-user, only the score
method parameter had a significant influence: when
using the Sokal and Michener score method or the
Jaccard score method, the tests took longer to finish
(∼80 s instead of 50 s). The score method parameter had
no impact on the number of rejected queries. Our tests
showed that the communication protocol used (SOAP or
REST) had no influence on the performance of IKey+.
Some taxa do not appear in the generated key, due to
insufficient data in the input file. Indeed, the Cichorieae
data set we used for our tests was created by an
external team. This is because some errors were made
during creation of the Cichorieae data set: some taxa
with unknown data were not specifically marked as
“unknown data”, but were left unspecified instead.
Regardless, although these ambiguously coded taxa do
not appear in the resulting identification key by default,
the end-user can choose to have them appear in the key.
When modifying the parallelization of the queries,
we observed significant performance variations. As can
be expected, when launching 100 queries sequentially
(instead of using 4 simultaneous threads launching 25
queries each), the test was 4 times longer, and no queries
were rejected. When we augmented the parallelism
of the queries (e.g., 25 or 100 simultaneous queries),
more queries were rejected (up to 50%). However, when
launching 100 simultaneous queries, with a random
delay at the beginning of each thread, the results (both in
time and number of rejected queries) were comparable to
the performance of the reference test. In the usability test,
the average time needed to generate the identification
key was slightly above 60 s (60.36 s), with the shortest
time measured at 26 s, and the longest time measured
at 121 s. The complete results of the usability test
are available in Supplementary Materials, appendix 7,
doi:10.5061/dryad.3ft19.
DISCUSSION
IKey+ is available and can be installed on any J2EE
application server (e.g., Apache Tomcat). It can generate
single-access keys using a tree or a flat representation
in several output formats (HTML, Wiki, SDD, plain-
text, etc.).
Our test showed that IKey+ performs sufficiently well
to handle a large data set in a relatively short amount of
time and can generate well-optimized key files (average
number of steps to identify a taxon: 4.67). This, combined
with the web service accessibility, makes it possible to
integrate IKey+ in many workflows that might require
a fast and automated key generation process (e.g., a batch
key generation script). Furthermore, an end-user can use
the web interface available at www.identificationkey.fr
to quickly generate a customized identification key,
using the numerous parameters available (affecting the
topology, representation, file format, etc.).
Finally, our tests showed that IKey+ is likely to
be robust in a production environment (i.e., many
simultaneous queries) as it is able to withstand
simultaneous key generation queries (e.g., 4 threads
launching 25 consecutive queries). It is also protected
from cryptic failure, because we implemented a CPU-
load-watching mechanism that automatically rejects a
query (with an explicit error message) whenever the
CPU load exceeds a given threshold (80%). This prevents
a crash of the service, or the generation of incomplete or
corrupt key files.
CONCLUSION
IKey+ is the first key-generation tool available as
a web service with standardized input and output
formats. Our test showed that IKey+ is able to generate
keys rapidly and that it can also be used by an end-user
with the web interface. Finally, the modular and open-
source nature of IKey+ makes it possible for anyone to
reuseitscomponents.Forinstance,weplantoreusesome
components of the API to develop another web service
that would provide free-access key identification.
LICENSING
As part of the ViBRANT project, IKey+’s source
code is freely available and is licensed under the GNU
General Public License version 2. It is already available
on our google code SVN repository: http://ikey-plus.
googlecode.com/svn/trunk/ (last accessed 13 August
2012). It will be actively maintained by our team for the
next 2 years.
SUPPLEMENTARY MATERIAL
Supplementary material, including Algorithms and
appendices, can be found at http://www.sysbio
.oxfordjournals.org.
FUNDING
This work was supported by the European Union
funded FP7 ViBRANT Project (Contract number RI-
261532, Period, December 2010 to November 2013).
ACKNOWLEDGEMENTS
We sincerely thank Gregor Hagedorn (Julius Kühn
Institute, Berlin, Germany) and Andreas Müller
(Botanical Garden and Botanical Museum, Berlin,
atFreieUniversitaetBerlinonSeptember18,2012http://sysbio.oxfordjournals.org/Downloadedfrom
5. Copyedited by: GS MANUSCRIPT CATEGORY: Article
[13:42 7/9/2012 Sysbio-sys069.tex] Page: 5 1–5
2012 BURGUIERE ET AL.—Ikey+ 5
Germany) for sharing their knowledge on the SDD
format. We are also grateful to Dave Roberts (Natural
History Museum, London, UK) for reviewing an early
version of the article and providing style improvements.
REFERENCES
BIOTA. Available from: URL http://www.edinburgh.ceh.ac.uk/
biota/ (last accessed 13 August 2012).
Breiman L., Friedman J.H., Olshen R.A. Stone C.J. 1984. Classification
and regression trees. Belmont, CA:Wadsworth International Group.
Catalogue of Life. Available from: URL http://www.catalogueoflife.
org/ (last accessed 13 August 2012).
Dallwitz M.J., Paine T.A., Zurcher E.J. 1993. User’s guide to the delta
system: a general system for coding taxonomic descriptions. 4th
ed. Available from: URL http://delta-intkey.com (last accessed
13 August 2012).
Delgado-Calvo-Flores M., Fajardo-Contreras W., Gibaja-Galindo E.L.,
Perez-Perez R. 2006. Xkey: a tool for the generation of identification
keys. Expert Syst. Appl. 30:337–351.
EDIT. European Distributed Institute of Taxonomy. Available from:
URL http://www.e-taxonomy.eu/ (last accessed 13 August 2012).
EOL. Encyclopaedia of Life. Available from: URL http://eol.org/.
GBIF. Global Biodiversity Information Facility. Available from: URL
http://www.gbif.org/ (last accessed 13 August 2012).
Gérard D., Vignes-Lebbe R. 2010. Mykey: a server-side software to
create customized decision trees. In: Nimis P.L., Vignes-Lebbe R.,
editors. Tools for identifying biodiversity: progress and problems.
Edizioni Università di Trieste, Trieste, Italy. p. 107–112.
Gower J.C., Payne R.W. 1975. A comparison of different criteria
for selecting binary tests in diagnostic keys. Biometrika
62:665–672.
Gyllenberg H.G. 1963. A general method for deriving determinative
schemes for random collections of microbial isolates. Ann. Acad.
Scient. Fenn. Ser. A IV. Biologica 1(69):1–23.
Hagedorn G. 2006. The structured descriptive data (SDD) w3c-xml-
schema. Version 1.1 Available from: URL http://wiki.tdwg.org/
twiki/bin/view/SDD/Version1dot1 (last accessed 13 August 2012).
Hagedorn G., Rambold G., Martellos S. 2010. Types of identification
keys. In: Nimis P.L., Vignes-Lebbe R. editors. Tools for identifying
biodiversity: progress and problems. Edizioni Università di Trieste,
Trieste, Italy. p. 59–64.
Hand R., Kilian N., Raab-Straube E. 2009. International cichorieae
network: Cichorieae portal. Available from: URL http://wp6-
cichorieae.e-taxonomy.eu/portal/ (last accessed 13 August 2012).
Jaccard P. 1901. Étude comparative de la distribution florale dans une
portion des alpes et des jura. Bull. Soc. Vaud. Sci. Nat. 37:547–579.
J2EE. Java 2, Enterprise Edition. Available from: URL http://www.
oracle.com/technetwork/java/javaee/overview/index.html (last
accessed 13 August 2012).
KeyToNature. Available from: URL http://www.keytonature.
eu/wiki/ (last accessed 13 August 2012).
Pankhurst R.J. 1988. Pankey programs. DELTA Newsletter 1:2.
Pankhurst R.J. 1991. Practical taxonomic computing. Cambridge
University Press, Cambridge, UK.
Quinlan J.R. 1986. Induction of decision trees. Mach. Learn. 1:81–
106. ISSN 0885-6125. Available from: URL http://dx.doi.org/
10.1007/BF00116251 (last accessed 13 August 2012).
REST Architecture. Available from: URL http://www.oracle.com/
technetwork/articles/javase/index-137171.html (last accessed
13 August 2012).
Scratchpads. Biodiversity Online. Available from: URL http://
scratchpads.eu/ (last accessed 13 August 2012).
SOAP. Simple Object Access Protocol, W3C Recommandation. Version
1.2. Available from: URL http://www.w3.org/TR/soap/ (last
accessed 13 August 2012).
Sokal R., Michener C. 1958. A statistical method for evaluating
systematic relationships. Univ. Kansas Sci. Bull., (38):1409–1438.
Ung V., Dubus G., Zaragüeta-Bagils R., Vignes-Lebbe R. 2010. Xper2:
introducing e-taxonomy. Bioinformatics 26(5):703–704.
ViBRANTa. Objectives. Available from: URL http://vbrant.eu/
node/1 (last accessed 13 August 2012).
ViBRANTb. Virtual Biodiversity Research and Access Network for
Taxonomy. Available from: URL http://vbrant.eu (last accessed
13 August 2012).
Vignes R., Lebbe J., Darmoni S. 1989. Symbolic-numeric approach
for biological knowledge representation: a medical example with
creation of identification graphs. In E. Diday, editor, Proceedings
of the conference on Data analysis, learning symbolic and numeric
knowledge.NovaSciencePublishers,Inc.Commack,NY,USA.ISBN
0-941743-64-0. p. 389–398.
Wheeler, Q.D., editor 2008. The new taxonomy. CRC Press Inc. New
York, USA.
atFreieUniversitaetBerlinonSeptember18,2012http://sysbio.oxfordjournals.org/Downloadedfrom