Workflow Classification and Open-Sourcing Methods: Towards a New Publication ...Richard Littauer
Presented at the Open Knowledge Conference 2011 in Berlin.
This work is being done under the heading of DataONE. More information can be found at http://notebooks.dataone.org/workflows
Workflow Best Practices - DeVry UniversityTeradata
Andrea Passas with DeVry University presented Workflow Best Practices at the Aprimo Marketing Summit. For more information please visit www.Teradata.com
Handout for the ALAO 10/30/09 presentation.
Presentation description: Overview: What can technical services departments do to tackle automation, training, and communication issues when they have been doing more with less for years? This presentation will cover several free (or inexpensive) and easy to implement tools that Miami University’s Technical Services department has incorporated into daily operations. Specific software, including wikis, macros, and tutorials will be discussed along with quick ways these tools can help technical services address the above issues.
Workflow Classification and Open-Sourcing Methods: Towards a New Publication ...Richard Littauer
Presented at the Open Knowledge Conference 2011 in Berlin.
This work is being done under the heading of DataONE. More information can be found at http://notebooks.dataone.org/workflows
Workflow Best Practices - DeVry UniversityTeradata
Andrea Passas with DeVry University presented Workflow Best Practices at the Aprimo Marketing Summit. For more information please visit www.Teradata.com
Handout for the ALAO 10/30/09 presentation.
Presentation description: Overview: What can technical services departments do to tackle automation, training, and communication issues when they have been doing more with less for years? This presentation will cover several free (or inexpensive) and easy to implement tools that Miami University’s Technical Services department has incorporated into daily operations. Specific software, including wikis, macros, and tutorials will be discussed along with quick ways these tools can help technical services address the above issues.
Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...confluent
Apache Kafka is now nearly ubiquitous in modern data pipelines and use cases. While the Kafka development model is elegantly simple, operating Kafka clusters in production environments is a challenge. It’s hard to troubleshoot misbehaving Kafka clusters, especially when there are potentially hundreds or thousands of topics, producers and consumers and billions of messages.
The root cause of why real-time applications is lag may be due to an application problem – like poor data partitioning or load imbalance – or due to a Kafka problem – like resource exhaustion or suboptimal configuration. Therefore getting the best performance, predictability, and reliability for Kafka-based applications can be difficult. In the end, the operation of your Kafka powered analytics pipelines could themselves benefit from machine learning (ML).
Jumbune helps developers to analyze the data flow within their code and perform distributed debugging of MapReduce application, detect data anomalies, monitor your cluster and also profile your code.
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...Bonnie Hurwitz
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to microbes. Overview of work underway to add applications and computational analysis pipelines to iPlant for metagenomics and microbial ecology.
Yinyin Liu presents at SD Robotics Meetup on November 8th, 2016. Deep learning has made great success in image understanding, speech, text recognition and natural language processing. Deep Learning also has tremendous potential to tackle the challenges in robotic vision, and sensorimotor learning in a robotic learning environment. In this talk, we will talk about how current and future deep learning technologies can be applied for robotic applications.
[3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, ...3TU.Datacentrum
3TU.Datacentrum Symposium Research Data Management:
Funder requirements, Questions and Solutions
At this symposium the funding organisation NWO and the European Commission explained their vision, plans and requirements. Researchers from the three universities of technology shared their experiences of data management in different stages of research. And the Research Data Services team informed the audience about research data management services offered by 3TU.Datacentrum.
The 3TU.Datacentrum symposium took place at the TU Delft (26 May), University of Twente (2 June) and TU Eindhoven (11 June) for and with local researchers.
More information on: datacentrum.3tu.nl/over-3tudatacentrum/symposium-2014
Planets, OPF & SCAPE - presentation of tools on digital preservationSCAPE Project
Andrew Jackson from British Library presents digital preservation tools from the EU projects Planets and SCAPE and the Open Planets Foundation which is a network providing practical solutions and expertise in digital preservation.
Presented at 'Practical Tools for Digital Preservation: A Hack-a-thon' in York, September 28, 2011.
Deep Learning on Apache® Spark™: Workflows and Best PracticesDatabricks
The combination of Deep Learning with Apache Spark has the potential for tremendous impact in many sectors of the industry. This webinar, based on the experience gained in assisting customers with the Databricks Virtual Analytics Platform, will present some best practices for building deep learning pipelines with Spark.
Rather than comparing deep learning systems or specific optimizations, this webinar will focus on issues that are common to deep learning frameworks when running on a Spark cluster, including:
* optimizing cluster setup;
* configuring the cluster;
* ingesting data; and
* monitoring long-running jobs.
We will demonstrate the techniques we cover using Google’s popular TensorFlow library. More specifically, we will cover typical issues users encounter when integrating deep learning libraries with Spark clusters.
Clusters can be configured to avoid task conflicts on GPUs and to allow using multiple GPUs per worker. Setting up pipelines for efficient data ingest improves job throughput, and monitoring facilitates both the work of configuration and the stability of deep learning jobs.
Deep Learning on Apache® Spark™: Workflows and Best PracticesJen Aman
The combination of Deep Learning with Apache Spark has the potential for tremendous impact in many sectors of the industry. This webinar, based on the experience gained in assisting customers with the Databricks Virtual Analytics Platform, will present some best practices for building deep learning pipelines with Spark.
Rather than comparing deep learning systems or specific optimizations, this webinar will focus on issues that are common to deep learning frameworks when running on a Spark cluster, including:
* optimizing cluster setup;
* configuring the cluster;
* ingesting data; and
* monitoring long-running jobs.
We will demonstrate the techniques we cover using Google’s popular TensorFlow library. More specifically, we will cover typical issues users encounter when integrating deep learning libraries with Spark clusters.
Clusters can be configured to avoid task conflicts on GPUs and to allow using multiple GPUs per worker. Setting up pipelines for efficient data ingest improves job throughput, and monitoring facilitates both the work of configuration and the stability of deep learning jobs.
Deep Learning on Apache® Spark™ : Workflows and Best PracticesJen Aman
The combination of Deep Learning with Apache Spark has the potential for tremendous impact in many sectors of the industry. This webinar, based on the experience gained in assisting customers with the Databricks Virtual Analytics Platform, will present some best practices for building deep learning pipelines with Spark.
Rather than comparing deep learning systems or specific optimizations, this webinar will focus on issues that are common to deep learning frameworks when running on a Spark cluster, including:
* optimizing cluster setup;
* configuring the cluster;
* ingesting data; and
* monitoring long-running jobs.
We will demonstrate the techniques we cover using Google’s popular TensorFlow library. More specifically, we will cover typical issues users encounter when integrating deep learning libraries with Spark clusters.
Clusters can be configured to avoid task conflicts on GPUs and to allow using multiple GPUs per worker. Setting up pipelines for efficient data ingest improves job throughput, and monitoring facilitates both the work of configuration and the stability of deep learning jobs.
#OSSPARIS19 : Comment ONLYOFFICE aide à organiser les travaux de recherches ...Paris Open Source Summit
ONLYOFFICE développée par Ascensio System SIA, est une suite bureautique open-source basée sur l'élément Canvas de HTML5, qui offre une gamme complète d’outils d’édition en ligne des documents texte, feuilles de calcul et présentations.
Cette présentation commence par l’aperçu des principes de base :
- support de tous les formats courants,
- riche éventail d’outils de la mise en forme,
- affichage du contenu de manière identique, quel que soit le navigateur utilisé,
- ressources permettant d’étendre les fonctionnalités des éditeurs,
- capacités avancées de co-édition,
- transfert de données sécurisé en temps réel.
Le nombre des universités et des écoles qui optent pour les alternatives open source aux solutions populaires offertes par les grandes marques, augmente chaque année. Les solutions de ONLYOFFICE sont actuellement utilisées par plus de 30 établissements d’enseignement en France tels que treize Universités de la Sorbonne, l’Université de Grenoble, l’Université de Nantes, l’École Nationale d'Ingénieurs de Brest, le l'établissement public Campus Condorcet, etc.
Dans cette partie, Jeremy Maton, l’Administrateur Systèmes et Réseaux à l’Institut de Biologie de Lille, va présenter comment ONLYOFFICE est intégrée au sein de leur unité de recherches et aide à organiser le flux de travail.
Academic Research in the Blogosphere: Adapting to New Risks and Opportunities...Richard Littauer
This was a talk given at the Digital Humanities 2012 conference in Hamburg by Michael Pleyer, a coauthor on the paper and on the blog ReplicatedTypo.com.
Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...confluent
Apache Kafka is now nearly ubiquitous in modern data pipelines and use cases. While the Kafka development model is elegantly simple, operating Kafka clusters in production environments is a challenge. It’s hard to troubleshoot misbehaving Kafka clusters, especially when there are potentially hundreds or thousands of topics, producers and consumers and billions of messages.
The root cause of why real-time applications is lag may be due to an application problem – like poor data partitioning or load imbalance – or due to a Kafka problem – like resource exhaustion or suboptimal configuration. Therefore getting the best performance, predictability, and reliability for Kafka-based applications can be difficult. In the end, the operation of your Kafka powered analytics pipelines could themselves benefit from machine learning (ML).
Jumbune helps developers to analyze the data flow within their code and perform distributed debugging of MapReduce application, detect data anomalies, monitor your cluster and also profile your code.
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...Bonnie Hurwitz
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to microbes. Overview of work underway to add applications and computational analysis pipelines to iPlant for metagenomics and microbial ecology.
Yinyin Liu presents at SD Robotics Meetup on November 8th, 2016. Deep learning has made great success in image understanding, speech, text recognition and natural language processing. Deep Learning also has tremendous potential to tackle the challenges in robotic vision, and sensorimotor learning in a robotic learning environment. In this talk, we will talk about how current and future deep learning technologies can be applied for robotic applications.
[3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, ...3TU.Datacentrum
3TU.Datacentrum Symposium Research Data Management:
Funder requirements, Questions and Solutions
At this symposium the funding organisation NWO and the European Commission explained their vision, plans and requirements. Researchers from the three universities of technology shared their experiences of data management in different stages of research. And the Research Data Services team informed the audience about research data management services offered by 3TU.Datacentrum.
The 3TU.Datacentrum symposium took place at the TU Delft (26 May), University of Twente (2 June) and TU Eindhoven (11 June) for and with local researchers.
More information on: datacentrum.3tu.nl/over-3tudatacentrum/symposium-2014
Planets, OPF & SCAPE - presentation of tools on digital preservationSCAPE Project
Andrew Jackson from British Library presents digital preservation tools from the EU projects Planets and SCAPE and the Open Planets Foundation which is a network providing practical solutions and expertise in digital preservation.
Presented at 'Practical Tools for Digital Preservation: A Hack-a-thon' in York, September 28, 2011.
Deep Learning on Apache® Spark™: Workflows and Best PracticesDatabricks
The combination of Deep Learning with Apache Spark has the potential for tremendous impact in many sectors of the industry. This webinar, based on the experience gained in assisting customers with the Databricks Virtual Analytics Platform, will present some best practices for building deep learning pipelines with Spark.
Rather than comparing deep learning systems or specific optimizations, this webinar will focus on issues that are common to deep learning frameworks when running on a Spark cluster, including:
* optimizing cluster setup;
* configuring the cluster;
* ingesting data; and
* monitoring long-running jobs.
We will demonstrate the techniques we cover using Google’s popular TensorFlow library. More specifically, we will cover typical issues users encounter when integrating deep learning libraries with Spark clusters.
Clusters can be configured to avoid task conflicts on GPUs and to allow using multiple GPUs per worker. Setting up pipelines for efficient data ingest improves job throughput, and monitoring facilitates both the work of configuration and the stability of deep learning jobs.
Deep Learning on Apache® Spark™: Workflows and Best PracticesJen Aman
The combination of Deep Learning with Apache Spark has the potential for tremendous impact in many sectors of the industry. This webinar, based on the experience gained in assisting customers with the Databricks Virtual Analytics Platform, will present some best practices for building deep learning pipelines with Spark.
Rather than comparing deep learning systems or specific optimizations, this webinar will focus on issues that are common to deep learning frameworks when running on a Spark cluster, including:
* optimizing cluster setup;
* configuring the cluster;
* ingesting data; and
* monitoring long-running jobs.
We will demonstrate the techniques we cover using Google’s popular TensorFlow library. More specifically, we will cover typical issues users encounter when integrating deep learning libraries with Spark clusters.
Clusters can be configured to avoid task conflicts on GPUs and to allow using multiple GPUs per worker. Setting up pipelines for efficient data ingest improves job throughput, and monitoring facilitates both the work of configuration and the stability of deep learning jobs.
Deep Learning on Apache® Spark™ : Workflows and Best PracticesJen Aman
The combination of Deep Learning with Apache Spark has the potential for tremendous impact in many sectors of the industry. This webinar, based on the experience gained in assisting customers with the Databricks Virtual Analytics Platform, will present some best practices for building deep learning pipelines with Spark.
Rather than comparing deep learning systems or specific optimizations, this webinar will focus on issues that are common to deep learning frameworks when running on a Spark cluster, including:
* optimizing cluster setup;
* configuring the cluster;
* ingesting data; and
* monitoring long-running jobs.
We will demonstrate the techniques we cover using Google’s popular TensorFlow library. More specifically, we will cover typical issues users encounter when integrating deep learning libraries with Spark clusters.
Clusters can be configured to avoid task conflicts on GPUs and to allow using multiple GPUs per worker. Setting up pipelines for efficient data ingest improves job throughput, and monitoring facilitates both the work of configuration and the stability of deep learning jobs.
#OSSPARIS19 : Comment ONLYOFFICE aide à organiser les travaux de recherches ...Paris Open Source Summit
ONLYOFFICE développée par Ascensio System SIA, est une suite bureautique open-source basée sur l'élément Canvas de HTML5, qui offre une gamme complète d’outils d’édition en ligne des documents texte, feuilles de calcul et présentations.
Cette présentation commence par l’aperçu des principes de base :
- support de tous les formats courants,
- riche éventail d’outils de la mise en forme,
- affichage du contenu de manière identique, quel que soit le navigateur utilisé,
- ressources permettant d’étendre les fonctionnalités des éditeurs,
- capacités avancées de co-édition,
- transfert de données sécurisé en temps réel.
Le nombre des universités et des écoles qui optent pour les alternatives open source aux solutions populaires offertes par les grandes marques, augmente chaque année. Les solutions de ONLYOFFICE sont actuellement utilisées par plus de 30 établissements d’enseignement en France tels que treize Universités de la Sorbonne, l’Université de Grenoble, l’Université de Nantes, l’École Nationale d'Ingénieurs de Brest, le l'établissement public Campus Condorcet, etc.
Dans cette partie, Jeremy Maton, l’Administrateur Systèmes et Réseaux à l’Institut de Biologie de Lille, va présenter comment ONLYOFFICE est intégrée au sein de leur unité de recherches et aide à organiser le flux de travail.
Academic Research in the Blogosphere: Adapting to New Risks and Opportunities...Richard Littauer
This was a talk given at the Digital Humanities 2012 conference in Hamburg by Michael Pleyer, a coauthor on the paper and on the blog ReplicatedTypo.com.
Presentation of the Marcu 2000 ACL paper "The rhetorical parsing of unrestricted texts- A surface-based approach" for Discourse Parsing and Language Technology seminar.
Recent studies have suggested that various anatomical changes, such as the widening of the hypoglossal canal, the descent of the larynx, and the loss of air sacs, are prerequisites for speech or occurred due to selective pressure on speech. Such studies have been used to suggest that Homo neanderthalis as well as early Homo sapiens were capable of speech. However, using a broad literature review of multimodal languages, such as whistle languages, and the articulation processes behind prosodic features, I will show that such studies ignore various aspects of language that would not require maximal discreteness in phonological features. I will suggest that these studies do not adequately account for prosodic features that would not require anatomical changes in early hominins when considering protolanguage, as they are based on a fundamentally modern view of modern languages which place a heavier load on phonological features at the cost of prosodic load. Therefore, a reanalysis of anatomical changes in early hominins is necessary.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Trends in Use of Scientific Workflows: Insights from a Public Repository and Recommendations for Best Practices
1. Trends in Use of Scientific Workflows:
DataONE
Insights from a Public Repository and
Recommendations for Best Practices
Richard Littauer, Karthik Ram, Bertram Ludäscher, William
Michener, Rebecca Koskela
1
2. Scientific Workflows
Tools that help scientists:
• Automate repetitive or
difficult work
DataONE
• Provide reproducibility to their
experiments
• Track provenance
• Share their data with other 2
scientists
5. Example Workflow
DataONE
5
http://www.myexperiment.org/workflows/140.html
6. Our Study
• How are workflows being used?
DataONE
6
http://www.flickr.com/photos/eleaf/2536358399
7. Our Study
• How are workflows being used?
DataONE
• How are they being shared?
7
http://www.flickr.com/photos/eleaf/2536358399
8. Our Study
• How are workflows being used?
DataONE
• How are they being shared?
• What sort of best practices can
researchers follow to maximize
the longevity and use of their
work?
8
http://www.flickr.com/photos/eleaf/2536358399
10. Our Study
• www.myexperiment.org
• Est. 2007
DataONE
• 5000+ users
• 2000+ workflows (mostly Taverna 1, 2, and RapidMiner)
• Minable RDF storage for workflows, groups, packs, users, files.
• Minable data gathered through the SCUFLE XML language for the
Taverna workflows
• Taverna 1 - 479 workflows; Taverna 2 - 684 workflows.
10
11. Our Study
• We harvested information using a combination of SPARQL and
Python (https://github.com/RichardLitt/Understanding-Workflows)
DataONE
11
12. Our Study
• We harvested information using a combination of SPARQL and
Python (https://github.com/RichardLitt/Understanding-Workflows)
DataONE
• Gathered user, workflow, files, packs, groups view and
download statistics, metadata, descriptions, tags, and so on
(http://thedatahub.org/dataset/myexperiment-screenscrape)
12
13. Findings
• A large percentage of
workflows consist of few
components.
• The amount of components
DataONE
ranges from 1 to 250. The
average workflow supports
24.3 tasks.
• Complex workflows are
downloaded more.
13
14. Findings
• Most workflow contributors
submit a single workflow.
• Only 13 users have uploaded
more than 30 workflows.
DataONE
• Just over 5% of the users on
myExperiment have uploaded
workflows.
14
15. Findings
• Most workflows have only
one version uploaded.
• When several versions do
DataONE
exist, the workflow is more
frequently downloaded than
“single-edition” workflows.
15
16. Findings
• Workflow use declined
significantly a month after
initial upload.
DataONE
16
17. Findings
• A large percentage of workflow components – approx. 38% -
are shims.
DataONE
• Components that are used to make output from one step
conform to the format expected by a subsequent step.
17
18. Findings
• A large percentage of workflow components – approx. 38% -
are shims.
DataONE
• Components that are used to make output from one step
conform to the format expected by a subsequent step.
• This is a problem for developers.
18
19. Findings
• A large percentage of workflow components – approx. 38% -
are shims.
DataONE
• Components that are used to make output from one step
conform to the format expected by a subsequent step.
• This is a problem for developers.
• 8% more than previous studies (Lin et al.)
19
20. Findings
• 60% of workflows have embedded workflows within them.
DataONE
20
21. Findings
• 60% of workflows have embedded workflows within them.
• Documentation on site (tags, description) does not improve
DataONE
use…
21
22. Findings
• 60% of workflows have embedded workflows within them.
• Documentation on site (tags, description) does not improve
DataONE
use…
• … but community engagement does.
22
23. Recommendations
Remember workflows are evolving entities.
DataONE
They are updated in response to user
feedback, engagement, and improvements in
methodology.
23
24. Recommendations
Use relevant social annotation tools.
DataONE
But they need to be constrained; for
instance, through the use of a controlled tag
vocabulary.
24
28. Recommendations
Workflow re-use could benefit significantly from
DataONE
the assignment of stable identifiers, like Digital
Object Identifiers (DOI).
28
29. Recommendations
Education is the key to more use.
DataONE
i.e. in professional society meetings, online
courses, and undergraduate and graduate courses.
29
30. Impact on Science
Following these recommendations can help:
• Make science more efficient.
• Facilitate reproducible science.
DataONE
• Help with collaborative research.
• Speed up the peer review process.
• Your impact. (For instance, NSF has said these
are valuable contributions.)
30
31. Links
• Mendeley Research Group:
http://www.mendeley.com/groups/1189721/scientific-workflows-
and-workflow-systems/
• Github https://github.com/RichardLitt/Understanding-Workflows
• Data http://thedatahub.org/dataset/myexperiment-screenscrape
DataONE
• Notebook https://notebooks.dataone.org/workflows
31
http://www.flickr.com/photos/wwworks/4759535950/