The document provides an introduction to open science and the European Open Science Cloud (EOSC). It discusses the concepts of open access, open data, open methods, and FAIR data principles. It describes the EOSC as a federation of research infrastructures and services that aims to enable multidisciplinary discovery and use. Key benefits of the EOSC for researchers include access to more services, funding for compute resources, easier discovery of related data, and greater collaboration abilities.
University of Liverpool Researcher KnowHow session presented by Judith Carr.
At the end of this session you will know what the FAIR data principles are, what is required and be in a position to think how these would relate to your research practice.
Introduction to Persistent Identifiers| www.eudat.eu | EUDAT
| www.eudat.eu | What are persistent identifiers? Why use persistent identifiers? Different persistent identifier systems; The HANDLE system; EPIC PID system; Policies; Use cases
Ver 2 July 2017
An introduction to the FAIR principles and a discussion of key issues that must be addressed to ensure data is findable, accessible, interoperable and re-usable. The session explored the role of the CDISC and DDI standards for addressing these issues.
Presented by Gareth Knight at the ADMIT Network conference, organised by the Association for Data Management in the Tropics, in Antwerp, Belgium on December 1st 2015.
An overview on FAIR Data and FAIR Data stewardship, and the roadmap for FAIR Data solutions coordinated by the Dutch Techcentre for Life Sciences. This presentation was given at the Netherlands eScience Center's "Essential skills in data-intensive research" course week.
University of Liverpool Researcher KnowHow session presented by Judith Carr.
At the end of this session you will know what the FAIR data principles are, what is required and be in a position to think how these would relate to your research practice.
Introduction to Persistent Identifiers| www.eudat.eu | EUDAT
| www.eudat.eu | What are persistent identifiers? Why use persistent identifiers? Different persistent identifier systems; The HANDLE system; EPIC PID system; Policies; Use cases
Ver 2 July 2017
An introduction to the FAIR principles and a discussion of key issues that must be addressed to ensure data is findable, accessible, interoperable and re-usable. The session explored the role of the CDISC and DDI standards for addressing these issues.
Presented by Gareth Knight at the ADMIT Network conference, organised by the Association for Data Management in the Tropics, in Antwerp, Belgium on December 1st 2015.
An overview on FAIR Data and FAIR Data stewardship, and the roadmap for FAIR Data solutions coordinated by the Dutch Techcentre for Life Sciences. This presentation was given at the Netherlands eScience Center's "Essential skills in data-intensive research" course week.
Open Data Institute Course - Open Data in a Day conducted by Registered ODI Trainer Ian Henshaw on October 14, 2015 in RTP, NC USA - Deck #1 Introduction to Open Data
El Plan Datos como Herramienta para la Ciencia AbiertaLourdes Feria
Todo lo que necesitas saber para elaborar tu plan de datos de investigación ¿Qué tipo de datos vas a crear¿ ¿Cómo los vas a documentar? ¿Cómo cuidarás los datos sensibles? ¿Qué vas a hacer con ellos al final? ¿Cómo los vas a compartir?. Todos los tesistas y los académicos que generan publicaciones científicas necesitan conocer esta herramienta.
Data Catalog as the Platform for Data IntelligenceAlation
Data catalogs are in wide use today across hundreds of enterprises as a means to help data scientists and business analysts find and collaboratively analyze data. Over the past several years, customers have increasingly used data catalogs in applications beyond their search & discovery roots, addressing new use cases such as data governance, cloud data migration, and digital transformation. In this session, the founder and CEO of Alation will discuss the evolution of the data catalog, the many ways in which data catalogs are being used today, the importance of machine learning in data catalogs, and discuss the future of the data catalog as a platform for a broad range of data intelligence solutions.
Presentation given at Macquarie University in support of the ARDC 'institutional role in the data commons' project on "Implementing FAIR: Standards in Research Data Management" https://ardc.edu.au/news/data-and-services-discovery-activities-successful-applicants/
Talk given at Fronteers 2015 in Amsterdam.
In a world where many of our digital spaces are becoming more closed than ever, open data is a concept that is rapidly on the rise.
In this talk we'll explore what open data is (and what it isn't), and why we should care about it. We'll look at how you can introduce it into your projects with regards to practical publication and consumption, and discuss some useful tools and reference points.
Open data isn't just dry and technical - it gives us great scope to be creative, and throughout this talk we'll go through some of the amazing things that it has been used for globally in the hope that it will inspire you to create something amazing yourself.
How a Semantic Layer Makes Data Mesh Work at ScaleDATAVERSITY
Data Mesh is a trending approach to building a decentralized data architecture by leveraging a domain-oriented, self-service design. However, the pure definition of Data Mesh lacks a center of excellence or central data team and doesn’t address the need for a common approach for sharing data products across teams. The semantic layer is emerging as a key component to supporting a Hub and Spoke style of organizing data teams by introducing data model sharing, collaboration, and distributed ownership controls.
This session will explain how data teams can define common models and definitions with a semantic layer to decentralize analytics product creation using a Hub and Spoke architecture.
Attend this session to learn about:
- The role of a Data Mesh in the modern cloud architecture.
- How a semantic layer can serve as the binding agent to support decentralization.
- How to drive self service with consistency and control.
Ethics of Big Data is about finding alignment between an organization's core values and their day-to-day actions in a way that balances risk and innovation. As Big Data brings business operations and practices deeper and more fully into individual lives, it is creating a forcing function that raises ethical questions about our values around concepts like identity, privacy, ownership, and reputation. How we understand those values and align them with our actions when innovating products and services using Big Data technologies benefits from a framework that provides a common vocabulary and encourages explicit discussion.
The material will address the intersection of ethics and Big Data; what it is and what it isn't. Specifically, how to approach and generate dialog about an abstract subject with direct, real-world implications. A general framework for talking about ethics in the context of Big Data will be introduced.
Aspects include:
1. Direct relevance to your data handling practices
2. How Big Data is influencing important concepts including identity, privacy, ownership, and reputation
3. Ethical Decision Points
4. Value Personas as a tool for encouraging discussion and generating agreement and alignment between values and actions
5. Balancing the benefits of Big Data innovation and the risks of harm
The webcast will present key concepts from the forthcoming book Ethics of Big Data
Active Governance Across the Delta Lake with AlationDatabricks
Alation provides a single interface to provide users and stewards to provide active and agile data governance across Databricks Delta Lake and Databricks SQL Analytics Service. Understand how Alation can expand adoption in the data lake while providing safe and responsible data consumption.
Juanjo Hierro - Introduction and overview of FIWARE Vision on Data Spaces.pdfFIWARE
This session will bring you the opportunity to discover how FIWARE will make Data Spaces happen! Contents will give all the details and insights around the path taken in this strategic area. An introduction will provide the overall vision on Data Spaces, the status of the Data Spaces Business Alliance (DSBA) Technical Convergence activities, and initial considerations around the concept of FIWARE Data Space Connector, the first dataspace connector that will comply with the Data Space Business Alliance recommendations.
Different coordination and support actions of the Digital Europe Programme (DEP) in the Data Spaces domain will also be presented, as well as initial outputs from these projects. It will provide insights about the opportunities to influence and drive decisions within this important program of the European Union.
A series of presentations will deep dive into technical details about the minimum viable framework recommended in DSBA: the standards proposed and how they integrate together. Concretely, presentations will focus on the pillars linked to decentralized Trust, Identity & Access Management and the pillar for Data Value creation covering aspects for Monetization and Marketplace services.
Several presentations will tackle elements that open the discussion around the evolution of Data Spaces, as well as components expected to be integrated in the concept of Data Space Connector. They will be followed by use cases that provide insight on what is being developed and testimonies on how technologies based on Data Spaces concepts previously displayed are being used in real life scenarios.
RWDG Webinar: Data Steward Definition and Other Data Governance RolesDATAVERSITY
The role of the Data Steward is critical to the success of a Data Governance program. There are several approaches to Stewardship including assigning people to be Data Stewards, identify existing Data Stewards and recognizing Data Stewards according to their relationship to the data they define, produce and use. However Stewards are only one of several Data Governance roles that must be considered.
In this month’s RWDG webinar, Bob Seiner will discuss several approaches to defining the role of the Data Steward as well as the other roles necessary for Data Governance program success. Data Governance roles must include operational, tactical, strategic and supporting levels of responsibilities. Spend an hour with Bob where he will share a customize-able Operating Model of Data Governance roles and responsibilities.
In this webinar, Bob will discuss:
• Several approaches to defining Data Stewards and Stewardship
• How to select the Stewardship approach that is right for you
• Different levels of Stewards required for a successful program
• An Operating Model of DG Roles that can be molded to fit in any culture
• Why the approach to defining DG roles can make or break the program
Data is everywhere, and delivering trustable data to anyone who needs it has become a challenge. But innovative technologies come to the rescue: through smart semantics, metadata management, auto-profiling, faceted search and collaborative data curation there is a way to establish a Wikipedia like approach for your data. Find out how Talend will help you to operationalize more data faster and increase data usage for everyone with an Enterprise Data Catalog
The ability to continuously innovate is crucial for business growth – and often necessary for survival. Leaders in an uncertain and fast-paced global business regularly seek innovation to revitalise rigid business models and processes. However, they are aware that ‘innovation is hard’ and fraught with uncertainty. I contend that Big Data Analytics – in addition to its many other business benefits – can guide the innovation process to make it more efficient, effective and predictable.
Big Data Analytics promotes the application of a data-driven mindset that ‘listens to the data’ for new insights and disrupts entrenched thinking that hinders innovation. It applies what-if analysis to assess impact of new ideas on key business metrics and uses evidence-based business performance analysis to track the impact of innovation. Integrating Big Data Analytics into the business planning and operational processes provides valuable feedback loops and enables an adaptive innovation process.
In short, Big Data Analytics can spark innovation, guide its refinement and adoption processes and sustain its ongoing implementation.
Data Catalogs Are the Answer – What is the Question?DATAVERSITY
Organizations with governed metadata made available through their data catalog can answer questions their people have about the organization’s data. These organizations get more value from their data, protect their data better, gain improved ROI from data-centric projects and programs, and have more confidence in their most strategic data.
Join Bob Seiner for this lively webinar where he will talk about the value of a data catalog and how to build the use of the catalog into your stewards’ daily routines. Bob will share how the tool must be positioned for success and viewed as a must-have resource that is a steppingstone and catalyst to governed data across the organization.
Data Catalog for Better Data Discovery and GovernanceDenodo
Watch full webinar here: https://buff.ly/2Vq9FR0
Data catalogs are en vogue answering critical data governance questions like “Where all does my data reside?” “What other entities are associated with my data?” “What are the definitions of the data fields?” and “Who accesses the data?” Data catalogs maintain the necessary business metadata to answer these questions and many more. But that’s not enough. For it to be useful, data catalogs need to deliver these answers to the business users right within the applications they use.
In this session, you will learn:
*How data catalogs enable enterprise-wide data governance regimes
*What key capability requirements should you expect in data catalogs
*How data virtualization combines dynamic data catalogs with delivery
A open science presentation focusing on the benefits to be gained and basic practices to follow. This was given on behalf of FOSTER at the Open Science Boos(t)camp event at KU Leuven on 24th October 2014.
Open Data Institute Course - Open Data in a Day conducted by Registered ODI Trainer Ian Henshaw on October 14, 2015 in RTP, NC USA - Deck #1 Introduction to Open Data
El Plan Datos como Herramienta para la Ciencia AbiertaLourdes Feria
Todo lo que necesitas saber para elaborar tu plan de datos de investigación ¿Qué tipo de datos vas a crear¿ ¿Cómo los vas a documentar? ¿Cómo cuidarás los datos sensibles? ¿Qué vas a hacer con ellos al final? ¿Cómo los vas a compartir?. Todos los tesistas y los académicos que generan publicaciones científicas necesitan conocer esta herramienta.
Data Catalog as the Platform for Data IntelligenceAlation
Data catalogs are in wide use today across hundreds of enterprises as a means to help data scientists and business analysts find and collaboratively analyze data. Over the past several years, customers have increasingly used data catalogs in applications beyond their search & discovery roots, addressing new use cases such as data governance, cloud data migration, and digital transformation. In this session, the founder and CEO of Alation will discuss the evolution of the data catalog, the many ways in which data catalogs are being used today, the importance of machine learning in data catalogs, and discuss the future of the data catalog as a platform for a broad range of data intelligence solutions.
Presentation given at Macquarie University in support of the ARDC 'institutional role in the data commons' project on "Implementing FAIR: Standards in Research Data Management" https://ardc.edu.au/news/data-and-services-discovery-activities-successful-applicants/
Talk given at Fronteers 2015 in Amsterdam.
In a world where many of our digital spaces are becoming more closed than ever, open data is a concept that is rapidly on the rise.
In this talk we'll explore what open data is (and what it isn't), and why we should care about it. We'll look at how you can introduce it into your projects with regards to practical publication and consumption, and discuss some useful tools and reference points.
Open data isn't just dry and technical - it gives us great scope to be creative, and throughout this talk we'll go through some of the amazing things that it has been used for globally in the hope that it will inspire you to create something amazing yourself.
How a Semantic Layer Makes Data Mesh Work at ScaleDATAVERSITY
Data Mesh is a trending approach to building a decentralized data architecture by leveraging a domain-oriented, self-service design. However, the pure definition of Data Mesh lacks a center of excellence or central data team and doesn’t address the need for a common approach for sharing data products across teams. The semantic layer is emerging as a key component to supporting a Hub and Spoke style of organizing data teams by introducing data model sharing, collaboration, and distributed ownership controls.
This session will explain how data teams can define common models and definitions with a semantic layer to decentralize analytics product creation using a Hub and Spoke architecture.
Attend this session to learn about:
- The role of a Data Mesh in the modern cloud architecture.
- How a semantic layer can serve as the binding agent to support decentralization.
- How to drive self service with consistency and control.
Ethics of Big Data is about finding alignment between an organization's core values and their day-to-day actions in a way that balances risk and innovation. As Big Data brings business operations and practices deeper and more fully into individual lives, it is creating a forcing function that raises ethical questions about our values around concepts like identity, privacy, ownership, and reputation. How we understand those values and align them with our actions when innovating products and services using Big Data technologies benefits from a framework that provides a common vocabulary and encourages explicit discussion.
The material will address the intersection of ethics and Big Data; what it is and what it isn't. Specifically, how to approach and generate dialog about an abstract subject with direct, real-world implications. A general framework for talking about ethics in the context of Big Data will be introduced.
Aspects include:
1. Direct relevance to your data handling practices
2. How Big Data is influencing important concepts including identity, privacy, ownership, and reputation
3. Ethical Decision Points
4. Value Personas as a tool for encouraging discussion and generating agreement and alignment between values and actions
5. Balancing the benefits of Big Data innovation and the risks of harm
The webcast will present key concepts from the forthcoming book Ethics of Big Data
Active Governance Across the Delta Lake with AlationDatabricks
Alation provides a single interface to provide users and stewards to provide active and agile data governance across Databricks Delta Lake and Databricks SQL Analytics Service. Understand how Alation can expand adoption in the data lake while providing safe and responsible data consumption.
Juanjo Hierro - Introduction and overview of FIWARE Vision on Data Spaces.pdfFIWARE
This session will bring you the opportunity to discover how FIWARE will make Data Spaces happen! Contents will give all the details and insights around the path taken in this strategic area. An introduction will provide the overall vision on Data Spaces, the status of the Data Spaces Business Alliance (DSBA) Technical Convergence activities, and initial considerations around the concept of FIWARE Data Space Connector, the first dataspace connector that will comply with the Data Space Business Alliance recommendations.
Different coordination and support actions of the Digital Europe Programme (DEP) in the Data Spaces domain will also be presented, as well as initial outputs from these projects. It will provide insights about the opportunities to influence and drive decisions within this important program of the European Union.
A series of presentations will deep dive into technical details about the minimum viable framework recommended in DSBA: the standards proposed and how they integrate together. Concretely, presentations will focus on the pillars linked to decentralized Trust, Identity & Access Management and the pillar for Data Value creation covering aspects for Monetization and Marketplace services.
Several presentations will tackle elements that open the discussion around the evolution of Data Spaces, as well as components expected to be integrated in the concept of Data Space Connector. They will be followed by use cases that provide insight on what is being developed and testimonies on how technologies based on Data Spaces concepts previously displayed are being used in real life scenarios.
RWDG Webinar: Data Steward Definition and Other Data Governance RolesDATAVERSITY
The role of the Data Steward is critical to the success of a Data Governance program. There are several approaches to Stewardship including assigning people to be Data Stewards, identify existing Data Stewards and recognizing Data Stewards according to their relationship to the data they define, produce and use. However Stewards are only one of several Data Governance roles that must be considered.
In this month’s RWDG webinar, Bob Seiner will discuss several approaches to defining the role of the Data Steward as well as the other roles necessary for Data Governance program success. Data Governance roles must include operational, tactical, strategic and supporting levels of responsibilities. Spend an hour with Bob where he will share a customize-able Operating Model of Data Governance roles and responsibilities.
In this webinar, Bob will discuss:
• Several approaches to defining Data Stewards and Stewardship
• How to select the Stewardship approach that is right for you
• Different levels of Stewards required for a successful program
• An Operating Model of DG Roles that can be molded to fit in any culture
• Why the approach to defining DG roles can make or break the program
Data is everywhere, and delivering trustable data to anyone who needs it has become a challenge. But innovative technologies come to the rescue: through smart semantics, metadata management, auto-profiling, faceted search and collaborative data curation there is a way to establish a Wikipedia like approach for your data. Find out how Talend will help you to operationalize more data faster and increase data usage for everyone with an Enterprise Data Catalog
The ability to continuously innovate is crucial for business growth – and often necessary for survival. Leaders in an uncertain and fast-paced global business regularly seek innovation to revitalise rigid business models and processes. However, they are aware that ‘innovation is hard’ and fraught with uncertainty. I contend that Big Data Analytics – in addition to its many other business benefits – can guide the innovation process to make it more efficient, effective and predictable.
Big Data Analytics promotes the application of a data-driven mindset that ‘listens to the data’ for new insights and disrupts entrenched thinking that hinders innovation. It applies what-if analysis to assess impact of new ideas on key business metrics and uses evidence-based business performance analysis to track the impact of innovation. Integrating Big Data Analytics into the business planning and operational processes provides valuable feedback loops and enables an adaptive innovation process.
In short, Big Data Analytics can spark innovation, guide its refinement and adoption processes and sustain its ongoing implementation.
Data Catalogs Are the Answer – What is the Question?DATAVERSITY
Organizations with governed metadata made available through their data catalog can answer questions their people have about the organization’s data. These organizations get more value from their data, protect their data better, gain improved ROI from data-centric projects and programs, and have more confidence in their most strategic data.
Join Bob Seiner for this lively webinar where he will talk about the value of a data catalog and how to build the use of the catalog into your stewards’ daily routines. Bob will share how the tool must be positioned for success and viewed as a must-have resource that is a steppingstone and catalyst to governed data across the organization.
Data Catalog for Better Data Discovery and GovernanceDenodo
Watch full webinar here: https://buff.ly/2Vq9FR0
Data catalogs are en vogue answering critical data governance questions like “Where all does my data reside?” “What other entities are associated with my data?” “What are the definitions of the data fields?” and “Who accesses the data?” Data catalogs maintain the necessary business metadata to answer these questions and many more. But that’s not enough. For it to be useful, data catalogs need to deliver these answers to the business users right within the applications they use.
In this session, you will learn:
*How data catalogs enable enterprise-wide data governance regimes
*What key capability requirements should you expect in data catalogs
*How data virtualization combines dynamic data catalogs with delivery
A open science presentation focusing on the benefits to be gained and basic practices to follow. This was given on behalf of FOSTER at the Open Science Boos(t)camp event at KU Leuven on 24th October 2014.
An introduction to open science, why it's important and how to do it. This presentation was given at the European Medical Students Association (EMSA) event, 'Open Access in Action' in Berlin on 14th-15th September 2015
Presentation investigating the state of FAIR practice and what is needed to turn FAIR data into reality given at the Danish FAIR conference in Copenhagen on 20th November 2018. https://vidensportal.deic.dk/en/Programme/FAIR_Toolbox_Nov2018 The presentation reflect on recent FAIR studies and international initiatives and outlines the recommendations emerging from the European Commission's FAIR Data Expert Group report - http://tinyurl.com/FAIR-EG
On November 21st 2014 at the Tufts University Medford campus and November 25th 2014 at the campus of the University of Massachusetts Medical School in Worcester, the BLC and Digital Science hosted a workshop focused on better understanding the research information management landscape.
Mark Hahnel, CEO of Figshare discussed more specific aspects of the research data management landscape and various approaches to address the growing suite of mandates.
Open Data in a Big Data World: easy to say, but hard to do?LEARN Project
Presentation at 3rd LEARN workshop on Research Data Management, “Make research data management policies work”
Helsinki, 28 June 2016, by Sarah Callaghan, STFC Rutherford Appleton Laboratory
A presentation offering an introduction to managing and sharing research data given at the Czech Open Science days as part of the EC-funded FOSTER project.
FAIR Ddata in trustworthy repositories: the basicsOpenAIRE
This video illustrates how certified digital repositories contribute to making and keeping research data findable, accessible, interoperable and reusable (FAIR). Trustworthy repositories support Open Access to data, as well as Restricted Access when necessary, and they offer support for metadata, sustainable and interoperable file formats, and persistent identifiers for future citation. Presented by Marjan Grootveld (DANS, OpenAIRE).
Main references
• Core Trust Seal for trustworthy digital repositories: https://www.coretrustseal.org/
• EUDAT FAIR checklist: https://doi.org/10.5281/zenodo.1065991
• European Commission’s Guidelines on FAIR data management: http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf
• FAIR data principles: www.force11.org/group/fairgroup/fairprinciples
• Overview of metadata standards and tools: https://rdamsc.dcc.ac.uk/
How open data contribute to improving the world. The life science use case. The technical, social, ethical issues.
This was a talk given within the iGEM 2020 programme by the London Imperial College students group (https://2020.igem.org/Team:Imperial_College), in a webinar organised by the SOAPLab group on the topic of Ethics of Automation. Excellent Dr Brandon Sepulvado was the other speaker of the day.
Keynote presentation given at the Data Fellows 2023 workshop in Berlin on 22-23 June. Presentation gives examples of good communication to explain data management concepts and how to use games and other forms of interactivity in training events
Presentation given at the DMPonline 10 year anniversary week, reflecting on lessons learned developing the business model. See https://www.dcc.ac.uk/events/dmponline-10th-year-anniversary-celebration-week and #10yearsDMPonline
Keynote presentation given at the 10th anniversary of the 4TU.researchdata repository https://data.4tu.nl/info/en/news-events/training-events/news-item/4turesearchdatas-role-in-fostering-open-science-10th-anniversary-celebration-29-sep-2020-1530-1730-c/
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
20 Comprehensive Checklist of Designing and Developing a WebsitePixlogix Infotech
Dive into the world of Website Designing and Developing with Pixlogix! Looking to create a stunning online presence? Look no further! Our comprehensive checklist covers everything you need to know to craft a website that stands out. From user-friendly design to seamless functionality, we've got you covered. Don't miss out on this invaluable resource! Check out our checklist now at Pixlogix and start your journey towards a captivating online presence today.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
UiPath Test Automation using UiPath Test Suite series, part 5DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 5. In this session, we will cover CI/CD with devops.
Topics covered:
CI/CD with in UiPath
End-to-end overview of CI/CD pipeline with Azure devops
Speaker:
Lyndsey Byblow, Test Suite Sales Engineer @ UiPath, Inc.
Full-RAG: A modern architecture for hyper-personalizationZilliz
Mike Del Balso, CEO & Co-Founder at Tecton, presents "Full RAG," a novel approach to AI recommendation systems, aiming to push beyond the limitations of traditional models through a deep integration of contextual insights and real-time data, leveraging the Retrieval-Augmented Generation architecture. This talk will outline Full RAG's potential to significantly enhance personalization, address engineering challenges such as data management and model training, and introduce data enrichment with reranking as a key solution. Attendees will gain crucial insights into the importance of hyperpersonalization in AI, the capabilities of Full RAG for advanced personalization, and strategies for managing complex data integrations for deploying cutting-edge AI solutions.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
Climate Impact of Software Testing at Nordic Testing Days
Introduction to Open Science and EOSC
1. www.geant.org
www.geant.org
1 |
Click to edit Master title style
• Click to edit Master text styles
• Second level
• Third level
• Fourth level
• Fifth level
01/04/2022 1
Introduction to Open Science and EOSC
www.geant.org
Sarah Jones
EOSC Engagement Manager
sarah.jones@geant.org
Twitter: @sarahroams
Predictive Epigenetics PEP-NET training network
1st April 2020
4. www.geant.org
www.geant.org
“science carried out and communicated in a manner which
allows others to contribute, collaborate and add to the research
effort, with all kinds of data, results and protocols made freely
available at different stages of the research process.”
Research Information Network, Open Science case studies
www.rin.ac.uk/our-work/data-management-and-curation/
open-science-case-studies
Defining Open Science
4 |
7. www.geant.org
www.geant.org
• Free, immediate, online access to the results of research
• Two routes to make sure anyone can access your papers
– Gold route: paying APCs to ensure publishers makes copy open
– Green route: self-archiving Open Access copy in repository
• Find out what your publisher allows on SHERPA RoMEO
– www.sherpa.ac.uk/romeo
Open access to publications
8. www.geant.org
www.geant.org
Open data
make your stuff available on the Web (whatever format) under an open licence
make it available as structured data (e.g. Excel instead of a scan of a table)
use non-proprietary formats (e.g. CSV instead of Excel)
use URIs to denote things, so that people can point at your stuff
link your data to other data to provide context
Tim Berners-Lee’s proposal for five star open data - http://5stardata.info
“Open data and content can be freely used, modified
and shared by anyone for any purpose”
http://opendefinition.org
9. www.geant.org
www.geant.org
• Documenting and sharing workflows and methods
• Sharing code and tools to allow others to reproduce work
• Using web based tools to facilitate collaboration and interaction from the
outside world in your research
• Using tools like MyExperiment and Taverna
Open methods
10. www.geant.org
www.geant.org
Reliance on specialist research software
Slide from Neil Chue-Hong, Software Sustainability Institute
56%
71%
Do you use research
software?
What would happen to your
research without software
Survey of researchers from 15 UK Russell Group universities conducted
by SSI between August - October 2014. DOI: 10.5281/zenodo.14809
Develop their
own software
Have no formal
software training
12. www.geant.org
www.geant.org
Degrees of openness
Open Restricted Closed
Content that can be
freely used, modified and
shared by anyone
for any purpose
Limits on who can use the data,
how or for what purpose
- Charges for use
- Data sharing agreements
- Restrictive licences
- Peer-to-peer exchange
- …
Five star open data
Unable to share
Under embargo
13. www.geant.org
www.geant.org
• FAIR ≠ Open
• FAIR ensures data can be found, understood and reused
• Data can be shared under restrictions & still be FAIR
"As open as possible, as closed as necessary"
And what is FAIR?
13 |
Image CC-BY-SA by SangyaPundir Image CC-BY by European Commission FAIR data expert group
14. www.geant.org
www.geant.org
What FAIR means: 15 principles
Findable
F1. (meta)data are assigned a globally unique and eternally
persistent identifier.
F2. data are described with rich metadata.
F3. (meta)data are registered or indexed in a searchable resource.
F4. metadata specify the data identifier.
Interoperable
I1. (meta)data use a formal, accessible, shared, and broadly
applicable language for knowledge representation.
I2. (meta)data use vocabularies that follow FAIR principles.
I3. (meta)data include qualified references to other (meta)data.
Accessible
A1 (meta)data are retrievable by their identifier using a
standardized communications protocol.
A1.1 the protocol is open, free, and universally implementable.
A1.2 the protocol allows for an authentication and authorization
procedure, where necessary.
A2 metadata are accessible, even when the data are no longer
available.
Reusable
R1. meta(data) have a plurality of accurate and relevant attributes.
R1.1. (meta)data are released with a clear and accessible data
usage license.
R1.2. (meta)data are associated with their provenance.
R1.3. (meta)data meet domain-relevant community standards.
Slide CC-BY by Erik Schultes, Leiden UMC
doi: 10.1038/sdata.2016.18
15. www.geant.org
www.geant.org
The FAIR data principles explained
• Clarifications from GO FAIR
• Each principle is a link to
further clarification,
examples and context
https://www.go-fair.org/fair-
principles
R1. Meta(data) are richly described with a plurality of accurate and relevant
attributes
• By giving data many ‘labels’, it will be much easier to find and reuse the data.
• Provide not just metadata that allows discovery, but also metadata that richly
describes the context under which that data was generated
• “plurality” indicates that metadata should be as generous as possible, even to the
point of providing information that may seem irrelevant.
16. www.geant.org
www.geant.org
• Findable
- Persistent Identifier
- Metadata online
• Accessible
- Data online
- Restrictions where needed
• Interoperable
- Use standards, controlled vocabs
- Common (open) formats
• Reusable
- Rich documentation
- Clear usage licence
FAIR data checklist
https://doi.org/10.5281/zenodo.5111307
17. www.geant.org
www.geant.org
• Various research communities have been sharing their
data in a ‘FAIR’ way long before the term emerged
• Meaningful and memorable articulation of concepts
• Natural desire to want to be ‘fair’
• FAIR is gaining significant international traction
FAIR is nothing new
20. www.geant.org
www.geant.org
A study that analysed the citation counts of 10,555 papers on gene expression
studies that created microarray data, showed:
“studies that made data available in a public repository
received 9% more citations than similar studies for
which the data was not made available”
Data reuse and the open data citation advantage,
Piwowar, H. & Vision, T. https://peerj.com/articles/175
Get a citation advantage
21. www.geant.org
www.geant.org
Increased use and economic benefit
Up to 2008 Since 2009
• Freely available over the internet
• Google Earth now uses the images
• Transmission of 2,100,000
scenes per year.
• Estimated to have created value for the
environmental management industry of
$935 million, with direct benefit of more
than $100 million per year to the US
economy
• Has stimulated the development of
applications from a large number of
companies worldwide
The case of NASA Landsat satellite imagery of the Earth’s surface:
http://earthobservatory.nasa.gov/IOTD/view.php?id=83394&src=ve
• Sold through the US Geological
Survey for US$600 per scene
• Sales of 19,000 scenes per year
• Annual revenue of $11.4 million
22. www.geant.org
www.geant.org
“Open Research Europe requires open
access to research data supporting
articles under the principle ‘as open
as possible, as closed as necessary’,
according to the policy of Horizon
Europe. Data should be deposited in
trusted data repositories.”
Funder imperatives...
https://open-research-europe.ec.europa.eu/for-
authors/data-guidelines#opendata
23. www.geant.org
www.geant.org
But there are also opportunity costs
By Emilio Bruna
http://brunalab.org/blog/2014/09/04/the-opportunity-
cost-of-my-openscience-was-35-hours-690
For his paper he calculated the following:
1. Double checking the main dataset and
reformatting to submit to Dryad: 5 hours
2. Creating complementary file and preparing
metadata: 3 hours
3. Submission of these two files and the
metadata to Dryad: 45 minutes
4. Preparing a map of the locations: 1 hour
5. Submission of map to Figshare: 15 minutes
6. Cleaning up and documenting the code,
uploading it to GitHub: 25 hours
7. Cost of archiving in Dryad: US$90
8. Page Charges: $600
24. www.geant.org
www.geant.org
• EC and Member States committed to FAIR and Open
• Pursue this in research policy and grant conditions
• Lots of investment in infrastructure to support data sharing
• Ultimately supports the science ecosystem and ensures
greater return on investment
FAIR and Open both central to EOSC
24 |
27. www.geant.org
www.geant.org
• Collaboration between European
Commission and Member States to
“make Open Science the new normal”
• Established EOSC Association as legal
entity to govern and oversee the
implementation
• Huge investment in infrastructure –
€350 million in initial development
phase and at least €1 billion co-
investment foreseen for next 7 years
Large EC initiative
27 |
EOSC
Association
Steering
Board
European
Commission
28. Long history of political agreements and activity
Lots of groundwork since 2015
• Council Conclusions
• Expert Group reports
• EC documents
• Major investment in EOSC
related projects to develop the
infrastructure and platform
30. www.geant.org
www.geant.org
• A web of FAIR data and services
• Federation of eInfra and Research
Infrastructures (RIs)
• Environment in which data can be
brought together with services to
perform analyses and address
societal challenges
The EOSC platform
32. www.geant.org
www.geant.org
FAIR is central to principles in EOSC
• Is the glue that connects data & services
• Requirement for FAIR to support reuse
• Use community standards
• Share all types of output (openly)
35. www.geant.org
www.geant.org
• Currently the primary resource for
navigating EOSC
• https://eosc-portal.eu
• Includes a virtual tour for new users
• Catalogue and marketplace is how
you discover, access and compose
resources
EOSC Portal
37. Access to free storage, compute and support services
C-SCALE will federate compute
and data resources from the
Copernicus DIAS, the national
Collaborative Ground
Segments and the European
Open Science Cloud (EOSC)
towards a European open
source Big (Copernicus) Data
Analytics platform:
- Storage services: up to 12 PB
- Cloud services: up to
17,728,500 CPU hours
- HPC/HTC services: up to
3,100,000 CPU hours
- GPU services: up to 6,000
GPU hours
DICE makes available a set of
data management services (and
associated resources) for
researchers and research
communities from any scientific
domain including:
- Data archives (up to 25 PB)
- Policies based data archives (up
to 17 PB)
- Personal and project
workspaces (up to 5 PB)
- Data repository services for
data sharing (up to 8 PB)
- Data discovery services (with
PID and DOI services and
metadata harvesting)
EGI-ACE will deliver the EOSC
Compute Platform and will
contribute to the EOSC Data
Commons. Services offered
include: compute and storage
resources, compute platform
services, data management
services and related user support
and training.
The total capacity that EGI-ACE
makes available through the call
between 2021-2023 is:
- 80,000,000 CPU hours
- 250,000 GPU hours
- 20 PB storage
support to Argos DMP service by
drafting discipline specific DMPs,
Horizon Europe DMP support
set your own community
research gateway
(connect.openaire.eu) and
Zenodo communities
access open science metrics for
your projects, institution,
community
service to anonymise your data
and comply with GDPR
support and mentoring on
Horizon Europe open access
mandates
Provides three core services for
Research Lifecycle Management:
- ROHub: tool to facilitate the
exchange of information across the
scientific community.
- Text Enrichment and Mining:
service which automatically extracts
valuable information and metadata
from bibliographic sources and
other text documents
- Datacube technology for Earth
Observation (EO) data
management: efficient access to
extensive collections of multi-
temporal and multi-dimensional EO
imagery, also allowing
interoperability among the different
information layers.
https://marketplace.eosc-portal.eu
39. www.geant.org
www.geant.org
EOSC Future is using AI techniques to make recommendations to users:
• relevant projects, data, publications, training materials
• potential collaborators (people, task forces, communities)
Recommendations based on
• viewing history
• order history
• general popularity
• popularity among users with
a similar background/interests
Recommendations for users
40. www.geant.org
www.geant.org
• Federated identity management – ease of single sign on
• Access to a greater number of services
• Funding provided to pay for compute e.g. EGI-ACE, DICE
• Discovery of related data from other disciplines / sectors
• Greater ability to collaborate and address key research
questions
Benefits of EOSC for researchers
40
43. www.geant.org
www.geant.org
1. Choose your dataset(s)
– What can you may open? You may need to revisit this step if you
encounter problems later.
2. Apply an open license
– Determine what IP exists. Apply a suitable licence e.g. CC-BY
3. Make the data available
– Provide the data in a suitable format. Use repositories.
•
4.Make it discoverable
– Post on the web, register in catalogues…
How to make data open?
https://okfn.org
47. www.geant.org
www.geant.org
• Look for provision from your community, university, publisher, funder etc
• Check they match your particular data needs: e.g. formats accepted;
mixture of Open and Restricted Access.
• See if they provide guidance on how to cite the deposited data.
• Do they assign a persistent & globally unique identifier for sustainable
citations and to links back to particular researchers and grants?
• Look for certification as a ‘Trustworthy Digital Repository’ with an explicit
ambition to keep the data available in long term.
How to select a repository?
48. www.geant.org
www.geant.org
Metadata Standards Directory
Broad, disciplinary listing of standards
and tools. Maintained by RDA group
http://rd-alliance.github.io/metadata-directory
Use metadata standards
FAIRsharing
• A portal of data standards,
databases, and policies
• Focused on life, environmental
and biomedical sciences
https://fairsharing.org
49. www.geant.org
www.geant.org
If you want your data to be re-used and sustainable in the long-
term, you typically want to opt for open, non-proprietary formats.
Choose appropriate file formats
Type Recommended Avoid for data sharing
Tabular data CSV, TSV, SPSS portable Excel
Text Plain text, HTML, RTF
PDF/A only if layout matters
Word
Media Container: MP4, Ogg
Codec: Theora, Dirac, FLAC
Quicktime
H264
Images TIFF, JPEG2000, PNG GIF, JPG
Structured data XML, RDF RDBMS
Further examples:
https://ukdataservice.ac.uk/learning-hub/research-data-management/format-
your-data/recommended-formats
51. www.geant.org
www.geant.org
More on life science tools
and infrastructure coming
up in Susanna’s talk
51 |
Image: Sangharsh Lohakare https://unsplash.com/photos/Iy7QyzOs1bo
Journal prices have outpaced inflation by more than 250% over the past 30 years
15 entire disciplines where the average price for one journal for one year is over £1000 (chemistry £4227, physics £3229). Journal called tetrahedron that’s over £40,000
Irrational to think that scientists are paid by government to do research and then the papers are locked away behind paywalls. Journals don’t do the research, employ the people or pay the reviewers.
In the last four years, we have investigated and understood the challenges of the UK research community.
Anecdotally, we had a lot of evidence for people working in this area that researchers relied on software, but there had been no studies conducted. So we did this ourselves.
Two areas of interest, do you use software and possibly more important, what would happen to your research without software – this is 170,000 researchers in the UK who could not conduct their software without software.
This is more than just a reliance on Word or web browsers – specialist software is written into the research workflows of people from psychology to physics, from the life sciences to literature. The reliance isn’t confined to the “traditionally” computationally intensive subjects, it’s a feature of all disciplines.
This means that 140,000 researchers are relying on their own coding skills.
Certain research communities have also seen the benefit of sharing data as it speeds up the process of discovery. This article shows how researchers in the field of Alzheimer’s research have agreed as a community to share data immediately to make scientific breakthroughs.
There’s also a citation advantage for individual researchers. This study by Heather Piwowar and Todd Vision looked at 10,555 paper of gene expression studies that had shared the associated microarray data. Those studies that shared data received 9% more citations.
There’s also an economic benefit, as seen by the case of the NASA landsat satellite images. These were sold until 2008 for $600 a scene. Now they’re freely available and used by Google Earth. Previously they sold 19,000 images a year, whereas now they transmit 2.1 million. The revenue has gone up incredibly too from $11.4 million to an estimated value of $935 million with direct benefit of more than $100 million. The release has also stimulated the development of applications from companies worldwide.
This case study comes from the Royal Society Report on Science as an Open Enterprise.
The background to this is about making the most of the data that has been created through publicly funded research. The guidelines speak of:
Improved quality of results
Greater efficiency
Faster to market = faster growth
Improved transparency of the scientific process
It’s not all positive though – otherwise why isn’t everyone already doing this. There is a certain amount of effort and cost to open science, which this blog post by Emilio Bruna highlights. He calculated the cost of sharing his data for one paper and came to a total of 35 hours and $690. He breaks this down into the cost of preparing the dataset, creating complementary metadata and associated files, cleaning up and documenting the code (which involves a big mental leap), and the charges applied.
Still a question we are asking ourselves but some commonality of vision is sticking.
I like this picture as it represents some of that for me:
Federation of services
Interconnecting / interoperable
User in the centre
Greenfield site? Open to ideas / creativity?
Guidance from the DCC can also help researchers to understand data licensing. This guide outlines the pros and cons of each approach e.g. the limitations of some CC options
The OA guidelines under Horizon 2020 point to CC-0 or CC-BY as a straightforward and effective way to make it possible for others to mine, exploit and reproduce the data. See p11 at: http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-pilot-guide_en.pdf