This document outlines an agenda for a data management training session. The full-day session will cover basics in the morning, advanced topics after lunch, and end with a question and answer period and required homework. Attendees will learn about account creation and login procedures for various research platforms, file labeling standards, and data management best practices including uploading, downloading, sharing and archiving data throughout its lifecycle. The document provides details on specific topics to be covered as well as templates and guidelines for research activities like field and column experiments.
This document provides an agenda for Part II of an SPP 2089 data management training. The agenda includes topics such as troubleshooting common data upload issues, improving dataset quality, and attaching metadata to data. Techniques for updating datasets, ensuring data consistency and completeness, linking related datasets, and adding explanatory information to datasets are discussed. The training emphasizes using the BEXIS2 data management platform to properly store, organize, and document research data over the full data lifecycle in accordance with SPP 2089 guidelines.
A basic course on Reseach data management, part 2: protecting and organizing ...Leon Osinski
A basic course on research data management for PhD students. The course consists of 4 parts. The course was given at Eindhoven University of Technology (TUe), 24-01-2017
A basic course on Research data management, part 3: sharing your dataLeon Osinski
A basic course on research data management for PhD students. The course consists of 4 parts. The course was given at Eindhoven University of Technology (TUe), 24-01-2017
Publishing hkh biodiversity data globally technical session ii ICIMOD
This document provides an overview of the process for publishing biodiversity data from the Hindu Kush Himalaya (HKH) region on the HKH Biodiversity Information Facility (HKH-BIF) platform and making it globally accessible through the Global Biodiversity Information Facility (GBIF). It outlines the key steps, which include preparing the data using standardized templates, uploading it to HKH-BIF, filling out detailed metadata, mapping data fields to Darwin Core terms, and publishing the data to make it publicly available. The presentation provides guidance on preparing different types of biodiversity data like occurrence records, checklists, and using the GBIF spreadsheet processor and mapping tools.
The document describes an experimental evaluation of five popular data storage formats (Avro, CSV, JSON, ORC, Parquet) used in the Apache Hadoop system. Researchers conducted tests to analyze the characteristics of each format, such as file size, reading and processing speeds. They deployed an experimental setup using Apache Spark and tested formats on a dataset of 10 million records. The results showed differences in performance across formats for operations like reading, filtering, sorting and grouping data. Researchers then developed an algorithm using tropical optimization methods to choose the optimal format based on the experimental results and quality criteria. The goal was to provide guidance for selecting the most effective storage format for Apache Hadoop systems.
A Comparative Study of Sorting and Searching AlgorithmsIRJET Journal
This document presents a comparative study of various sorting algorithms and searching techniques. It discusses sorting algorithms like quicksort, selection sort, and bubble sort, and searching techniques like binary search. For each algorithm, it covers the steps, advantages, and disadvantages. It also provides a comparison table contrasting the time complexity, prerequisites, and other aspects of binary search and linear search. The goal of the study is to help users choose the most efficient technique based on their specific requirements by analyzing the algorithms' performance in terms of time and space complexity.
Research data management: course OGO Quantitative research (21-11-2018)Leon Osinski
Research data management involves three key aspects: 1) protecting data through organized file naming and folder structures, 2) sharing data via collaboration platforms or archives to enable reproducibility and reuse, and 3) caring for data through tidy formatting, thorough metadata and documentation, and use of open standards to ensure understandability and usability.
Webinar: Data management and the Open Research Data Pilot in Horizon 2020OpenAccessBelgium
This webinar provides information about strategies for successful Research Data Management, resources to help manage data effectively, choosing where to store and deposit data, the EC H2020 Open Data Pilot and the basics of data management.
At the end of the session participants will be able to:
- Understand the basic principles and importance of RDM
- Set clear goals regarding data curation, preservation and sharing
- Comply with the requirements of the Research Data Pilot
- Draft a Data Management Plan
- Identify RDM resources and tools
This document provides an agenda for Part II of an SPP 2089 data management training. The agenda includes topics such as troubleshooting common data upload issues, improving dataset quality, and attaching metadata to data. Techniques for updating datasets, ensuring data consistency and completeness, linking related datasets, and adding explanatory information to datasets are discussed. The training emphasizes using the BEXIS2 data management platform to properly store, organize, and document research data over the full data lifecycle in accordance with SPP 2089 guidelines.
A basic course on Reseach data management, part 2: protecting and organizing ...Leon Osinski
A basic course on research data management for PhD students. The course consists of 4 parts. The course was given at Eindhoven University of Technology (TUe), 24-01-2017
A basic course on Research data management, part 3: sharing your dataLeon Osinski
A basic course on research data management for PhD students. The course consists of 4 parts. The course was given at Eindhoven University of Technology (TUe), 24-01-2017
Publishing hkh biodiversity data globally technical session ii ICIMOD
This document provides an overview of the process for publishing biodiversity data from the Hindu Kush Himalaya (HKH) region on the HKH Biodiversity Information Facility (HKH-BIF) platform and making it globally accessible through the Global Biodiversity Information Facility (GBIF). It outlines the key steps, which include preparing the data using standardized templates, uploading it to HKH-BIF, filling out detailed metadata, mapping data fields to Darwin Core terms, and publishing the data to make it publicly available. The presentation provides guidance on preparing different types of biodiversity data like occurrence records, checklists, and using the GBIF spreadsheet processor and mapping tools.
The document describes an experimental evaluation of five popular data storage formats (Avro, CSV, JSON, ORC, Parquet) used in the Apache Hadoop system. Researchers conducted tests to analyze the characteristics of each format, such as file size, reading and processing speeds. They deployed an experimental setup using Apache Spark and tested formats on a dataset of 10 million records. The results showed differences in performance across formats for operations like reading, filtering, sorting and grouping data. Researchers then developed an algorithm using tropical optimization methods to choose the optimal format based on the experimental results and quality criteria. The goal was to provide guidance for selecting the most effective storage format for Apache Hadoop systems.
A Comparative Study of Sorting and Searching AlgorithmsIRJET Journal
This document presents a comparative study of various sorting algorithms and searching techniques. It discusses sorting algorithms like quicksort, selection sort, and bubble sort, and searching techniques like binary search. For each algorithm, it covers the steps, advantages, and disadvantages. It also provides a comparison table contrasting the time complexity, prerequisites, and other aspects of binary search and linear search. The goal of the study is to help users choose the most efficient technique based on their specific requirements by analyzing the algorithms' performance in terms of time and space complexity.
Research data management: course OGO Quantitative research (21-11-2018)Leon Osinski
Research data management involves three key aspects: 1) protecting data through organized file naming and folder structures, 2) sharing data via collaboration platforms or archives to enable reproducibility and reuse, and 3) caring for data through tidy formatting, thorough metadata and documentation, and use of open standards to ensure understandability and usability.
Webinar: Data management and the Open Research Data Pilot in Horizon 2020OpenAccessBelgium
This webinar provides information about strategies for successful Research Data Management, resources to help manage data effectively, choosing where to store and deposit data, the EC H2020 Open Data Pilot and the basics of data management.
At the end of the session participants will be able to:
- Understand the basic principles and importance of RDM
- Set clear goals regarding data curation, preservation and sharing
- Comply with the requirements of the Research Data Pilot
- Draft a Data Management Plan
- Identify RDM resources and tools
Webinar: Data management and the Open Research Data Pilot in Horizon 2020 OpenAIRE
This webinar provides information about strategies for successful Research Data Management, resources to help manage data effectively, choosing where to store and deposit data, the EC H2020 Open Data Pilot and the basics of data management.
At the end of the session participants will be able to:
- Understand the basic principles and importance of RDM
- Set clear goals regarding data curation, preservation and sharing
- Comply with the requirements of the Research Data Pilot
- Draft a Data Management Plan
- Identify RDM resources and tools
Research data management involves organizing data throughout the research lifecycle to ensure results can be verified and built upon. This presentation covered key aspects of research data management including creating a data management plan, file naming conventions, documentation, storage and security, and data sharing. The presentation emphasized starting early with data management to work more efficiently and protect data, and highlighted resources available from the Open University to help with all stages of effective research data management.
Practical Strategies for Research Data ManagementDaniel Crane
This document summarizes a presentation about practical strategies for research data management. It discusses what research data management is, working with data including file naming, formats, documentation and metadata, personal and sensitive data, and data storage and security. The presentation covers planning for data management, including creating a data management plan, and data sharing and reuse. It provides an overview of useful resources and asks if attendees have any questions.
This document provides an overview of research data management. It begins by defining research data and research data management. It discusses the data lifecycle and importance of planning for data management. A key part of planning is creating a Data Management Plan which covers topics like data collection, documentation, ethics, storage, sharing, and responsibilities. The document provides guidance on each of these topics to help researchers effectively manage their research data.
Webinar delivered by the OU Library Research Support team on 21st March 2020. Covers essential tips for working with research data, including file storage, information security, file naming, metadata and working with participants.
How to Load Data More Quickly and Accurately into Oracle's Life Sciences Data...Perficient, Inc.
Sponsors and CROs know the value of having a consolidated and regulatory-compliant data warehouse, such as Oracle’s Life Sciences Data Hub (LSH), as well as the importance of consistently loading data into that warehouse quickly and accurately.
However, as data structures from the source files change over time, it can be very time consuming to modify the data structure in the warehouse itself. Additionally, for the large groups of SAS datasets that are typical for a clinical trial, the out-of-the-box load times can be quite long, as the data is loaded one set at a time.
Perficient has the answer. In this webinar, we discussed and demonstrated an autoloader tool that greatly simplifies the data loading process for LSH. We showed how the autoloader can automatically load files, detect metadata changes, upgrade target structures, and load data, all with no human intervention. In addition, we demonstrated how Perficient’s autoloader tool can load multiple datasets in parallel to minimize load times.
Research data management involves organizing data throughout the research lifecycle to ensure reliable verification of results and allow new research. It includes developing policies, storing and organizing data appropriately, and addressing requirements for working with personal or sensitive information. The Open University provides support and resources to help researchers effectively manage their data, including training, data storage options, and a research data repository.
Introduction to Data Management Planning at Alien Challenge COST workshopAaike De Wever
1. The document introduces data management planning and discusses typical components to include in a data management plan such as a description of the data, file formats and metadata standards, data sharing and dissemination methods, roles and responsibilities, and long-term data storage and preservation.
2. Key components discussed include describing the data, file formats and metadata standards, how the data will be shared or made accessible, data storage and preservation, and roles and responsibilities for managing the data.
3. Creating a data management plan facilitates effective data management both during and after a research project and ensures data is well-organized, documented and preserved for future use or reuse.
Preparing your data for sharing and publishingVarsha Khodiyar
This document provides information on preparing data for sharing and publishing. It discusses organizing data through clear file and folder labeling, including additional context about methods and instruments. It also describes publishing data through journals like Scientific Data, which provide peer review and credit. Sensitive data requires careful handling and may be suitable for controlled access repositories. Overall the document offers guidance on effective data organization, documentation, sharing and receiving credit for shared data.
Wolfgang Epting – IT-Tage 2015 – Testdaten – versteckte Geschäftschance oder ...Informatik Aktuell
Qualitative hochwertige Testdaten in der richtigen Größe und Zusammensetzung, zur richtigen Zeit am richtigen Ort steigern nachweislich die Anwendungsqualität, reduzieren die Fehlerrate im produktiven Betrieb, erhöhen die Agilität der Anwendungsentwicklung und sparen somit erhebliche Kosten. Aber welcher Entwickler oder Tester möchte in der Ausübung seines Berufes etwas eigentlich Verbotenes tun, wenn er mit personenbezogenen Daten in Berührung kommt? Daher bedarf es klarer Richtlinien und Standards in Kombination mit geeigneten Werkzeugen, um möglichen Verletzungen des Bundesdatenschutzgesetzes vorzubeugen.
How to best manage your data to make the most of it for your research - With ODAM Framework (Open Data for Access and Mining) Give an open access to your data and make them ready to be mined
Practical Strategies for Research Data Managementdancrane_open
The document provides an overview of practical strategies for research data management. It discusses what research data management is, including definitions of research data and the data lifecycle. It emphasizes the importance of planning for data management from the start of a research project through drafting a data management plan. The document outlines key elements to address in a data management plan, such as data collection, documentation, ethics and legal compliance, storage and backup, and data sharing. It also provides guidance on issues like organizing and naming research data files, using metadata to document data, and managing personal or sensitive data.
Spring 2014 Data Management Lab: Session 2 Slides (more details at http://ulib.iupui.edu/digitalscholarship/dataservices/datamgmtlab)
What you will learn:
1. Build awareness of research data management issues associated with digital data.
2. Introduce methods to address common data management issues and facilitate data integrity.
3. Introduce institutional resources supporting effective data management methods.
4. Build proficiency in applying these methods.
5. Build strategic skills that enable attendees to solve new data management problems.
Working with Research Data 17th October 2019IzzyChad
Slides from a webinar delivered by the Open University Library on 17th October. This webinar covered practical details of how to manage data during research projects, including data security, file naming strategies and working with participants.
This document provides an introduction to data management. It discusses why data management is important, covering key aspects like developing data management plans, file organization, documentation and metadata, storage and backup, legal and ethical considerations, sharing and reuse, and preservation. Effective data management is critical for research success as it supports reproducibility, sharing, and preventing data loss. The document outlines best practices and resources like the library that can help with developing strong data management strategies.
This slideshow was used in an Introduction to Research Data Management course taught for the Mathematical, Physical and Life Sciences Division, University of Oxford, on 2017-02-15. It provides an overview of some key issues, looking at both day-to-day data management, and longer term issues, including sharing, and curation.
Aim:- To show how research data management can contribute to the success of your PhD.
*What is research data and why it is important?
*The Research Data lifecycle
* Research Data – more than just your results
* FAIR data and Open Research
* DMP online tool
This is the presentation part of an M.Sc. thesis in Software Engineering at Friedrich Schiller University Jena entitled “Dataset quality visualization in BEXIS2”. In this thesis, a visual overview of data quality was prototypically implemented as a new feature for the BEXIS2 Data Management System to make studying dataset quality straightforward.
Webinar: Data management and the Open Research Data Pilot in Horizon 2020 OpenAIRE
This webinar provides information about strategies for successful Research Data Management, resources to help manage data effectively, choosing where to store and deposit data, the EC H2020 Open Data Pilot and the basics of data management.
At the end of the session participants will be able to:
- Understand the basic principles and importance of RDM
- Set clear goals regarding data curation, preservation and sharing
- Comply with the requirements of the Research Data Pilot
- Draft a Data Management Plan
- Identify RDM resources and tools
Research data management involves organizing data throughout the research lifecycle to ensure results can be verified and built upon. This presentation covered key aspects of research data management including creating a data management plan, file naming conventions, documentation, storage and security, and data sharing. The presentation emphasized starting early with data management to work more efficiently and protect data, and highlighted resources available from the Open University to help with all stages of effective research data management.
Practical Strategies for Research Data ManagementDaniel Crane
This document summarizes a presentation about practical strategies for research data management. It discusses what research data management is, working with data including file naming, formats, documentation and metadata, personal and sensitive data, and data storage and security. The presentation covers planning for data management, including creating a data management plan, and data sharing and reuse. It provides an overview of useful resources and asks if attendees have any questions.
This document provides an overview of research data management. It begins by defining research data and research data management. It discusses the data lifecycle and importance of planning for data management. A key part of planning is creating a Data Management Plan which covers topics like data collection, documentation, ethics, storage, sharing, and responsibilities. The document provides guidance on each of these topics to help researchers effectively manage their research data.
Webinar delivered by the OU Library Research Support team on 21st March 2020. Covers essential tips for working with research data, including file storage, information security, file naming, metadata and working with participants.
How to Load Data More Quickly and Accurately into Oracle's Life Sciences Data...Perficient, Inc.
Sponsors and CROs know the value of having a consolidated and regulatory-compliant data warehouse, such as Oracle’s Life Sciences Data Hub (LSH), as well as the importance of consistently loading data into that warehouse quickly and accurately.
However, as data structures from the source files change over time, it can be very time consuming to modify the data structure in the warehouse itself. Additionally, for the large groups of SAS datasets that are typical for a clinical trial, the out-of-the-box load times can be quite long, as the data is loaded one set at a time.
Perficient has the answer. In this webinar, we discussed and demonstrated an autoloader tool that greatly simplifies the data loading process for LSH. We showed how the autoloader can automatically load files, detect metadata changes, upgrade target structures, and load data, all with no human intervention. In addition, we demonstrated how Perficient’s autoloader tool can load multiple datasets in parallel to minimize load times.
Research data management involves organizing data throughout the research lifecycle to ensure reliable verification of results and allow new research. It includes developing policies, storing and organizing data appropriately, and addressing requirements for working with personal or sensitive information. The Open University provides support and resources to help researchers effectively manage their data, including training, data storage options, and a research data repository.
Introduction to Data Management Planning at Alien Challenge COST workshopAaike De Wever
1. The document introduces data management planning and discusses typical components to include in a data management plan such as a description of the data, file formats and metadata standards, data sharing and dissemination methods, roles and responsibilities, and long-term data storage and preservation.
2. Key components discussed include describing the data, file formats and metadata standards, how the data will be shared or made accessible, data storage and preservation, and roles and responsibilities for managing the data.
3. Creating a data management plan facilitates effective data management both during and after a research project and ensures data is well-organized, documented and preserved for future use or reuse.
Preparing your data for sharing and publishingVarsha Khodiyar
This document provides information on preparing data for sharing and publishing. It discusses organizing data through clear file and folder labeling, including additional context about methods and instruments. It also describes publishing data through journals like Scientific Data, which provide peer review and credit. Sensitive data requires careful handling and may be suitable for controlled access repositories. Overall the document offers guidance on effective data organization, documentation, sharing and receiving credit for shared data.
Wolfgang Epting – IT-Tage 2015 – Testdaten – versteckte Geschäftschance oder ...Informatik Aktuell
Qualitative hochwertige Testdaten in der richtigen Größe und Zusammensetzung, zur richtigen Zeit am richtigen Ort steigern nachweislich die Anwendungsqualität, reduzieren die Fehlerrate im produktiven Betrieb, erhöhen die Agilität der Anwendungsentwicklung und sparen somit erhebliche Kosten. Aber welcher Entwickler oder Tester möchte in der Ausübung seines Berufes etwas eigentlich Verbotenes tun, wenn er mit personenbezogenen Daten in Berührung kommt? Daher bedarf es klarer Richtlinien und Standards in Kombination mit geeigneten Werkzeugen, um möglichen Verletzungen des Bundesdatenschutzgesetzes vorzubeugen.
How to best manage your data to make the most of it for your research - With ODAM Framework (Open Data for Access and Mining) Give an open access to your data and make them ready to be mined
Practical Strategies for Research Data Managementdancrane_open
The document provides an overview of practical strategies for research data management. It discusses what research data management is, including definitions of research data and the data lifecycle. It emphasizes the importance of planning for data management from the start of a research project through drafting a data management plan. The document outlines key elements to address in a data management plan, such as data collection, documentation, ethics and legal compliance, storage and backup, and data sharing. It also provides guidance on issues like organizing and naming research data files, using metadata to document data, and managing personal or sensitive data.
Spring 2014 Data Management Lab: Session 2 Slides (more details at http://ulib.iupui.edu/digitalscholarship/dataservices/datamgmtlab)
What you will learn:
1. Build awareness of research data management issues associated with digital data.
2. Introduce methods to address common data management issues and facilitate data integrity.
3. Introduce institutional resources supporting effective data management methods.
4. Build proficiency in applying these methods.
5. Build strategic skills that enable attendees to solve new data management problems.
Working with Research Data 17th October 2019IzzyChad
Slides from a webinar delivered by the Open University Library on 17th October. This webinar covered practical details of how to manage data during research projects, including data security, file naming strategies and working with participants.
This document provides an introduction to data management. It discusses why data management is important, covering key aspects like developing data management plans, file organization, documentation and metadata, storage and backup, legal and ethical considerations, sharing and reuse, and preservation. Effective data management is critical for research success as it supports reproducibility, sharing, and preventing data loss. The document outlines best practices and resources like the library that can help with developing strong data management strategies.
This slideshow was used in an Introduction to Research Data Management course taught for the Mathematical, Physical and Life Sciences Division, University of Oxford, on 2017-02-15. It provides an overview of some key issues, looking at both day-to-day data management, and longer term issues, including sharing, and curation.
Aim:- To show how research data management can contribute to the success of your PhD.
*What is research data and why it is important?
*The Research Data lifecycle
* Research Data – more than just your results
* FAIR data and Open Research
* DMP online tool
This is the presentation part of an M.Sc. thesis in Software Engineering at Friedrich Schiller University Jena entitled “Dataset quality visualization in BEXIS2”. In this thesis, a visual overview of data quality was prototypically implemented as a new feature for the BEXIS2 Data Management System to make studying dataset quality straightforward.
The document provides an overview of the BEXIS2 platform for collaborative data management. It describes the key steps in the workflow for registering an account, creating a data structure, uploading datasets, and searching or accessing uploaded data. The platform is designed to integrate tabular data from research projects and promote open data sharing. Users can download Excel templates to input data, provide metadata to describe datasets, and make data publicly available or share access with other users.
This is a project work that was required by the lecture "Semantic Web Technology" in the winter semester 2017/2018 at the Friedrich-Schiller-University Jena. Prof. Dr. König-Ries and Dr. Chamanara were the supervisors.
This presentation is required by the lecture "History of the Computer" in the summer semester of 2018 at the University of Jena. The text is in German.
This document discusses the history and future of data science over the past 50 years and next 50 years. It covers:
1) How data science has evolved from its origins in statistics and data analysis in the 1960s to become a broader field today, encompassing skills in software engineering, machine learning, and domain expertise.
2) The six main divisions of modern data science: data exploration/preparation, representation/transformation, computing with data, visualization, modeling, and science about data science itself.
3) How open science, data and code sharing, and empirical validation of methods will drive the field to become more reproducible, collaborative, and evidence-based over the next 50 years.
The document discusses question answering over knowledge graphs. It introduces question answering and describes how knowledge graphs can be used to answer natural language questions. It summarizes three proposed papers on learning knowledge graphs for question answering through dialogs, automated template generation for question answering over knowledge graphs, and generating knowledge questions from knowledge graphs. The document also covers motivation for question answering, defining characteristics, different methods like template-based and dialog-based systems, evaluating knowledge quality, and examples of question answering systems.
The document discusses facilitating the discovery of public datasets. It describes Schema.org, a collaborative project to add metadata to content using microdata, RDFa or JSON-LD formats. It also discusses challenges in identifying and relating datasets, as well as properties for describing datasets, such as name, description, URL, version, and spatial/temporal coverage. An example is given of markup for a seismic hazard zones dataset using these properties.
Revolutionizing Visual Effects Mastering AI Face Swaps.pdfUndress Baby
The quest for the best AI face swap solution is marked by an amalgamation of technological prowess and artistic finesse, where cutting-edge algorithms seamlessly replace faces in images or videos with striking realism. Leveraging advanced deep learning techniques, the best AI face swap tools meticulously analyze facial features, lighting conditions, and expressions to execute flawless transformations, ensuring natural-looking results that blur the line between reality and illusion, captivating users with their ingenuity and sophistication.
Web:- https://undressbaby.com/
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeAftab Hussain
Understanding variable roles in code has been found to be helpful by students
in learning programming -- could variable roles help deep neural models in
performing coding tasks? We do an exploratory study.
- These are slides of the talk given at InteNSE'23: The 1st International Workshop on Interpretability and Robustness in Neural Software Engineering, co-located with the 45th International Conference on Software Engineering, ICSE 2023, Melbourne Australia
WhatsApp offers simple, reliable, and private messaging and calling services for free worldwide. With end-to-end encryption, your personal messages and calls are secure, ensuring only you and the recipient can access them. Enjoy voice and video calls to stay connected with loved ones or colleagues. Express yourself using stickers, GIFs, or by sharing moments on Status. WhatsApp Business enables global customer outreach, facilitating sales growth and relationship building through showcasing products and services. Stay connected effortlessly with group chats for planning outings with friends or staying updated on family conversations.
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Łukasz Chruściel
No one wants their application to drag like a car stuck in the slow lane! Yet it’s all too common to encounter bumpy, pothole-filled solutions that slow the speed of any application. Symfony apps are not an exception.
In this talk, I will take you for a spin around the performance racetrack. We’ll explore common pitfalls - those hidden potholes on your application that can cause unexpected slowdowns. Learn how to spot these performance bumps early, and more importantly, how to navigate around them to keep your application running at top speed.
We will focus in particular on tuning your engine at the application level, making the right adjustments to ensure that your system responds like a well-oiled, high-performance race car.
Atelier - Innover avec l’IA Générative et les graphes de connaissancesNeo4j
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Allez au-delà du battage médiatique autour de l’IA et découvrez des techniques pratiques pour utiliser l’IA de manière responsable à travers les données de votre organisation. Explorez comment utiliser les graphes de connaissances pour augmenter la précision, la transparence et la capacité d’explication dans les systèmes d’IA générative. Vous partirez avec une expérience pratique combinant les relations entre les données et les LLM pour apporter du contexte spécifique à votre domaine et améliorer votre raisonnement.
Amenez votre ordinateur portable et nous vous guiderons sur la mise en place de votre propre pile d’IA générative, en vous fournissant des exemples pratiques et codés pour démarrer en quelques minutes.
Hand Rolled Applicative User ValidationCode KataPhilip Schwarz
Could you use a simple piece of Scala validation code (granted, a very simplistic one too!) that you can rewrite, now and again, to refresh your basic understanding of Applicative operators <*>, <*, *>?
The goal is not to write perfect code showcasing validation, but rather, to provide a small, rough-and ready exercise to reinforce your muscle-memory.
Despite its grandiose-sounding title, this deck consists of just three slides showing the Scala 3 code to be rewritten whenever the details of the operators begin to fade away.
The code is my rough and ready translation of a Haskell user-validation program found in a book called Finding Success (and Failure) in Haskell - Fall in love with applicative functors.
What is Augmented Reality Image Trackingpavan998932
Augmented Reality (AR) Image Tracking is a technology that enables AR applications to recognize and track images in the real world, overlaying digital content onto them. This enhances the user's interaction with their environment by providing additional information and interactive elements directly tied to physical images.
Microservice Teams - How the cloud changes the way we workSven Peters
A lot of technical challenges and complexity come with building a cloud-native and distributed architecture. The way we develop backend software has fundamentally changed in the last ten years. Managing a microservices architecture demands a lot of us to ensure observability and operational resiliency. But did you also change the way you run your development teams?
Sven will talk about Atlassian’s journey from a monolith to a multi-tenanted architecture and how it affected the way the engineering teams work. You will learn how we shifted to service ownership, moved to more autonomous teams (and its challenges), and established platform and enablement teams.
SOCRadar's Aviation Industry Q1 Incident Report is out now!
The aviation industry has always been a prime target for cybercriminals due to its critical infrastructure and high stakes. In the first quarter of 2024, the sector faced an alarming surge in cybersecurity threats, revealing its vulnerabilities and the relentless sophistication of cyber attackers.
SOCRadar’s Aviation Industry, Quarterly Incident Report, provides an in-depth analysis of these threats, detected and examined through our extensive monitoring of hacker forums, Telegram channels, and dark web platforms.
Unveiling the Advantages of Agile Software Development.pdfbrainerhub1
Learn about Agile Software Development's advantages. Simplify your workflow to spur quicker innovation. Jump right in! We have also discussed the advantages.
Artificia Intellicence and XPath Extension FunctionsOctavian Nadolu
The purpose of this presentation is to provide an overview of how you can use AI from XSLT, XQuery, Schematron, or XML Refactoring operations, the potential benefits of using AI, and some of the challenges we face.
Most important New features of Oracle 23c for DBAs and Developers. You can get more idea from my youtube channel video from https://youtu.be/XvL5WtaC20A
Do you want Software for your Business? Visit Deuglo
Deuglo has top Software Developers in India. They are experts in software development and help design and create custom Software solutions.
Deuglo follows seven steps methods for delivering their services to their customers. They called it the Software development life cycle process (SDLC).
Requirement — Collecting the Requirements is the first Phase in the SSLC process.
Feasibility Study — after completing the requirement process they move to the design phase.
Design — in this phase, they start designing the software.
Coding — when designing is completed, the developers start coding for the software.
Testing — in this phase when the coding of the software is done the testing team will start testing.
Installation — after completion of testing, the application opens to the live server and launches!
Maintenance — after completing the software development, customers start using the software.
Graspan: A Big Data System for Big Code AnalysisAftab Hussain
We built a disk-based parallel graph system, Graspan, that uses a novel edge-pair centric computation model to compute dynamic transitive closures on very large program graphs.
We implement context-sensitive pointer/alias and dataflow analyses on Graspan. An evaluation of these analyses on large codebases such as Linux shows that their Graspan implementations scale to millions of lines of code and are much simpler than their original implementations.
These analyses were used to augment the existing checkers; these augmented checkers found 132 new NULL pointer bugs and 1308 unnecessary NULL tests in Linux 4.4.0-rc5, PostgreSQL 8.3.9, and Apache httpd 2.2.18.
- Accepted in ASPLOS ‘17, Xi’an, China.
- Featured in the tutorial, Systemized Program Analyses: A Big Data Perspective on Static Analysis Scalability, ASPLOS ‘17.
- Invited for presentation at SoCal PLS ‘16.
- Invited for poster presentation at PLDI SRC ‘16.
5. 5
Part I: Basic information
• Introduction (Doris)
• Accounts (Susanne)
• Labeling (Susanne)
• Break
• Research data management
• Store data: upload data
• Use data: download data
• Share data: data security
6. 6
Part I: Basic information
• Introduction (Doris)
• Accounts (Susanne)
• Labeling (Susanne)
• Break
• Research data management
• Store data: upload data
• Use data: download data
• Share data: data security
7. 7
Part I: Basic information
• Introduction (Doris)
• Accounts (Susanne)
• Labeling (Susanne)
• Break
• Research data management
• Store data: upload data
• Use data: download data
• Share data: data security
9. 9
Login to SPP website
https://www.ufz.de/spp-rhizosphere/index.php?en=43202
10. 10
Login to SPP website
https://www.ufz.de/spp-rhizosphere/index.php?en=43517
Username:
First letter of given name
family name
Password:
3 letters and
3 numbers
20. 20
Part I: Basic information
• Introduction (Doris)
• Accounts (Susanne)
• Labeling (Susanne)
• Break
• Research data management
• Store data: upload data
• Use data: download data
• Share data: data security
22. Universal labelling code
Please use the universal labelling code
Labelling of samples
• Substrate: Loam = L
Sand = S
don’t use clay, sandy loam….
• Genotype: Wild type = WT
hairless Mutant = rth3
don’t use mutant +/- …
• Biological replicates: REP1, REP2, REP3, ….
22
23. Universal labelling code
• Labelling of soil column experiments
[projectnumber_SCE#_C#];
• Soil colum expriment = SCE01,…
• Column in the experiment = C01
• example: P21_SCE01_C01
• Labelling of sampling campaigns in soil plot experiment
[projectnumber_SPE_sampling date_FP#_type of sample#];
• Field plot = FP01, FP02..
• Depth 0-20 cm = D00_20
• Sampling of several points within each plot = a, b, c
• example: P21_SPE_20181105_FP01_UC#
• You may extend the name by providing further details if required
(i.e. bulk/rhizosphere/rhizoplane)…
• If you extend the details, communicate that to your cooperation partners
23
24. Planned SPE
We are planning to sow the maize on the 26th and 27th of April 2022.
Furthermore, we are planning 4 samplings this year:
BBCH14: 08th-10th of June
BBCH19: 29th of June- 01st of July
BBCH59: 10th-12th of August
BBCH83: 28th -30th of September
The final harvest is planned on the 12th-14th of October 2022.
SPE_annual variation in precipitation
SPE_legacy
24
25. Planned SCE
Please provide date and description that other projects can join
SCE_drought P7, P3, P6
SCE_drought-long P3, P6
SCE_compaction P21
SCE_contact
SCE_biopore
SCE_decay P19, P21
SCE_mucilage P4
SCE_nutrient deficiency P7
25
26. New members of staff
Please provide the following information for each new member of staff
• Name
• Email address
• Postal address
• Position (PhD, PostDoc, PI…)
• Photo
Please inform the coordination if a member of staff
• is leaving the project permanently or for a longer period of time
• move to a new institute
• changes of names or email addresses
26
28. 28
Part I: Basic information
• Welcome (Doris)
• Accounts (Susanne)
• Labeling (Susanne)
• Break
• Research data management
• Store data: upload data
• Use data: download data
• Share data: data security
29. 29
Part I: Basic information
• Introduction (Doris)
• Accounts (Susanne)
• Labeling (Susanne)
• Break
• Research data management
• Store data: upload data
• Use data: download data
• Share data: data security
30. 30
It is data collected or produced in the course of scientific research activities and
used as evidence in the research process, or commonly accepted in the research
community as necessary to validate research findings and results (European open
science cloud [1]).
Research data might include measurement data, laboratory values, audiovisual
information, texts, survey data, objects from collections, or samples that were
created, developed or evaluated during scientific work. Methodical forms of
testing such as questionnaires, software and simulations may also produce
important results for scientific research and should therefore also be categorized
as research data (DFG Guidelines on the Handling of Research Data [2]).
Research data
31. Generate
Store
Use
Share
Archive
Destroy
31
Wrong or out-of-date data must be
permanently erased.
Note: End-of-life data destruction is the
responsibility of all stakeholders.
Research data life cycle
Data samples from the field,
Collected data from sensors or
devices like CT scan images.
Store data in a remote and
secure location like a data
repository or a hard drive
located in the library to keep
data safe for a long time.
Access, study, or process data to do
analysis and conclude.
Write data in a notebook,
Enter data in an Excel sheet,
record data in a hard drive,…
Be aware of using the agreed
labeling method.
Share data amongst internal
colleagues or partners outside
of your organization, with SPP
2089 colleagues.
Use Email, a transfer site like
NextCloud, or hard drive.
32. 32
Why using a data management platform (DMP)
1. DMP supports data throughout its life cycle.
2. All components of the research process must be available to ensure
transparency, reproducibility, and reusability [3].
3. A DMP gathers research data in one place and keeps it usable for a
long time.
4. A DMP has to deal with security and privacy concerns due to
collecting private data.
5. Using a data management system is a DFG requirement, and it is
mentioned in the SPP 2089 bylaws.
34. Generate
Store
Use
Share
Archive
Destroy
34
BEXIS2 administrator can remove
incorrect or useless data forever.
Of course, it requires special
permission from the data owner.
BEXIS2 is a free and open source
software supporting researchers in
managing their data throughout
the data life cycle from data storing
to sharing research data [4].
BEXIS2 keeps track of the
evolution of a dataset and returns
to any previous version if needed.
Start to store data in BEXIS2
at this point of your work.
Data security is a major
concern for BEXIS2. It specify
fine grained data permissions
on who can view, access, or
update a dataset.
Why using BEXIS2?
BEXIS2 can be used for long-
term data archiving even as
the publication requirement.
In the near future, you can
get DOI for each dataset.
36. 36
Part I: Basic information
• Introduction (Doris)
• Accounts (Susanne)
• Labeling (Susanne)
• Break
• Research data management
• Store data: upload data
• Use data: download data
• Share data: data security
37. Generate
Store
Use
Share
Archive
Destroy
1. Create a dataset
2. Provide the metadata
3. Design the data structure
4. Upload data
1. Upload a data table
2. Upload a small file
3. Upload big files
37
Data store workflow in BEXIS2
38. 1. Create a dataset
2. Provide the metadata
3. Design the data structure
4. Upload data
1. Upload data table
2. Upload a small file
3. Upload big files
38
Data store workflow
43. 1. Create a dataset
2. Provide the metadata
3. Design the data structure
4. Upload data
1. Upload data table
2. Upload a small file
3. Upload big files
43
Data store workflow
44. The metadata structure is designed for SPP 2089 purposes.
Minimum meta information is required.
44
2. Provide the metadata
46. 1. Create a dataset
2. Provide the metadata
3. Design the data structure
4. Upload data
1. Upload data table
2. Upload a small file
3. Upload big files
46
Data store workflow
47. Example of a data table
47
3. Design the data structure
48. Example of a data structure
48
3. Design the data structure
49. BEXIS2 assigns an empty data structure to a dataset.
49
3. Design the data structure
51. Select a variable template
51
3. Design the data structure
Check
• Description
• Unit
• Data type
52. Search for an existing variable template
52
3. Design the data structure
Search for
• Name
• Description
• Unit
• Data type
53. Create a new variable template
Enter reusable name and description
53
3. Design the data structure
Weight Mucilage weight
Root weight
Dried root weight
54. Select proper variable templates
54
3. Design the data structure
1
1 2 3 4
String = Text
Integer = Whole number
Double, decimal = Real number
5
2
3
4
5
65. 1. Create a dataset
2. Provide the metadata
3. Design the data structure
4. Upload data
1. Upload data table
2. Upload a small file
3. Upload big files
65
Data store workflow
70. 1. Create a dataset
2. Provide the metadata
3. Design the data structure
4. Upload data
1. Upload data table
2. Upload a small file
3. Upload big files
70
Data store workflow
• Check out acceptable file extensions such as
PDF, CSV, or ZIP.
• Each time you can upload only one small file.
• The maximum file size is 1 GB.
71. Create a file format dataset!
71
4.2. Upload a small file
1
2
5
72. The maximum file size is 1 GB.
72
4.2. Upload a small file: Select file
Acceptable file extensions
75. 1. Create a dataset
2. Provide the metadata
3. Design the data structure
4. Upload data
1. Upload data table
2. Upload a small file
3. Upload big files
75
Data store workflow
76. 1. Upload data into a data
repository
– Any data repository such as
Pangea or Zenodo
– UFZ offers its archive system
2. Enter information in BEXIS2
76
4.3. Upload big files
77. 77
4.3. Upload big files: Create a tabular dataset
- Upload big files in a
data repository
- Collect links and
information in BEXIS2
Select “SPP External Data Storage” data structure
78. 78
Note: If you have more than
one link, mention it as remark
and upload the link data table.
2. Enter the link of
archived data as remark.
1. Upload big files in a data repository such as Pangea.
UFZ archive system is an offer to use.
4.3. Upload big files: Provide metadata
79. 79
Row number: An ordinal number like 1, 2, 3
SPP_ID: A combination of the project number and the purpose (e.g., P10_SCE01_Paper1)
Link to Archive: Link to the respective data in a data repository
Name of external drive: Name of the external hard drives, if applicable (e.g., SPP_P10_SCE01_Part1a)
4.3. Upload big files: Upload a list of links
81. 81
Part I: Basic information
• Introduction (Doris)
• Accounts (Susanne)
• Labeling (Susanne)
• Break
• Research data management
• Store data: upload data
• Use data: download data
• Share data: data security
86. 3. Access data via R
3.2. Download and install rBExIS package
1. rBExIS package is available in SPP Intranet data
management web page.
2. Install the package from your computer
• devtools::install (“PATH_TO_THE _rBExIS”)
3. Load “rBExIS” package
• library (rBExIS)
• load_all ("rBExIS")
• check ("rBExIS")
• require (rBExIS)
86
87. 3. Access data via R
3.3. Set options for the rBExIS package
1. Find your tocken
2. Set rBExIS options
bexis.options("token" = "YOUR_TOKEN")
bexis.options("base_url" = "https://spp2089.ufz.de:4433")
87
88. 3. Access data via R
3.4. rBExIS functions
1. A list of all dataset Ids
bexis.get.datasets ()
2. Retrieve data from a dataset
specified by the dataset Id
bexis.get.dataset_by (id = xy)
88
89. If you cannot see the primary data,
contact data owner or contact person.
89
4. Contact data owners
91. 91
Part I: Basic information
• Introduction (Doris)
• Accounts (Susanne)
• Labeling (Susanne)
• Break
• Research data management
• Store data: upload data
• Use data: download data
• Share data: data security
93. 93
Share a dataset: Adjust permission settings
• Read: Reading and downloading
primary data
• Write: Editing metadata and
uploading/updating data
• Delete: only the BEXIS2
administrator can delete a dataset
• Grant: Seeing permission tab
• SPP2089 Group: Applying for all
SPP 2089 members
95. Generate
Store
Use
Share
Archive
Destroy
• The SPP 2089 BEXIS2 platform
will be available forever!
• You can use the SPP 2089 BEXIS2
as the data repository required
for publications.
– Need special settings
95
Archive data
96. Generate
Store
Use
Share
Archive
Destroy
• The BEXIS2 administrator can
delete a whole or the latest
version of a dataset permanently.
• You can delete a data structure by
yourself.
• You can delete unused variable
templates by yourself.
96
Destroy a dataset
98. 98
The End of the Part I
Thank you for your attention!
We will start Part II at 1 p.m.
99. 99
[1] EOSC glossary: https://eosc-portal.eu/glossary
[2] DFG Guidelines on the Handling of Research Data:
https://www.dfg.de/download/pdf/foerderung/grundlagen_dfg_foerderung/forschungsdaten/g
uidelines_research_data.pdf
[3] Wilkinson, M. D. et al. (2016). https://www.nature.com/articles/sdata201618
[4] BEXIS Research Data Management: https://fusion.cs.uni-jena.de/bpp/
References