If Big Data is data that exceeds the processing capacity of conventional systems, thereby necessitating alternative processing measures, we are looking at an essentially technological challenge that IT managers are best equipped to address.
The DCC is currently working with 18 HEIs to support and develop their capabilities in the management of research data and, whilst the aforementioned challenge is not usually core to their expressed concerns, are there particular issues of curation inherent to Big Data that might force a different perspective?
We have some understanding of Big Data from our contacts in the Astronomy and High Energy Physics domains, and the scale and speed of development in Genomics data generation is well known, but the inability to provide sufficient processing capacity is not one of their more frequent complaints.
That’s not to say that Big Science and its Big Data are free of challenges in data curation; only that they are shared with their lesser cousins, where one might say that the real challenge is less one of size than diversity and complexity.
This brief presentation explores those aspects of data curation that go beyond the challenges of processing power but which may lend a broader perspective to the technology selection process.
An update on BeSTGRID activity and plans, in particular in preparation for the planned future developments of a unified approach to high performance and distributed computing in NZ.
Presentation on the work we've done within BeSTGRID as it relates to bioinformatics in NZ, for the 2010 Bioinformatics Symposium https://www.bestgrid.org/NZ-Bioinformatics-Symposium-2010
Presentation from the 2013 Bio-IT World conference. It describes the design and implementation of data and compute infrastructure for the New York Genome Center.
A description of software as infrastructure at NSF, and how Apache projects may be similar. What lessons can be shared from one organization to the other? How does science software compare with more general software?
An update on BeSTGRID activity and plans, in particular in preparation for the planned future developments of a unified approach to high performance and distributed computing in NZ.
Presentation on the work we've done within BeSTGRID as it relates to bioinformatics in NZ, for the 2010 Bioinformatics Symposium https://www.bestgrid.org/NZ-Bioinformatics-Symposium-2010
Presentation from the 2013 Bio-IT World conference. It describes the design and implementation of data and compute infrastructure for the New York Genome Center.
A description of software as infrastructure at NSF, and how Apache projects may be similar. What lessons can be shared from one organization to the other? How does science software compare with more general software?
A brief overview of the development and current workflows for Research Data Management at Imperial College London, presented to colleagues at the University of Copenhagen and Roskilde University in Denmark.
OpenData Public Research
Open Access Events: The Case for Open Data, Why you should Care
Map & Data Library - 5th Floor Robarts Library, University of Toronto
Thursday, Oct. 25 from 10:00-12:00
Organized by Data and Map Librarians, Marcel Fortin and Berenica Vejvoda
Crowdsourcing Approaches to Big Data Curation - Rio Big Data MeetupEdward Curry
Data management efforts such as Master Data Management and Data Curation are a popular approach for high quality enterprise data. However, Data Curation can be heavily centralised and labour intensive, where the cost and effort can become prohibitively high. The concentration of data management and stewardship onto a few highly skilled individuals, like developers and data experts, can be a significant bottleneck. This talk explores how to effectively involving a wider community of users within big data management activities. The bottom-up approach of involving crowds in the creation and management of data has been demonstrated by projects like Freebase, Wikipedia, and DBpedia. The talk discusses how crowdsourcing data management techniques can be applied within an enterprise context.
Topics covered include:
- Data Quality And Data Curation
- Crowdsourcing
- Case Studies on Crowdsourced Data Curation
- Setting up a Crowdsourced Data Curation Process
- Linked Open Data Example
- Future Research Challenges
Case Study Big Data: Socio-Technical Issues of HathiTrust Digital TextsBeth Plale
Invited talk at TRUST Women’s Institute for Summer Enrichment (WISE), Cornell, NY Jun 16, 2014. Infrastructure support for text mining research of big data repository like HathiTrust raises challenges in access and security when the bulk of the repository is protected by copyright.
dkNET Webinar: FAIR Data & Software in the Research Life Cycle 01/22/2021dkNET
Abstract
Good data stewardship is the cornerstone of knowledge, discovery, and innovation in research. The FAIR Data Principles address data creators, stewards, software engineers, publishers, and others to promote maximum use of research data. The principles can be used as a framework for fostering and extending research data services.
This talk will provide an overview of the FAIR principles and the drivers behind their development by a broad community of international stakeholders. We will explore a range of topics related to putting FAIR data into practice, including how and where data can be described, stored, and made discoverable (e.g., data repositories, metadata); methods for identifying and citing data; interoperability of (meta)data; best-practice examples; and tips for enabling data reuse (e.g., data licensing). Practical examples of how FAIR is applied will be provided along the way.
Presenter: Christopher Erdmann, Engagement, support, and training expert on the NHLBI BioData Catalyst project at University of North Carolina Renaissance Computing Institute
dkNET Webinars Information: https://dknet.org/about/webinar
DEVELOPING A KNOWLEDGE MANAGEMENT SPIRAL FOR THE LONG-TERM PRESERVATION SYSTE...cscpconf
The goal of Long-term preservation (LTP) is to make the sustainability of archives lasting for a foreseeable enough time. The efforts are primarily hampered by challenges such as missing of standards, formal methodology and workflow model during archiving. This research is aiming to explore the LTP of various kinds of documents independently from the evolution of time and changes in techniques within digital environments. Basic requirements come from integration of storage management and information management, securing preservation of data, metadata, indexes, etc. This paper presents the evolutionary development of the LTP process for Governmental Archive Management and Knowledge . Effective search to resources and efficient storage/access on data, recovery drawing on co-location back-up, dynamic regulation on authentication and security management are tasks followed. Then, a pilot Semantic Data Grid and service matching mechanisms are described, where the ontologism
plays a crucial role
Introduction to research data management; Lecture 01 for GRAD521Amanda Whitmire
Lesson 1: Introduction to research data management. From a series of lectures from a 10-week, 2-credit graduate-level course in research data management (GRAD521, offered at Oregon State University).
The course description is: "Careful examination of all aspects of research data management best practices. Designed to prepare students to exceed funder mandates for performance in data planning, documentation, preservation and sharing in an increasingly complex digital research environment. Open to students of all disciplines."
Major course content includes: Overview of research data management, definitions and best practices; Types, formats and stages of research data; Metadata (data documentation); Data storage, backup and security; Legal and ethical considerations of research data; Data sharing and reuse; Archiving and preservation.
See also, "Whitmire, Amanda (2014): GRAD 521 Research Data Management Lectures. figshare. http://dx.doi.org/10.6084/m9.figshare.1003835. Retrieved 23:25, Jan 07, 2015 (GMT)"
This presentation was delivered at the Elsevier Library Connect Seminar on 6 October 2014 in Johannesburg, 7 October 2014 in Durban and 9 October 2014 in Cape Town and gives an overview of the potential role that librarians can play in research data management
Building the FAIR Research Commons: A Data Driven Society of ScientistsCarole Goble
Science is knowledge work. The scientific method and scholarly communication are about facilitating “knowledge turns” – that is, the turning of observation and hypothesis through experimentation, comparison, and analysis into new, pooled knowledge. Turns depend on the FAIR flow and availability of data, methods for automated processing, reproducible results and on a society of scientists coordinating and collaborating. We need to build a new form of Research Commons and I will present my steps towards this.
Presented at Symposium: The Future of a Data-Driven Society, Maastricht University, 25 Jan 2018 that accompanied the 42nd Dies Natalis where I was awarded an honorary doctorate
Personal video:
https://www.youtube.com/watch?v=k5WN6KDDatU&index=4&list=PLzi-FBaZlOOagma5dCW7WSA5lv22tmNMD
Video of the symposium:
https://www.youtube.com/watch?v=JN9eMMtCHf8&t=19s&index=6&list=PLzi-FBaZlOOagma5dCW7WSA5lv22tmNMD
Data accessibility and the role of informatics in predicting the biosphereAlex Hardisty
The variety, distinctiveness and complexity of life – biodiversity in other words and by implication the ecosystems in which it is situated – is our life support system. It is absolutely essential and more important than almost everything else but it is typically taken for granted. Today’s big societal challenges – food and water security, coping with environmental change and aspects of human health – are beyond the abilities of any one individual or research group to solve. Solving them depends not only on collaboration to deliver the appropriate scientific evidence but increasingly on vast amounts of data from multiple sources (environmental, taxonomic, genomic and ecological) gathered by manual observation and automated sensors, digitisation, remote sensing, and genetic sequencing. In April 2012 we called the biodiversity and ecosystems research communities to arms to formulate a consensus view on establishing an infrastructure to improve the accessibility of the ever-increasing volumes of biological data. We published the whitepaper: “A decadal view of biodiversity informatics: challenges and priorities” that has since been viewed more than 24,000 times. We envisage a shared and maintained multi-purpose network of computationally-based processing services sitting on top of an open data domain. By open data domain we mean data that is accessible i.e., published, registered and linked. BioVeL, pro-iBiosphere, ViBRANT and other FP7 funded projects have all explored aspects of this vision.
Introduction to research data managementMichael Day
Slides from a presentation given at the JIBS User Group / RLUK joint event "Demystifying research data: don't be scared, be prepared" held at the SOAS Brunei Gallery, London, 17 July 2012.
How the University of Waterloo Centre for Education in Mathematics and Computing is using Maplesoft’s experience and technologies, Maple, Maple T.A. and Maple.net, to bring STEM Courses Online in an environment that includes: natural maths notation, visualizations and assessment. Presented to Eduserv's Maths and Stats Software Group December 2014
A brief overview of the development and current workflows for Research Data Management at Imperial College London, presented to colleagues at the University of Copenhagen and Roskilde University in Denmark.
OpenData Public Research
Open Access Events: The Case for Open Data, Why you should Care
Map & Data Library - 5th Floor Robarts Library, University of Toronto
Thursday, Oct. 25 from 10:00-12:00
Organized by Data and Map Librarians, Marcel Fortin and Berenica Vejvoda
Crowdsourcing Approaches to Big Data Curation - Rio Big Data MeetupEdward Curry
Data management efforts such as Master Data Management and Data Curation are a popular approach for high quality enterprise data. However, Data Curation can be heavily centralised and labour intensive, where the cost and effort can become prohibitively high. The concentration of data management and stewardship onto a few highly skilled individuals, like developers and data experts, can be a significant bottleneck. This talk explores how to effectively involving a wider community of users within big data management activities. The bottom-up approach of involving crowds in the creation and management of data has been demonstrated by projects like Freebase, Wikipedia, and DBpedia. The talk discusses how crowdsourcing data management techniques can be applied within an enterprise context.
Topics covered include:
- Data Quality And Data Curation
- Crowdsourcing
- Case Studies on Crowdsourced Data Curation
- Setting up a Crowdsourced Data Curation Process
- Linked Open Data Example
- Future Research Challenges
Case Study Big Data: Socio-Technical Issues of HathiTrust Digital TextsBeth Plale
Invited talk at TRUST Women’s Institute for Summer Enrichment (WISE), Cornell, NY Jun 16, 2014. Infrastructure support for text mining research of big data repository like HathiTrust raises challenges in access and security when the bulk of the repository is protected by copyright.
dkNET Webinar: FAIR Data & Software in the Research Life Cycle 01/22/2021dkNET
Abstract
Good data stewardship is the cornerstone of knowledge, discovery, and innovation in research. The FAIR Data Principles address data creators, stewards, software engineers, publishers, and others to promote maximum use of research data. The principles can be used as a framework for fostering and extending research data services.
This talk will provide an overview of the FAIR principles and the drivers behind their development by a broad community of international stakeholders. We will explore a range of topics related to putting FAIR data into practice, including how and where data can be described, stored, and made discoverable (e.g., data repositories, metadata); methods for identifying and citing data; interoperability of (meta)data; best-practice examples; and tips for enabling data reuse (e.g., data licensing). Practical examples of how FAIR is applied will be provided along the way.
Presenter: Christopher Erdmann, Engagement, support, and training expert on the NHLBI BioData Catalyst project at University of North Carolina Renaissance Computing Institute
dkNET Webinars Information: https://dknet.org/about/webinar
DEVELOPING A KNOWLEDGE MANAGEMENT SPIRAL FOR THE LONG-TERM PRESERVATION SYSTE...cscpconf
The goal of Long-term preservation (LTP) is to make the sustainability of archives lasting for a foreseeable enough time. The efforts are primarily hampered by challenges such as missing of standards, formal methodology and workflow model during archiving. This research is aiming to explore the LTP of various kinds of documents independently from the evolution of time and changes in techniques within digital environments. Basic requirements come from integration of storage management and information management, securing preservation of data, metadata, indexes, etc. This paper presents the evolutionary development of the LTP process for Governmental Archive Management and Knowledge . Effective search to resources and efficient storage/access on data, recovery drawing on co-location back-up, dynamic regulation on authentication and security management are tasks followed. Then, a pilot Semantic Data Grid and service matching mechanisms are described, where the ontologism
plays a crucial role
Introduction to research data management; Lecture 01 for GRAD521Amanda Whitmire
Lesson 1: Introduction to research data management. From a series of lectures from a 10-week, 2-credit graduate-level course in research data management (GRAD521, offered at Oregon State University).
The course description is: "Careful examination of all aspects of research data management best practices. Designed to prepare students to exceed funder mandates for performance in data planning, documentation, preservation and sharing in an increasingly complex digital research environment. Open to students of all disciplines."
Major course content includes: Overview of research data management, definitions and best practices; Types, formats and stages of research data; Metadata (data documentation); Data storage, backup and security; Legal and ethical considerations of research data; Data sharing and reuse; Archiving and preservation.
See also, "Whitmire, Amanda (2014): GRAD 521 Research Data Management Lectures. figshare. http://dx.doi.org/10.6084/m9.figshare.1003835. Retrieved 23:25, Jan 07, 2015 (GMT)"
This presentation was delivered at the Elsevier Library Connect Seminar on 6 October 2014 in Johannesburg, 7 October 2014 in Durban and 9 October 2014 in Cape Town and gives an overview of the potential role that librarians can play in research data management
Building the FAIR Research Commons: A Data Driven Society of ScientistsCarole Goble
Science is knowledge work. The scientific method and scholarly communication are about facilitating “knowledge turns” – that is, the turning of observation and hypothesis through experimentation, comparison, and analysis into new, pooled knowledge. Turns depend on the FAIR flow and availability of data, methods for automated processing, reproducible results and on a society of scientists coordinating and collaborating. We need to build a new form of Research Commons and I will present my steps towards this.
Presented at Symposium: The Future of a Data-Driven Society, Maastricht University, 25 Jan 2018 that accompanied the 42nd Dies Natalis where I was awarded an honorary doctorate
Personal video:
https://www.youtube.com/watch?v=k5WN6KDDatU&index=4&list=PLzi-FBaZlOOagma5dCW7WSA5lv22tmNMD
Video of the symposium:
https://www.youtube.com/watch?v=JN9eMMtCHf8&t=19s&index=6&list=PLzi-FBaZlOOagma5dCW7WSA5lv22tmNMD
Data accessibility and the role of informatics in predicting the biosphereAlex Hardisty
The variety, distinctiveness and complexity of life – biodiversity in other words and by implication the ecosystems in which it is situated – is our life support system. It is absolutely essential and more important than almost everything else but it is typically taken for granted. Today’s big societal challenges – food and water security, coping with environmental change and aspects of human health – are beyond the abilities of any one individual or research group to solve. Solving them depends not only on collaboration to deliver the appropriate scientific evidence but increasingly on vast amounts of data from multiple sources (environmental, taxonomic, genomic and ecological) gathered by manual observation and automated sensors, digitisation, remote sensing, and genetic sequencing. In April 2012 we called the biodiversity and ecosystems research communities to arms to formulate a consensus view on establishing an infrastructure to improve the accessibility of the ever-increasing volumes of biological data. We published the whitepaper: “A decadal view of biodiversity informatics: challenges and priorities” that has since been viewed more than 24,000 times. We envisage a shared and maintained multi-purpose network of computationally-based processing services sitting on top of an open data domain. By open data domain we mean data that is accessible i.e., published, registered and linked. BioVeL, pro-iBiosphere, ViBRANT and other FP7 funded projects have all explored aspects of this vision.
Introduction to research data managementMichael Day
Slides from a presentation given at the JIBS User Group / RLUK joint event "Demystifying research data: don't be scared, be prepared" held at the SOAS Brunei Gallery, London, 17 July 2012.
How the University of Waterloo Centre for Education in Mathematics and Computing is using Maplesoft’s experience and technologies, Maple, Maple T.A. and Maple.net, to bring STEM Courses Online in an environment that includes: natural maths notation, visualizations and assessment. Presented to Eduserv's Maths and Stats Software Group December 2014
A talk delivered by Ivan Harris at the London G-Cloud meet-up, January 2014.
Topics covered:
• Government security classifications
• PSN connectivity
• Hybrid clouds
• Application development
An introduction to Eduserv's UMF Cloud Pilot scheme for higher education. Eduserv has created the Education Cloud as part of the university modernisation fund cloud projects.
This presentation reviews the cloud infrastructure and pricing created for the cloud.
This presentation was delivered by Andy Powell at our shared services in HE event on 24 November 2011.
A talk delivered by Dr Tim Cockle at Public Sector Enterprise 2013.
The presentation looks at how various agile techniques can be applied in public sector organisations.
Topics covered:
• Understanding what agile means for you
• What are the core principles and implications
• How to find the balance when moving towards agile
A presentation given by Steve Warburton of KCL at the Where Next for Digital Identity event organised by Eduserv and held at the British Library in January 2010.
Case study: Building a business case for cloud, migration in practice and spr...Eduserv
A talk delivered by Rocco Labellarte, Head of Technology and Change Delivery, Corporate Services at Royal Borough of Windsor and Maidenhead. This presentation was given at Cloud Control: Implementing Cloud Computing, a seminar hosted by Civil Service World and Eduserv.
Topics covered include the process of building a case for cloud, the benefits and lessons learnt.
Supporting Libraries in Leading the Way in Research Data ManagementMarieke Guy
Marieke Guy, Institutional Support Officer, Digital Curation Centre, UKOLN, University of Bath, UK presents on Supporting Libraries in Leading the Way in Research Data Management at Online Information, London 20th -21st November 2012
High Performance Data Analytics and a Java Grande Run TimeGeoffrey Fox
There is perhaps a broad consensus as to important issues in practical parallel computing as applied to large scale simulations; this is reflected in supercomputer architectures, algorithms, libraries, languages, compilers and best practice for application development.
However the same is not so true for data intensive even though commercially clouds devote many more resources to data analytics than supercomputers devote to simulations.
Here we use a sample of over 50 big data applications to identify characteristics of data intensive applications and to deduce needed runtime and architectures.
We propose a big data version of the famous Berkeley dwarfs and NAS parallel benchmarks.
Our analysis builds on the Apache software stack that is well used in modern cloud computing.
We give some examples including clustering, deep-learning and multi-dimensional scaling.
One suggestion from this work is value of a high performance Java (Grande) runtime that supports simulations and big data
CINECA webinar slides: Data Gravity in the Life Sciences: Lessons learned fro...CINECAProject
We live in an era of cloud computing. Many of the services in the life sciences are keenly planning cloud transformations, seeking to create globally distributed ecosystems of harmonised data based on standards from organisations like GA4GH. CINECA faces similar challenges, gathering cohort datasets from all over the globe, many of which are pinned in place, due to their size, legal restrictions, or other considerations. But is “bringing compute to the data” always the right choice? In this webinar, based on experiences from the Human Cell Atlas Data Coordination Platform and other projects from EMBL-EBI, we will explore the concept of “data gravity”: The idea that whilst there are forces that may hold data in one place, there are others that require it to be mobile. We’ll consider how effectively planning a cloud strategy requires consideration of the gravity of datasets, and the impact it may have on team skills required, incentives for good practice, and storage and compute costs.
The CINECA webinar series aims to discuss ways to address common challenges and share best practices in the field of cohort data analysis, as well as distribute CINECA project results. All CINECA webinars include an audience Q&A session during which attendees can ask questions and make suggestions. Please note that all webinars are recorded and available for posterior viewing. CINECA webinars include an audience Q&A session during which attendees can ask questions and make suggestions.
This webinar took place on 12th November 2020 and is part of the CINECA webinar series.
For previous and upcoming CINECA webinars see:
https://www.cineca-project.eu/webinars
Sirris innovate2011 - Smart Products with smart data - introduction, Dr. Elen...Sirris
This lecture highlights current trends, challenges and opportunities related to the emergence of large amounts of data. It also presents Sirris’s recent research activities in this domain.
2013 DataCite Summer Meeting - DOIs and Supercomputing (Terry Jones - Oak Rid...datacite
2013 DataCite Summer Meeting - Making Research better
DataCite. Co-sponsored by CODATA.
Thursday, 19 September 2013 at 13:00 - Friday, 20 September 2013 at 12:30
Washington, DC. National Academy of Sciences
http://datacite.eventbrite.co.uk/
Meeting Federal Research Requirements for Data Management Plans, Public Acces...ICPSR
These slides cover evolving federal research requirements for sharing scientific data. Provided are updates on federal agency responses to the 2013 OSTP memo, guidance on data management plans, resources for data management and curation training for staff/researchers, and tips for evaluating public data-sharing services. ICPSR's public data-sharing service, openICPSR, is also presented. Recording of this presentation is here: https://www.youtube.com/watch?v=2_erMkASSv4&feature=youtu.be
To foster greater and more consistent use of the new 100 Gbps connections that is being deployed in the national RNP backbone, the e-Cyber project aims at delivering high-performing services to the most infrastructure-demanding research centers in Brazil. To do this, the project is getting inspired by the “superfacility” concept, which is adopted by initiatives like GRP (Global Research Platform) and EOSC (European Open Science Cloud). However, one of our biggest challenges is to engage the client institutions and bring them to co-create solutions and participate in the project governance.
Stuart Macdonald steps through the process of creating a robust data management plan for researchers. Presented at the European Association for Health Information and Libraries (EAHIL) 2015 workshop, Edinburgh, 11 June 2015.
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...Sarah Anna Stewart
Presentation given at the M25 Consortium of Academic Libraries, CPD25 Event on 'The Role of the Library in Supporting Research'. Provides an introduction to data, software and PIDs and a brief look at how libraries can enable researchers to gain impact and credit for their research data and software.
Research Data Management: An Introductory Webinar from OpenAIRE and EUDATTony Ross-Hellauer
OpenAIRE and EUDAT co-present this webinar which aims to introduce researchers and others to the concept of research data management (RDM). As well as presenting the benefits of taking an active approach to research data management – including increased speed and ease of access, efficiency (fund once, reuse many times), and improved quality and transparency of research – the webinar will advise on strategies for successful RDM, resources to help manage data effectively, choosing where to store and deposit data, the EC H2020 Open Data Pilot and the basics of data management, stewardship and archiving.
Webinar recording available: http://www.instantpresenter.com/eifl/EB57D6888147
Research Data Management: An Introductory Webinar from OpenAIRE and EUDATOpenAIRE
OpenAIRE and EUDAT co-present this webinar which aims to introduce researchers and others to the concept of research data management (RDM). As well as presenting the benefits of taking an active approach to research data management – including increased speed and ease of access, efficiency (fund once, reuse many times), and improved quality and transparency of research – the webinar will advise on strategies for successful RDM, resources to help manage data effectively, choosing where to store and deposit data, the EC H2020 Open Data Pilot and the basics of data management, stewardship and archiving.
Webinar recording available: http://www.instantpresenter.com/eifl/EB57D6888147
Phase two of OpenAthens SP evolution including OpenID connect optionEduserv
David Orrell, System Architect and Phil Leahy, Service Relationship Manager, talk about Phase II of the OpenAthens Cloud Service Provider project, and also about how OpenAthens is being used as an identity provider service in the corporate sector.
Tim Lull, Vice President of Sales and Gar Sydnor, Vice President of Discovery Innovation, showcases EBSCO and how this product benefits the identity and access management community.
Phil Leahy, Service Relationship Manager covers our commitment to the publishing community as part of our Publisher Manifesto. David Orrell, System Architect, runs through phase one of our new service provider product.
Neil Scully, Head of Development and Service Delivery, shares the AGILE SCRUM and SPRINT process used in our product development methodology and the benefits this brings.
Tracy Gardner from Simon Inger Consulting presents the results of their 12 month research project, which included a survey of how over 40,000 readers discover scholarly content. The findings are pertinent to publishers and information professionals alike across sectors.
Jon Bentley, Commercial Director, shares the vision for our products, explains our brand evolution and presents key milestones in the development of our identity and access management (IAM) solutions. He also highlights the range of applications that work with OpenAthens.
Mike Brooksbank, Executive Director of OpenAthens, runs through the schedule of the day, plus an overview of OpenAthens and Eduserv, our last FY year and the year ahead.
Eduserv's Marketing Manager, Alex Bacon, presented at the B2B Network about his experience of content marketing and how to deliver valuable and engaging content to your audiences whilst generating leads at the same time.
This presentation by Jonathan Watkins of Maplesoft and the University of Birmingham was given to the Eduserv Maths and Stats Software Focus Group in June 2016. Möbius is a comprehensive online courseware environment that focuses on science, technology, engineering, and mathematics (STEM). students can explore important concepts using engaging, interactive applications, visualize problems and solutions, and test their understanding by answering questions that are graded instantly.
This presentation was given to the Eduserv Maths and Stats Software Focus Group in June 2016. It focuses on updates to NVivo 11 for Windows and Mac, the new QSR Certification Programme and how QSR and the academic community might work more closely together.
Nick Wallace, Government Analyst, Public Sector Ovum
Momentum for the adoption of cloud services continues to grow in the public sector as services mature and agencies experience in buying and using cloud services grows. As agencies steadily incorporate various cloud components into their environment, it is clear that public sector organisations are starting to realise the benefits of cloud. In fact if one where creating a “greenfield” service, “in the cloud” would be the default approach. However the reality is that most institutions are not in this position. Most have to manage a legacy environment that comprises aging technology, duplicate, inefficient and inconsistent business processes. Developing and implementing a staged migration to cloud will be pivotal when determining whether the “as-a-service” promise facilitates innovation or undermines organisational integrity
Planning your cloud strategy: Adur and Worthing CouncilsEduserv
Paul Brewer, Director for Digital & Resources at Adur & Worthing Council.
How do you assess your organisations readiness to move to the cloud and adopt new platforms drive business change? Paul Brewer from Adur and Worthing Councils will be sharing how they evaluated whether cloud was right for them, the talk will cover how they evaluated the benefits, costs and risks of moving to the cloud, and how they used this assessment to support and build their cloud strategy.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
I have heard many times that architecture is not important for the front-end. Also, many times I have seen how developers implement features on the front-end just following the standard rules for a framework and think that this is enough to successfully launch the project, and then the project fails. How to prevent this and what approach to choose? I have launched dozens of complex projects and during the talk we will analyze which approaches have worked for me and which have not.
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.
Search and Society: Reimagining Information Access for Radical Futures
Graham Pryor
1. Because good research needs good data
Big data
– no big deal for curation?
Graham Pryor, Associate Director, UK Digital Curation Centre
Eduserv Symposium 2012: Big Data, Big Deal?
.
This work is licensed under a Creative Commons Attribution 2.5 UK: Scotland License
2. Big data – big deal or same deal?
“What need the bridge much broader than the flood?
The fairest grant is the necessity.
Look, what will serve is fit…”
Much Ado About Nothing, Act 1 Scene 1
3. Eduserv Symposium 2012 –
speakers’ Research Areas
• Operating Systems & Networking
• Computer and Network Security
• Distributed Systems
• Mobile Computing
• Wireless Networking
• Software Engineering
• High performance compute clusters
• Cloud and grid technologies
• Effective management of large clusters and
cluster file-systems
• Very large database systems (architecture,
management and application optimization)
4. The Digital Curation Centre
• a consortium comprising units from the Universities of Bath
(UKOLN), Edinburgh (DCC Centre) and Glasgow (HATII)
• launched 1st March 2004 as a national centre for solving
challenges in digital curation that could not be tackled by
any single institution or discipline
• funded by JISC to build capacity, capability and skills in
research data management across the UK HEI community
• awarded additional HEFCE funding 2011/13 for
• the provision of support to national cloud services
• targeted institutional development
5. Three perspectives
Scale and complexity
– Volume and pace
– Infrastructure
– Open science
Policy
– Funders
– Institutions
– Ethics & IP
Management
– Storage
– Incentives
– Costs & Sustainability
http://www.nonsolotigullio.com/effettiottici/images/escher.jpg/
6. Challenges of scale and complexity
• The virtual laboratory is a federation
of server nodes that allows
• Globally, >100,000
distributed data to be stored local to
neuroscientists study the
acquisition
CNS, generating massive,
• Analysis codes can be uploaded and
intricate and highly this is only talking
But terabytes…
executed on the nodes so that
interrelated datasets
derived datasets need not be
• Analysts require access to
transported over low bandwidth
these data to develop
connections
algorithms, models and
• Data and analysis codes are
schemata that characterise
described by structured metadata,
the underlying system
providing an index for search,
• Resources and actors are
annotation and audit over workflows
rarely collocated and are
leading to scientific outcomes
therefore difficult to combine.
• Users access the distributed
resources through a web portal
emulating a PC desktop
http://www.carmen.org.uk/
7. Big data? – The Large Hadron Collider
Searching for the Higgs Boson
• Predicted annual generation of around 15
petabytes (15 million gigabytes) of data
• Would need >1,700,000 dual layer DVDs
8. Big data – the GridPP solution
Crowd sourcing for the LHC
Home and“Withcomputer users
office GridPP you
can sign up to thenever have
need LHC at home
project (based at Queen Mary,
University those data
of London), which
processing blues
makes use of idle CPU time. So
far, 40,000again…”
users in more than 100
countries have contributed the
equivalent of 3000 years on a
http://www.gridpp.ac.uk/about
single computer to the project.
With the Large Hadron Collider running at CERN the grid is
being used to process the accompanying data deluge. The UK
grid is contributing more than the equivalent of 20,000 PCs to
this worldwide effort.
9. Yet…..Data Preservation in High
Energy Physics?
Data from high–energy physics (HEP)
experiments are collected with significant
financial and human effort and are in many
cases unique. At the same time, HEP has no
coherent strategy for data preservation and re–
use, and many important and complex data sets
are simply lost.
David M. South, on behalf of the ICFA DPHEP Study Group
arXiv:1101.3186v1 [hep-ex]
10. Big data in genomics
These studies are generating
valuable datasets which, due to
their size and complexity, need to
be skilfully managed…
11. There’s a bigger deal than big data…
Socio- 2.
technical • Inventory data assets
management
perspectives • Profile norms, roles,
• Identify drivers and
values
champions
• Identify capability gaps
• Analyse stakeholders,
• Analyse current
issues
Information workflows
• Identify capability systems
gaps perspectives
• Assess costs,
benefits, risks
3.
Research
practice • Produce feasible,
perspectives desirable changes
• Evaluate fitness for
purpose
Adapted from Developing Research Data Management Capabilities by Whyte et al, DCC, 2012
12. The DCC - building capacity and capability
through targeted institutional development
• 18 institutional engagements, 14 roadshows
• advice and assistance in strategy and policy
• use of curation tools for audit and planning
• training and skills transfer
13. Why do we do this?
1. Reports that researchers are often unaware
of threats and opportunities
14. http://www.flickr.com/photos/mattimattila/3003324844/
“Departments don’t have guidelines or
norms for personal back-up and researcher
procedure, knowledge and diligence varies
tremendously. Many have experienced
moderate to catastrophic data loss”
Incremental Project Report, June 2010
15. Why do we do this?
1. Reports that researchers are often unaware
of threats and opportunities
2. There is a lack of clarity in terms of skills
availability and acquisition
16. …researchers are
reluctant to adopt new tools and
services unless they know
someone who can recommend
or share knowledge about
them. Support needs to be
based on a close understanding
of the researchers’ work, its
patterns and timetables.
17. Why do we do this?
1. Reports that researchers are often unaware
of threats and opportunities
2. There is a lack of clarity in terms of skills
availability and acquisition
3. Many institutions are unprepared to meet
the increasingly prescriptive demands of
funders
18. EPSRC expects all those institutions it funds
• to have developed a roadmap aligning their policies
and processes with EPSRC’s nine expectations by
1st May 2012
• to be fully compliant with each of those expectations
by 1st May 2015
• to recognise that compliance will be monitored and
non-compliance investigated and that
• failure to share research data could result in the
imposition of sanctions
19. Why do we do this?
1. Reports that researchers are often unaware
of threats and opportunities
2. There is a lack of clarity in terms of skills
availability and acquisition
3. Many institutions are unprepared to meet
the increasingly prescriptive demands of
funders
4. …and legislators
20. Rules and regulations…
Compliance
Data Protection Act
1998
• Rights, Exemptions, Enforcement
Freedom of • Climategate, Tree Rings, Tobacco
Information Act 2000 and…(what’s next?)
Computer Misuse Act
1980
• etc. etc. etc………..
21. Why do we do this?
1. Reports that researchers are often unaware
of threats and opportunities
2. There is a lack of clarity in terms of skills
availability and acquisition
3. Many institutions are unprepared to meet
the increasingly prescriptive demands of
funders
4. …and legislators
5. The advantages from planning, openness
and sharing are not understood
22. Open to all? Case studies of openness
in research
Choices are made according to context, with
degrees of openness reached according to:
• The kinds of data to be made available
• The stage in the research process
• The groups to whom data will be made
available
• On what terms and conditions it will be
provided
Default position of most:
• YES to protocols, software, analysis tools,
methods and techniques
• NO to making research data content freely
available to everyone
After all, where is the incentive? Angus Whyte, RIN/NESTA, 2010
24. Main institutional concerns
And big data? There has been no mention
– Compliance
yet of any specific challenge from big data
– Asset management
but…
– Cost benefits
– Incentivisation
Institutions are providing resources to work
onComplexity of the data environment
– big data, both equipment and people,
and more importantly…
…the issues central to effective data
management are common across the data
spectrum, irrespective of size
25. Some current institutional engagements
Assessing Piloting tools
needs e.g. DataFlow
RDM roadmaps
Policy Policy
development implementation
26. Support offered by the DCC
Institutional
Assess data catalogues
needs Workflow
assessment Pilot RDM
tools
Develop
DAF & CARDIO DCC
assessments Guidance support
support
team and training and
services
RDM policy
Advocacy to senior development
management
Customised Data
Make the case Management Plans
…and support policy implementation
28. Your Data as Assets: DAF
• What are the characteristics of your
research data assets?
– Number?
– Scale?
– Complexity?
– Dependencies?
– Liabilities?
• Why do researchers act the way they do
with respect to data?
• Which data do they need to undertake
productive research?
29. DMP Online is a web-based data management
planning tool that allows you to build and edit plans
according to the requirements of the major UK
funders.
The tool also contains helpful guidance and links for
researchers and other data professionals.
http://www.dcc.ac.uk/dmponline
30. An online tool for departments or research groups to
identify their current data management capabilities
and identify coordinated pathways to future
enhancement via a dedicated knowledge base.
CARDIO emphasises a collaborative, consensus-
driven approach, and enables benchmarking with
other groups and institutions.
http://cardio.dcc.ac.uk/
31. DRAMBORA is an audit methodology and tool for
identifying and planning for the management of risks
which may threaten the availability and/or usability of
content in a digital repository or archive.
http://www.repositoryaudit.eu
32. So, big data
– no big deal for curation?
• Yes, it’s big
• It’s also very complex
• There is no single technology solution
• Issues of human infrastructure are
possibly a bigger challenge
• But for big data aficionados the
technology challenges are big enough
33. Data Management – infrastructure
and data storage challenges...
Scaleability
Cost-effectiveness
Security (privacy and IPR)
Robust and resilient
Low entry barrier
Ease-of-use
Data-handling / transfer /
analysis capabilities
The case for cloud computing in genome informatics.
Lincoln D Stein, May 2010