Data Herding for Scientists - UC Davis OA WeekCarly Strasser
Presentation for the UC Davis for Open Access Week. Covers the current status of data management in the sciences, best practices for data management, data management planning, and tools for researchers.
Data Management: Scientist Perspective - DLF 2012Carly Strasser
Presentation at the 2012 Digital Libraries Federation Fall Forum in Denver, CO. Workshop on Data Management Services, held 5 Nov 2012. http://www.diglib.org/forums/2012forum/data-management-services-at-the-library-the-3-hour-tour/
Presentation on all things open at the 2012 Digital Libraries Federation Fall Forum in Denver CO. Part of a workshop on data management services, 5 Nov 2012.
CDL has recently launched a new project dubbed Digital Curation for Excel (DCXL), funded by the Gordon and Betty Moore Foundation and Microsoft Research. The goal of the DCXL project is to facilitate data management, sharing, and archiving for earth, environmental, and ecological scientists. The main result from the project will be an open source add-in for Microsoft Excel that will assist scientists in preparing their Excel data for sharing.
Overview of data management policies and data management plans, including the DMPTool. For Ecological Society of America 2013 Meeting in Minneapolis, MN 5 August 2013.
RDAP 15: You’re in good company: Unifying campus research data servicesASIS&T
Research Data Access and Preservation Summit, 2015
Minneapolis, MN
April 22-23
Cynthia Hudson-Vitale, Digital Data Outreach Librarian, Washington University
Brianna Marshall, Digital Curation Coordinator, University of Wisconsin-Madison
Amy Nurnberger, Research Data Manager, Columbia University
The Internet, Science, and Transformations of KnowledgeEric Meyer
Talk on June 7, 2012 in the Harvard SAP Speaker Series (Office of the Senior Associate Provost for the Harvard Library).
http://www.provost.harvard.edu/harvard_library/sap_speakers_series.php
Data Herding for Scientists - UC Davis OA WeekCarly Strasser
Presentation for the UC Davis for Open Access Week. Covers the current status of data management in the sciences, best practices for data management, data management planning, and tools for researchers.
Data Management: Scientist Perspective - DLF 2012Carly Strasser
Presentation at the 2012 Digital Libraries Federation Fall Forum in Denver, CO. Workshop on Data Management Services, held 5 Nov 2012. http://www.diglib.org/forums/2012forum/data-management-services-at-the-library-the-3-hour-tour/
Presentation on all things open at the 2012 Digital Libraries Federation Fall Forum in Denver CO. Part of a workshop on data management services, 5 Nov 2012.
CDL has recently launched a new project dubbed Digital Curation for Excel (DCXL), funded by the Gordon and Betty Moore Foundation and Microsoft Research. The goal of the DCXL project is to facilitate data management, sharing, and archiving for earth, environmental, and ecological scientists. The main result from the project will be an open source add-in for Microsoft Excel that will assist scientists in preparing their Excel data for sharing.
Overview of data management policies and data management plans, including the DMPTool. For Ecological Society of America 2013 Meeting in Minneapolis, MN 5 August 2013.
RDAP 15: You’re in good company: Unifying campus research data servicesASIS&T
Research Data Access and Preservation Summit, 2015
Minneapolis, MN
April 22-23
Cynthia Hudson-Vitale, Digital Data Outreach Librarian, Washington University
Brianna Marshall, Digital Curation Coordinator, University of Wisconsin-Madison
Amy Nurnberger, Research Data Manager, Columbia University
The Internet, Science, and Transformations of KnowledgeEric Meyer
Talk on June 7, 2012 in the Harvard SAP Speaker Series (Office of the Senior Associate Provost for the Harvard Library).
http://www.provost.harvard.edu/harvard_library/sap_speakers_series.php
Security and Data Ownership in the Cloud
Andrew K. Pace, Executive Director, Networked Library Services, OCLC; Councilor-at-large, American Library Association
10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...DuraSpace
“Hot Topics: The DuraSpace Community Webinar Series," Series Six: Research Data in Repositories” Curated by David Minor, Research Data Curation Program, UC San Diego Library. Webinar 2: “Metadata and Repository Services for Research Data Curation”
Presented by Declan Fleming, Chief Technology Strategist, Arwen Hutt, Metadata Librarian & Matt Critchlow, Manager of Development and Web ServicesUC, San Diego Library.
Duraspace Hot Topics Series 6: Metadata and Repository ServicesMatthew Critchlow
Presented by Declan Fleming, Arwen Hutt, and Matt Critchlow. The second in a three part Webinar series on Research Data Curation at UC San Diego, as part of the larger Research Cyberinfrastructure initiative.
Adoption of Cloud Computing in Scientific ResearchYehia El-khatib
Some might say the scientific research community is somewhat behind the curve of adopting the cloud. In this talk, I present a few examples of adopting the cloud from the wider research community. I also highlight some of the aspects by which cloud computing could affect scientific research in the near future and the associated challenges.
Dataverse, Cloud Dataverse, and DataTagsMerce Crosas
Talk given at Two Sigma:
The Dataverse project, developed at Harvard's Institute for Quantitative Social Science since 2006, is a widely used software platform to share and archive data for research. There are currently more than 20 Dataverse repository installations worldwide, with the Harvard Dataverse repository alone hosting more than 60,000 datasets. Dataverse provides incentives to researchers to share their data, giving them credit through data citation and control over terms of use and access. In this talk, I'll discuss the Dataverse project, as well as related projects such as DataTags to share sensitive data and Cloud Dataverse to share Big Data.
Funders and publishers have something in common: for better or worse, we have the ability to influence the behavior of researchers. This talk will focus on what both groups can do to improve research now and in the future.
ESA Ignite talk on UC3 Dash platform for data sharingCarly Strasser
Ignite talk (20 slides / 15 seconds per slide) for ESA 2014 meeting in Sacramento, CA 12 August 2014. On the Dash platform for helping researchers manage and share their data via institutional repositories
Data Management for Mountain Observatories WorkshopCarly Strasser
Keynote presentation for 2014 Mountain Observatories Workshop, 16 July 2014.
Abstract:
While methods for collecting data are well taught, there is less emphasis on managing the resulting data effectively. New mandates, announcements, memos, and requirements from agencies and publishers are emerging that encourage better data management, data sharing, and data preservation. Scientists with good management skills will be able to maximize the productivity of their own research, effectively and efficiently share their data with the community, and benefit from the re-use of their data by others. I will offer an overview of data management landscape - discussing recent events, resources, and new directions for data stewardship. I will also cover best practices for data management, which will facilitate data sharing and reuse, and introduce tools researchers can use to help in their data stewardship endeavours.
Libraries & Research Data Management for CO Alliance of Resrch LibrariesCarly Strasser
Keynote presentation for the Colorado Alliance of Research Libraries 2014 Research Data Management Conference, 11 July 2014. Focuses on why data management and sharing is important, and the role of libraries.
Open Science for Australian Institute of Marine Science WorkshopCarly Strasser
*Please excuse the typos :)
Presentation on open science and open data for the Australian Institute of Marine Science (AIMS) workshop on "Raising your research profile using research data". 18 June 2014.
Data management overview and UC3 tools for IASSIST 2014Carly Strasser
Presentation to introduce current landscape of data management and UC3 tools and services that support data sharing. For IASSIST in Toronto, 5 June 2014.
Data Publication for UC Davis Publish or PerishCarly Strasser
Intro presentation for panel on going beyond publishing journal articles. UC Davis "Publish or Perish?" Event, 13 Feb 2014. Sorry about missing gradient on some of slides!
October 18, 2013 @ Kennedy Library, Data Studio, Cal Poly. We hear about all things “open” these days: open access, open source, open data, open science, et cetera. But what does it really mean for how we do science? How are things changing, and what are the implications for individual researchers?
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
GridMate - End to end testing is a critical piece to ensure quality and avoid...ThomasParaiso2
End to end testing is a critical piece to ensure quality and avoid regressions. In this session, we share our journey building an E2E testing pipeline for GridMate components (LWC and Aura) using Cypress, JSForce, FakerJS…
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
DataUp: Data Curation for Excel
1. Facilitating
data
stewardship
practices
for
scientists
Carly
Strasser
|
carly.strasser@ucop.edu
|
www.carlystrasser.net
Open
Access
symposium
|
University
of
North
Texas
|
May
2012
2. UGLY
TRUTH
Many
Earth
|
Environmental
|
Ecological
scientists…
5shortessays.blogspot.com
are
not
taught
data
management
don’t
know
what
metadata
are
can’t
name
data
centers
or
repositories
don’t
share
data
publicly
or
store
it
in
an
archive
aren’t
convinced
they
should
share
data
3. Where
data
end
up
From
Flickr
by
diylibrarian
www
blog.order2disorder.com
From
Flickr
by
csessums
Data
Metadata
From
Flickr
by
csessums
Recreated
from
Klump
et
al.
2006
4. Where
data
end
up
From
Flickr
by
diylibrarian
www
Data
www
Metadata
From
Flickr
by
torkildr
Recreated
from
Klump
et
al.
2006
6. Frequency
of
Excel
use
Rare
or
occasional
use
Moderate
use
Percent
of
respondents
who
use
Excel
for
these
tasks
100
Every
day
90
or
almost
80
every
day
70
60
50
40
30
20
10
0
Organizing
Visualizing
Sta:s:cs
Sharing
data
data
data
7.
8. Facilitate
Archiving
Data
Data
Reuse
management
Sharing
&
organization
Reproducibility
Publishing
9. • Open
source
add-‐in
&
web
application
• Facilitate
data
management,
sharing,
archiving
for
scientists
• Focus
on
atmospheric,
ecological,
hydrological,
and
oceanographic
data
• Collect
requirements
for
add-‐in
from
scientists,
data
centers,
libraries
10. Add-‐in
&
Web
Application?
Add-‐in
• Little
pieces
of
software
• Download
to
extend
the
capabilities
of
Excel
• Appear
as
“ribbon”
in
Excel
• Only
work
with
Windows
Excel
2007+
• Available
offline
but
updates
difficult
www.ablebits.com
11. Add-‐in
&
Web
Application?
Add-‐in
• Little
pieces
of
software
• Download
to
extend
the
capabilities
of
Excel
• Appear
as
“ribbon”
in
Excel
• Only
work
with
Windows
Excel
2007+
• Available
offline
but
updates
difficult
Web-‐based
application
• Websites
that
do
something
with
info/files
provided
by
user
• Examples:
Facebook,
YouTube
• No
program
download
required
but
updates
easy
• New
user
interface
to
learn
13. ~ 150
scientists
• No
data
preservation
– Unaware
of
archives
– Resistant
to
sharing
• Poor
data
documentation
• 90%
use
other
programs
along
with
Excel
14. Requirements
1. Must
work
for
Excel
users
without
the
add-‐in
2. No
additional
software
necessary
3. Can
be
used
offline
4. Perform
CSV
compatibility
checks,
reporting,
and
automated
fixes
5. Add
Metadata
to
data
file
a. Can
use
existing
metadata
as
a
template
b. Add-‐in
can
automatically
generate
some
of
the
metadata
where
the
info
is
available
from
the
file
6. Generate
a
citation
for
the
data
file
7. Deposit
data
and
metadata
in
a
repository
15. Requirements
Features
1. Compatibility
Check
2. Generate
metadata
3. Generate
citation
4. Post
data
to
repository
18. Vision
for
Future
• Community
adoption
• Extension
to
other
programs
– Google
Docs,
OpenOffice
• Incorporation
of
other
metadata
schemas
• Repository
adoption
• Partnerships:
FigShare,
F1000,
USGS,
etc.