SlideShare a Scribd company logo
1 of 25
Download to read offline
1
2
3
I’m a former chemistry researcher who was really bad at the data management game
the first time I played it.
Now I’m a data services librarian who has produced a book, a blog, and videos in this
area.
I want to make the data management game easy and understandable to all players.
This presentation will not only show you tools but also provide tips on leveling up
during the game.
4
5
6
Beware flash drives as a storage option.
7
Cloud storage is a great option for the 3-2-1 Rule’s offsite copy.
Not all cloud storage is made equal (read Google Drive’s terms of service). And don’t
rely only on cloud storage for your data (several horror stories here).
Many cloud storage providers offer free storage up to a certain amount, and then it’s a
paid plan.
I like SpiderOak. This is primarily a cloud backup solution, which is less good for file
sharing (other options are available for that).
It’s billed as “zero knowledge” cloud storage. Files get encrypted on your computer
before sending to their servers, meaning the company can’t read your files and they
stay secure when travelling across the internet (this is really important).
I combine this with my local computer and an external hard drive to make my 3 copies.
8
9
10
I don’t use Bulk Rename Utility often, but it’s so useful when I do.
Bulk Rename Utility is free for personal users on Windows.
It allows you to rename a large number of files at the same time (such as when you
have a file naming convention you want to apply to existing files).
The interface looks complicated but that is because it is so powerful.
You can: replace particular characters, add or remove things at a particular position,
easily add numbering or dates, swap parts of the file name around, etc.
It takes a few minutes to learn, but it’s a great tool to have in your back pocket.
11
12
Regular expressions (regex) are an amazing tool for search and replace.
Regex doesn’t stand alone, but rather plugs into other tools like Bulk Rename Utility,
notepad++, Java, etc.
Regex works by pattern matching, allowing you to search for all social security numbers
in a document, reformat any phone numbers, change the order of sections in a
document but keep the text the same, etc.
Regex takes a bit more learning but is incredibly useful for anyone doing text
manipulation or clean up.
The first link on this slide is to a tutorial I like.
The second link is to a tool, RegExr, that allows you to test your written regular
expressions against text.
13
14
15
Versioning files by hand takes up a lot of hard drive space.
A version control system, like Git, only saves the differences between one version and
the next instead of the whole file. It also streamlines the versioning process.
Such tools came out of computer science but are being used by many researchers.
Git is free and open source.
Git is different than GitHub – Git basically handles the version control, while GitHub
hosts the files and versions and can make them available to others.
Git is really useful but has a learning curve. Because of that, I recommend starting with
the GUI version unless you are comfortable with the command line.
16
17
This tool originated in computer code
Don’t need anything more complicated than a text editor to make one! I use
notepad++.
18
19
20
21
Excel is a useful tool but isn’t always the best tool for cleaning data.
It’s especially bad with dates and tends to mangle them.
22
OpenRefine is a free, open source tool that was previously known as GoogleRefine.
It is the best tool for cleaning up tabular data.
OpenRefine can break data down by “facet” (variable values or ranges), allowing you to
do quick parsing, counting, or editing.
Editing includes straight replacement, math, basic text manipulation (uppercase to
lowercase, etc.), or other functions using Google Refine Expression Language (GREL).
You can also break multi-component cells apart or combine them into one.
The tool also allows for text clean up, providing a number of different algorithms for
text matching.
23
24
25

More Related Content

What's hot

Serverless Big Data Architecture on Google Cloud Platform at Credit OK
Serverless Big Data Architecture on Google Cloud Platform at Credit OKServerless Big Data Architecture on Google Cloud Platform at Credit OK
Serverless Big Data Architecture on Google Cloud Platform at Credit OKKriangkrai Chaonithi
 
Web Browser Controls in Adlib: The Hidden Diamond in the Adlib Treasure Chest
Web Browser Controls in Adlib: The Hidden Diamond in the Adlib Treasure ChestWeb Browser Controls in Adlib: The Hidden Diamond in the Adlib Treasure Chest
Web Browser Controls in Adlib: The Hidden Diamond in the Adlib Treasure ChestAxiell ALM
 
Couch DB/PouchDB approach for hybrid mobile applications
Couch DB/PouchDB approach for hybrid mobile applicationsCouch DB/PouchDB approach for hybrid mobile applications
Couch DB/PouchDB approach for hybrid mobile applicationsIhor Malytskyi
 
Introduction to Modern DevOps Technologies
Introduction to  Modern DevOps TechnologiesIntroduction to  Modern DevOps Technologies
Introduction to Modern DevOps TechnologiesKriangkrai Chaonithi
 
SQL To NoSQL - Top 6 Questions Before Making The Move
SQL To NoSQL - Top 6 Questions Before Making The MoveSQL To NoSQL - Top 6 Questions Before Making The Move
SQL To NoSQL - Top 6 Questions Before Making The MoveIBM Cloud Data Services
 
Easy Data for PhoneGap apps with PouchDB
Easy Data for PhoneGap apps with PouchDBEasy Data for PhoneGap apps with PouchDB
Easy Data for PhoneGap apps with PouchDBHolly Schinsky
 
Codemotion madrid 2017 Arquitectura kappa 2.0
Codemotion madrid 2017  Arquitectura kappa 2.0Codemotion madrid 2017  Arquitectura kappa 2.0
Codemotion madrid 2017 Arquitectura kappa 2.0Juantomás García Molina
 
Streaming sql and druid
Streaming sql and druid Streaming sql and druid
Streaming sql and druid arupmalakar
 
Introduction of Open Source Job Board with Drupal CMS
Introduction of Open Source Job Board with Drupal CMSIntroduction of Open Source Job Board with Drupal CMS
Introduction of Open Source Job Board with Drupal CMSSammy Fung
 
Tikal Fuse Day Access Layer Implementation (C#) Based On Mongo Db
Tikal Fuse Day   Access Layer Implementation (C#) Based On Mongo DbTikal Fuse Day   Access Layer Implementation (C#) Based On Mongo Db
Tikal Fuse Day Access Layer Implementation (C#) Based On Mongo DbTikal Knowledge
 
Collaborative data science and how to build a data science toolchain around n...
Collaborative data science and how to build a data science toolchain around n...Collaborative data science and how to build a data science toolchain around n...
Collaborative data science and how to build a data science toolchain around n...Moon Soo Lee
 
DocumentDB - NoSQL on Cloud at Reboot2015
DocumentDB - NoSQL on Cloud at Reboot2015DocumentDB - NoSQL on Cloud at Reboot2015
DocumentDB - NoSQL on Cloud at Reboot2015Vidyasagar Machupalli
 

What's hot (17)

Serverless Big Data Architecture on Google Cloud Platform at Credit OK
Serverless Big Data Architecture on Google Cloud Platform at Credit OKServerless Big Data Architecture on Google Cloud Platform at Credit OK
Serverless Big Data Architecture on Google Cloud Platform at Credit OK
 
Fluentd - Unified logging layer
Fluentd -  Unified logging layerFluentd -  Unified logging layer
Fluentd - Unified logging layer
 
Google BigQuery
Google BigQueryGoogle BigQuery
Google BigQuery
 
Web Browser Controls in Adlib: The Hidden Diamond in the Adlib Treasure Chest
Web Browser Controls in Adlib: The Hidden Diamond in the Adlib Treasure ChestWeb Browser Controls in Adlib: The Hidden Diamond in the Adlib Treasure Chest
Web Browser Controls in Adlib: The Hidden Diamond in the Adlib Treasure Chest
 
Big Data Applications
Big Data ApplicationsBig Data Applications
Big Data Applications
 
Couch DB/PouchDB approach for hybrid mobile applications
Couch DB/PouchDB approach for hybrid mobile applicationsCouch DB/PouchDB approach for hybrid mobile applications
Couch DB/PouchDB approach for hybrid mobile applications
 
Introduction to Modern DevOps Technologies
Introduction to  Modern DevOps TechnologiesIntroduction to  Modern DevOps Technologies
Introduction to Modern DevOps Technologies
 
SQL To NoSQL - Top 6 Questions Before Making The Move
SQL To NoSQL - Top 6 Questions Before Making The MoveSQL To NoSQL - Top 6 Questions Before Making The Move
SQL To NoSQL - Top 6 Questions Before Making The Move
 
Easy Data for PhoneGap apps with PouchDB
Easy Data for PhoneGap apps with PouchDBEasy Data for PhoneGap apps with PouchDB
Easy Data for PhoneGap apps with PouchDB
 
Google Big Query UDFs
Google Big Query UDFsGoogle Big Query UDFs
Google Big Query UDFs
 
Codemotion madrid 2017 Arquitectura kappa 2.0
Codemotion madrid 2017  Arquitectura kappa 2.0Codemotion madrid 2017  Arquitectura kappa 2.0
Codemotion madrid 2017 Arquitectura kappa 2.0
 
CCT Check and Calculate Transfer
CCT Check and Calculate TransferCCT Check and Calculate Transfer
CCT Check and Calculate Transfer
 
Streaming sql and druid
Streaming sql and druid Streaming sql and druid
Streaming sql and druid
 
Introduction of Open Source Job Board with Drupal CMS
Introduction of Open Source Job Board with Drupal CMSIntroduction of Open Source Job Board with Drupal CMS
Introduction of Open Source Job Board with Drupal CMS
 
Tikal Fuse Day Access Layer Implementation (C#) Based On Mongo Db
Tikal Fuse Day   Access Layer Implementation (C#) Based On Mongo DbTikal Fuse Day   Access Layer Implementation (C#) Based On Mongo Db
Tikal Fuse Day Access Layer Implementation (C#) Based On Mongo Db
 
Collaborative data science and how to build a data science toolchain around n...
Collaborative data science and how to build a data science toolchain around n...Collaborative data science and how to build a data science toolchain around n...
Collaborative data science and how to build a data science toolchain around n...
 
DocumentDB - NoSQL on Cloud at Reboot2015
DocumentDB - NoSQL on Cloud at Reboot2015DocumentDB - NoSQL on Cloud at Reboot2015
DocumentDB - NoSQL on Cloud at Reboot2015
 

Viewers also liked (11)

Smith - Developing Campus Stakeholders' Collaborations - Sept 8
Smith - Developing Campus Stakeholders' Collaborations - Sept 8Smith - Developing Campus Stakeholders' Collaborations - Sept 8
Smith - Developing Campus Stakeholders' Collaborations - Sept 8
 
McDanold-1-jun15
McDanold-1-jun15McDanold-1-jun15
McDanold-1-jun15
 
Lawless-3-jun15
Lawless-3-jun15Lawless-3-jun15
Lawless-3-jun15
 
Thompson 6-jun15-final
Thompson 6-jun15-finalThompson 6-jun15-final
Thompson 6-jun15-final
 
Lauruhn-5-jun15
Lauruhn-5-jun15Lauruhn-5-jun15
Lauruhn-5-jun15
 
Hansen-2-jun15
Hansen-2-jun15Hansen-2-jun15
Hansen-2-jun15
 
Wacker-4-june15
Wacker-4-june15Wacker-4-june15
Wacker-4-june15
 
Gonzalez-8-jun15
Gonzalez-8-jun15Gonzalez-8-jun15
Gonzalez-8-jun15
 
Stahmer-9-Jun15-final
Stahmer-9-Jun15-finalStahmer-9-Jun15-final
Stahmer-9-Jun15-final
 
Wiggins-7-jun15
Wiggins-7-jun15Wiggins-7-jun15
Wiggins-7-jun15
 
McIlroy - Book Publishing Start-Ups
McIlroy - Book Publishing Start-UpsMcIlroy - Book Publishing Start-Ups
McIlroy - Book Publishing Start-Ups
 

Similar to Briney - Leveling Up Data Management - With Notes

SessionThree_IntroductionToVersionControlSystems
SessionThree_IntroductionToVersionControlSystemsSessionThree_IntroductionToVersionControlSystems
SessionThree_IntroductionToVersionControlSystemsHellen Gakuruh
 
Collaborative Data Projects
Collaborative Data ProjectsCollaborative Data Projects
Collaborative Data Projectsdatacommons
 
Top 10 web development tools in 2022
Top 10 web development tools in 2022Top 10 web development tools in 2022
Top 10 web development tools in 2022intouchgroup2
 
Google software engineering practices by handerson
Google software engineering practices by handersonGoogle software engineering practices by handerson
Google software engineering practices by handersonmustafa sarac
 
Introduction to go lang
Introduction to go langIntroduction to go lang
Introduction to go langAmal Mohan N
 
Must be similar to screenshotsI must be able to run the projects.docx
Must be similar to screenshotsI must be able to run the projects.docxMust be similar to screenshotsI must be able to run the projects.docx
Must be similar to screenshotsI must be able to run the projects.docxherthaweston
 
Digital Work Tools for the rest of us (2015)
Digital Work Tools for the rest of us (2015)Digital Work Tools for the rest of us (2015)
Digital Work Tools for the rest of us (2015)Filip Modderie
 
Seminar Report on Google File System
Seminar Report on Google File SystemSeminar Report on Google File System
Seminar Report on Google File SystemVishal Polley
 
Windows registry troubleshooting (2015)
Windows registry troubleshooting (2015)Windows registry troubleshooting (2015)
Windows registry troubleshooting (2015)James Konol
 
Will Google Docs Spreadsheet Replace Excel?
Will Google Docs Spreadsheet Replace Excel?Will Google Docs Spreadsheet Replace Excel?
Will Google Docs Spreadsheet Replace Excel?lenorajohnson
 
Exercises portfolio-Digital Curation Tools (IS40620)
Exercises portfolio-Digital Curation Tools (IS40620)Exercises portfolio-Digital Curation Tools (IS40620)
Exercises portfolio-Digital Curation Tools (IS40620)softwaresatish
 
Introduction to Operating Systems
Introduction to Operating SystemsIntroduction to Operating Systems
Introduction to Operating SystemsSuhreed Sarkar
 
Evernote Demo Vs Github Demo.pdf
Evernote Demo Vs Github Demo.pdfEvernote Demo Vs Github Demo.pdf
Evernote Demo Vs Github Demo.pdfSoftware Finder
 
Useful Shareware / Freeware for Technical Communicators
Useful Shareware / Freeware for Technical CommunicatorsUseful Shareware / Freeware for Technical Communicators
Useful Shareware / Freeware for Technical CommunicatorsSTC-Philadelphia Metro Chapter
 
Cs121 Unit Test
Cs121 Unit TestCs121 Unit Test
Cs121 Unit TestJill Bell
 
Let your data shine... with OpenRefine
Let your data shine... with OpenRefineLet your data shine... with OpenRefine
Let your data shine... with OpenRefineOpen Knowledge Belgium
 
Software for paper formatting
Software for paper formatting Software for paper formatting
Software for paper formatting salonibansal21
 

Similar to Briney - Leveling Up Data Management - With Notes (20)

SessionThree_IntroductionToVersionControlSystems
SessionThree_IntroductionToVersionControlSystemsSessionThree_IntroductionToVersionControlSystems
SessionThree_IntroductionToVersionControlSystems
 
Collaborative Data Projects
Collaborative Data ProjectsCollaborative Data Projects
Collaborative Data Projects
 
Toolboxes for data scientists
Toolboxes for data scientistsToolboxes for data scientists
Toolboxes for data scientists
 
Datasciencetools
DatasciencetoolsDatasciencetools
Datasciencetools
 
Top 10 web development tools in 2022
Top 10 web development tools in 2022Top 10 web development tools in 2022
Top 10 web development tools in 2022
 
Google software engineering practices by handerson
Google software engineering practices by handersonGoogle software engineering practices by handerson
Google software engineering practices by handerson
 
Introduction to go lang
Introduction to go langIntroduction to go lang
Introduction to go lang
 
Must be similar to screenshotsI must be able to run the projects.docx
Must be similar to screenshotsI must be able to run the projects.docxMust be similar to screenshotsI must be able to run the projects.docx
Must be similar to screenshotsI must be able to run the projects.docx
 
Digital Work Tools for the rest of us (2015)
Digital Work Tools for the rest of us (2015)Digital Work Tools for the rest of us (2015)
Digital Work Tools for the rest of us (2015)
 
Seminar Report on Google File System
Seminar Report on Google File SystemSeminar Report on Google File System
Seminar Report on Google File System
 
Windows registry troubleshooting (2015)
Windows registry troubleshooting (2015)Windows registry troubleshooting (2015)
Windows registry troubleshooting (2015)
 
Will Google Docs Spreadsheet Replace Excel?
Will Google Docs Spreadsheet Replace Excel?Will Google Docs Spreadsheet Replace Excel?
Will Google Docs Spreadsheet Replace Excel?
 
Exercises portfolio-Digital Curation Tools (IS40620)
Exercises portfolio-Digital Curation Tools (IS40620)Exercises portfolio-Digital Curation Tools (IS40620)
Exercises portfolio-Digital Curation Tools (IS40620)
 
Mke15
Mke15Mke15
Mke15
 
Introduction to Operating Systems
Introduction to Operating SystemsIntroduction to Operating Systems
Introduction to Operating Systems
 
Evernote Demo Vs Github Demo.pdf
Evernote Demo Vs Github Demo.pdfEvernote Demo Vs Github Demo.pdf
Evernote Demo Vs Github Demo.pdf
 
Useful Shareware / Freeware for Technical Communicators
Useful Shareware / Freeware for Technical CommunicatorsUseful Shareware / Freeware for Technical Communicators
Useful Shareware / Freeware for Technical Communicators
 
Cs121 Unit Test
Cs121 Unit TestCs121 Unit Test
Cs121 Unit Test
 
Let your data shine... with OpenRefine
Let your data shine... with OpenRefineLet your data shine... with OpenRefine
Let your data shine... with OpenRefine
 
Software for paper formatting
Software for paper formatting Software for paper formatting
Software for paper formatting
 

More from National Information Standards Organization (NISO)

More from National Information Standards Organization (NISO) (20)

Bazargan "NISO Webinar, Sustainability in Publishing"
Bazargan "NISO Webinar, Sustainability in Publishing"Bazargan "NISO Webinar, Sustainability in Publishing"
Bazargan "NISO Webinar, Sustainability in Publishing"
 
Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"
 
Compton "NISO Webinar, Sustainability in Publishing"
Compton "NISO Webinar, Sustainability in Publishing"Compton "NISO Webinar, Sustainability in Publishing"
Compton "NISO Webinar, Sustainability in Publishing"
 
Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"
 
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
 
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
 
Mattingly "Text and Data Mining: Building Data Driven Applications"
Mattingly "Text and Data Mining: Building Data Driven Applications"Mattingly "Text and Data Mining: Building Data Driven Applications"
Mattingly "Text and Data Mining: Building Data Driven Applications"
 
Mattingly "Text and Data Mining: Searching Vectors"
Mattingly "Text and Data Mining: Searching Vectors"Mattingly "Text and Data Mining: Searching Vectors"
Mattingly "Text and Data Mining: Searching Vectors"
 
Mattingly "Text Mining Techniques"
Mattingly "Text Mining Techniques"Mattingly "Text Mining Techniques"
Mattingly "Text Mining Techniques"
 
Mattingly "Text Processing for Library Data: Representing Text as Data"
Mattingly "Text Processing for Library Data: Representing Text as Data"Mattingly "Text Processing for Library Data: Representing Text as Data"
Mattingly "Text Processing for Library Data: Representing Text as Data"
 
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
 
Ross and Clark "Strategic Planning"
Ross and Clark "Strategic Planning"Ross and Clark "Strategic Planning"
Ross and Clark "Strategic Planning"
 
Mattingly "Data Mining Techniques: Classification and Clustering"
Mattingly "Data Mining Techniques: Classification and Clustering"Mattingly "Data Mining Techniques: Classification and Clustering"
Mattingly "Data Mining Techniques: Classification and Clustering"
 
Straza "Global collaboration towards equitable and open science: UNESCO Recom...
Straza "Global collaboration towards equitable and open science: UNESCO Recom...Straza "Global collaboration towards equitable and open science: UNESCO Recom...
Straza "Global collaboration towards equitable and open science: UNESCO Recom...
 
Lippincott "Beyond access: Accelerating discovery and increasing trust throug...
Lippincott "Beyond access: Accelerating discovery and increasing trust throug...Lippincott "Beyond access: Accelerating discovery and increasing trust throug...
Lippincott "Beyond access: Accelerating discovery and increasing trust throug...
 
Kriegsman "Integrating Open and Equitable Research into Open Science"
Kriegsman "Integrating Open and Equitable Research into Open Science"Kriegsman "Integrating Open and Equitable Research into Open Science"
Kriegsman "Integrating Open and Equitable Research into Open Science"
 
Mattingly "Ethics and Cleaning Data"
Mattingly "Ethics and Cleaning Data"Mattingly "Ethics and Cleaning Data"
Mattingly "Ethics and Cleaning Data"
 
Mercado-Lara "Open & Equitable Program"
Mercado-Lara "Open & Equitable Program"Mercado-Lara "Open & Equitable Program"
Mercado-Lara "Open & Equitable Program"
 
Ratner "Enhancing Open Science: Assessing Tools & Charting Progress"
Ratner "Enhancing Open Science: Assessing Tools & Charting Progress"Ratner "Enhancing Open Science: Assessing Tools & Charting Progress"
Ratner "Enhancing Open Science: Assessing Tools & Charting Progress"
 
Pfeiffer "Enhancing Open Science: Assessing Tools & Charting Progress"
Pfeiffer "Enhancing Open Science: Assessing Tools & Charting Progress"Pfeiffer "Enhancing Open Science: Assessing Tools & Charting Progress"
Pfeiffer "Enhancing Open Science: Assessing Tools & Charting Progress"
 

Recently uploaded

Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxAshokKarra1
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxCarlos105
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
Q4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxQ4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxnelietumpap1
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 

Recently uploaded (20)

Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptx
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
Q4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxQ4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptx
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 

Briney - Leveling Up Data Management - With Notes

  • 1. 1
  • 2. 2
  • 3. 3
  • 4. I’m a former chemistry researcher who was really bad at the data management game the first time I played it. Now I’m a data services librarian who has produced a book, a blog, and videos in this area. I want to make the data management game easy and understandable to all players. This presentation will not only show you tools but also provide tips on leveling up during the game. 4
  • 5. 5
  • 6. 6
  • 7. Beware flash drives as a storage option. 7
  • 8. Cloud storage is a great option for the 3-2-1 Rule’s offsite copy. Not all cloud storage is made equal (read Google Drive’s terms of service). And don’t rely only on cloud storage for your data (several horror stories here). Many cloud storage providers offer free storage up to a certain amount, and then it’s a paid plan. I like SpiderOak. This is primarily a cloud backup solution, which is less good for file sharing (other options are available for that). It’s billed as “zero knowledge” cloud storage. Files get encrypted on your computer before sending to their servers, meaning the company can’t read your files and they stay secure when travelling across the internet (this is really important). I combine this with my local computer and an external hard drive to make my 3 copies. 8
  • 9. 9
  • 10. 10
  • 11. I don’t use Bulk Rename Utility often, but it’s so useful when I do. Bulk Rename Utility is free for personal users on Windows. It allows you to rename a large number of files at the same time (such as when you have a file naming convention you want to apply to existing files). The interface looks complicated but that is because it is so powerful. You can: replace particular characters, add or remove things at a particular position, easily add numbering or dates, swap parts of the file name around, etc. It takes a few minutes to learn, but it’s a great tool to have in your back pocket. 11
  • 12. 12
  • 13. Regular expressions (regex) are an amazing tool for search and replace. Regex doesn’t stand alone, but rather plugs into other tools like Bulk Rename Utility, notepad++, Java, etc. Regex works by pattern matching, allowing you to search for all social security numbers in a document, reformat any phone numbers, change the order of sections in a document but keep the text the same, etc. Regex takes a bit more learning but is incredibly useful for anyone doing text manipulation or clean up. The first link on this slide is to a tutorial I like. The second link is to a tool, RegExr, that allows you to test your written regular expressions against text. 13
  • 14. 14
  • 15. 15
  • 16. Versioning files by hand takes up a lot of hard drive space. A version control system, like Git, only saves the differences between one version and the next instead of the whole file. It also streamlines the versioning process. Such tools came out of computer science but are being used by many researchers. Git is free and open source. Git is different than GitHub – Git basically handles the version control, while GitHub hosts the files and versions and can make them available to others. Git is really useful but has a learning curve. Because of that, I recommend starting with the GUI version unless you are comfortable with the command line. 16
  • 17. 17
  • 18. This tool originated in computer code Don’t need anything more complicated than a text editor to make one! I use notepad++. 18
  • 19. 19
  • 20. 20
  • 21. 21
  • 22. Excel is a useful tool but isn’t always the best tool for cleaning data. It’s especially bad with dates and tends to mangle them. 22
  • 23. OpenRefine is a free, open source tool that was previously known as GoogleRefine. It is the best tool for cleaning up tabular data. OpenRefine can break data down by “facet” (variable values or ranges), allowing you to do quick parsing, counting, or editing. Editing includes straight replacement, math, basic text manipulation (uppercase to lowercase, etc.), or other functions using Google Refine Expression Language (GREL). You can also break multi-component cells apart or combine them into one. The tool also allows for text clean up, providing a number of different algorithms for text matching. 23
  • 24. 24
  • 25. 25