Data101 pmcb retreat_09-20-13_final

544 views

Published on

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
544
On SlideShare
0
From Embeds
0
Number of Embeds
127
Actions
Shares
0
Downloads
6
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • JW
  • JW
  • JW
  • JW
  • JW
  • JW
  • Ask them to think about what type of data they deal with/generate. Give a couple minutes.
  • Ask if they have additional data types that they brainstormed
  • JW
  • These are all things that the library can help you do
  • JW
  • JW
  • http://patenteux.com/Messy_desktop/messy_wallpaper-1280x1024.jpg
  • If you work on the command line, you can see all the file paths
  • JW
  • Show examples of versionsCan go back when you make mistakes when changes are madeShare work with other peopleBoth work on things at the same time and merge back togetherAkin to game of telephone- version control can let you see exactly when a change was made
  • Show examples of versionsCan go back when you make mistakes when changes are madeShare work with other peopleBoth work on things at the same time and merge back togetherAkin to game of telephone- version control can let you see exactly when a change was madeNEW SLIDES:Examples of versions of dataData101_NV_v1Data101_NV_v2Simple software solutionsSome software keeps versions for youShow where to go get itVersion Control SoftwareVersion control softwareSVN, GITShow example of google codeCan write commit messages you version you commit
  • Show examples of versionsCan go back when you make mistakes when changes are madeShare work with other peopleBoth work on things at the same time and merge back togetherAkin to game of telephone- version control can let you see exactly when a change was made
  • Show examples of versionsCan go back when you make mistakes when changes are madeShare work with other peopleBoth work on things at the same time and merge back togetherAkin to game of telephone- version control can let you see exactly when a change was made
  • Show examples of versionsCan go back when you make mistakes when changes are madeShare work with other peopleBoth work on things at the same time and merge back togetherAkin to game of telephone- version control can let you see exactly when a change was made
  • NICOLE
  • Central servers will have multiple redundancy, back ups of back upsHigh quality secure USBs with passwords and encyrption, or burn to disk
  • JW
  • !
  • Move this
  • Information science is a parent
  • Ontologies classify terms and the relationships between them.
  • JW
  • Software that can rename your files, if you already have them named
  • Goal is to solve the author/contributor name ambiguity problem in scholarly communications Creating a central registry of unique identifiers for individual researchers Identifiers, and the relationships among them, can be linked to the researcher
  • JW
  • JW
  • JW
  • JW
  • JW
  • JW
  • JW
  • Maybe discuss the PlumX project?
  • JW
  • Say that we won an award to sponsor this program
  • Data101 pmcb retreat_09-20-13_final

    1. 1. DATA MANAGEMENT 101 Nicole Vasilevsky, Jackie Wirz and Melissa Haendel PMCB New Student Orientation 20 September 2013
    2. 2. 1 | Data definitions 2 | Dealing with data 3 | How the OHSU Library can help
    3. 3. Nicole Vasilevsky, Ph D Project Manager, Ontolo gy Development Group Jackie Wirz, PhD Assistant Professor, Bioinformation Specialist Melissa Haendel, PhD Assistant Professor, Lead, Ontology Development Group
    4. 4. 1 | Data definitions
    5. 5. Data does not speak for itself…
    6. 6. YOU speak for YOUR data
    7. 7. But First, you need to manage it
    8. 8. But, even more fundamentally…
    9. 9. data means many things…
    10. 10. what does data mean to you?
    11. 11. What are data? Experimental data Social data School related data Personal data
    12. 12. Do you know what metadata is? a. Philosophy b. describes data c. dating site d. data
    13. 13. 2 | dealing with data
    14. 14. Do you get frustrated with any of the following? a. Storing data b. Backing up data c. Analyzing/manipulating data d. Finding data produced by other researchers/clinicians e. Ensuring data are secure f. Making data accessible to other researchers g. Controlling access to data h. Tracking updates to data (ie versioning) i. Creating metadata (ie describing the data to be more useful at a later time or by others) j. Protecting intellectual property rights k. Ensuring appropriate professional credit/citation is given to data sets/generated
    15. 15. Why? Personal organization Efficiency Credit where credit is due Accelerate scientific and clinical discovery Reproducibility of science and medicine
    16. 16. naming | metadata | tools | standards How?
    17. 17. naming
    18. 18. File naming
    19. 19. Naming conventions Project_instrument_location_YYYYMMDDhhm mss_extra.ext Index/grant conditions Leading zero! s/n, variable Retain order
    20. 20. Naming: Directory Structure
    21. 21. PCMB presentation Library presentation DMICE presentation Presentations PMCB Library DMICE
    22. 22. http://ftp.ihmc.us/
    23. 23. ReadMe
    24. 24. Version Control
    25. 25. Versioning • Save a copy of every version of a file • Follow a file naming convention Data101_PMCB_Retreat_09-20-13_v1 Data101_PMCB_Retreat_09-20-13_v2 Data101_PMCB_Retreat_09-20-13_Final
    26. 26. Versioning
    27. 27. Versioning
    28. 28. Versioning Version Control software: • GIT • SVN
    29. 29. Backups
    30. 30. Which of the following do you do? a. Save copies of data on a disk, USB drive, or computer hard drive b. Save copies of data on a local server c. Save copies of data on a central campus server d. Save copies of data on a web-based or cloud server e. Store data in a repository or archives f. Automatically backup files g. Manually generate backup h. Restrict access to files
    31. 31.  1 on your local workstation  1 local/removable, such as external hard drive  1 on central server  1 remote, such as on a cloud server* *Depending on the type of data, as cloud servers are not always secure Where can you backup your data?
    32. 32. Metadata
    33. 33. What is Metadata? Title Author Call number Publisher ISBN
    34. 34. - Anne Gilliland Your metadata should make your data understandable to others without your involvement Metadata Metadata Metadata Metadata Metadata Metadata Metadata Metadata Metadata Metadata Metadata Metadata Metadata Metadata Metadata Metadata Metadata Metadata Metadata Metadata Metadata Metadata Metadata Metadata Metadata Metadata Metadata Metadata Metadata Metadata Metadata Metadata Metadata Metadata Metadata Metadata Metadata Metadata Metadata Metadata Metadata Metadata
    35. 35. Are you aware of data standards in your field?
    36. 36. data standards Data standards are the rules by which data are described and recorded. In order to share, exchange, and understand data, we must standardize the format as well as the meaning. http://www.usgs.gov/datamanagement/plan/datastandards.php
    37. 37. Controlled vocabularies
    38. 38. Structured data helps with searching Craigslist search: Chaise Craigslist matches on strings only Craigslist search: Fainting couch
    39. 39. Structured data helps with searching PubMed indexes articles with MeSH Terms
    40. 40. Structured data helps with searching
    41. 41. Why are CVs and Ontologies useful? • Can be used to structure your metadata • Are often used to structure information in databases Cell Ontology Linnean Taxonomy Order Genus Species Phylum Class Family Kingdom
    42. 42. tools
    43. 43. File renaming applications • Bulk Rename Utility (Windows) • Renamer (Mac) • PSRenamer
    44. 44. Data Management tools and repositories • Purpose: Software where you can organize, store and/or share data • Often contain metadata to assist with data entry and create structured data
    45. 45. Tools for data management
    46. 46. Repositories use Unique IDs • Document Object Identifier (DOI) • Example: DOIs for publications – doi: 10.1371/journal.pbio.1001339 • Unique resource identifier (URI) • A URI will resolve to a single location on the web • URIs for people
    47. 47. • Example: • John L Campbell, Research Ecologist, Oregon State University, Corvallis OR • John L Campbell, Research Ecologist, Center for Research on Ecosystem Change, Durham, NC
    48. 48. standards
    49. 49. nomenclature
    50. 50. antibodies Western Blot Immunohistochemstry ELISA Co-immunoprecipitation ChIP Radioimmunoassay
    51. 51. FACS analysis of T cells from LNs and tumors T cells were liberated from LNs by disruption between two frosted glass slides. Cells from LNs and tumors were stained with various combination of the following Abs: FITC- CD4, allophycocyanin-CD25, PE Cy7-CD8, APC-CD62L, PE- CD25, PE Cy7-CD25, and biotinylated-KJ-126 and in some experiments made permeable with fixation/permeablization buffers and stained with PE-FoxP3 (eBioscience). Harvested samples, isotype controls, and single stain controls were run on the FACSCalibur (BD Biosciences). Ruby and Weinberg (2009) J Immunol. 182(3):1481-9.
    52. 52. Which antibody did they use in the paper?
    53. 53. A Solution: Antibody Registry antibodyregistry.org
    54. 54. Meet the Urban Lab Meet the Urban Lab
    55. 55. A+ organization! The Urban lab antibodies
    56. 56. 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% Commerical Ab identifiable Catalog number reported Source organism reported Target uniquely identifiable Of 14 antibodies published in 45 articles, only 38% were identifiable Percentidentifiable
    57. 57. http://www.force11.org/node/4463 http://biosharing.org/bsg-000532
    58. 58. http://www.biosharing.org/standards/mibbi Minimum Information for Biological and Biomedical Investigations
    59. 59. data publication and sharing
    60. 60. Why share data? • Data sharing mandates • Further science and and medicine • Build collaborations • Enable new discoveries with your data • Can be required at time of publication
    61. 61. Distribution of 2004–2005 citation counts of 85 trials by data availability.
    62. 62. How?
    63. 63. Beyond the PDF: What can be published (and cited)? Raw Science Nanopublications Self-publishing
    64. 64. Beyond the PDF: What can be published (and cited)? Raw Science Nanopublications Self-publishing Datasets Code Experimental design Argument or passage Blogging Microblogging Comments on existing work Annotations on existing work Single figure publications
    65. 65. How? Data Journals and Repositories • FigShare • Dryad • DataVerse (social science) • Institutional repositories
    66. 66. www.impactstory.org
    67. 67. 3 | How the OHSU Library can help
    68. 68. 1 | Large Lecture: Data Management 101 2 | 10 –15 Small Groups: data playground • 1 researcher paired with 2 or 3 library staff • Tailored analysis of data reporting and instruction Save the date: 10/09/13 4-6pm 1k challenge award recipients
    69. 69. Thank you!
    70. 70. URLs to resources Go to: http://libguides.ohsu.edu/data

    ×