• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content

Loading…

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

Like this presentation? Why not share!

talk

on

  • 424 views

 

Statistics

Views

Total Views
424
Views on SlideShare
424
Embed Views
0

Actions

Likes
0
Downloads
1
Comments
0

0 Embeds 0

No embeds

Accessibility

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Acacdemics and biologists have embraced open source. Limits of librarians and scientists patience being reached, and tie-ins and sweeteners for initial multiyear online deals now running out.

talk talk Presentation Transcript

  • Text mining and Open Access publishing Matthew Cockerill Technical Director, BioMed Central
  • Summary
    • What is Open Access publishing?
    • Open Access publishing and text mining
    • About BMC Bioinformatics
    • The BioCreative supplement
  • Summary
    • What is Open Access publishing?
    • Open Access publishing and text mining
    • About BMC Bioinformatics
    • The BioCreative supplement
  • The current model of publishing scientific research
    • Scientists carry out research
    • They write up their results
    • They submit them to a journal
    • Other scientists act as peer reviewers and editorial advisers
    • Finally, the publisher sells access to that research back to the scientific community
  • What’s wrong with this status quo?
    • Restricted access to scientific research is contrary to the interests of
      • the scientists who do the research
      • the funders who pay for it
      • society as a whole
    • It is an historical artefact of the economics of print publishing
    • It is a serious obstacle to mining of full text information
  • BioMed Central The Open Access publisher
    • Commercial organization
    • Published first article in mid-2000
    • Strict policy of immediate Open Access to all research articles
  • Growth of BioMed Central
  • Momentum for Open Access
    • PubMed Central
    • Public Library of Science
    • Open Access declarations: Budapest/Bethesda/Berlin
    • Software open-source movement
    • Mass cancellation of titles from traditional publishers
  • BioMed Central’s business model for open access publishing
    • Keep costs down via
      • Online submission and peer review
      • Automated tools to streamline article processing, conversion and layout
    • Processing charge (currently $525) for accepted articles
    • No processing charge for authors at member institutions
  • Institutional membership
    • CalTech
    • Cancer Research UK
    • Columbia University
    • Cornell University
    • University of California
    • Dana-Farber Cancer Institute
    • Harvard University
    • INSERM
    • Imperial College
    • Institut Pasteur
    • John Innes Centre
    • Johns Hopkins University
    • Kyoto University
    • Max Planck Institutes
    • Memorial Sloan-Kettering Cancer Center
    More than 400 institutions are members of BioMed Central, including, to name just a few:
    • MRC Laboratory of Molecular Biology
    • National Institutes of Health
    • National Institute for Medical Research
    • NHS England
    • Princeton University
    • Rockefeller University
    • TIGR
    • TSRI
    • Tufts University
    • Wellcome Trust Sanger Institute
    • University of Wisconsin
    • World Health Organization
    • Yale University
  • Summary
    • What is Open Access publishing?
    • Open Access publishing and text mining
    • About BMC Bioinformatics
    • The BioCreative supplement
  • Mining the full text
    • Analysing results of high-throughput experiments means biologists increasingly need text-mining tools
    • PubMed is currently the primary resource for text mining (“it’s what’s available”) but:
      • Abstracts omit critical information
      • Techniques developed for abstracts may not effectively use extra information in full text
    • Fully Open Access corpora, in standard XML formats, will help
  • Data mining - BioMed Central
    • Entire corpus of full text XML downloadable by ftp as a single zip file
    • Various groups working with the data
      • E.g Pre-BIND (automatic extraction of possible protein-protein interaction information from full text)
    • No restrictions on redistribution
    • This means other groups can use same corpus to repeat and build on results
    http://www.biomedcentral.com/info/about/datamining
  • Data mining - BioMed Central (screen shot)
  • Data mining - PubMed Central
    • Standard NLM archiving/interchange XML DTD: common format across multiple publishers
    • Only a subset of PubMed Central participating publishers allow download of full text XML
      • BioMed Central
      • Public Library of Science
    • Hopefully, more will follow….
    • XML made available via OAI interface
    http://www. pubmedcentral .com/about/ oai .html
  • Data mining - PubMed Central
  • Adding structure to full text data
    • Some examples of useful structure:
    • Structure of article itself (figure legends, materials and methods, references etc)
    • MathML, CML etc
    • Disambiguated references to genes/proteins…
  • Authoring tools are key
    • Manuscript structure EndNote, TeX/BibTeX pretty good already
    • MathML
    • Publicon, TeX etc.
    • CML
    • Chemsketch etc.
    • Gene/protein reference markup ?
    • Semi-automatic markup during authoring
    • Author reviews and confirms markup
    • System prompts author to clarify ambiguity c.f. grammar checker, code intelligence
  • Summary
    • What is Open Access publishing?
    • Open Access publishing and text mining
    • BMC Bioinformatics
    • The BioCreative supplement
  • BMC series of online journals
    • BMC Biochemistry
    • BMC Bioinformatics
    • BMC Biotechnology
    • BMC Cell Biology
    • BMC Chemical Biology
    • BMC Developmental Biology
    • BMC Ecology
    • BMC Evolutionary Biology
    • BMC Genetics
    • BMC Genomics
    • BMC Immunology
    • BMC Microbiology
    • BMC Molecular Biology
    • BMC Neuroscience
    • BMC Pharmacology
    • BMC Physiology
    • BMC Plant Biology
    • BMC Structural Biology
    • BMC Anesthesiology
    • BMC Blood Disorders
    • BMC Cancer
    • BMC Cardiovascular Disorders
    • BMC Clinical Pathology
    • BMC Clinical Pharmacology
    • BMC Complementary and Alternative Medicine
    • BMC Dermatology
    • BMC Ear, Nose and Throat Disorders
    • BMC Emergency Medicine
    • BMC Endocrine Disorders
    • BMC Family Practice
    • BMC Gastroenterology
    • BMC Geriatrics
    • BMC Health Services Research
    • BMC Infectious Diseases
    • BMC International Health and Human Rights
    • BMC Medical Education
    • BMC Medical Ethics
    • BMC Medical Genetics
    • BMC Medical Imaging
    • BMC Medical Informatics and Decision Making
    • BMC Medical Research Methodology
    • BMC Musculoskeletal Disorders
    • BMC Nephrology
    • BMC Neurology
    • BMC Nuclear Medicine
    • BMC Nursing
    • BMC Ophthalmology
    • BMC Oral Health
    • BMC Palliative Care
    • BMC Pediatrics
    • BMC Pregnancy and Childbirth
    • BMC Psychiatry
    • BMC Public Health
    • BMC Pulmonary Medicine
    • BMC Surgery
    • BMC Urology
    • BMC Women's Health
  • BMC Bioinformatics
  • RSS feeds
  • Open access leads to high visibility
    • Indexing/Linking
    • PubMed
    • MEDLINE
    • ISI
    • BIOSIS
    • CAS
    • CrossRef
    • Scirus
    • Open Archive Initiative
    • Citebase
    • Google
    • Archiving
    • PubMed Central
    • INIST
    • LOCKSS
    • Max Planck
    • OhioLINK
  • BMC Bioinformatics - citation impact
  • Summary
    • What is Open Access publishing?
    • Open Access publishing and text mining
    • About BMC Bioinformatics
    • The BioCreative supplement
  • Process for publishing in BMC Bioinformatics supplement
    • Follow BMC Bioinformatics ‘Research Article’ instructions for authors
    • Send articles to BioCreative organizers who will coordinate peer review [do not submit articles online]
    • Supplement passed on to BioMed Central for XML markup and publication
    • $400 processing charge/article
  • Instructions for authors
  • Access to supplement
    • All articles in supplement covered by BioMed Central’s Open Access licence agreement
      • Free access
      • Free re-distribution/re-use
    • Supplement indexed in PubMed and permanently archived in PubMed Central
  • That’s it