Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Towards Lensfield


Published on

Lensfield is a desktop and filesystem-based tool designed as a “personal data management assistant” for the scientist. It combines distributed version control (DVCS), software transaction memory (STM) and linked open data (LOD) publishing to create a novel data management, processing and publication tool. The application “just looks after” these technologies for the scientist, providing simple interfaces for typical uses. It is built with Clojure and includes macros which define steps in a common workflow. Functions and Java libraries provide facilities for automatic processing of data which is ultimately published as RDF in a web application. The progress of data processing is tracked by a fine-grained data structure that can be serialized to disk, with the potential to include manual steps and programmatic interrupts in largely automated processes through seamless resumption. Flexibility in operation and minimizing barriers to adoption are major design features.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Towards Lensfield

  1. 1. Towards Lensfield: data management, processing and semantic publication for vernacular e-science Nick Day, Jim Downing, Lezan Hawizy, Nico Adams and Peter Murray-Rust Unilever Centre for Molecular Science Informatics, University of Cambridge This presentation: CC-By-SA Jim Downing
  2. 2. Linked Data CC-By-SA-NC jmelchio CC Images from Flickr
  3. 3. Selling Linked Data Make it transparent Make it easy CC-By mrslogic
  4. 4. Selling Linked Data Citations
  5. 5. Selling Linked Data • Visualizations • Data management • Automation
  6. 6. Demo
  7. 7. Lensfield Principles • Make it easier to do the right thing • Vernacular • KISS and Embrace constraints
  8. 8. Constraints • Work on the desktop without infrastructure installation • Processing tasks could be anything and aren’t predictable
  9. 9. Re-use
  10. 10. Jumbo-Converters • Library of chemistry file format converters, semantifiers and enhancers • Part of the CML Java libraries •
  11. 11. Version Control • Mercurial • Track script changes with data • Excellent support for experimentation • Automatically ignore deterministic • Backup to remote intermediates machine • P2P sharing
  12. 12. Build metaphor • Describing state transitions rather than process better for provenance tracking • Alternative to graphical programming languages / workflow packages • hard problems are re-use and comprehension
  13. 13. Clojure • Strong on concurrency • Functional • Software Transactional Memory • Lisp • Snapshots, pause and resume, continuations
  14. 14. Future Development • Templated Parameter Sweeps & sensitivity analysis • Design of Experiments • Multicore performance testing • Grid processing
  15. 15.
  16. 16. Users • CLARION project • Embargo management and publication of Electronic Lab Notebook data. • OREChem • Distributed chemistry eScience using Linked Data. • Computational Chemical engineering
  17. 17. CC-By-NC ilonameagher Users You? ... to use Lensfield!
  18. 18. Thanks Colleagues Funds Nick Day John Aspden Lezan Hawizy Peter Murray-Rust Collaboration and Inspiration Nico Adams (Dept of Genetics, Cambridge) Jerry Winter (Unilever) Noel Ruddock (Unilever) Markus Kraft, Weerapong Phadungsukanan (Chemical Engineering, Cambridge)