2. Project background
• JISC Information Environment Programme 2009-2011
• Strand A2: Developing e-Infrastructure to support research disciplines
• Project partners: Oxford, Southampton, Reading
– An e-Research South Consortium project
3. The Neuroscience partners
• The Oxford group is focused on specific issues such as the study of
the molecular basis of synapse formation, plasticity and the regulation
of neuronal morphology in the normal and diseased brain. They are
examining the mechanisms of an activity-dependent form of neural
plasticity known as long-term potentiation (LTP).
• Research at CINN focuses on physiological and psychological
mechanisms underpinning complex cognitive behaviours, targeting
typical and atypical development and decline in individuals. There is
also a strong research group working on issues in signal analysis and
computational modelling.
• A focus in Southampton is on the integrative analysis of brain
function/dysfunction. Modelling aspects of this across levels of
biological organization ranging from molecules, cells, tissue, systems
to animal behaviour.
4. Challenges
• Interdisciplinary teams – different expectations, cultures,
requirements
• Agreed standards
– Different data formats
Microscopes (Multi-photon or Confocal)
Live cell fluorescent imaging
Electrophysiology recordings
– Meta data standards
• Complexity of tools used in community
• Ability to share images, data, analysis
• Network connectivity not the best
10. What we would like to see:
1. VRE – single point of contact
2. A consistent annotation method for data archiving
3. Web & shared filesystem based repository for data.
Many file formats to be supported
4. A searchable data base for images
5. A searchable data base for video images
6. A document share tool for ‘live’ manuscript editing
7. File space for literature sharing (PDFs)
8. Blog area
11. Release 2.0
• Drupal – Frontend content management. Based on Drupal Commons,
• Alfresco – Backend data management. Modified Alfresco module,
• Apache Solr – Search engine,
• Apache Tika – Metadata extraction toolkit for documents,
• Google services – Docs and Calendar,
• Cloud-based computation using GPU’s,
• NCBO ontology-based tagging,
• LDAP – Single sign-on,
• Digital Pens – Used for recording experiments,
• XML-RPC desktop client – uploading and generating content.
12. What is Alfresco?
• Open source document management system (also enterprise edition),
• Alternative to Microsoft Sharepoint,
• It provides the following features:
– Unified repository – Manages documents, images, video, audio,
etc…
–Network share services – CIFS/Samba, WebDAV, IMAP and
SharePoint protocol.
– Connectivity – CMIS, JSR 168, REST, Microsoft Office integration.
– Version control – Tracking of major and minor document versions.
–Folder-based Rules and Actions – Support for document workflows.
13. What is Apache Solr?
• Open source search platform based on Apache Lucene,
• Apache Solr search platforms features include:
– Full-text searching,
– Faceted searching,
– Dynamic clustering,
– Database integration,
–Caching and replication,
– Document handling via Apache Tika– e.g. Word, PDF, etc…
– Connectivity – HTTP/XML, JSON API
14. What is Apache Tika?
• Open source content analysis toolkit,
• Detection and extraction of metadata and structured text from various
documents,
– Compressed formats – tar, jar, zip, bzip2, gz, tgz.
– Text Documents – Word, Excel, Powerpoint, RTF, PDF, HTML,
XHTML, OpenDocument, Plain text.
– Images – BMP, GIF, PNG, JPEG, TIFF.
– Audio – MP3, AIFF, AU, MIDI, WAV.
• Extensible parser – Allows for custom parsers to be developed for
other document types.
16. Site Usage
• Oxford:
– Have tested different input methods including digital pens, iPad
and tablet PC.
–Uploads of their 6000 image files is working fine, with ~250GB of
data stored.
• Reading:
– Want to use NeuroHub as a frontend for their whole centre.
– Workflow development has improved scientists working pattern
and supported integration with other complex activities.
• Southampton:
– Have ~100GB of data stored and have started to organise their
files.
– Interested in digital pen input and want to continue uploading their
files in NeuroHub.