WikiLIMS BioTeam.net
Dilbert 1
1st Next Gen Sequencer Centerpiece of a lab Generates new workflows These cannot be known in advance When they order 2 more sequencers Still want a single repository for all runs
Tasks/Workflows Production  Few tasks, all repeated many times Rigorous standards Ideal for software Research Many one-off tasks Ad hoc standards Difficult for software
Sequencer’s Input Infinite variety of  Samples, handling and lab prep All details might matter Usually only a few do
The 454 Solution A single strict [A-Z0-9]+ field Intended as an external primary key Makes sample tracking an upstream problem Part of the results directory name R_TIMESTAMP_MACHINEID_USER_YOURFIELD Clean technical solution
These are Researchers Apparently they wanted a LIMS Found a way to cram it in PROJIDxxSPECIESxxSAMPLExxDESCxxNOTES More or less consistent
Additional Details 3 machines Signs of strain by the 50th run Difficult to look across machines Too many DESCRIPTION variants Desire to rename old data
Dilbert 2
Key Terms Wiki  Fast in Hawaiian  LIMS Laboratory Information Management System Mediawiki Software that runs Wikipedia
Wikipedia/ UC Berkeley
flexible --- database 1/3 flexible  database
flexible --- database 2/3 No need to abuse a ‘comments’ field. Everything is a comment until you make it structured.
flexible --- database 3/3 No need to abuse a ‘comments’ field. Everything is  a comment until you make it structured.
Full History Audit Trail Full History Differences between any 2 versions
Version Differences
Next Gen Data Store File data raid Meta data wiki
Next Gen Data Analysis File data raid Meta data wiki People Programs
Next Gen Data Analysis File data raid Meta data wiki People Programs
Automatic data capture - Raw Most structured content can be captured and recorded by programs as it is generated
Automatic data capture - Pretty 1/3 All captured automatically, at the 454 machine
Automatic data capture - Pretty 2/3 All captured automatically, at the 454 machine
Automatic data capture - Pretty 3/3 All captured automatically, at the 454 machine
File Browser Access raw files
Next Gen Data Analysis CGI File data raid Meta data wiki People Programs
Custom HTML Tricks you’ve never seen wikipedia do.  Adding a record via a form. Run custom perl/php code. Generate *any* html on the fly. AJAX
Project Dashboard Steer the ongoing analysis
User Interface Traditionally LIMS UI Must be done up-front Can be hardest part to get right Wiki provides a minimal UI Instantaneous and consistent Focus on data first Improve it when and where needed
As Details Emerge Users can edit data with only a browser Won’t make 5000 changes by hand But 50 is faster and cheaper than calling in a coder Write software only for the heavy lifting Cost effective only if we will do something many times  Deferred until patterns emerge and become tedious
Reading Wiki From Perl use Perlwikipedia; $bot  = Perlwikipedia->new; $bot->set_wiki($hostname,  $directory); $bot->login($username,  $password); $pagetext = $bot->get_text("Main Page");
Edit Wiki Pages @pages = $bot->get_all_pages_in_category( "Category:Is_a_454_Run"); foreach $page (@pages) { $oldtext = $bot->get_text($page); $newtext ="$oldtext  changed by bot"; $bot->edit($page, $newtext, $comment); }
Dilbert 3
SPARQL PREFIX abc: <http://mynamespace.com/exampleOntologie#> SELECT ?capital ?country WHERE { ?x abc:cityname ?capital. ?y abc:countryname ?country. ?x abc:isCapitalOf ?y. ?y abc:isInContinent abc:africa. } Select all African capitol cities from wikipedia
DBpedia.org Use SPARQL to query directly against wikipedia Make a local relational cache  Query with SQL You hide your SQL behind a layer anyway…..right?
Concerns It can’t scale see http://en.wikipedia.org No theoretical basis This is a semantic web
Conclusion Extremely flexible database Unifies next gen, microarrays, inventory, … History of all changes Initiate/steer tasks  Perl for deep customization The human intelligence of a wiki

Wikilims Road4

  • 1.
  • 2.
  • 3.
    1st Next GenSequencer Centerpiece of a lab Generates new workflows These cannot be known in advance When they order 2 more sequencers Still want a single repository for all runs
  • 4.
    Tasks/Workflows Production Few tasks, all repeated many times Rigorous standards Ideal for software Research Many one-off tasks Ad hoc standards Difficult for software
  • 5.
    Sequencer’s Input Infinitevariety of Samples, handling and lab prep All details might matter Usually only a few do
  • 6.
    The 454 SolutionA single strict [A-Z0-9]+ field Intended as an external primary key Makes sample tracking an upstream problem Part of the results directory name R_TIMESTAMP_MACHINEID_USER_YOURFIELD Clean technical solution
  • 7.
    These are ResearchersApparently they wanted a LIMS Found a way to cram it in PROJIDxxSPECIESxxSAMPLExxDESCxxNOTES More or less consistent
  • 8.
    Additional Details 3machines Signs of strain by the 50th run Difficult to look across machines Too many DESCRIPTION variants Desire to rename old data
  • 9.
  • 10.
    Key Terms Wiki Fast in Hawaiian LIMS Laboratory Information Management System Mediawiki Software that runs Wikipedia
  • 11.
  • 12.
    flexible --- database1/3 flexible database
  • 13.
    flexible --- database2/3 No need to abuse a ‘comments’ field. Everything is a comment until you make it structured.
  • 14.
    flexible --- database3/3 No need to abuse a ‘comments’ field. Everything is a comment until you make it structured.
  • 15.
    Full History AuditTrail Full History Differences between any 2 versions
  • 16.
  • 17.
    Next Gen DataStore File data raid Meta data wiki
  • 18.
    Next Gen DataAnalysis File data raid Meta data wiki People Programs
  • 19.
    Next Gen DataAnalysis File data raid Meta data wiki People Programs
  • 20.
    Automatic data capture- Raw Most structured content can be captured and recorded by programs as it is generated
  • 21.
    Automatic data capture- Pretty 1/3 All captured automatically, at the 454 machine
  • 22.
    Automatic data capture- Pretty 2/3 All captured automatically, at the 454 machine
  • 23.
    Automatic data capture- Pretty 3/3 All captured automatically, at the 454 machine
  • 24.
  • 25.
    Next Gen DataAnalysis CGI File data raid Meta data wiki People Programs
  • 26.
    Custom HTML Tricksyou’ve never seen wikipedia do. Adding a record via a form. Run custom perl/php code. Generate *any* html on the fly. AJAX
  • 27.
    Project Dashboard Steerthe ongoing analysis
  • 28.
    User Interface TraditionallyLIMS UI Must be done up-front Can be hardest part to get right Wiki provides a minimal UI Instantaneous and consistent Focus on data first Improve it when and where needed
  • 29.
    As Details EmergeUsers can edit data with only a browser Won’t make 5000 changes by hand But 50 is faster and cheaper than calling in a coder Write software only for the heavy lifting Cost effective only if we will do something many times Deferred until patterns emerge and become tedious
  • 30.
    Reading Wiki FromPerl use Perlwikipedia; $bot = Perlwikipedia->new; $bot->set_wiki($hostname, $directory); $bot->login($username, $password); $pagetext = $bot->get_text(&quot;Main Page&quot;);
  • 31.
    Edit Wiki Pages@pages = $bot->get_all_pages_in_category( &quot;Category:Is_a_454_Run&quot;); foreach $page (@pages) { $oldtext = $bot->get_text($page); $newtext =&quot;$oldtext changed by bot&quot;; $bot->edit($page, $newtext, $comment); }
  • 32.
  • 33.
    SPARQL PREFIX abc:<http://mynamespace.com/exampleOntologie#> SELECT ?capital ?country WHERE { ?x abc:cityname ?capital. ?y abc:countryname ?country. ?x abc:isCapitalOf ?y. ?y abc:isInContinent abc:africa. } Select all African capitol cities from wikipedia
  • 34.
    DBpedia.org Use SPARQLto query directly against wikipedia Make a local relational cache Query with SQL You hide your SQL behind a layer anyway…..right?
  • 35.
    Concerns It can’tscale see http://en.wikipedia.org No theoretical basis This is a semantic web
  • 36.
    Conclusion Extremely flexibledatabase Unifies next gen, microarrays, inventory, … History of all changes Initiate/steer tasks Perl for deep customization The human intelligence of a wiki