Sucheta Tripathy, Akash Gupta, Brett M Tyler This is work in progress…
Yet Again Another Database????????? Its raining sequences!!!!!!!!!! Sequencing is outrunning the ability to store, transmit and analyze the data - NYtimes
FungiDBVMD ->EuMicrobedb.org Transcriptomicsdb
Eumicrobedb ◦ Based on Oracle and GUS ◦ Administered at the Virginia Tech
Based on Mysql Front end remains the same (Based on perl CGI, GD and PHP) Name spaces are downsized. Number of tables/views downsized. Removed dependencies.
3 Name spaces5 name spaces (20+7+10) tables(179+39+40+15+56) (18) Viewstables Independent of oracle(84+4+15+24) views Independent of BioperlNeeds Oracle licenseNeeds Bioperl Transcriptomics database
EuMicrobedb- Eumicrobedb- Oracle LightTotal Number of 329 37TablesTotal Number of 127 18viewsQuery time 10secs 1.2 secsTime for genome 12-14 hours 2 hoursupload
An P. sojae V1.0notati P. sojae V5.0on P. ramorum V1.0 H. arabidopsidis V8.3 Sequence fasta C++ Database API GFF Toolkit
GFF Genome toolsSequence fasta C++API Database An not ati on Toolkit
3 Dell Power Edge R420 servers: 16 GB RAM, 1.5 TB each with NFS. ◦ Data Analysis server ◦ Web server ◦ Data storage R820 server: 128 GB RAM, 16 TB storage. IICB has 2 compute clusters with 64 nodes and each node having 192 GB memory. CSIR-cMMACS has India’s fastest supercomputer. One sequencing and one bioinformatics support.
Labs sequencing genomes at rapid pace ◦ Draft Assembly ◦ Data not yet in genbank ◦ Gff annotation available : View on browser Labs with limited hosting facilities ◦ Data hosting ◦ Data analysis
Source will be released soon for people to replicate the database. ◦ No Oracle license. $$$ ◦ Independent of many packages. ◦ Installation time reduced. ◦ Simple user experience.