Rna rocket demo


A How-to for using the RNA-Rocket site.

  1. 1. RNA-Seq Analysis Using Pathogen Portal’s RNA-Seq analysis pipeline RNARocket Overview Creating an account Exploring the site Getting data Checking quality Starting analysis Further analysis
  2. 2. Create an account Step 1: Create a login account: I. Go to http://pathogenportal.org II. Click on RNA Rocket. III. Click on Create account IV. Fill in the required information.
  3. 3. Exploring the site: Launch Pad - Interactive concept diagram - Task oriented menu system - Designed for novice user
  4. 4. Exploring the site: Launch Pad  Trim Reads - User guide to why, what, and how - Details required inputs and expected outputs - Helps organize files into project spaces
  5. 5. Exploring the site: Project View - View existing projects Download files View metadata Stream to BRC sites Manage space allocation Share projects
  6. 6. Exploring the site: Shared Data Published Projects - View shared projects - Import into your project space - Share with collaborators - Provide data for presentations… -
  7. 7. Getting Data A. Importing shared data B. Transferring ENA/SRA data C. Uploading your own 1. Click on “Shared Data”  “Published Projects” 2. Click on the title of the Project you wish to import
  8. 8. 3. Click “Import History” to import the Project into your Project View
  9. 9. Getting Data A. Importing shared data B. Transferring ENA/SRA data C. Uploading your own 1. Navigate to the ‘Launch Pad’ page and click the ‘Get fastq files from SRA/ENA’ link 2. Click the ‘Continue’ button
  10. 10. 3. Search for the SRA or ENA accession in the search box provided. Alternatively search for the GEO, ArrayExpress, SRA, or ENA identifiers in the global search box at the top. 4. Click on the Nucleotide Sequences Record title you wish to import.
  11. 11. 5. On the subsequent ENA record page click the ‘File’ link in the ‘Fastq files (galaxy)’ column for the files you wish to transfer.
  12. 12. Getting Data A. Importing shared data B. Transferring ENA/SRA data C. Uploading your own 1. To upload data from your computer or a remote computer click the ‘Upload Files’ link on the Launch Pad page. 2. On the subsequent page use the ‘Choose File’ button to upload files from your own computer (limited to 2Gb), the ‘URL/Text’ box to paste URLs for files on remote computers, and the FTP instructions for transferring files over FTP (better for larger files). Choose files from your computer here Paste the FastQ URLs here Instructions for using FTP
  13. 13. Checking quality Read base quality can affect how the reads map to the genome. Different sequencing technologies can have different quality and base-call error profiles. Depending on the quality of base calls you may wish to trim your read sequences or make special adjustments to the alignment parameters to account for this. There are two tools, FastQC and SAMStat, for checking the average base call quality in a fastq file and the number of reads aligned, respectively. An example is provided in Shared Data  Published Projects  RNASeq_QC_Demo Here we show two classes of files: 1. the original reads 2. trimmed version of those reads with low quality ends removed For these two classes we give both the FastQC and SAMStat report Original fastq & analysis Trimmed fastq & analysis Click the eye see the contents of a file or report
  14. 14. From the FastQC report we see that the average base call quality is improved by trimming the reads. From the SAMStat report we see that the number of unaligned reads only shows a slight improvement with trimming. Modern alignment software is often able to account for the base call quality in determining alignments. Also of note is that the ‘Mean Base Quality’ profile is not substantially different for MAPQ >=30 and MAPQ < 3.
  15. 15. Starting Analysis Test datasets have been provided for the purpose of starting an alignment and transcript assembly job at Shared Data  Published Projects  RNASeq_Run_Demo. - To begin, import this history into your own workspace by using the ‘Import history’ functionality demonstrated previously. - After the Project is imported it should appear in your ‘Project View’
  16. 16. - Proceed to the ‘Launch Pad’ page and click the ‘Align Reads & Assemble Transcripts’ link. -­‐ On the next page choose the type of analysis (we are analyzing a paired end prokaryotic sample). Next select the target project from the drop down menu. You should have a project called ‘imported: RNASeq_Run_Demo’. Once you select the correct project you should see the two FASTQ files listed. Next click ‘Continue’. -­‐
  17. 17. The following page allows you to configure the parameters for the various tools that will run as part of the analysis you have selected. Here we describe the bare minimum for running a job. More care should be taken when customizing analysis to your data. First populate the Upstream and Downstream Read Files with READ1_SHORT.fastq and READ2_SHORT.fastq respectively. Select the reference organism ‘Salmonella enterica subsp. Typhimurium 14028S’ from the dropdown. It may take a moment for the dropdown to appear once clicked due to the number of organisms.
  18. 18. Select ‘Run Workflow’ at the bottom of the page If the workflow is successfully queued you should see the following
  19. 19. Next go to the ‘Project View’ page to see the status of your jobs From the display in the right most panel: Grey jobs are pending, Green jobs are complete, and Yellow Jobs are running.
  20. 20. Further Analysis Test datasets have been provided for the purpose of testing the RNA-Seq visualization capabilities at PATRIC. Navigate to Shared Data  Published Projects  RNASeq_Analysis_Demo The files displayed each have a visualization component on the PATRIC site. This can be done by first clicking the dataset title to expand the dataset section, then clicking the display at PATRIC link. Read Quality View Displaying BAM at PATRIC Expression View
  21. 21. Displaying BigWig at PATRIC Displaying GFF at PATRIC
  22. 22. Displaying GeneList file at PATRIC