3. Download the raw fastq files using bash script from SRA Explorer
Project sample accession
Copy the bash script for terminal download
3
4. login-20-26:/proj$ nano FQdownload.sh
login-20-26:/proj$ sbatch FQdownload.sh
login-20-26:/proj$ squeue -u mosabdel
Create a bash script named FQdownload.sh
in the terminal using nano function
Run the bash script named FQdownload.sh in the terminal using
sbatch function (I use my server due to sample size o my PC)
Check the job submission and running time
4
5. for R1 in *_Arabidopsis_thaliana_RNA-Seq_1.fastq.gz*
do
R2=${R1//_Arabidopsis_thaliana_RNA-
Seq_1.fastq.gz/_Arabidopsis_thaliana_RNA-Seq_2.fastq.gz}
echo $R1 $R2
fastqc $R1 $R2
done
QC using fastqc
Creating a loop for Read1 and Read2, printing the R1 and R2 and then running fastqc
5
6. for R1 in *_Arabidopsis_thaliana_RNA-Seq_1.fastq.gz*
do
R2=${R1//_Arabidopsis_thaliana_RNA-Seq_1.fastq.gz/_Arabidopsis_thaliana_RNA-Seq_2.fastq.gz}
R1_trim=${R1//_Arabidopsis_thaliana_RNA-Seq_1.fastq.gz/_Trimmed_Arabidopsis_thaliana_RNA-Seq_1.fastq.gz}
R2_trim=${R2//_Arabidopsis_thaliana_RNA-Seq_2.fastq.gz/_Trimmed_Arabidopsis_thaliana_RNA-Seq_2.fastq.gz}
sample_name=${R1%%_Arabidopsis_thaliana_RNA-Seq_1.fastq.gz*}
fastpReport="${sample_name}_fastpReport.html"
fastp -i $R1 -I $R2 -o $R1_trim -O $R2_trim --html $fastpReport --qualified_quality_phred 30 detect_adapter_for_pe --
trim_poly_x --trim_poly_g
done
QC and trimming using fastp through bash looping
6
7. kallisto index -i Arabidopsis_thaliana.TAIR10.cdna.all.fa.gz_kallisto_index Arabidopsis_thaliana.TAIR10.cdna.all.fa.gz
Kallisto index
Ensemble Plants
for R1 in *1.fastq.gz*
do
R2=${R1//1.fastq.gz/2.fastq.gz}
Output=${R1//Trimmed_Arabidopsis_thaliana_RNA-Seq_1.fastq.gz/Count}
kallisto quant --threads 8 -i Arabidopsis_thaliana.TAIR10.cdna.all.fa.gz_kallisto_index -o $Output $R1 $R2
done
Kallisto count
7