SlideShare a Scribd company logo
1 of 17
Download to read offline
Genome-wide Association Analysis Guide
Study (GWAS) in TASSEL Software
(GUIs)
REZA DYSTA SATRIA
Translated from Indonesian to English - www.onlinedoctranslator.com
SNP Quality Control
SNP Quality Control (QC) is a process of evaluating and cleaning data in
genetic analysis related to Single Nucleotide Polymorphisms (SNP). SNP is a genetic
variation that occurs when one nucleotide in DNA is replaced by another nucleotide
at a certain point in the genome. In this case the use of HapMap in Tassel is very
helpful in the SNP quality inspection process.
HapMap is used as a reference to compare and analyze genetic variation in the
sample population data being studied. Tassel users can utilize HapMap information to
assess population relationships, genomic structure, and patterns of associations
between genetic polymorphisms and phenotypic characteristics in their samples.
To understand more, we want to understand data analysis using the HapMap
method with the hmp data we have. For example, I will practice it with the data
"mdp_genotype.hmp.txt".
Select the "HapMap" Format and also check the "Sort Positions" column. Then click "OK"
The output of the data that was opened earlier produces the output as below:
To obtain the results of genetic data analysis which provides information
about genotypes at certain loci in a genetic sample. We can find this information in
the "Data" bar then click "Geno Summary". The details we can see include:
1. Search for mdp_genotype_OverallSummary
From this information we get information:
A. Number of Taxa : The total number of taxa
B. Number of Sites: The number of locations that are used as research objects
C. Sites X Tava: Relationship between genetic location and phenotypic characteristics
D. Number Not Missing: Completeness of data
E. Proportion Not Missng: Percentage of data completeness
F. Number Missing : Lost data
G. Proportion Missing: Percentage of data loss
H. Number of Gametes : Number of Gametes
İ. Gametes Not Missing : Data gametes that are not lost
J. Proportion gametes not Missing : The percentage of gametes that are not missing
K. Gametes Missing: The number of missing gametes
L. Proportion of Gamet Missing: Percentage of gametes missing
M. Number Heterozygous: Number heterozygous
N. Heterozygous Proportion: The percentage of the number of heterozygotes
O. Average Minor Allele Frequency: the average frequency of minor alleles at a particular
locus (site) in the genetic sample.
2. Study mdp_genotype_SiteSummary
This chart is one of the charts that must be skipped to analyze GWAS.
Select “chart” in the “Result” chart. After that we can get a histogram of the data
distribution of certain characteristics in the genetic dataset.
The histograms that can be displayed include:
-Missing data imputation
In order to get more valid analysis results and also improve the quality and
consistency of data analysis, we need to replace missing or empty values in the
dataset with the specified values. For that, we use this method. Click
"mdp_genotype" then select the "Impute" bar. After that, select the "LD KNNI
Imputation" section. After doing the things that have been mentioned a pop-up
window will appear as shown below:
With the default settings that are already available we can replace the missing
values. Click "OK" to run this method. After the process is complete we will get the
updated dataset
With this dataset, we can get higher quality data analysis results. To
increase the validity of the updated dataset that we have processed, we need to
filter this dataset with "Filter Genotype Table Sites". Set the filtering with the
settings formatted below:
After this filtering, we get a new dataset with the name
"mdp_genotype_KNNimp_Filtered_QD". Just like the previous method, we implement
this dataset into "Geno Summary". To clearly see the dataset that we have filtered
and replaced the empty values we can see it on the histogram chart with the
method we did before by selecting
"mdp_genotype_KNNimp_Filtered_QD_SiteSummary" in the "Chart" option in the
"Result" bar. Some of the results we can see include:
With these results, we can use this dataset for further methods. Save it to the
folder with the "Hapmap" format you want to be able to use this dataset further.
GLM analysis
GLM (Generalized Linear Models) analysis is a statistical method used to model
the relationship between response variables and predictor variables in various
situations. In this process we need a file that was created in the "SNP Quality Control"
method with the "mdp_traits" file.
Analysis of "Filtered_QCount" data using the "PCA" method in the Analysis bar
"Relatedness". After that, click OK according to the pop-up settings that exist.
After doing the initial work, we select the files we want to use. For example in
this work, we will use the initial “Filtered_QCount” file, the new “Filtered_QCount”,
with the “mdp_traits” file.
Click the "Data" bar then select "Intersect Join" as shown below:
After this work, we will get a new file. In this work, we will re-analyze the data
"PC_Filtered_QCount + Filtered_QCount + mdp_traits" as shown in the image
below.
GLM analysis using the file we got earlier by applying it to the "Analyse" bar and
selecting the "Association" option by clicking "GLM".
Set the settings as shown below
Create a new folder that is used to store this data. For example, create a "GLM
stats" folder that is used to store the results of this data. Then click "OK".
After that, we will get 2 (two) GLM dataset output results. We can see the
results of the dataset by using the "Manhattan Plot" in the "Results" bar. This
process is listed below.
MLM ( PCA + Kinship )
MLM (Mixed Linear Model) with PCA (Principal Component Analysis) and
Kinship is a statistical approach used in genetic analysis to examine the relationship
between genotype (genetic data) and phenotype (observed characteristics of an
organism). This method is used to address the problems of population structure and
chromosomal effects in genomic association analysis. Open the file that has been
filtered and replaced with the "HapMap" format and also the "mdp_traits" file.
Analyze the "Filtered_QCount" file with PCA analyzes in the "Analyses" bar.
After that, analyze the data using the "Kinship" method as shown in the image
below.
After that process, new data will be generated called "Centered_IBS_Filtered_QCount".
The next step we select 3 files as shown in this image.
After selecting these three files, "Intersect join" the three files. Then select
both files as shown below to be analyzed in "MLM" in the "Analyse" chart.
After the "MLM" process is complete, select the file
"MLM_statistics_for_Filtered_QCount + mdp_traits + PC_Filtered_Qcount" then look at the
graph using the "Manhattan Plot" in the "Result" chart.
PLOT OF GWAS RESULTS IN R STUDIO
Open Rstudio then create a new script with "Set Working Directory" in the
"Session" bar. Select the folder you want to save in that folder. For example in this
project, I select “GLM stats”.
- Make sure you have installed the qqman and dlypar packages on Rstdio. If you
haven't installed the package in the "Tool" bar.
-Code inside the script
library(qqman)
library(dplyr)
# import TASSEL results
#notes
TASSEL_MLM_Out <- read.table("Tasel out2.txt", header = T, sep = "t")
# Number of traits
head(unique(TASSEL_MLM_Out$Trait))
# note: for each plot trait name must be specified
# first trait as example (ie, EarHT)
Trait1 <- TASSEL_MLM_Out %>% filter(.$Trait == "EarHT")
# Bonferroni correction threshold
name <- nrow(Trait1)
(GWAS_Bonn_corr_threshold <- -log10(0.05 / nmrk))
# Manhattan plot
(Mann_plot <- manhattan(
TASSEL_MLM_Out,
chr = "chr",
bp = "Post",
snp = "Markers",
p = "p",
col = c("red", "blue"),
annotateTop = T,
genomewideline = GWAS_Bonn_corr_threshold,
suggestiveline = F
)
)
#QQ plots
QQ_plot <- qq(TASSEL_MLM_Out$p)
# Manhattan and QQ plots arranged in 1 rows and 2 columns
old_par <- par()
par(mfrow=c(1,2))
(Mann_plot <- manhattan(
TASSEL_MLM_Out,
chr = "chr",
bp = "Post",
snp = "Markers",
p = "p",
col = c("red", "blue"),
annotateTop = T,
genomewideline = GWAS_Bonn_corr_threshold,
suggestiveline = F,
main = "EarHT" # trait name
)
)
(QQ_plot <- qq(TASSEL_MLM_Out$p, main = "EarHT" ))
-Results of analysis

More Related Content

Similar to Genome-wide Association Study (GWAS) Analysis Guide in TASSEL Software (GUI).pdf

research paper
research paperresearch paper
research paperKalyan Ram
 
Data analysis_PredictingActivity_SamsungSensorData
Data analysis_PredictingActivity_SamsungSensorDataData analysis_PredictingActivity_SamsungSensorData
Data analysis_PredictingActivity_SamsungSensorDataKaren Yang
 
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...IJCSES Journal
 
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...ijcseit
 
Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...
Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...
Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...Seval Çapraz
 
The Validity of CNN to Time-Series Forecasting Problem
The Validity of CNN to Time-Series Forecasting ProblemThe Validity of CNN to Time-Series Forecasting Problem
The Validity of CNN to Time-Series Forecasting ProblemMasaharu Kinoshita
 
Predicting deaths from COVID-19 using Machine Learning
Predicting deaths from COVID-19 using Machine LearningPredicting deaths from COVID-19 using Machine Learning
Predicting deaths from COVID-19 using Machine LearningIdanGalShohet
 
Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Derek Kane
 
Musings of kaggler
Musings of kagglerMusings of kaggler
Musings of kagglerKai Xin Thia
 
Building Predictive Models R_caret language
Building Predictive Models R_caret languageBuilding Predictive Models R_caret language
Building Predictive Models R_caret languagejaved khan
 
DATA-LEVEL HYBRID STRATEGY SELECTION FOR DISK FAULT PREDICTION MODEL BASED ON...
DATA-LEVEL HYBRID STRATEGY SELECTION FOR DISK FAULT PREDICTION MODEL BASED ON...DATA-LEVEL HYBRID STRATEGY SELECTION FOR DISK FAULT PREDICTION MODEL BASED ON...
DATA-LEVEL HYBRID STRATEGY SELECTION FOR DISK FAULT PREDICTION MODEL BASED ON...IJCI JOURNAL
 
Machine Learning.pdf
Machine Learning.pdfMachine Learning.pdf
Machine Learning.pdfBeyaNasr1
 
analysing_data_using_spss.pdf
analysing_data_using_spss.pdfanalysing_data_using_spss.pdf
analysing_data_using_spss.pdfDrAnilKannur1
 
Analysis Of Data Using SPSS
Analysis Of Data Using SPSSAnalysis Of Data Using SPSS
Analysis Of Data Using SPSSBrittany Brown
 

Similar to Genome-wide Association Study (GWAS) Analysis Guide in TASSEL Software (GUI).pdf (20)

research paper
research paperresearch paper
research paper
 
Data analysis_PredictingActivity_SamsungSensorData
Data analysis_PredictingActivity_SamsungSensorDataData analysis_PredictingActivity_SamsungSensorData
Data analysis_PredictingActivity_SamsungSensorData
 
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
 
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
THE IMPLICATION OF STATISTICAL ANALYSIS AND FEATURE ENGINEERING FOR MODEL BUI...
 
Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...
Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...
Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...
 
SAS Notes
SAS NotesSAS Notes
SAS Notes
 
Dissertation
DissertationDissertation
Dissertation
 
The Validity of CNN to Time-Series Forecasting Problem
The Validity of CNN to Time-Series Forecasting ProblemThe Validity of CNN to Time-Series Forecasting Problem
The Validity of CNN to Time-Series Forecasting Problem
 
Predicting deaths from COVID-19 using Machine Learning
Predicting deaths from COVID-19 using Machine LearningPredicting deaths from COVID-19 using Machine Learning
Predicting deaths from COVID-19 using Machine Learning
 
Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests
 
2015-03-31_MotifGP
2015-03-31_MotifGP2015-03-31_MotifGP
2015-03-31_MotifGP
 
Musings of kaggler
Musings of kagglerMusings of kaggler
Musings of kaggler
 
PheWAS-package.pdf
PheWAS-package.pdfPheWAS-package.pdf
PheWAS-package.pdf
 
Building Predictive Models R_caret language
Building Predictive Models R_caret languageBuilding Predictive Models R_caret language
Building Predictive Models R_caret language
 
DATA-LEVEL HYBRID STRATEGY SELECTION FOR DISK FAULT PREDICTION MODEL BASED ON...
DATA-LEVEL HYBRID STRATEGY SELECTION FOR DISK FAULT PREDICTION MODEL BASED ON...DATA-LEVEL HYBRID STRATEGY SELECTION FOR DISK FAULT PREDICTION MODEL BASED ON...
DATA-LEVEL HYBRID STRATEGY SELECTION FOR DISK FAULT PREDICTION MODEL BASED ON...
 
Machine Learning.pdf
Machine Learning.pdfMachine Learning.pdf
Machine Learning.pdf
 
Short story_2.pptx
Short story_2.pptxShort story_2.pptx
Short story_2.pptx
 
analysing_data_using_spss.pdf
analysing_data_using_spss.pdfanalysing_data_using_spss.pdf
analysing_data_using_spss.pdf
 
analysing_data_using_spss.pdf
analysing_data_using_spss.pdfanalysing_data_using_spss.pdf
analysing_data_using_spss.pdf
 
Analysis Of Data Using SPSS
Analysis Of Data Using SPSSAnalysis Of Data Using SPSS
Analysis Of Data Using SPSS
 

More from RezaDystaSatria

Macro Design Document for Project: Adventure of Dysta
Macro Design Document for Project: Adventure of DystaMacro Design Document for Project: Adventure of Dysta
Macro Design Document for Project: Adventure of DystaRezaDystaSatria
 
Panduan Analisis Genome-wide Association Study (GWAS) dalam Software TASSEL (...
Panduan Analisis Genome-wide Association Study (GWAS) dalam Software TASSEL (...Panduan Analisis Genome-wide Association Study (GWAS) dalam Software TASSEL (...
Panduan Analisis Genome-wide Association Study (GWAS) dalam Software TASSEL (...RezaDystaSatria
 
Secarik Surat Perang Peloponesos.pdf
Secarik Surat Perang Peloponesos.pdfSecarik Surat Perang Peloponesos.pdf
Secarik Surat Perang Peloponesos.pdfRezaDystaSatria
 
Weka'ile Veri Madenciliği ve Analizi Uygulamasi.pdf
Weka'ile Veri Madenciliği ve Analizi Uygulamasi.pdfWeka'ile Veri Madenciliği ve Analizi Uygulamasi.pdf
Weka'ile Veri Madenciliği ve Analizi Uygulamasi.pdfRezaDystaSatria
 

More from RezaDystaSatria (6)

Macro Design Document for Project: Adventure of Dysta
Macro Design Document for Project: Adventure of DystaMacro Design Document for Project: Adventure of Dysta
Macro Design Document for Project: Adventure of Dysta
 
easyOS.pdf
easyOS.pdfeasyOS.pdf
easyOS.pdf
 
easyOSReview.pdf
easyOSReview.pdfeasyOSReview.pdf
easyOSReview.pdf
 
Panduan Analisis Genome-wide Association Study (GWAS) dalam Software TASSEL (...
Panduan Analisis Genome-wide Association Study (GWAS) dalam Software TASSEL (...Panduan Analisis Genome-wide Association Study (GWAS) dalam Software TASSEL (...
Panduan Analisis Genome-wide Association Study (GWAS) dalam Software TASSEL (...
 
Secarik Surat Perang Peloponesos.pdf
Secarik Surat Perang Peloponesos.pdfSecarik Surat Perang Peloponesos.pdf
Secarik Surat Perang Peloponesos.pdf
 
Weka'ile Veri Madenciliği ve Analizi Uygulamasi.pdf
Weka'ile Veri Madenciliği ve Analizi Uygulamasi.pdfWeka'ile Veri Madenciliği ve Analizi Uygulamasi.pdf
Weka'ile Veri Madenciliği ve Analizi Uygulamasi.pdf
 

Recently uploaded

BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 

Recently uploaded (20)

BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 

Genome-wide Association Study (GWAS) Analysis Guide in TASSEL Software (GUI).pdf

  • 1. Genome-wide Association Analysis Guide Study (GWAS) in TASSEL Software (GUIs) REZA DYSTA SATRIA Translated from Indonesian to English - www.onlinedoctranslator.com
  • 2. SNP Quality Control SNP Quality Control (QC) is a process of evaluating and cleaning data in genetic analysis related to Single Nucleotide Polymorphisms (SNP). SNP is a genetic variation that occurs when one nucleotide in DNA is replaced by another nucleotide at a certain point in the genome. In this case the use of HapMap in Tassel is very helpful in the SNP quality inspection process. HapMap is used as a reference to compare and analyze genetic variation in the sample population data being studied. Tassel users can utilize HapMap information to assess population relationships, genomic structure, and patterns of associations between genetic polymorphisms and phenotypic characteristics in their samples. To understand more, we want to understand data analysis using the HapMap method with the hmp data we have. For example, I will practice it with the data "mdp_genotype.hmp.txt".
  • 3. Select the "HapMap" Format and also check the "Sort Positions" column. Then click "OK" The output of the data that was opened earlier produces the output as below:
  • 4. To obtain the results of genetic data analysis which provides information about genotypes at certain loci in a genetic sample. We can find this information in the "Data" bar then click "Geno Summary". The details we can see include: 1. Search for mdp_genotype_OverallSummary
  • 5. From this information we get information: A. Number of Taxa : The total number of taxa B. Number of Sites: The number of locations that are used as research objects C. Sites X Tava: Relationship between genetic location and phenotypic characteristics D. Number Not Missing: Completeness of data E. Proportion Not Missng: Percentage of data completeness F. Number Missing : Lost data G. Proportion Missing: Percentage of data loss H. Number of Gametes : Number of Gametes İ. Gametes Not Missing : Data gametes that are not lost J. Proportion gametes not Missing : The percentage of gametes that are not missing K. Gametes Missing: The number of missing gametes L. Proportion of Gamet Missing: Percentage of gametes missing M. Number Heterozygous: Number heterozygous N. Heterozygous Proportion: The percentage of the number of heterozygotes O. Average Minor Allele Frequency: the average frequency of minor alleles at a particular locus (site) in the genetic sample. 2. Study mdp_genotype_SiteSummary This chart is one of the charts that must be skipped to analyze GWAS. Select “chart” in the “Result” chart. After that we can get a histogram of the data distribution of certain characteristics in the genetic dataset. The histograms that can be displayed include:
  • 7. In order to get more valid analysis results and also improve the quality and consistency of data analysis, we need to replace missing or empty values in the dataset with the specified values. For that, we use this method. Click "mdp_genotype" then select the "Impute" bar. After that, select the "LD KNNI Imputation" section. After doing the things that have been mentioned a pop-up window will appear as shown below: With the default settings that are already available we can replace the missing values. Click "OK" to run this method. After the process is complete we will get the updated dataset With this dataset, we can get higher quality data analysis results. To increase the validity of the updated dataset that we have processed, we need to filter this dataset with "Filter Genotype Table Sites". Set the filtering with the settings formatted below:
  • 8. After this filtering, we get a new dataset with the name "mdp_genotype_KNNimp_Filtered_QD". Just like the previous method, we implement this dataset into "Geno Summary". To clearly see the dataset that we have filtered and replaced the empty values we can see it on the histogram chart with the method we did before by selecting "mdp_genotype_KNNimp_Filtered_QD_SiteSummary" in the "Chart" option in the "Result" bar. Some of the results we can see include:
  • 9. With these results, we can use this dataset for further methods. Save it to the folder with the "Hapmap" format you want to be able to use this dataset further. GLM analysis GLM (Generalized Linear Models) analysis is a statistical method used to model the relationship between response variables and predictor variables in various situations. In this process we need a file that was created in the "SNP Quality Control" method with the "mdp_traits" file.
  • 10. Analysis of "Filtered_QCount" data using the "PCA" method in the Analysis bar "Relatedness". After that, click OK according to the pop-up settings that exist. After doing the initial work, we select the files we want to use. For example in this work, we will use the initial “Filtered_QCount” file, the new “Filtered_QCount”, with the “mdp_traits” file. Click the "Data" bar then select "Intersect Join" as shown below: After this work, we will get a new file. In this work, we will re-analyze the data "PC_Filtered_QCount + Filtered_QCount + mdp_traits" as shown in the image below.
  • 11. GLM analysis using the file we got earlier by applying it to the "Analyse" bar and selecting the "Association" option by clicking "GLM". Set the settings as shown below
  • 12. Create a new folder that is used to store this data. For example, create a "GLM stats" folder that is used to store the results of this data. Then click "OK". After that, we will get 2 (two) GLM dataset output results. We can see the results of the dataset by using the "Manhattan Plot" in the "Results" bar. This process is listed below.
  • 13. MLM ( PCA + Kinship ) MLM (Mixed Linear Model) with PCA (Principal Component Analysis) and Kinship is a statistical approach used in genetic analysis to examine the relationship between genotype (genetic data) and phenotype (observed characteristics of an organism). This method is used to address the problems of population structure and chromosomal effects in genomic association analysis. Open the file that has been filtered and replaced with the "HapMap" format and also the "mdp_traits" file. Analyze the "Filtered_QCount" file with PCA analyzes in the "Analyses" bar. After that, analyze the data using the "Kinship" method as shown in the image below.
  • 14. After that process, new data will be generated called "Centered_IBS_Filtered_QCount". The next step we select 3 files as shown in this image. After selecting these three files, "Intersect join" the three files. Then select both files as shown below to be analyzed in "MLM" in the "Analyse" chart.
  • 15. After the "MLM" process is complete, select the file "MLM_statistics_for_Filtered_QCount + mdp_traits + PC_Filtered_Qcount" then look at the graph using the "Manhattan Plot" in the "Result" chart. PLOT OF GWAS RESULTS IN R STUDIO Open Rstudio then create a new script with "Set Working Directory" in the "Session" bar. Select the folder you want to save in that folder. For example in this project, I select “GLM stats”. - Make sure you have installed the qqman and dlypar packages on Rstdio. If you haven't installed the package in the "Tool" bar. -Code inside the script library(qqman) library(dplyr) # import TASSEL results #notes
  • 16. TASSEL_MLM_Out <- read.table("Tasel out2.txt", header = T, sep = "t") # Number of traits head(unique(TASSEL_MLM_Out$Trait)) # note: for each plot trait name must be specified # first trait as example (ie, EarHT) Trait1 <- TASSEL_MLM_Out %>% filter(.$Trait == "EarHT") # Bonferroni correction threshold name <- nrow(Trait1) (GWAS_Bonn_corr_threshold <- -log10(0.05 / nmrk)) # Manhattan plot (Mann_plot <- manhattan( TASSEL_MLM_Out, chr = "chr", bp = "Post", snp = "Markers", p = "p", col = c("red", "blue"), annotateTop = T, genomewideline = GWAS_Bonn_corr_threshold, suggestiveline = F ) ) #QQ plots QQ_plot <- qq(TASSEL_MLM_Out$p) # Manhattan and QQ plots arranged in 1 rows and 2 columns old_par <- par() par(mfrow=c(1,2)) (Mann_plot <- manhattan( TASSEL_MLM_Out, chr = "chr", bp = "Post", snp = "Markers", p = "p", col = c("red", "blue"), annotateTop = T, genomewideline = GWAS_Bonn_corr_threshold,
  • 17. suggestiveline = F, main = "EarHT" # trait name ) ) (QQ_plot <- qq(TASSEL_MLM_Out$p, main = "EarHT" )) -Results of analysis