SlideShare a Scribd company logo
Transcription Factor-DNA binding
                 prediction
Tahmina Ahmed
Prosunjit Biswas
Iffat Sharmin Chowdhury
Badri Sampath




                                 1
Motivation
• Label the unlabeled DNA sequences by the model,
  built by examining the labeled DNA sequences
  and be able to perceive some real world Machine
  Learning problems.




                                        2
Approaches
• K-mer based
     Fixed length K-mer
     K-mer with Mismatches
     Using Regular Expression
• PWM based
     MEME and MAST
• Combined Model
    Unite both model




                                3
K-mer Approach Based on Regular
                Expression
Motivation
  2-mer appears mostly in the sequences. So, emphasize
 mostly on 2-mer.

Strategy
  - For any two 2-mers X & Y, generate regular expression
  X(.*)Y and Y(.*)X.
  - Use these Regular expression as candidate attribute.
Classifier Selection




                        Fig : Around 9 classifiers applied on TF data set
Algorithms are numbered as follows -
      (1)Logistic (2)SMO (3)NaiveBayes (4)BayesianLogisticRegression (5)Kstar (6)Bagging
                               7)LogitBoost (8)RandomForest (9)J48
Summary -
     * 9 classifiers are applied on 10 data set. 3 are shown among them
     * choosing an absolute classifier is not a trivial task
     * same classifier behaves differently on different data sets
                                                                            5
Change in Accuracy due to Different Classifiers




                 Logistic         J48       RandomForest     NaiveBayes                      Logistic         J48       RandomForest     NaiveBayes

 Fig : The performance of different types of Classifiers on TF_3 data set   Fig : The performance of different types of Classifiers on TF_5 data set




Summary -
       * classifiers have great consequences on accuracy
       * one has to be prudent when choosing classifiers

                                                                                                                       6
Change in Accuracy due to Different K-mer
                  Length




                        4-mer             5-mer             6-mer
            Fig : The performance of different length K-mer on TF_3 data set


Summary -
    * K-mer length also has consequences on accuracy
    * not trivial, difficult to find the absolute one


                                                                               7
Attribute Space Selection




        Fig : The performance of different selecting k-mer on TF_4 data set


Summary -
    * considering number of attributes also has consequences on accuracy
    * accuracy increases if we consider greater number of attributes, but from such
   saturation point it decreases.


                                                                              8
PWM based Analysis on Accuracy
                       (TF_1 data set)




Fig : J48, minW 6 - maxW 15, no. of sites 10               Fig : J48, minW 6 – maxW 15, no. of motifs 5
Summary -
      * accuracy increases when we have more motifs but fixed no. of sites
      * accuracy increases when we have more sites but fixed no. of motifs
      * what happened when we increases both ?????


                                                                                 9
PWM based Analysis




                            Fig : Accuracy vary on no. of motifs and no. of sites


* 1st bar concern with no. of sites
* 2nd bar concern with no. of motifs
* 3rd bar concern with accuracy
* the point is that accuracy decreases when we increases no. of motifs and no. of sites.
Extra Work for TF_20


                  Sequences
                identified by
                 both model
K-mer
                                                                   The New Model
  +                                                                  for TF-20
Pwm              Sequences         Biased 2-         Newly
                 identified        mer Model         Labeled
                 differently                        Sequences



              Fig : Flow diagram of Building New Model for TF-20


Summary -
    * we have done some extra work for TF_20
AUC based on the Feedback (bonus model)




                    Fig : AUC of 10 data sets based on last submission


* accuracy improved than first submission
* PWM does not have pleasant result



                                                                         12
Participation
            Background      Working      Working   Paramete   Automation
              Study         with Tools    with     r Tuning
                                         Models
  Badri     DNA,RNA,        AlignAce,     PWM       K-mer     Arff Writer,
 Sampath     protein,        MEME,                            Mast output
              motif          MAST                               writer
   Iffat      Protein,       Weka,        K-mer     PWM        Script for
 Sharmin       Motif,       AlignAce,                          FASTA,
Chowdhury   Transcriptio    ScanAce                             Weka
                 n
Prosunjit      DNA,          MEME,        K-mer     PWM        Script for
 Biswas     Transcriptio     MAST                             RE, for new
              nK-mer                                            model
 Tahmina      MEME,          MEME,        PWM       K-mer      Script for
  Ahmed       MAST,          MAST,                              MEME,
              PWM             Weka                              MAST




                                                                   13
Acknowledgment




                 14
Questions ???

More Related Content

Viewers also liked

Presentation
PresentationPresentation
Presentation
seenivasaperumal5
 
Dna protein
Dna proteinDna protein
DNA binding Domains
DNA binding DomainsDNA binding Domains
DNA binding Domains
Hern Bio Genious
 
Protein dna interaction
Protein dna interactionProtein dna interaction
Protein dna interaction
Prasanthperceptron
 
Lac operon
Lac operonLac operon
Lac operon
Johnathan Sarines
 
Dna fingerprinting powerpoint
Dna fingerprinting powerpointDna fingerprinting powerpoint
Dna fingerprinting powerpoint
Genevia Vincent
 
Protein – DNA interactions, an overview
Protein – DNA interactions, an overviewProtein – DNA interactions, an overview
Protein – DNA interactions, an overview
Dariyus Kabraji
 

Viewers also liked (7)

Presentation
PresentationPresentation
Presentation
 
Dna protein
Dna proteinDna protein
Dna protein
 
DNA binding Domains
DNA binding DomainsDNA binding Domains
DNA binding Domains
 
Protein dna interaction
Protein dna interactionProtein dna interaction
Protein dna interaction
 
Lac operon
Lac operonLac operon
Lac operon
 
Dna fingerprinting powerpoint
Dna fingerprinting powerpointDna fingerprinting powerpoint
Dna fingerprinting powerpoint
 
Protein – DNA interactions, an overview
Protein – DNA interactions, an overviewProtein – DNA interactions, an overview
Protein – DNA interactions, an overview
 

Similar to Transcription Factor DNA Binding Prediction

Understand and Harness the Capabilities of Intel® Xeon Phi™ Processors
Understand and Harness the Capabilities of Intel® Xeon Phi™ ProcessorsUnderstand and Harness the Capabilities of Intel® Xeon Phi™ Processors
Understand and Harness the Capabilities of Intel® Xeon Phi™ Processors
Intel® Software
 
2010 nephee 01_smart_grid과제진행및이슈사항_20100630_kimduho
2010 nephee 01_smart_grid과제진행및이슈사항_20100630_kimduho2010 nephee 01_smart_grid과제진행및이슈사항_20100630_kimduho
2010 nephee 01_smart_grid과제진행및이슈사항_20100630_kimduho
Kim Du-Ho
 
Design and Hardware Implementation of Low-Complexity Multiuser Precoders (ETH...
Design and Hardware Implementation of Low-Complexity Multiuser Precoders (ETH...Design and Hardware Implementation of Low-Complexity Multiuser Precoders (ETH...
Design and Hardware Implementation of Low-Complexity Multiuser Precoders (ETH...
TSC University of Mondragon
 
Large Scale Kernel Learning using Block Coordinate Descent
Large Scale Kernel Learning using Block Coordinate DescentLarge Scale Kernel Learning using Block Coordinate Descent
Large Scale Kernel Learning using Block Coordinate Descent
Shaleen Kumar Gupta
 
Pragmatic model checking: from theory to implementations
Pragmatic model checking: from theory to implementationsPragmatic model checking: from theory to implementations
Pragmatic model checking: from theory to implementations
Universität Rostock
 
(Paper) Efficient Evaluation Methods of Elementary Functions Suitable for SIM...
(Paper) Efficient Evaluation Methods of Elementary Functions Suitable for SIM...(Paper) Efficient Evaluation Methods of Elementary Functions Suitable for SIM...
(Paper) Efficient Evaluation Methods of Elementary Functions Suitable for SIM...
Naoki Shibata
 
The CTO's Espresso Guide to SON
The CTO's Espresso Guide to SONThe CTO's Espresso Guide to SON
The CTO's Espresso Guide to SON
Mindspeed Technologies
 
Inference accelerators
Inference acceleratorsInference accelerators
Inference accelerators
DarshanG13
 
Exploiting contextual information for improved phoeneme recognition
Exploiting contextual information for improved phoeneme recognitionExploiting contextual information for improved phoeneme recognition
Exploiting contextual information for improved phoeneme recognition
Sebastian Hafner
 
Presentation of the open source CFD code Code_Saturne
Presentation of the open source CFD code Code_SaturnePresentation of the open source CFD code Code_Saturne
Presentation of the open source CFD code Code_Saturne
Renuda SARL
 
BWA-MEM2-IPDPS 2019
BWA-MEM2-IPDPS 2019BWA-MEM2-IPDPS 2019
BWA-MEM2-IPDPS 2019
Sanchit Misra
 
Neural Field aware Factorization Machine
Neural Field aware Factorization MachineNeural Field aware Factorization Machine
Neural Field aware Factorization Machine
InMobi
 
Ai final ppt with InMobi template
Ai  final ppt with InMobi templateAi  final ppt with InMobi template
Ai final ppt with InMobi template
Gunjan Sharma
 
Rethinking Attention with Performers
Rethinking Attention with PerformersRethinking Attention with Performers
Rethinking Attention with Performers
Joonhyung Lee
 
Solido Pvt Corner Package Datasheet
Solido Pvt Corner Package DatasheetSolido Pvt Corner Package Datasheet
Solido Pvt Corner Package Datasheet
Solido Design Automation
 
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
MLconf
 
Conv-TasNet.pdf
Conv-TasNet.pdfConv-TasNet.pdf
Conv-TasNet.pdf
ssuser849b73
 
Optimization of Electrical Machines in the Cloud with SyMSpace by LCM
Optimization of Electrical Machines in the Cloud with SyMSpace by LCMOptimization of Electrical Machines in the Cloud with SyMSpace by LCM
Optimization of Electrical Machines in the Cloud with SyMSpace by LCM
cloudSME
 
Internet Technology (Practical Questions Paper) [CBSGS - 75:25 Pattern] {2017...
Internet Technology (Practical Questions Paper) [CBSGS - 75:25 Pattern] {2017...Internet Technology (Practical Questions Paper) [CBSGS - 75:25 Pattern] {2017...
Internet Technology (Practical Questions Paper) [CBSGS - 75:25 Pattern] {2017...
Mumbai B.Sc.IT Study
 
UNIT 2.pptx
UNIT 2.pptxUNIT 2.pptx
UNIT 2.pptx
lalithamani sampath
 

Similar to Transcription Factor DNA Binding Prediction (20)

Understand and Harness the Capabilities of Intel® Xeon Phi™ Processors
Understand and Harness the Capabilities of Intel® Xeon Phi™ ProcessorsUnderstand and Harness the Capabilities of Intel® Xeon Phi™ Processors
Understand and Harness the Capabilities of Intel® Xeon Phi™ Processors
 
2010 nephee 01_smart_grid과제진행및이슈사항_20100630_kimduho
2010 nephee 01_smart_grid과제진행및이슈사항_20100630_kimduho2010 nephee 01_smart_grid과제진행및이슈사항_20100630_kimduho
2010 nephee 01_smart_grid과제진행및이슈사항_20100630_kimduho
 
Design and Hardware Implementation of Low-Complexity Multiuser Precoders (ETH...
Design and Hardware Implementation of Low-Complexity Multiuser Precoders (ETH...Design and Hardware Implementation of Low-Complexity Multiuser Precoders (ETH...
Design and Hardware Implementation of Low-Complexity Multiuser Precoders (ETH...
 
Large Scale Kernel Learning using Block Coordinate Descent
Large Scale Kernel Learning using Block Coordinate DescentLarge Scale Kernel Learning using Block Coordinate Descent
Large Scale Kernel Learning using Block Coordinate Descent
 
Pragmatic model checking: from theory to implementations
Pragmatic model checking: from theory to implementationsPragmatic model checking: from theory to implementations
Pragmatic model checking: from theory to implementations
 
(Paper) Efficient Evaluation Methods of Elementary Functions Suitable for SIM...
(Paper) Efficient Evaluation Methods of Elementary Functions Suitable for SIM...(Paper) Efficient Evaluation Methods of Elementary Functions Suitable for SIM...
(Paper) Efficient Evaluation Methods of Elementary Functions Suitable for SIM...
 
The CTO's Espresso Guide to SON
The CTO's Espresso Guide to SONThe CTO's Espresso Guide to SON
The CTO's Espresso Guide to SON
 
Inference accelerators
Inference acceleratorsInference accelerators
Inference accelerators
 
Exploiting contextual information for improved phoeneme recognition
Exploiting contextual information for improved phoeneme recognitionExploiting contextual information for improved phoeneme recognition
Exploiting contextual information for improved phoeneme recognition
 
Presentation of the open source CFD code Code_Saturne
Presentation of the open source CFD code Code_SaturnePresentation of the open source CFD code Code_Saturne
Presentation of the open source CFD code Code_Saturne
 
BWA-MEM2-IPDPS 2019
BWA-MEM2-IPDPS 2019BWA-MEM2-IPDPS 2019
BWA-MEM2-IPDPS 2019
 
Neural Field aware Factorization Machine
Neural Field aware Factorization MachineNeural Field aware Factorization Machine
Neural Field aware Factorization Machine
 
Ai final ppt with InMobi template
Ai  final ppt with InMobi templateAi  final ppt with InMobi template
Ai final ppt with InMobi template
 
Rethinking Attention with Performers
Rethinking Attention with PerformersRethinking Attention with Performers
Rethinking Attention with Performers
 
Solido Pvt Corner Package Datasheet
Solido Pvt Corner Package DatasheetSolido Pvt Corner Package Datasheet
Solido Pvt Corner Package Datasheet
 
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
 
Conv-TasNet.pdf
Conv-TasNet.pdfConv-TasNet.pdf
Conv-TasNet.pdf
 
Optimization of Electrical Machines in the Cloud with SyMSpace by LCM
Optimization of Electrical Machines in the Cloud with SyMSpace by LCMOptimization of Electrical Machines in the Cloud with SyMSpace by LCM
Optimization of Electrical Machines in the Cloud with SyMSpace by LCM
 
Internet Technology (Practical Questions Paper) [CBSGS - 75:25 Pattern] {2017...
Internet Technology (Practical Questions Paper) [CBSGS - 75:25 Pattern] {2017...Internet Technology (Practical Questions Paper) [CBSGS - 75:25 Pattern] {2017...
Internet Technology (Practical Questions Paper) [CBSGS - 75:25 Pattern] {2017...
 
UNIT 2.pptx
UNIT 2.pptxUNIT 2.pptx
UNIT 2.pptx
 

More from UT, San Antonio

digital certificate - types and formats
digital certificate - types and formatsdigital certificate - types and formats
digital certificate - types and formats
UT, San Antonio
 
Saml metadata
Saml metadataSaml metadata
Saml metadata
UT, San Antonio
 
Static Analysis with Sonarlint
Static Analysis with SonarlintStatic Analysis with Sonarlint
Static Analysis with Sonarlint
UT, San Antonio
 
Shellshock- from bug towards vulnerability
Shellshock- from bug towards vulnerabilityShellshock- from bug towards vulnerability
Shellshock- from bug towards vulnerability
UT, San Antonio
 
Abac17 prosun-slides
Abac17 prosun-slidesAbac17 prosun-slides
Abac17 prosun-slides
UT, San Antonio
 
Abac17 prosun-slides
Abac17 prosun-slidesAbac17 prosun-slides
Abac17 prosun-slides
UT, San Antonio
 
Recitation
RecitationRecitation
Recitation
UT, San Antonio
 
Recitation
RecitationRecitation
Recitation
UT, San Antonio
 
Big Data Processing: Performance Gain Through In-Memory Computation
Big Data Processing: Performance Gain Through In-Memory ComputationBig Data Processing: Performance Gain Through In-Memory Computation
Big Data Processing: Performance Gain Through In-Memory Computation
UT, San Antonio
 
Enumerated authorization policy ABAC (EP-ABAC) model
Enumerated authorization policy ABAC (EP-ABAC) modelEnumerated authorization policy ABAC (EP-ABAC) model
Enumerated authorization policy ABAC (EP-ABAC) model
UT, San Antonio
 
Where is my Privacy presentation slideshow (one page only)
Where is my Privacy presentation slideshow (one page only)Where is my Privacy presentation slideshow (one page only)
Where is my Privacy presentation slideshow (one page only)
UT, San Antonio
 
Three month course
Three month courseThree month course
Three month course
UT, San Antonio
 
Zerovm backgroud
Zerovm backgroudZerovm backgroud
Zerovm backgroud
UT, San Antonio
 
Security_of_openstack_keystone
Security_of_openstack_keystoneSecurity_of_openstack_keystone
Security_of_openstack_keystone
UT, San Antonio
 
Research seminar group_1_prosunjit
Research seminar group_1_prosunjitResearch seminar group_1_prosunjit
Research seminar group_1_prosunjit
UT, San Antonio
 
Ksi
KsiKsi
Attribute Based Encryption
Attribute Based EncryptionAttribute Based Encryption
Attribute Based Encryption
UT, San Antonio
 
Final Project Transciption Factor DNA binding Prediction
Final Project Transciption Factor DNA binding Prediction Final Project Transciption Factor DNA binding Prediction
Final Project Transciption Factor DNA binding Prediction
UT, San Antonio
 
Cyber Security Exam 2
Cyber Security Exam 2Cyber Security Exam 2
Cyber Security Exam 2
UT, San Antonio
 

More from UT, San Antonio (20)

digital certificate - types and formats
digital certificate - types and formatsdigital certificate - types and formats
digital certificate - types and formats
 
Saml metadata
Saml metadataSaml metadata
Saml metadata
 
Static Analysis with Sonarlint
Static Analysis with SonarlintStatic Analysis with Sonarlint
Static Analysis with Sonarlint
 
Shellshock- from bug towards vulnerability
Shellshock- from bug towards vulnerabilityShellshock- from bug towards vulnerability
Shellshock- from bug towards vulnerability
 
Abac17 prosun-slides
Abac17 prosun-slidesAbac17 prosun-slides
Abac17 prosun-slides
 
Abac17 prosun-slides
Abac17 prosun-slidesAbac17 prosun-slides
Abac17 prosun-slides
 
Recitation
RecitationRecitation
Recitation
 
Recitation
RecitationRecitation
Recitation
 
Big Data Processing: Performance Gain Through In-Memory Computation
Big Data Processing: Performance Gain Through In-Memory ComputationBig Data Processing: Performance Gain Through In-Memory Computation
Big Data Processing: Performance Gain Through In-Memory Computation
 
Enumerated authorization policy ABAC (EP-ABAC) model
Enumerated authorization policy ABAC (EP-ABAC) modelEnumerated authorization policy ABAC (EP-ABAC) model
Enumerated authorization policy ABAC (EP-ABAC) model
 
Where is my Privacy presentation slideshow (one page only)
Where is my Privacy presentation slideshow (one page only)Where is my Privacy presentation slideshow (one page only)
Where is my Privacy presentation slideshow (one page only)
 
Three month course
Three month courseThree month course
Three month course
 
One month-syllabus
One month-syllabusOne month-syllabus
One month-syllabus
 
Zerovm backgroud
Zerovm backgroudZerovm backgroud
Zerovm backgroud
 
Security_of_openstack_keystone
Security_of_openstack_keystoneSecurity_of_openstack_keystone
Security_of_openstack_keystone
 
Research seminar group_1_prosunjit
Research seminar group_1_prosunjitResearch seminar group_1_prosunjit
Research seminar group_1_prosunjit
 
Ksi
KsiKsi
Ksi
 
Attribute Based Encryption
Attribute Based EncryptionAttribute Based Encryption
Attribute Based Encryption
 
Final Project Transciption Factor DNA binding Prediction
Final Project Transciption Factor DNA binding Prediction Final Project Transciption Factor DNA binding Prediction
Final Project Transciption Factor DNA binding Prediction
 
Cyber Security Exam 2
Cyber Security Exam 2Cyber Security Exam 2
Cyber Security Exam 2
 

Recently uploaded

Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
alexjohnson7307
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Tatiana Kojar
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
saastr
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Public CyberSecurity Awareness Presentation 2024.pptx
Public CyberSecurity Awareness Presentation 2024.pptxPublic CyberSecurity Awareness Presentation 2024.pptx
Public CyberSecurity Awareness Presentation 2024.pptx
marufrahmanstratejm
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Wask
 
Trusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process MiningTrusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process Mining
LucaBarbaro3
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
Edge AI and Vision Alliance
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
Antonios Katsarakis
 
AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
HarisZaheer8
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
Tatiana Kojar
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 

Recently uploaded (20)

Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Public CyberSecurity Awareness Presentation 2024.pptx
Public CyberSecurity Awareness Presentation 2024.pptxPublic CyberSecurity Awareness Presentation 2024.pptx
Public CyberSecurity Awareness Presentation 2024.pptx
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
 
Trusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process MiningTrusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process Mining
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
 
AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 

Transcription Factor DNA Binding Prediction

  • 1. Transcription Factor-DNA binding prediction Tahmina Ahmed Prosunjit Biswas Iffat Sharmin Chowdhury Badri Sampath 1
  • 2. Motivation • Label the unlabeled DNA sequences by the model, built by examining the labeled DNA sequences and be able to perceive some real world Machine Learning problems. 2
  • 3. Approaches • K-mer based Fixed length K-mer K-mer with Mismatches Using Regular Expression • PWM based MEME and MAST • Combined Model Unite both model 3
  • 4. K-mer Approach Based on Regular Expression Motivation 2-mer appears mostly in the sequences. So, emphasize mostly on 2-mer. Strategy - For any two 2-mers X & Y, generate regular expression X(.*)Y and Y(.*)X. - Use these Regular expression as candidate attribute.
  • 5. Classifier Selection Fig : Around 9 classifiers applied on TF data set Algorithms are numbered as follows - (1)Logistic (2)SMO (3)NaiveBayes (4)BayesianLogisticRegression (5)Kstar (6)Bagging 7)LogitBoost (8)RandomForest (9)J48 Summary - * 9 classifiers are applied on 10 data set. 3 are shown among them * choosing an absolute classifier is not a trivial task * same classifier behaves differently on different data sets 5
  • 6. Change in Accuracy due to Different Classifiers Logistic J48 RandomForest NaiveBayes Logistic J48 RandomForest NaiveBayes Fig : The performance of different types of Classifiers on TF_3 data set Fig : The performance of different types of Classifiers on TF_5 data set Summary - * classifiers have great consequences on accuracy * one has to be prudent when choosing classifiers 6
  • 7. Change in Accuracy due to Different K-mer Length 4-mer 5-mer 6-mer Fig : The performance of different length K-mer on TF_3 data set Summary - * K-mer length also has consequences on accuracy * not trivial, difficult to find the absolute one 7
  • 8. Attribute Space Selection Fig : The performance of different selecting k-mer on TF_4 data set Summary - * considering number of attributes also has consequences on accuracy * accuracy increases if we consider greater number of attributes, but from such saturation point it decreases. 8
  • 9. PWM based Analysis on Accuracy (TF_1 data set) Fig : J48, minW 6 - maxW 15, no. of sites 10 Fig : J48, minW 6 – maxW 15, no. of motifs 5 Summary - * accuracy increases when we have more motifs but fixed no. of sites * accuracy increases when we have more sites but fixed no. of motifs * what happened when we increases both ????? 9
  • 10. PWM based Analysis Fig : Accuracy vary on no. of motifs and no. of sites * 1st bar concern with no. of sites * 2nd bar concern with no. of motifs * 3rd bar concern with accuracy * the point is that accuracy decreases when we increases no. of motifs and no. of sites.
  • 11. Extra Work for TF_20 Sequences identified by both model K-mer The New Model + for TF-20 Pwm Sequences Biased 2- Newly identified mer Model Labeled differently Sequences Fig : Flow diagram of Building New Model for TF-20 Summary - * we have done some extra work for TF_20
  • 12. AUC based on the Feedback (bonus model) Fig : AUC of 10 data sets based on last submission * accuracy improved than first submission * PWM does not have pleasant result 12
  • 13. Participation Background Working Working Paramete Automation Study with Tools with r Tuning Models Badri DNA,RNA, AlignAce, PWM K-mer Arff Writer, Sampath protein, MEME, Mast output motif MAST writer Iffat Protein, Weka, K-mer PWM Script for Sharmin Motif, AlignAce, FASTA, Chowdhury Transcriptio ScanAce Weka n Prosunjit DNA, MEME, K-mer PWM Script for Biswas Transcriptio MAST RE, for new nK-mer model Tahmina MEME, MEME, PWM K-mer Script for Ahmed MAST, MAST, MEME, PWM Weka MAST 13