SlideShare a Scribd company logo
1 of 28
Download to read offline
GENDER
DETECTION IN
BLOGS
Presented By (Team No. 32)
Nitish Jain (201301227)
Ganesh Borle (201505587)
Vamshikrishna Reddy (201202177)
Mentored By
Lokesh Walase
IRE [CSE474]
The Big Picture
ABSTRACT
● Through the sands of time, textual content has remained a
prominent feature of internet media especially BLOGS.
● Thus, author profiling and attribution becomes an important
and task and we try to capture one aspect of it, i.e gender.
● internet can’t take responsibility of the all the content, it
should be the author itself.
● But . . .
● lot of content brings a lot of responsibility
Given a text blog , can we identify whether
the writer is a male or a female ?
The Question
WHO IS THE AUTHOR?
OUR APPROACH
THE APPROACH
● An ensemble is applied on these models and the input
document is classified as written by male or female.
● We take advantage of the linguistic features of the
blog and create a feature file.
● This feature file is then trained on various classifier and a
model for each of the classifier is prepared.
WORKFLOW
● each document contains text of about ~35 blogs
in XML format.
[Dataset Link : http://u.cs.biu.ac.il/~koppel/BlogCorpus.htm ]
The Dataset
● Koppels blog dataset
● contains about 19 thousand document
PARSING
● Language used : Python
● Each blog is entry stored in XML format
<Blog>
<date>....... </date>
<post>
….
</post>
...
<Blog>
● Each of the blog filename contains the name and Gender
of the author
The Feature Extraction
FEATURES
For our task of Gender Identification, we take the help of
the following linguistic features:
● Character Based Features
● Word Based Features
● Syntactic Features
● Structural Features
● Function Words
● POS Start Probability
The
Classification
THE CLASSIFICATION TASK
For the task of classification, we used several classifying
algorithms and arrived at a model that uses ensemble of the
following classification algorithms:
● Random Forest Classifier
● Neural Networks Classifier
● Adaboost Tree Classifier
● Gradient Boosting Classifier
● Bagging Classifier
THE CLASSIFICATION TASK
For each of the classifier
● We fed it with partial features to actually see the variation
of accuracies with the features.
● We applied a 10 fold validation to measure the accuracies.
For measuring the accuracy of the ensemble we took the
majority class from the classified results of the classifiers.
RANDOM FOREST CLASSIFIER
● An meta estimator that fits a number
of decision tree classifiers on various
sub-samples of the dataset
● By using Random Forest Classifier we
were able to achieve an accuracy of
69.79%
NEURAL NETWORKS CLASSIFIER
● Consists of multiple layers of nodes
with each layer fully connected to the
next layer nodes and each node is a
neuron with non-linear perceptron.
● Uses a supervised learning called
backpropagation for training the
network.
● By using Neural Networks Classifier
we were able to achieve an accuracy
of 69.51%
ADABOOST TREE CLASSIFIER
● An meta estimator that begins by
fitting a classifier on the original
dataset and then fits the next round
classifiers on the same dataset
● By using Adaboost tree Classifier we
were able to achieve an accuracy of
69.57%
GRADIENT BOOSTING CLASSIFIER
● Builds model in a forward stage-wise
fashion.
● In each of the next stages weak
classifiers are introduced to
compensate the shortcomings of the
existing weak learners and these
shortcomings are identified by the
gradients.
● By using Gradient Boosting Classifier
we were able to achieve an accuracy
of 70.81%
BAGGING CLASSIFIER
● A meta estimator that fits the base
classifiers each on random subsets of
the datasets and then aggregate their
individual predictions.
● By using Gradient Boosting Classifier
we were able to achieve an accuracy
of 70.03%
THE ENSEMBLE
● An Ensemble takes the output of other
classifier and then applies a majority
voting to the outputs of the classifier
to determine the output.
● By using the Ensemble model on the
above discussed classifiers we were
able to achieve an accuracy of
71.10%
FINAL RESULTS
THE FINAL RESULTS
● By using the ensemble, we were
actually able to increase our efficiency
by nearly 1% in each case irrespective
of the performance of the individual
classifiers.
● The maximum obtainable accuracy
that was shown during the
experiments was 73.19% by the
Ensemble model.
73.188406 %The maximum Accuracy Achieved
USEFUL LINKS
● Github - https://github.com/nitishjain2007/Gender_Identification
● Youtube - https://www.youtube.com/watch?v=T04BJ6cIeTs
● Slideshare - http://bit.ly/1Q8UiCe
● Website - http://nitishjain2007.github.io/Gender_Identification/
● Dropbox - http://bit.ly/1Xx0ppL
REFERENCES
● http://u.cs.biu.ac.il/~koppel/papers/male-female-llc-final.pdf
● http://www.aaai.org/ocs/index.
php/ICWSM/09/paper/viewFile/208/537
● http://www.cs.columbia.edu/nlp/papers/2011/acl2011age.pdf
● http://www.ccse.kfupm.edu.sa/~ahmadsm/coe589-
121/cheng2011-gender-identification.pdf
Thanks!
Any questions?

More Related Content

Similar to Gender Detection on Blogs

Movie Recommendation engine
Movie Recommendation engineMovie Recommendation engine
Movie Recommendation engineJayesh Lahori
 
Deep learning crash course
Deep learning crash courseDeep learning crash course
Deep learning crash courseVishwas N
 
MACHINE LEARNING - GENETIC ALGORITHM
MACHINE LEARNING - GENETIC ALGORITHMMACHINE LEARNING - GENETIC ALGORITHM
MACHINE LEARNING - GENETIC ALGORITHMPuneet Kulyana
 
From a thousand learners to a thousand markers: Scaling peer feedback with Ad...
From a thousand learners to a thousand markers: Scaling peer feedback with Ad...From a thousand learners to a thousand markers: Scaling peer feedback with Ad...
From a thousand learners to a thousand markers: Scaling peer feedback with Ad...NomadWarMachine
 
[Decisions2013@RecSys]The Role of Emotions in Context-aware Recommendation
[Decisions2013@RecSys]The Role of Emotions in Context-aware Recommendation[Decisions2013@RecSys]The Role of Emotions in Context-aware Recommendation
[Decisions2013@RecSys]The Role of Emotions in Context-aware RecommendationYONG ZHENG
 
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
Strata 2016 -  Lessons Learned from building real-life Machine Learning SystemsStrata 2016 -  Lessons Learned from building real-life Machine Learning Systems
Strata 2016 - Lessons Learned from building real-life Machine Learning SystemsXavier Amatriain
 
Dowhy: An end-to-end library for causal inference
Dowhy: An end-to-end library for causal inferenceDowhy: An end-to-end library for causal inference
Dowhy: An end-to-end library for causal inferenceAmit Sharma
 
Machine learning - session 3
Machine learning - session 3Machine learning - session 3
Machine learning - session 3Luis Borbon
 
pattern classification
pattern classificationpattern classification
pattern classificationRanjan Ganguli
 
Automated Essay Grading using Features Selection
Automated Essay Grading using Features SelectionAutomated Essay Grading using Features Selection
Automated Essay Grading using Features SelectionIRJET Journal
 
End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF
End-to-end sequence labeling via bi-directional LSTM-CNNs-CRFEnd-to-end sequence labeling via bi-directional LSTM-CNNs-CRF
End-to-end sequence labeling via bi-directional LSTM-CNNs-CRFJayavardhan Reddy Peddamail
 
20211229120253D6323_PERT 06_ Ensemble Learning.pptx
20211229120253D6323_PERT 06_ Ensemble Learning.pptx20211229120253D6323_PERT 06_ Ensemble Learning.pptx
20211229120253D6323_PERT 06_ Ensemble Learning.pptxRaflyRizky2
 
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC ...
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC                           ...Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC                           ...
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC ...PATHALAMRAJESH
 

Similar to Gender Detection on Blogs (20)

Movie Recommendation engine
Movie Recommendation engineMovie Recommendation engine
Movie Recommendation engine
 
Ensemble methods
Ensemble methods Ensemble methods
Ensemble methods
 
C3 w5
C3 w5C3 w5
C3 w5
 
ngboost.pptx
ngboost.pptxngboost.pptx
ngboost.pptx
 
Deep learning crash course
Deep learning crash courseDeep learning crash course
Deep learning crash course
 
Text Summarization
Text SummarizationText Summarization
Text Summarization
 
MACHINE LEARNING - GENETIC ALGORITHM
MACHINE LEARNING - GENETIC ALGORITHMMACHINE LEARNING - GENETIC ALGORITHM
MACHINE LEARNING - GENETIC ALGORITHM
 
From a thousand learners to a thousand markers: Scaling peer feedback with Ad...
From a thousand learners to a thousand markers: Scaling peer feedback with Ad...From a thousand learners to a thousand markers: Scaling peer feedback with Ad...
From a thousand learners to a thousand markers: Scaling peer feedback with Ad...
 
InternshipReport
InternshipReportInternshipReport
InternshipReport
 
[Decisions2013@RecSys]The Role of Emotions in Context-aware Recommendation
[Decisions2013@RecSys]The Role of Emotions in Context-aware Recommendation[Decisions2013@RecSys]The Role of Emotions in Context-aware Recommendation
[Decisions2013@RecSys]The Role of Emotions in Context-aware Recommendation
 
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
Strata 2016 -  Lessons Learned from building real-life Machine Learning SystemsStrata 2016 -  Lessons Learned from building real-life Machine Learning Systems
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
 
CSSC ML Workshop
CSSC ML WorkshopCSSC ML Workshop
CSSC ML Workshop
 
Dowhy: An end-to-end library for causal inference
Dowhy: An end-to-end library for causal inferenceDowhy: An end-to-end library for causal inference
Dowhy: An end-to-end library for causal inference
 
Machine learning - session 3
Machine learning - session 3Machine learning - session 3
Machine learning - session 3
 
pattern classification
pattern classificationpattern classification
pattern classification
 
Automated Essay Grading using Features Selection
Automated Essay Grading using Features SelectionAutomated Essay Grading using Features Selection
Automated Essay Grading using Features Selection
 
End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF
End-to-end sequence labeling via bi-directional LSTM-CNNs-CRFEnd-to-end sequence labeling via bi-directional LSTM-CNNs-CRF
End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF
 
20211229120253D6323_PERT 06_ Ensemble Learning.pptx
20211229120253D6323_PERT 06_ Ensemble Learning.pptx20211229120253D6323_PERT 06_ Ensemble Learning.pptx
20211229120253D6323_PERT 06_ Ensemble Learning.pptx
 
Demystifying Xgboost
Demystifying XgboostDemystifying Xgboost
Demystifying Xgboost
 
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC ...
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC                           ...Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC                           ...
Copy of CRICKET MATCH WIN PREDICTOR USING LOGISTIC ...
 

Recently uploaded

SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 

Recently uploaded (20)

SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 

Gender Detection on Blogs

  • 2. Presented By (Team No. 32) Nitish Jain (201301227) Ganesh Borle (201505587) Vamshikrishna Reddy (201202177) Mentored By Lokesh Walase IRE [CSE474]
  • 4. ABSTRACT ● Through the sands of time, textual content has remained a prominent feature of internet media especially BLOGS. ● Thus, author profiling and attribution becomes an important and task and we try to capture one aspect of it, i.e gender. ● internet can’t take responsibility of the all the content, it should be the author itself. ● But . . . ● lot of content brings a lot of responsibility
  • 5. Given a text blog , can we identify whether the writer is a male or a female ? The Question
  • 6. WHO IS THE AUTHOR?
  • 8. THE APPROACH ● An ensemble is applied on these models and the input document is classified as written by male or female. ● We take advantage of the linguistic features of the blog and create a feature file. ● This feature file is then trained on various classifier and a model for each of the classifier is prepared.
  • 10. ● each document contains text of about ~35 blogs in XML format. [Dataset Link : http://u.cs.biu.ac.il/~koppel/BlogCorpus.htm ] The Dataset ● Koppels blog dataset ● contains about 19 thousand document
  • 11. PARSING ● Language used : Python ● Each blog is entry stored in XML format <Blog> <date>....... </date> <post> …. </post> ... <Blog> ● Each of the blog filename contains the name and Gender of the author
  • 13. FEATURES For our task of Gender Identification, we take the help of the following linguistic features: ● Character Based Features ● Word Based Features ● Syntactic Features ● Structural Features ● Function Words ● POS Start Probability
  • 15. THE CLASSIFICATION TASK For the task of classification, we used several classifying algorithms and arrived at a model that uses ensemble of the following classification algorithms: ● Random Forest Classifier ● Neural Networks Classifier ● Adaboost Tree Classifier ● Gradient Boosting Classifier ● Bagging Classifier
  • 16. THE CLASSIFICATION TASK For each of the classifier ● We fed it with partial features to actually see the variation of accuracies with the features. ● We applied a 10 fold validation to measure the accuracies. For measuring the accuracy of the ensemble we took the majority class from the classified results of the classifiers.
  • 17. RANDOM FOREST CLASSIFIER ● An meta estimator that fits a number of decision tree classifiers on various sub-samples of the dataset ● By using Random Forest Classifier we were able to achieve an accuracy of 69.79%
  • 18. NEURAL NETWORKS CLASSIFIER ● Consists of multiple layers of nodes with each layer fully connected to the next layer nodes and each node is a neuron with non-linear perceptron. ● Uses a supervised learning called backpropagation for training the network. ● By using Neural Networks Classifier we were able to achieve an accuracy of 69.51%
  • 19. ADABOOST TREE CLASSIFIER ● An meta estimator that begins by fitting a classifier on the original dataset and then fits the next round classifiers on the same dataset ● By using Adaboost tree Classifier we were able to achieve an accuracy of 69.57%
  • 20. GRADIENT BOOSTING CLASSIFIER ● Builds model in a forward stage-wise fashion. ● In each of the next stages weak classifiers are introduced to compensate the shortcomings of the existing weak learners and these shortcomings are identified by the gradients. ● By using Gradient Boosting Classifier we were able to achieve an accuracy of 70.81%
  • 21. BAGGING CLASSIFIER ● A meta estimator that fits the base classifiers each on random subsets of the datasets and then aggregate their individual predictions. ● By using Gradient Boosting Classifier we were able to achieve an accuracy of 70.03%
  • 22. THE ENSEMBLE ● An Ensemble takes the output of other classifier and then applies a majority voting to the outputs of the classifier to determine the output. ● By using the Ensemble model on the above discussed classifiers we were able to achieve an accuracy of 71.10%
  • 24. THE FINAL RESULTS ● By using the ensemble, we were actually able to increase our efficiency by nearly 1% in each case irrespective of the performance of the individual classifiers. ● The maximum obtainable accuracy that was shown during the experiments was 73.19% by the Ensemble model.
  • 25. 73.188406 %The maximum Accuracy Achieved
  • 26. USEFUL LINKS ● Github - https://github.com/nitishjain2007/Gender_Identification ● Youtube - https://www.youtube.com/watch?v=T04BJ6cIeTs ● Slideshare - http://bit.ly/1Q8UiCe ● Website - http://nitishjain2007.github.io/Gender_Identification/ ● Dropbox - http://bit.ly/1Xx0ppL
  • 27. REFERENCES ● http://u.cs.biu.ac.il/~koppel/papers/male-female-llc-final.pdf ● http://www.aaai.org/ocs/index. php/ICWSM/09/paper/viewFile/208/537 ● http://www.cs.columbia.edu/nlp/papers/2011/acl2011age.pdf ● http://www.ccse.kfupm.edu.sa/~ahmadsm/coe589- 121/cheng2011-gender-identification.pdf