U S I N G M A C H I N E L E A R N I N G
F R A M E W O R K
GENDER VOICE
RECOGNITION
MALLA REDDY ENGINEERING COLLEGE AND MANAGEMENT SCIENCES
M D N O U S H A D
1 9 U J 1 A 0 5 3 6
TEAM MEMBERS
T H A R U N G O U D
1 9 U J 1 A 0 5 0 3
T H I P P A N I C H A N D U
2 0 U J 5 A 0 5 0 6
M U T H Y A M
B H A S K A R
1 9 U J 1 A 0 5 4 4
P R O J E C T G U I D E : M M A H I P A L R E D D Y
1 . A I M O F T H E P R O J E C T
2 . A B S T R A C T
3 . I N T R O D U C T I O N
4 . L I T E R A T U R E S U R V E Y
5 . E X I S T I N G S Y S T E M
6 . P R O P O S E D S Y S T E M
7 . A L G O R I T H M
8 . R E Q U I R E M E N T S
9 . U M L D I A G R A M S
1 0 . A D V A N T A G E S
1 1 . D I S A D V A N T A G E S
1 2 . O U T P U T
1 3 . R E F E R E N C E S
1 4 . C O N C L U S I O N
1 5 . F U T U R E E N H A N C E M E N T S
CONTENT
AIM OF THE PROJECT
The project aims to develop an effective
gender recognition by human voice using
fast fourier transformation based machine
learning framework for enhanced accuracy
as compared to existing MLP.i
ABSTRACT
Gender Recognition (GR) means recognizing the gender of the person whether the person is men or
women. It is a significant task for human beings, as many communal functions precariously rely upon
correct gender awareness. Automatic human GR by machines has currently received symbolic attention
in computer vision community. Features like face, 3D body shape, gait (manner of walking), footwear,
fundamental frequency of voice etc are used for gender recognition. The ability to do automatic
recognition of human gender is essential for several systems that process or exploit human-source
information.
INTRODUCTION
• Gender Recognition (GR) means recognizing the gender of
the person whether the person is man or woman.
• Features like face, 3D body shape, gait (manner of walking),
footwear, fundamental frequency of voice etc are used for
gender recognition.
• In our project, a Logistic Regression has been described to
recognize voice gender.
• Our model achieves 96.74% accuracy on the test data set.
LITERATURE SURVEY
Yadav et al, considered standard data set for gender identification using machine
learning approach. Various approaches are considered to achieve greater accuracy and
low error
Altalbe et al, proposes a deep learning method based on long-term short-term memory
(LSTM) that can be used with preprocessing, segmentation, and retrieval of audio
signals from the dataset. The precision, recall, accuracy and F-measure of LSTM is
96.54%, 96.15%, 98.56% and 0.96% respectively
Sánchez-Hevia et al, deals with joint gender identification and age group classification
from speech, aimed at improving the functionalities of Interactive Voice Response
Systems. Deep Neural Networks are used, because they have recently demonstrated
discriminative and representation capabilities over a wide range of applications
Existing System
• Speech classification and processing and gender based
recognition and classification have been used for long
period of time
• In Existing System of Gender voice Recognition, There
are some Models like using MLP, Random forest etc
• These model achieves around 91% accuracy on the test
data set
Proposed System
• Our System uses Logistic Regression Algorithm.
• And we used FFT - Fast Fourier Transform algorithm to
cancel the surrounding noises.
• We added noise cancellation to our project to improve
the accuracy of the model.
• Our model achieves 96.74% accuracy on the test data
set.
ALGORITHM
Fast Fourier Series
A fast Fourier transform (FFT) is an algorithm that computes the discrete Fourier
transform (DFT) of a sequence, or its inverse (IDFT). Fourier analysis converts a signal
from its original domain (often time or space) to a representation in the frequency
domain and vice versa. The DFT is obtained by decomposing a sequence of values into
components of different frequencies.
• Logistic Regression
• Logistic regression predicts the probability of an outcome that can only have two values (i.e., a dichotomy). The prediction is based on the use of one or several predictors
(numerical and categorical). A linear regression is not appropriate for predicting the value of a binary variable for two reasons:
 A linear regression will predict values outside the acceptable range (e.g., predicting probabilities)
 outside the range 0 to 1)
 Since the dichotomous experiments can only have one of two possible values for each experiment, the residuals will not be normally distributed about the predicted line.
.
ALGORITHM(CONT...)
Requirements
Hardware Requirements
Floppy Drive : 1.44 Mb.
System : Intel I-3, 5, 7 Processor.
Monitor : 14’ Colour Monitor.
Hard Disk : 500 GB.
Ram : 2Gb.
Software Requirements
Coding Language : Python.
Operating system : Windows 7,8,10 Ultimate, Linux, Mac.
Software Environment : Anaconda,
Front-End : Python.
UML DIAGRAMS
Class diagram
Use case Diagram
UML DIAGRAMS
Sequence diagrams
UML DIAGRAMS
Component diagram
Activity Diagram
Advantages
 The ability to do automatic recognition of human gender is
essential for several systems that process or exploit human-
source information. Common examples are
information retrieval,
Human computer, or
Human-Robot intercommunication.
 Gender recognition has great importance in
Forensics,
Games,
Business intelligence,
visual surveillance.
Disadvantages
 There are limitations to speech recognition software. It
does not always work across all operating systems. Noisy
environments, accents and multiple speakers may degrade
results. Also, regular voice recognition software can lack
integration with other key services.
 There are also accuracy issues with most speech
recognition software. Even the most accurate speech-to-
text software solutions are only about 80%-85% accurate.
Output
Output
1. Chen, X., 2022. Design of Political Online Teaching Based on Artificial Speech Recognition and Deep Learning.
Computational Intelligence and Neuroscience, 2022.
2. Jayne, C., Chang, V., Bailey, J., Xu, Q.A. (2022). Automatic Accent and Gender Recognition of Regional UK Speakers. In:
Iliadis, L., Jayne, C., Tefas, A., Pimenidis, E. (eds) Engineering Applications of Neural Networks. EANN 2022.
Communications in Computer and Information Science, vol 1600. Springer, Cham. https://doi.org/10.1007/978-3-031-08223-
8_6
3. Wani, T.M. et al. (2021). Multilanguage Speech-Based Gender Classification Using Time-Frequency Features and SVM
Classifier. In: , et al. Advances in Robotics, Automation and Data Analytics. iCITES 2020. Advances in Intelligent Systems and
Computing, vol 1350. Springer, Cham. https://doi.org/10.1007/978-3-030-70917-4_1
4. Yadav, C.S., Yadav, M., Yadav, P.S.S., Kumar, R., Yadav, S., Yadav, K.S. (2021). Effect of Normalisation for Gender
Identification. In: K V, S., Rao, K. (eds) Smart Sensors Measurements and Instrumentation. Lecture Notes in Electrical
Engineering, vol 750. Springer, Singapore. https://doi.org/10.1007/978-981-16-0336-5_13
5. Jha, T., Kavya, R., Christopher, J. et al. Machine learning techniques for speech emotion recognition using paralinguistic
acoustic features. Int J Speech Technol 25, 707–725 (2022). https://doi.org/10.1007/s10772-022-09985-6
.
REFERENCES
CONCLUSION
the geThe model is found to be accurate for determining the gender using the
voice and speech samples of the humans. The proposed model is less resource
intensive and also has the ability to train faster with high accuracy due to use
of SVM as the core. The results obtained by employing the combination of
Principal component analysis and Support vector machine is close to the
result obtained by using ML-deep learning. Also, the model can be trained by
using smaller dataset unlike the case of the deep learning.
speech samples of the humans. The proposed model is less resource intensive and also has the ability to train faster with high accuracy due to use of SVM as the core. The results obtained by employing the combination of Principal
component analysis and Support vector machine is close to the result obtained by using ML-deep learning. Also, the model caand also has the ability to train faster with high accuracy due to use of SVM as the core. The results
 Emotion analysis
 Sentiment analysis
 Speech synthesis
 Speech recognition
 Customer segmentation
Future Enhancements
 Recommendation of products to customer in online
shopping
Thank You

Gender voice recognition.pptx

  • 1.
    U S IN G M A C H I N E L E A R N I N G F R A M E W O R K GENDER VOICE RECOGNITION MALLA REDDY ENGINEERING COLLEGE AND MANAGEMENT SCIENCES
  • 2.
    M D NO U S H A D 1 9 U J 1 A 0 5 3 6 TEAM MEMBERS T H A R U N G O U D 1 9 U J 1 A 0 5 0 3 T H I P P A N I C H A N D U 2 0 U J 5 A 0 5 0 6 M U T H Y A M B H A S K A R 1 9 U J 1 A 0 5 4 4 P R O J E C T G U I D E : M M A H I P A L R E D D Y
  • 3.
    1 . AI M O F T H E P R O J E C T 2 . A B S T R A C T 3 . I N T R O D U C T I O N 4 . L I T E R A T U R E S U R V E Y 5 . E X I S T I N G S Y S T E M 6 . P R O P O S E D S Y S T E M 7 . A L G O R I T H M 8 . R E Q U I R E M E N T S 9 . U M L D I A G R A M S 1 0 . A D V A N T A G E S 1 1 . D I S A D V A N T A G E S 1 2 . O U T P U T 1 3 . R E F E R E N C E S 1 4 . C O N C L U S I O N 1 5 . F U T U R E E N H A N C E M E N T S CONTENT
  • 4.
    AIM OF THEPROJECT The project aims to develop an effective gender recognition by human voice using fast fourier transformation based machine learning framework for enhanced accuracy as compared to existing MLP.i
  • 5.
    ABSTRACT Gender Recognition (GR)means recognizing the gender of the person whether the person is men or women. It is a significant task for human beings, as many communal functions precariously rely upon correct gender awareness. Automatic human GR by machines has currently received symbolic attention in computer vision community. Features like face, 3D body shape, gait (manner of walking), footwear, fundamental frequency of voice etc are used for gender recognition. The ability to do automatic recognition of human gender is essential for several systems that process or exploit human-source information.
  • 6.
    INTRODUCTION • Gender Recognition(GR) means recognizing the gender of the person whether the person is man or woman. • Features like face, 3D body shape, gait (manner of walking), footwear, fundamental frequency of voice etc are used for gender recognition. • In our project, a Logistic Regression has been described to recognize voice gender. • Our model achieves 96.74% accuracy on the test data set.
  • 7.
    LITERATURE SURVEY Yadav etal, considered standard data set for gender identification using machine learning approach. Various approaches are considered to achieve greater accuracy and low error Altalbe et al, proposes a deep learning method based on long-term short-term memory (LSTM) that can be used with preprocessing, segmentation, and retrieval of audio signals from the dataset. The precision, recall, accuracy and F-measure of LSTM is 96.54%, 96.15%, 98.56% and 0.96% respectively Sánchez-Hevia et al, deals with joint gender identification and age group classification from speech, aimed at improving the functionalities of Interactive Voice Response Systems. Deep Neural Networks are used, because they have recently demonstrated discriminative and representation capabilities over a wide range of applications
  • 8.
    Existing System • Speechclassification and processing and gender based recognition and classification have been used for long period of time • In Existing System of Gender voice Recognition, There are some Models like using MLP, Random forest etc • These model achieves around 91% accuracy on the test data set
  • 9.
    Proposed System • OurSystem uses Logistic Regression Algorithm. • And we used FFT - Fast Fourier Transform algorithm to cancel the surrounding noises. • We added noise cancellation to our project to improve the accuracy of the model. • Our model achieves 96.74% accuracy on the test data set.
  • 10.
    ALGORITHM Fast Fourier Series Afast Fourier transform (FFT) is an algorithm that computes the discrete Fourier transform (DFT) of a sequence, or its inverse (IDFT). Fourier analysis converts a signal from its original domain (often time or space) to a representation in the frequency domain and vice versa. The DFT is obtained by decomposing a sequence of values into components of different frequencies.
  • 11.
    • Logistic Regression •Logistic regression predicts the probability of an outcome that can only have two values (i.e., a dichotomy). The prediction is based on the use of one or several predictors (numerical and categorical). A linear regression is not appropriate for predicting the value of a binary variable for two reasons:  A linear regression will predict values outside the acceptable range (e.g., predicting probabilities)  outside the range 0 to 1)  Since the dichotomous experiments can only have one of two possible values for each experiment, the residuals will not be normally distributed about the predicted line. . ALGORITHM(CONT...)
  • 12.
    Requirements Hardware Requirements Floppy Drive: 1.44 Mb. System : Intel I-3, 5, 7 Processor. Monitor : 14’ Colour Monitor. Hard Disk : 500 GB. Ram : 2Gb. Software Requirements Coding Language : Python. Operating system : Windows 7,8,10 Ultimate, Linux, Mac. Software Environment : Anaconda, Front-End : Python.
  • 13.
  • 14.
  • 15.
  • 16.
    Advantages  The abilityto do automatic recognition of human gender is essential for several systems that process or exploit human- source information. Common examples are information retrieval, Human computer, or Human-Robot intercommunication.  Gender recognition has great importance in Forensics, Games, Business intelligence, visual surveillance.
  • 17.
    Disadvantages  There arelimitations to speech recognition software. It does not always work across all operating systems. Noisy environments, accents and multiple speakers may degrade results. Also, regular voice recognition software can lack integration with other key services.  There are also accuracy issues with most speech recognition software. Even the most accurate speech-to- text software solutions are only about 80%-85% accurate.
  • 18.
  • 19.
  • 20.
    1. Chen, X.,2022. Design of Political Online Teaching Based on Artificial Speech Recognition and Deep Learning. Computational Intelligence and Neuroscience, 2022. 2. Jayne, C., Chang, V., Bailey, J., Xu, Q.A. (2022). Automatic Accent and Gender Recognition of Regional UK Speakers. In: Iliadis, L., Jayne, C., Tefas, A., Pimenidis, E. (eds) Engineering Applications of Neural Networks. EANN 2022. Communications in Computer and Information Science, vol 1600. Springer, Cham. https://doi.org/10.1007/978-3-031-08223- 8_6 3. Wani, T.M. et al. (2021). Multilanguage Speech-Based Gender Classification Using Time-Frequency Features and SVM Classifier. In: , et al. Advances in Robotics, Automation and Data Analytics. iCITES 2020. Advances in Intelligent Systems and Computing, vol 1350. Springer, Cham. https://doi.org/10.1007/978-3-030-70917-4_1 4. Yadav, C.S., Yadav, M., Yadav, P.S.S., Kumar, R., Yadav, S., Yadav, K.S. (2021). Effect of Normalisation for Gender Identification. In: K V, S., Rao, K. (eds) Smart Sensors Measurements and Instrumentation. Lecture Notes in Electrical Engineering, vol 750. Springer, Singapore. https://doi.org/10.1007/978-981-16-0336-5_13 5. Jha, T., Kavya, R., Christopher, J. et al. Machine learning techniques for speech emotion recognition using paralinguistic acoustic features. Int J Speech Technol 25, 707–725 (2022). https://doi.org/10.1007/s10772-022-09985-6 . REFERENCES
  • 21.
    CONCLUSION the geThe modelis found to be accurate for determining the gender using the voice and speech samples of the humans. The proposed model is less resource intensive and also has the ability to train faster with high accuracy due to use of SVM as the core. The results obtained by employing the combination of Principal component analysis and Support vector machine is close to the result obtained by using ML-deep learning. Also, the model can be trained by using smaller dataset unlike the case of the deep learning. speech samples of the humans. The proposed model is less resource intensive and also has the ability to train faster with high accuracy due to use of SVM as the core. The results obtained by employing the combination of Principal component analysis and Support vector machine is close to the result obtained by using ML-deep learning. Also, the model caand also has the ability to train faster with high accuracy due to use of SVM as the core. The results
  • 22.
     Emotion analysis Sentiment analysis  Speech synthesis  Speech recognition  Customer segmentation Future Enhancements  Recommendation of products to customer in online shopping
  • 23.