International Journal of Scientific Research and Engineering Development-– Volume 2 Issue 5, September 2019
Available at www.ijsred.com
ISSN : 2581-7175 ©IJSRED: All Rights are Reserved Page 466
Voice Recognition System Based on Machine
Learning Algorithm
Thorat Sanjay*,Raskar Sonali**, Pawar Dhanshree***,
Jadhav Ashwini****,Pawar Akash*****
*(Department of Information Technology, SPPU/ SVPM College Of Engineering , Malegaon bk
Email: sanjaythorat79@gmail.com)
**(Department of Information Technology, SPPU/ SVPM College Of Engineering , Malegaon bk
Email: sonaliraskar2016@gmail.com)
***(Department of Information Technology, SPPU/ SVPM College Of Engineering , Malegaon bk
Email: dhanshreepawar2016@gmail.com)
****(Department of Information Technology, SPPU/ SVPM College Of Engineering , Malegaon bk
Email:ashwinijadhav1090@gmail.com)
*****(Department of Information Technology, SPPU/ SVPM College Of Engineering , Malegaon bk
Email: aakashpawar3299@gmail.com)
----------------------------------------************************----------------------------------
Abstract:
Speaker acknowledgment is a system used to consequently perceive a speaker from a chronicle of their
voice or discourse articulation. Speaker acknowledgment innovation has improved over ongoing years and
has turned out to be modest and dependable technique for individual distinguishing proof and check.
Research in the field of speaker acknowledgment has now traversed more than five decades and has
demonstrated productive outcomes, anyway there isn't much work finished concerning South African
indigenous dialects. This paper introduces the improvement of a programmed speaker acknowledgment
framework that fuses characterization and acknowledgment of Sepedi home language speakers. Four
classifier models, to be specific, Support Vector Machines, K-Nearest Neighbors, Multilayer Perceptrons
(MLP) and Random Forest (RF), are prepared utilizing WEKA information mining device. Auto-WEKA
is applied to decide the best classifier model together with its best hyper-parameters.
Keywords — Convolutional Neural Network, K-Nearest Neighbors, Face Recognition, Voice
Recognition.
----------------------------------------************************----------------------------------
I. INTRODUCTION
As users store their information to various services
across the Internet, it can be accessible by
unauthorized people. So future safety is the main
topic in cloud computing. To provide security we
require proper authentication technique using face
and voice recognition. Discourse sign contains
various degrees of data. Discourse sign can be
utilized for discourse acknowledgment, speaker
acknowledgment or voice order acknowledgment
framework. Speaker acknowledgment is utilized for
some, discourse preparing applications particularly
security and confirmation
PROJECT AIMS AND OBJECTIVES
A.Project Aim
A total face and voice acknowledgment framework
incorporates face discovery voice acknowledgment
face pre-preparing and face acknowledgment forms.
RESEARCH ARTICLE OPEN ACCESS
International Journal of Scientific Research and Engineering Development-– Volume 2 Issue 5, September 2019
Available at www.ijsred.com
ISSN : 2581-7175 ©IJSRED: All Rights are Reserved Page 467
A total face and voice acknowledgment framework
incorporates face identification voice
acknowledgment face pre-handling and face
acknowledgment forms. In this way it is important
to draw out the face area from the face discovery
procedure and autonomous the face from the
foundation design which gives the premise to the
consequent extraction of the face distinction
highlights. The ongoing ascent of the face
dependent on the profundity of learning
identification techniques and voice recognize for
the authentication. The late ascent of the face
dependent on the profundity of learning location
strategies and voice acknowledgment for the
validation.
1.Objectives
To detect the face and voice recognition for
recognize purpose to identify the authorized user.
2.Scope
High exactness face acknowledgment models have
been accounted for in logical examines by monster
innovation organizations and research foundations.
II. LITERATURE SURVEY
Machine Learning applied to speaker verification of
fake voice recordings.
For measurable applications, speaker confirmation
comprises of assessing regardless of whether the
voice of a presume matches the proof sound
chronicle. An answer in light of AI for speaker
confirmation of sounds with phony inflection.
Neural network based speaker classification and
verification systems with enhanced features.
A neural system structure for content ward speaker
characterization what's more, confirmation .It
incorporates speaker discovery, i.e, location if there
is speaker in the sound, speaker recognizable proof,
i.e, distinguishing whose voice it is, speaker check
or then again verification i.e, checking someone’s
voice.
A survey on speaker on speaker recognition with
various features extraction and classification
techniques.
Speech processing is more important day by day for
providing immense security. Verification purpose
speech is widely used. Speaker recognition is a
process which can verify the person who is
speaking. The main aim is to identify the speaker
by extraction , characterization and recognition of
information contained in speech signal.
I. Proposed System
When comparing the differences between different
biostatistics, we can see that the cost of face
detection is low, the acceptance from user is easy,
and the acquisition of information is easy. Face
detection is the use of computer vision applications
and related algorithms, from the pictures to and
faces, voice and then analysis of the identity .In
addition, future analysis of the acquired face, voice
may conduct some additional attributes of the
individual.
The proposed system outcome is to recognise
authorized person for using the face and voice
recognition.
Today both biometrics voice and face recognition
software options are being implemented in both
private and professional premises. Since they both
have so many benefits to offer, these biometrics
software technologies are often used together or
independently to provide better security. Biometric
security devices are becoming more popular
because of the sound technology and high accuracy
they have to offer
1. Voice Recognition.
The principle target of speaker acknowledgment is
to change over the acoustic sound sign into PC
dependable structure. Speaker acknowledgment
frameworks include two stages, for example,
preparing and testing. In preparing procedure accept
the contribution as discourse sign and highlight
extraction is finished utilizing highlight extraction
strategy. Highlight vectors speaking to the voice
qualities of the speaker and are utilized for structure
reference model. Real acknowledgment undertaking
is in trying stage. In testing stage speaker voice is
International Journal of Scientific Research and Engineering Development-– Volume 2 Issue 5, September 2019
Available at www.ijsred.com
ISSN : 2581-7175 ©IJSRED: All Rights are Reserved Page 468
coordinating with reference model utilizing some
coordinating procedure. After degree of
coordinating choice is finished.
2. Face Recognition.
A facial acknowledgment framework is an
innovation fit for distinguishing or confirming an
individual from an advanced picture or a video
outline from a video source. There are different
strategies in which facial acknowledgment
frameworks work, yet as a rule, they work by
looking at chosen facial highlights from given
picture with countenances inside a database. It is
additionally depicted as a Biometric Artificial
Intelligence based application that can interestingly
recognize an individual by examining examples
dependent on the individual's facial surfaces and
shape.
Proposed System
Biometric distinguishing proof is an obligatory
apparatus to verify advanced data for different
mechanical, government, business, and security
applications. Face acknowledgment is an
unmistakable issue and comes up short on an
interesting arrangement relevant to all
circumstances. Face acknowledgment isn't
successful in distinguishing people in conditions,
when an individual is utilizing glasses, caps or has a
facial hair and so forth. Elective innovations like
Iris and retinal sweep biometric procedures need
modern gear, which isn't monetarily reasonable for
all applications. Voice acknowledgment techniques
have low precision and are influenced by
circumstances where an adjustment in an
individual's voice because of sickness like virus
render outright recognizable proof off base. This
paper proposes a biometric strategy actualizing
numerous strategies i.e., both face and voice
acknowledgment method as a successful ID device.
4.Algorithm Used In Proposed System.
A] K-Nearest Neighbors
In example acknowledgment, the k-closest
neighbors calculation (k-NN) is a non-parametric
strategy utilized for grouping and relapse. In the
two cases, the info comprises of the k nearest
preparing models in the element space. The yield
relies upon whether k-NN is utilized for order or
relapse:
Both for order and relapse, a valuable system can
be to dole out loads to the commitments of the
neighbors, so that the closer neighbors contribute
more to the normal than the more inaccessible ones.
For instance, a typical weighting plan comprises in
giving each neighbor a load of 1/d, where d is the
separation to the neighbor.K-NN is a sort of
occurrence based learning, or sluggish realizing,
where the capacity is just approximated locally and
all calculation is conceded until grouping.
B] Convolutional Neural Network
CNNs are regularized variants of multilayer
perceptrons. Multilayer perceptrons typically mean
completely associated systems, that is, every neuron
in one layer is associated with all neurons in the
following layer. The "completely connectedness" of
these systems makes them inclined to overfitting
information. Common methods for regularization
incorporate including some type of size estimation
of loads to the misfortune work. Be that as it may,
CNNs adopt an alternate strategy towards
regularization: they exploit the various leveled
design in information and gather increasingly
complex examples utilizing littler and less difficult
examples. Thusly, on the size of connectedness and
unpredictability, CNNs are on the lower outrageous.
They are otherwise called move invariant or space
invariant counterfeit neural systems (SIANN), in
light of their common loads design and
interpretation invariance qualities.
II. Application
A] For security purpose.
B] For Authentication purpose.
C] To secure machine to machine networks.
D] For identification and recognition of person.
International Journal of Scientific Research and Engineering Development-– Volume 2 Issue 5, September 2019
Available at www.ijsred.com
ISSN : 2581-7175 ©IJSRED: All Rights are Reserved Page 469
III. Conclusion
The general procedure of changing over
information in reasonable arrangement incorporates
different steps, for example, making unmistakable
portrayal of information anticipating future
outcomes and so forth. To set up such a framework,
a graphic language was structured, through which
the client can depict examination task effectively.
An information base was set up so that our
framework could choose, and make prediction.
Choice tree the most suitable calculation for
anticipating results dependent on information.
Various tale strategies were actualized to empower
the programmed achievement of the entire
information examination process. Intensive this
means we comprehended the whole procedure of
information analytics which incorporates
anticipating the future dangers and results on the
present information.
VI.REFERENCES
[1]Das, T.; Nahar, K.M. A voice identifcation
system using hidden Markov model. Indian J. Sci.
Technol. 2016, 9, 4.
[2] Makary, M.A.; Daniel, M. Medical error, The
third leading cause of death in the US. BMJ 2016,
353.
[3] Damacharla, P.; Dhakal, P.; Stumbo, S.; Javaid,
A.Y.; Ganapathy, S.; Malek, D.A.; Hodge, D.C.;
Devabhaktuni, V.E_ects of voice-based synthetic
assistant on performance of emergency care
provider in training. Int. J. Artif. Intell. Educ. 2018.
[4] Damacharla, P.; Javaid, A.Y.; Gallimore, J.J.;
Devabhaktuni, V.K. Common metrics to
benchmarkhuman-machine teams (HMT): A review.
IEEE Access 2018,6, 38637{38655.
[5] Hinton, G.; Deng, L.; Yu, D.; Dahl, G.;
Mohamed, A.R.; Jaitly, N.; Senior, A.;Vanhoucke,
V.; Nguyen, P.; Kingsbury, B.; et al. Deep neural
networks for acoustic modeling in speech
recognition: The shared views of four research
groups. IEEESignal. Process. Mag. 2012, 29,
82{97}.
[6] Cutajar, M.; Gatt, E.; Grech, I.; Casha, O.;
Micallef, J. Comparative study of automatic speech
recognition techniques. IET Signal. Process. 2013,
7, 25{46.
[7] Fernndez-Delgado, M.; Cernadas, E.; Barro, S.;
Amorim, D. Do we need hundreds of classifiers to
solve real-world classification problems. J. Mach.
Learn. Res. 2014,
15, 3133{3181.
[8] Weinberg, M.; Alipanahi, B.; Frey, B.J. Are
random forests truly the best classifiers J. Machine
Learning. Res. 2016, 17, 3837{3841.

IJSRED-V2I5P43

  • 1.
    International Journal ofScientific Research and Engineering Development-– Volume 2 Issue 5, September 2019 Available at www.ijsred.com ISSN : 2581-7175 ©IJSRED: All Rights are Reserved Page 466 Voice Recognition System Based on Machine Learning Algorithm Thorat Sanjay*,Raskar Sonali**, Pawar Dhanshree***, Jadhav Ashwini****,Pawar Akash***** *(Department of Information Technology, SPPU/ SVPM College Of Engineering , Malegaon bk Email: sanjaythorat79@gmail.com) **(Department of Information Technology, SPPU/ SVPM College Of Engineering , Malegaon bk Email: sonaliraskar2016@gmail.com) ***(Department of Information Technology, SPPU/ SVPM College Of Engineering , Malegaon bk Email: dhanshreepawar2016@gmail.com) ****(Department of Information Technology, SPPU/ SVPM College Of Engineering , Malegaon bk Email:ashwinijadhav1090@gmail.com) *****(Department of Information Technology, SPPU/ SVPM College Of Engineering , Malegaon bk Email: aakashpawar3299@gmail.com) ----------------------------------------************************---------------------------------- Abstract: Speaker acknowledgment is a system used to consequently perceive a speaker from a chronicle of their voice or discourse articulation. Speaker acknowledgment innovation has improved over ongoing years and has turned out to be modest and dependable technique for individual distinguishing proof and check. Research in the field of speaker acknowledgment has now traversed more than five decades and has demonstrated productive outcomes, anyway there isn't much work finished concerning South African indigenous dialects. This paper introduces the improvement of a programmed speaker acknowledgment framework that fuses characterization and acknowledgment of Sepedi home language speakers. Four classifier models, to be specific, Support Vector Machines, K-Nearest Neighbors, Multilayer Perceptrons (MLP) and Random Forest (RF), are prepared utilizing WEKA information mining device. Auto-WEKA is applied to decide the best classifier model together with its best hyper-parameters. Keywords — Convolutional Neural Network, K-Nearest Neighbors, Face Recognition, Voice Recognition. ----------------------------------------************************---------------------------------- I. INTRODUCTION As users store their information to various services across the Internet, it can be accessible by unauthorized people. So future safety is the main topic in cloud computing. To provide security we require proper authentication technique using face and voice recognition. Discourse sign contains various degrees of data. Discourse sign can be utilized for discourse acknowledgment, speaker acknowledgment or voice order acknowledgment framework. Speaker acknowledgment is utilized for some, discourse preparing applications particularly security and confirmation PROJECT AIMS AND OBJECTIVES A.Project Aim A total face and voice acknowledgment framework incorporates face discovery voice acknowledgment face pre-preparing and face acknowledgment forms. RESEARCH ARTICLE OPEN ACCESS
  • 2.
    International Journal ofScientific Research and Engineering Development-– Volume 2 Issue 5, September 2019 Available at www.ijsred.com ISSN : 2581-7175 ©IJSRED: All Rights are Reserved Page 467 A total face and voice acknowledgment framework incorporates face identification voice acknowledgment face pre-handling and face acknowledgment forms. In this way it is important to draw out the face area from the face discovery procedure and autonomous the face from the foundation design which gives the premise to the consequent extraction of the face distinction highlights. The ongoing ascent of the face dependent on the profundity of learning identification techniques and voice recognize for the authentication. The late ascent of the face dependent on the profundity of learning location strategies and voice acknowledgment for the validation. 1.Objectives To detect the face and voice recognition for recognize purpose to identify the authorized user. 2.Scope High exactness face acknowledgment models have been accounted for in logical examines by monster innovation organizations and research foundations. II. LITERATURE SURVEY Machine Learning applied to speaker verification of fake voice recordings. For measurable applications, speaker confirmation comprises of assessing regardless of whether the voice of a presume matches the proof sound chronicle. An answer in light of AI for speaker confirmation of sounds with phony inflection. Neural network based speaker classification and verification systems with enhanced features. A neural system structure for content ward speaker characterization what's more, confirmation .It incorporates speaker discovery, i.e, location if there is speaker in the sound, speaker recognizable proof, i.e, distinguishing whose voice it is, speaker check or then again verification i.e, checking someone’s voice. A survey on speaker on speaker recognition with various features extraction and classification techniques. Speech processing is more important day by day for providing immense security. Verification purpose speech is widely used. Speaker recognition is a process which can verify the person who is speaking. The main aim is to identify the speaker by extraction , characterization and recognition of information contained in speech signal. I. Proposed System When comparing the differences between different biostatistics, we can see that the cost of face detection is low, the acceptance from user is easy, and the acquisition of information is easy. Face detection is the use of computer vision applications and related algorithms, from the pictures to and faces, voice and then analysis of the identity .In addition, future analysis of the acquired face, voice may conduct some additional attributes of the individual. The proposed system outcome is to recognise authorized person for using the face and voice recognition. Today both biometrics voice and face recognition software options are being implemented in both private and professional premises. Since they both have so many benefits to offer, these biometrics software technologies are often used together or independently to provide better security. Biometric security devices are becoming more popular because of the sound technology and high accuracy they have to offer 1. Voice Recognition. The principle target of speaker acknowledgment is to change over the acoustic sound sign into PC dependable structure. Speaker acknowledgment frameworks include two stages, for example, preparing and testing. In preparing procedure accept the contribution as discourse sign and highlight extraction is finished utilizing highlight extraction strategy. Highlight vectors speaking to the voice qualities of the speaker and are utilized for structure reference model. Real acknowledgment undertaking is in trying stage. In testing stage speaker voice is
  • 3.
    International Journal ofScientific Research and Engineering Development-– Volume 2 Issue 5, September 2019 Available at www.ijsred.com ISSN : 2581-7175 ©IJSRED: All Rights are Reserved Page 468 coordinating with reference model utilizing some coordinating procedure. After degree of coordinating choice is finished. 2. Face Recognition. A facial acknowledgment framework is an innovation fit for distinguishing or confirming an individual from an advanced picture or a video outline from a video source. There are different strategies in which facial acknowledgment frameworks work, yet as a rule, they work by looking at chosen facial highlights from given picture with countenances inside a database. It is additionally depicted as a Biometric Artificial Intelligence based application that can interestingly recognize an individual by examining examples dependent on the individual's facial surfaces and shape. Proposed System Biometric distinguishing proof is an obligatory apparatus to verify advanced data for different mechanical, government, business, and security applications. Face acknowledgment is an unmistakable issue and comes up short on an interesting arrangement relevant to all circumstances. Face acknowledgment isn't successful in distinguishing people in conditions, when an individual is utilizing glasses, caps or has a facial hair and so forth. Elective innovations like Iris and retinal sweep biometric procedures need modern gear, which isn't monetarily reasonable for all applications. Voice acknowledgment techniques have low precision and are influenced by circumstances where an adjustment in an individual's voice because of sickness like virus render outright recognizable proof off base. This paper proposes a biometric strategy actualizing numerous strategies i.e., both face and voice acknowledgment method as a successful ID device. 4.Algorithm Used In Proposed System. A] K-Nearest Neighbors In example acknowledgment, the k-closest neighbors calculation (k-NN) is a non-parametric strategy utilized for grouping and relapse. In the two cases, the info comprises of the k nearest preparing models in the element space. The yield relies upon whether k-NN is utilized for order or relapse: Both for order and relapse, a valuable system can be to dole out loads to the commitments of the neighbors, so that the closer neighbors contribute more to the normal than the more inaccessible ones. For instance, a typical weighting plan comprises in giving each neighbor a load of 1/d, where d is the separation to the neighbor.K-NN is a sort of occurrence based learning, or sluggish realizing, where the capacity is just approximated locally and all calculation is conceded until grouping. B] Convolutional Neural Network CNNs are regularized variants of multilayer perceptrons. Multilayer perceptrons typically mean completely associated systems, that is, every neuron in one layer is associated with all neurons in the following layer. The "completely connectedness" of these systems makes them inclined to overfitting information. Common methods for regularization incorporate including some type of size estimation of loads to the misfortune work. Be that as it may, CNNs adopt an alternate strategy towards regularization: they exploit the various leveled design in information and gather increasingly complex examples utilizing littler and less difficult examples. Thusly, on the size of connectedness and unpredictability, CNNs are on the lower outrageous. They are otherwise called move invariant or space invariant counterfeit neural systems (SIANN), in light of their common loads design and interpretation invariance qualities. II. Application A] For security purpose. B] For Authentication purpose. C] To secure machine to machine networks. D] For identification and recognition of person.
  • 4.
    International Journal ofScientific Research and Engineering Development-– Volume 2 Issue 5, September 2019 Available at www.ijsred.com ISSN : 2581-7175 ©IJSRED: All Rights are Reserved Page 469 III. Conclusion The general procedure of changing over information in reasonable arrangement incorporates different steps, for example, making unmistakable portrayal of information anticipating future outcomes and so forth. To set up such a framework, a graphic language was structured, through which the client can depict examination task effectively. An information base was set up so that our framework could choose, and make prediction. Choice tree the most suitable calculation for anticipating results dependent on information. Various tale strategies were actualized to empower the programmed achievement of the entire information examination process. Intensive this means we comprehended the whole procedure of information analytics which incorporates anticipating the future dangers and results on the present information. VI.REFERENCES [1]Das, T.; Nahar, K.M. A voice identifcation system using hidden Markov model. Indian J. Sci. Technol. 2016, 9, 4. [2] Makary, M.A.; Daniel, M. Medical error, The third leading cause of death in the US. BMJ 2016, 353. [3] Damacharla, P.; Dhakal, P.; Stumbo, S.; Javaid, A.Y.; Ganapathy, S.; Malek, D.A.; Hodge, D.C.; Devabhaktuni, V.E_ects of voice-based synthetic assistant on performance of emergency care provider in training. Int. J. Artif. Intell. Educ. 2018. [4] Damacharla, P.; Javaid, A.Y.; Gallimore, J.J.; Devabhaktuni, V.K. Common metrics to benchmarkhuman-machine teams (HMT): A review. IEEE Access 2018,6, 38637{38655. [5] Hinton, G.; Deng, L.; Yu, D.; Dahl, G.; Mohamed, A.R.; Jaitly, N.; Senior, A.;Vanhoucke, V.; Nguyen, P.; Kingsbury, B.; et al. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEESignal. Process. Mag. 2012, 29, 82{97}. [6] Cutajar, M.; Gatt, E.; Grech, I.; Casha, O.; Micallef, J. Comparative study of automatic speech recognition techniques. IET Signal. Process. 2013, 7, 25{46. [7] Fernndez-Delgado, M.; Cernadas, E.; Barro, S.; Amorim, D. Do we need hundreds of classifiers to solve real-world classification problems. J. Mach. Learn. Res. 2014, 15, 3133{3181. [8] Weinberg, M.; Alipanahi, B.; Frey, B.J. Are random forests truly the best classifiers J. Machine Learning. Res. 2016, 17, 3837{3841.