SlideShare a Scribd company logo
VIVEKANAND EDUCATION SOCIETY’S INSTITUTE OF
TECHNOLOGY
Department of Computer Engineering
Project Report on
Resume and CV Summarization using NLP
In partial fulfillment of the Final Year, Bachelor of Engineering (B.E.) Degree in
Computer Engineering at the University of Mumbai Academic Year 2020-2021.
Submitted by
Anjali Asrani D17B 05
Sneha Indulkar D17B 23
Kaif Siddique D17B 63
Project Mentor
Mrs. Vidya Zope
(2020-2021)
1
VIVEKANAND EDUCATION SOCIETY’S INSTITUTE OF
TECHNOLOGY
Department of Computer Engineering
Certificate
This is to certify that Anjali Asrani, Sneha Indulkar and Kaif Siddique of Final
Year Computer Engineering studying under the University of Mumbai have
satisfactorily completed the mini project on “Resume and CV Summarization
using NLP” as a part of their coursework of Mini Project for Semester-VIII under
the guidance of their mentor Mrs. Vidya Zope in the year 2020-2021.
This mini project report entitled (Resume and CV Summarization using NLP) by
(Anjali Asrani, Sneha Indulkar and Kaif Siddique) is approved for the degree of
____________.
Programme Outcomes Grade
PO1,PO2,PO3,PO4,PO5,PO6,PO7,
PO8, PO9, PO10, PO11, PO12
PSO1, PSO2
Date:
Project Guide : Mrs. Vidya Zope
2
MINI PROJECT REPORT APPROVAL FOR B. E
(COMPUTER ENGINEERING)
This mini project report entitled Resume and CV Summarization using NLP by
Anjali Asrani, Sneha Indulkar and Kaif Siddique is approved for the degree of
B.E Computer Engg.
Internal Examiner
---------------------------------------------
External Examiner
---------------------------------------------
Head of the Department
-----------------------------------------------
Principal
-----------------------------------------------
Date:
Place:
3
Declaration
We declare that this written submission represents our ideas in our own words and
where others' ideas or words have been included, we have adequately cited and
referenced the original sources. We also declare that we have adhered to all
principles of academic honesty and integrity and have not misrepresented or
fabricated or falsified any idea/data/fact/source in our submission. We understand
that any violation of the above will be cause for disciplinary action by the Institute
and can also evoke penal action from the sources which have thus not been
properly cited or from whom proper permission has not been taken when needed.
______________________________ ______________________________
Anjali Asrani 05 Sneha Indulkar 23
______________________________
Kaif siddiqui 63
Date :
4
ACKNOWLEDGEMENT
We are thankful to our college Vivekanand Education Society’s Institute of Technology
for considering our project and extending help at all stages needed during our work of
collecting information regarding the project.
It gives us immense pleasure to express our deep and sincere gratitude to Assistant
Professor Mrs. Vidya Zope (Project Guide) for her kind help and valuable advice during the
development of project synopsis and for her guidance and suggestions.
We are deeply indebted to Head of the Computer Department Dr.(Mrs.) Nupur Giri
and our Principal Dr. (Mrs.) J.M. Nair for giving us this valuable opportunity to do this
project.
We express our hearty thanks to them for their assistance without which it would have
been difficult in finishing this project synopsis and project review successfully.
We convey our deep sense of gratitude to all teaching and non-teaching staff for their
constant encouragement, support and selfless help throughout the project work. It is great
pleasure to acknowledge the help and suggestion, which we received from the Department of
Computer Engineering.
We wish to express our profound thanks to all those who helped us in gathering
information about the project. Our families too have provided moral support and
encouragement at several times.
5
Computer Engineering Department
COURSE OUTCOMES FOR B.E PROJECT
Learners will be to:-
Course Outcome Description of the Course Outcome
CO 1 Do literature survey/industrial visit and identify the
problem of the selected project topic.
CO2 Apply basic engineering fundamental in the domain of
practical applications FORproblem identification,
formulation and solution
CO 3 Attempt & Design a problem solution in a right approach
to complex problems
CO 4 Cultivate the habit of working in a team
CO 5 Correlate the theoretical and experimental/simulations
results and draw the proper inferences
CO 6 Demonstrate the knowledge, skills and attitudes of a
professional engineer & Prepare report as per the standard
guidelines.
6
Abstract
This project proposes a model of extracting important information from the semi-structured
text format in a curriculum vitae or resume and ranking it according to the preference of the
associated company and requirements. In order to achieve the desired goal, the entire process has
been divided into 3 basic segments. The first segment consists of segmenting the entire CV /
Resume based on the topic of each part, the second segment consists of extracting data in
structured form from the unstructured data and the final segment consists of evaluating the
structured data by decision tree algorithm and training the system. The structured data extraction
process is done by segmenting the entire CV / Resume by converting it to text. After the
conversion to structured data, decision tree algorithm techniques are used to classify the input into
different categories based on qualifications, experience etc.
7
CHAPTER 1
INTRODUCTION
____________________________________________________________________________
1.1 Introduction to the project
After completing education the next phase that comes in a person’s life is a job.
However, there are lots of people who start working before completing their formal education.
While searching for jobs the most important thing to represent an applicant is Curriculum Vitae
(CV) or Resume. In this era of technology, job searching has become more smart and easier at the
same time. However, there are more than enough applicants for a single job and it is really tough
for an employer to select candidates only based on their CV / Resume. To solve this problem, there
are companies who provide specific formats for their applicants so that they can make this process
a little bit easier. Even after doing that the process is still pretty boring and most of the cases full of
errors.
Every organization has to deal with folders together with resumes. Going through these
resumes can be a tiring process added to the fact that it is very time consuming. It would help a ton
if there were to exist a model which processes the resumes and not only gives out how many and
which of them meet the requirements, but also gives a summarized version of the resumes. These
concise resumes can come of great use for a hiring company in their selection procedures , to
straight away reject those applications that are not suitable for the job description
1.2 Problem Definition
Large companies and recruitment agencies receive, process and manage hundreds of
resumes from job applicants. Besides, many people publish their resumes on the web. Dealing with
loads of resumes at once can be time consuming since all they need is a bunch of valuable resumes
that represent candidates specialized in fields the company/agency is looking for. The resumes that
one receives in their original format (pdf, docm etc.) are unstructured. The unstructured format of
these resumes , with random templates and fonts make it difficult for processing. These resumes
8
can be automatically retrieved and processed by a resume information extraction system. Extracted
information such as name, phone / mobile nos., e-mails id., qualification, experience, skill sets etc.
can be stored as structured information in a database.
1.3 Scope of project
Even though in the research one of the most feasible ways to evaluate a CV/ Resume was
detailed, the domain was kept restricted to the CVs/ Resumes of only engineering students and the
amount of sample data versus the amount of test data was relatively small. In addition to that, CVs/
Resumes with some varied layout design is out of the scope of this paper. For the future scope of
this research, the methodologies can be used for the data from CVs/ Resumes of other job
departments or the whole research can be done in a much larger scope.
1.4 CV / Resume Analyzing Process
In the past, CVs/Resumes submitted by job seekers used to be manually analyzed and
judged by the employers. This method is still followed in recent times. However, as the big
companies often need to deal with hundreds of CVs/Resumes each and every day, it has become
very problematic and time consuming to handle such a big number of CVs/Resumes one by one.
As a result, many companies started to provide specific formats or forms where the job seekers
need to fill up with required information and then the CV/Resume will be analyzed by machine
with simple pattern recognition and keywords searching. While this method reduced the workload
for the employers, it increased the amount of work for the applicants significantly as they need to
maintain different formats for each job they apply for. Additionally, it also tends to reduce the
creativity and the flexibility of writing the skills along with the qualifications in a CV/Resume.
1.5 Natural Language Processing Approach
With all the pros and cons in mind, there has always been an attempt to find an automated
method which finds the best of worlds, where the employers can easily select qualified candidates
in a short time and where the applicants can also demonstrate their creativity while maintaining
just one format to apply in different organizations. The innovation in the field of Natural Language
9
Processing [4] along with Machine Learning [5] has been really helpful in this case. The ability to
understand unstructured written language and extract important information from it to teach the
machine is exactly what is needed to analyze any written documents such as resume papers just
like human beings.
1.6 Technologies to be used
1. Software requirement :
i. Jypeter notebook
ii. Colab
2. Technology used
i. NLP
ii. Python
iii. Spacy tool
10
CHAPTER 2
LITERATURE SURVEY
______________________________________________________________________________
2.1 Research Paper Referred
● Satyaki Sanyal et. al [1] Parse information from a resume using natural language
processing, find the keywords, cluster them onto sectors based on their keywords and lastly
show the most relevant resume to the employer based on keyword matching. First, the user
uploads a resume to the web platform. The parser parses all the necessary information from
the resume and auto fills a form for the user to proofread. Once the user confirms, the
resume is saved into our NoSQL database ready to show itself to the employers. Also, t he
user gets their resume in both JSON format and pdf. .
● Dr.K.Satheesh*(Professor), A.Jahnavi 1 et. al.[2] Screening resumes out of bulk is a
challenging task and recruiters or hiring managers waste a lot of their valuable time by
searching through each and every resume. Often resumes are populated with irrelevant and
unnecessary information. Therefore, the process of parsing thousands of resumes manually
consumes a lot of time and energy thereby it makes the hiring process expensive. In
traditional hiring, resume screening is a manual process which consumes a lot of time and
energy. In this paper the process of screening resumes is automated by using advanced
Natural Language Processing which is a field in Machine Learning .Our model helps the
recruiters in screening the resumes based on job description within no time. It makes the
hiring process easy and efficient by extracting the required entities automatically by using
the Spacy NER model from the resumes and then generates a graph displaying the score of
each and every resume. Based on the scores recruiter can choose the required candidates
without rummaging through piles of resumes from unqualified candidates
11
CHAPTER 3
CONCEPTUAL SYSTEM DESIGN
______________________________________________________________________________
3.1 System diagram
The resumes that one receives in their original format (pdf, docm etc.) are unstructured.
The unstructured format of these resumes , with random templates and fonts make it difficult for
processing. These resumes can be automatically retrieved and processed by a resume information
extraction system. Extracted information such as name, phone / mobile nos., e-mails id.,
qualification, experience, skill sets etc. can be stored as structured information in a database.
Considered sample resume or CV
Segment 1 : (Name and details)
Md. Sakib Zaman Flat-A3, 127 West Kafrul, Agargaon, Taltola, Dhaka 1207 Mobile:
+8801912397694 E-mail: sakib2033@gmail.com
12
Segment 2 : (Working experience)
Employment Status Currently working as a Student Tutor/Teaching Assistant at
Department of Computer Science & Engineering, BRAC University from January 2017 Currently
working as a Student Trainer at Competitive Programming Training Session organized by
Department of Computer Science & Engineering, BRAC University and BRAC University ACM
Students Chapter from August 2016 Currently working as a Student Mentor at First Year Advising
Team, BRAC University Former Intern Software Engineer at Projukti Next from 2 nd May 2017 to
31st May 2017
Segment 3 : (Educational qualifications)
Educational Qualification Final year student of Computer Science and Engineering, BRAC
University, Dhaka CGPA: 3.74 in scale of 4.0 (till April, 2017) H.S.C. (2013) from Notre Dame
College, Dhaka GPA: 5.0 in scale of 5.0 S.S.C (2011) from Sher-E-Bangla Nagar Govt. Boys High
School, Dhaka GPA: 5.0 in scale of 5.0
Segment 4 : (Technical skills)
Field dependent Technical Skills Programming Languages: Java, C, C++, C# Operating
Systems: Windows, Linux Database System: MySQL Web: HTML5, CSS3 Segment 5 (Awards &
achievements) Achievements in Competitions and Programming 1st Runner-up in BRAC
University Intra University Programming Contest,
Segment 6 : (Projects)
Field dependent: Projects Currently working on an Online File Server System for
Educational Institutions Library Management System for Data Structure course Hospital
Management System for Data Structure course Cineplex Management System for Database course
Dhaka City Management System for Software Engineering course
13
3.2 Flowchart
We use python’s [3]spaCy module for training the NER model. spaCy’s models are statistical and
every “decision” they make — for example, which part-of-speech tag to assign, or whether a word
is a named entity — is a prediction. This prediction is based on the examples the model has seen
during training.
The model is then shown the unlabelled text and will make a prediction. Because we know
the correct answer, we can give the model feedback on its prediction in the form of an error
gradient of the loss function that calculates the difference between the training example and the
expected output. The greater the difference, the more significant the gradient and the updates to
our model.
14
CHAPTER 4
IMPLEMENTATION AND EVALUATION
____________________________________________________________________________
4.1 Resume summarization using NER:
Data Preparation:
Our first task is to create manually annotated training data to train the model. So we are
using an online automation tool called Dataturks which automatically parses the documents and
allows us to create annotations of required entities.
PDF to Text:
Our aim for this project is to come up with an end-to-end tool which takes in a document
and gives out the expected result, in this case - The category and the summary. The majority of the
resumes out there are submitted in pdf format, we decided to add a preprocessing step of
converting PDF to text, by making use of the well known Optical Character Recognition. We made
use of pdfminer under python for this task
15
Data Cleaning :
1. Unnecessary separators: A lot of resumes had separators like a string of ’-’, which was
considered to be removed too[3]..
2. Punctuation and Stop Words: Punctuation and stop words didn’t seem to add any value
to the analysis, and hence it was decided to be gotten rid of.
3. Erroneous Formatting: There were also some records with highly erroneous formatting
which came in the way of our cleaning/analysis. Getting rid of them was the best resort.
4. Personal details: Details like email id, phone numbers, dates etc would add nothing but
plain noise to the analysis which would add merely any value in the process of
classification.It was hence considered best to remove them.
Class Labels and Indexing : [3] For added ease during further processing, mapping the class
labels to numeric constants was done. This made it easier for the model to predict
16
4.2 Results and Evaluation:
17
Train data set output
18
Non train resume output:
19
CHAPTER 6
CONCLUSION AND FUTURE SCOPE
______________________________________________________________________________
6.1 Conclusion
Our application helps the recruiters to screen the resumes more efficiently thereby reducing
the cost of hiring. This will provide a potential candidate to the organization and the candidate will
be successfully placed in an organization which appreciates his/her skill set and ability.
6.2 Future Scope
The application can be extended further to other domains like Telecom, Healthcare,
E-commerce and public sector jobs
20
CHAPTER 7
REFERENCES
______________________________________________________________________________
● [1] Resume Parser with Natural Language Processing:
https://www.researchgate.net/publication/313851778_Resume_Parser_with_Natural_Languag
e_Processing / international Journal of Engineering Science and Computing, February 2017
● [2]Resume Ranking based on Job Description using SpaCy NER model:
https://www.irjet.net/archives/V7/i5/IRJET-V7I516.pdf / International Research Journal of
Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 05 | May 2020
www.irjet.net p-ISSN: 2395-0072
● [3] Article : A Review of Named Entity Recognition (NER) Using Automatic
Summarization of Resumes
https://towardsdatascience.com/a-review-of-named-entity-recognition-ner-using-automatic-su
mmarization-of-resumes-5248a75de175
21
22

More Related Content

What's hot

Attendance management system project report.
Attendance management system project report.Attendance management system project report.
Attendance management system project report.Manoj Kumar
 
ONLINE E-WASTE COLLECTION SYSTEM project Report (Approved)
ONLINE E-WASTE COLLECTION SYSTEM project Report (Approved)ONLINE E-WASTE COLLECTION SYSTEM project Report (Approved)
ONLINE E-WASTE COLLECTION SYSTEM project Report (Approved)Amit Mangukiya
 
Face Recognition Attendance System
Face Recognition Attendance System Face Recognition Attendance System
Face Recognition Attendance System Shreya Dandavate
 
SRS for online examination system
SRS for online examination systemSRS for online examination system
SRS for online examination systemlunarrain
 
Project report of OCR Recognition
Project report of OCR RecognitionProject report of OCR Recognition
Project report of OCR RecognitionBharat Kalia
 
Online Ticket Reservation System-SRS, ERD, DFD, Structured Charts
Online Ticket Reservation System-SRS, ERD, DFD, Structured ChartsOnline Ticket Reservation System-SRS, ERD, DFD, Structured Charts
Online Ticket Reservation System-SRS, ERD, DFD, Structured Chartsgrandhiprasuna
 
Documentation of railway reservation system
Documentation of railway reservation systemDocumentation of railway reservation system
Documentation of railway reservation systemSandip Murari
 
Communication primitives
Communication primitivesCommunication primitives
Communication primitivesStudent
 
Job portal project documentary
Job portal project documentaryJob portal project documentary
Job portal project documentaryUmang_jain
 
CSE Final Year Project Presentation on Android Application
CSE Final Year Project Presentation on Android ApplicationCSE Final Year Project Presentation on Android Application
CSE Final Year Project Presentation on Android ApplicationAhammad Karim
 
college website project report
college website project reportcollege website project report
college website project reportMahendra Choudhary
 
Payroll Management System Complete Report
Payroll Management System Complete ReportPayroll Management System Complete Report
Payroll Management System Complete ReportSavio Aberneithie
 

What's hot (20)

Srs2 Job Portal
Srs2 Job PortalSrs2 Job Portal
Srs2 Job Portal
 
Attendance management system project report.
Attendance management system project report.Attendance management system project report.
Attendance management system project report.
 
BULK SMS SENDER project report
BULK SMS SENDER project reportBULK SMS SENDER project report
BULK SMS SENDER project report
 
PPT.pptx
PPT.pptxPPT.pptx
PPT.pptx
 
ONLINE E-WASTE COLLECTION SYSTEM project Report (Approved)
ONLINE E-WASTE COLLECTION SYSTEM project Report (Approved)ONLINE E-WASTE COLLECTION SYSTEM project Report (Approved)
ONLINE E-WASTE COLLECTION SYSTEM project Report (Approved)
 
Face Recognition Attendance System
Face Recognition Attendance System Face Recognition Attendance System
Face Recognition Attendance System
 
SRS for online examination system
SRS for online examination systemSRS for online examination system
SRS for online examination system
 
Multi user chat system using java
Multi user chat system using javaMulti user chat system using java
Multi user chat system using java
 
Exam system
Exam systemExam system
Exam system
 
Project report of OCR Recognition
Project report of OCR RecognitionProject report of OCR Recognition
Project report of OCR Recognition
 
Online Ticket Reservation System-SRS, ERD, DFD, Structured Charts
Online Ticket Reservation System-SRS, ERD, DFD, Structured ChartsOnline Ticket Reservation System-SRS, ERD, DFD, Structured Charts
Online Ticket Reservation System-SRS, ERD, DFD, Structured Charts
 
And or graph
And or graphAnd or graph
And or graph
 
Documentation of railway reservation system
Documentation of railway reservation systemDocumentation of railway reservation system
Documentation of railway reservation system
 
Depression detection
Depression detectionDepression detection
Depression detection
 
Communication primitives
Communication primitivesCommunication primitives
Communication primitives
 
Job portal project documentary
Job portal project documentaryJob portal project documentary
Job portal project documentary
 
CSE Final Year Project Presentation on Android Application
CSE Final Year Project Presentation on Android ApplicationCSE Final Year Project Presentation on Android Application
CSE Final Year Project Presentation on Android Application
 
Online bus ticket booking
Online bus ticket bookingOnline bus ticket booking
Online bus ticket booking
 
college website project report
college website project reportcollege website project report
college website project report
 
Payroll Management System Complete Report
Payroll Management System Complete ReportPayroll Management System Complete Report
Payroll Management System Complete Report
 

Similar to Resume and CV Summarization using NLP Report

Internship in-chennai-for-eee-in-website-designing
Internship in-chennai-for-eee-in-website-designingInternship in-chennai-for-eee-in-website-designing
Internship in-chennai-for-eee-in-website-designingsofiyasofi
 
Resume and CV Summarization using NLP
Resume and CV Summarization using NLPResume and CV Summarization using NLP
Resume and CV Summarization using NLPsneha indulkar
 
Internship in-chennai-for-eee-in-embedded system
Internship in-chennai-for-eee-in-embedded systemInternship in-chennai-for-eee-in-embedded system
Internship in-chennai-for-eee-in-embedded systemdaulatbegam
 
Internship in-chennai-for-mca-in-windows'8-app-development
Internship in-chennai-for-mca-in-windows'8-app-developmentInternship in-chennai-for-mca-in-windows'8-app-development
Internship in-chennai-for-mca-in-windows'8-app-developmentbalageethasj
 
Internship in-chennai-for-it-in-ethical-hacking
Internship in-chennai-for-it-in-ethical-hackingInternship in-chennai-for-it-in-ethical-hacking
Internship in-chennai-for-it-in-ethical-hackingnithishajustin
 
Internship in-chennai-for-eie-in-wireless-communication-system
Internship in-chennai-for-eie-in-wireless-communication-systemInternship in-chennai-for-eie-in-wireless-communication-system
Internship in-chennai-for-eie-in-wireless-communication-systembhavna_chandar
 
Internship in-chennai-for-eie-in-ccna
Internship in-chennai-for-eie-in-ccnaInternship in-chennai-for-eie-in-ccna
Internship in-chennai-for-eie-in-ccnachitravasanth
 
Internship in-chennai-for-ece-in-ccna
Internship in-chennai-for-ece-in-ccnaInternship in-chennai-for-ece-in-ccna
Internship in-chennai-for-ece-in-ccnachitravasanth
 
Internship in-chennai-for-mca-web-application
Internship in-chennai-for-mca-web-applicationInternship in-chennai-for-mca-web-application
Internship in-chennai-for-mca-web-applicationvijinisuresh
 
Internship in-chennai-for-ece-in-embedded system
Internship in-chennai-for-ece-in-embedded systemInternship in-chennai-for-ece-in-embedded system
Internship in-chennai-for-ece-in-embedded systemdaulatbegam
 
Newsletter November
Newsletter NovemberNewsletter November
Newsletter NovemberCareersGlos
 
Internship in-chennai-for-mca-windows-application
Internship in-chennai-for-mca-windows-applicationInternship in-chennai-for-mca-windows-application
Internship in-chennai-for-mca-windows-applicationmythili_sweety3092
 
Internship in-chennai-for-mca-in-cloud-computing
Internship in-chennai-for-mca-in-cloud-computingInternship in-chennai-for-mca-in-cloud-computing
Internship in-chennai-for-mca-in-cloud-computingroshneyarul
 
Internship in-chennai-for-ece-matlab-in-basic-level-of-programming
Internship in-chennai-for-ece-matlab-in-basic-level-of-programmingInternship in-chennai-for-ece-matlab-in-basic-level-of-programming
Internship in-chennai-for-ece-matlab-in-basic-level-of-programmingvijinisuresh
 
Internship in-chennai-for-ece-in-android-application
Internship in-chennai-for-ece-in-android-applicationInternship in-chennai-for-ece-in-android-application
Internship in-chennai-for-ece-in-android-applicationroypooja
 
Internship in-chennai-for-eie-in-ethical-hacking
Internship in-chennai-for-eie-in-ethical-hackingInternship in-chennai-for-eie-in-ethical-hacking
Internship in-chennai-for-eie-in-ethical-hackingroshneyarul
 
Internship in-chennai-for-cse-android-application
Internship in-chennai-for-cse-android-applicationInternship in-chennai-for-cse-android-application
Internship in-chennai-for-cse-android-applicationroypooja
 
Internship-in-chennai-for-cse-android-application
Internship-in-chennai-for-cse-android-applicationInternship-in-chennai-for-cse-android-application
Internship-in-chennai-for-cse-android-applicationchitravasanth
 
Internship in-chennai-for-it-template-designing
Internship in-chennai-for-it-template-designingInternship in-chennai-for-it-template-designing
Internship in-chennai-for-it-template-designingbhavna_chandar
 
Internship in-chennai-for-eee-in-android-application
Internship in-chennai-for-eee-in-android-applicationInternship in-chennai-for-eee-in-android-application
Internship in-chennai-for-eee-in-android-applicationroypooja
 

Similar to Resume and CV Summarization using NLP Report (20)

Internship in-chennai-for-eee-in-website-designing
Internship in-chennai-for-eee-in-website-designingInternship in-chennai-for-eee-in-website-designing
Internship in-chennai-for-eee-in-website-designing
 
Resume and CV Summarization using NLP
Resume and CV Summarization using NLPResume and CV Summarization using NLP
Resume and CV Summarization using NLP
 
Internship in-chennai-for-eee-in-embedded system
Internship in-chennai-for-eee-in-embedded systemInternship in-chennai-for-eee-in-embedded system
Internship in-chennai-for-eee-in-embedded system
 
Internship in-chennai-for-mca-in-windows'8-app-development
Internship in-chennai-for-mca-in-windows'8-app-developmentInternship in-chennai-for-mca-in-windows'8-app-development
Internship in-chennai-for-mca-in-windows'8-app-development
 
Internship in-chennai-for-it-in-ethical-hacking
Internship in-chennai-for-it-in-ethical-hackingInternship in-chennai-for-it-in-ethical-hacking
Internship in-chennai-for-it-in-ethical-hacking
 
Internship in-chennai-for-eie-in-wireless-communication-system
Internship in-chennai-for-eie-in-wireless-communication-systemInternship in-chennai-for-eie-in-wireless-communication-system
Internship in-chennai-for-eie-in-wireless-communication-system
 
Internship in-chennai-for-eie-in-ccna
Internship in-chennai-for-eie-in-ccnaInternship in-chennai-for-eie-in-ccna
Internship in-chennai-for-eie-in-ccna
 
Internship in-chennai-for-ece-in-ccna
Internship in-chennai-for-ece-in-ccnaInternship in-chennai-for-ece-in-ccna
Internship in-chennai-for-ece-in-ccna
 
Internship in-chennai-for-mca-web-application
Internship in-chennai-for-mca-web-applicationInternship in-chennai-for-mca-web-application
Internship in-chennai-for-mca-web-application
 
Internship in-chennai-for-ece-in-embedded system
Internship in-chennai-for-ece-in-embedded systemInternship in-chennai-for-ece-in-embedded system
Internship in-chennai-for-ece-in-embedded system
 
Newsletter November
Newsletter NovemberNewsletter November
Newsletter November
 
Internship in-chennai-for-mca-windows-application
Internship in-chennai-for-mca-windows-applicationInternship in-chennai-for-mca-windows-application
Internship in-chennai-for-mca-windows-application
 
Internship in-chennai-for-mca-in-cloud-computing
Internship in-chennai-for-mca-in-cloud-computingInternship in-chennai-for-mca-in-cloud-computing
Internship in-chennai-for-mca-in-cloud-computing
 
Internship in-chennai-for-ece-matlab-in-basic-level-of-programming
Internship in-chennai-for-ece-matlab-in-basic-level-of-programmingInternship in-chennai-for-ece-matlab-in-basic-level-of-programming
Internship in-chennai-for-ece-matlab-in-basic-level-of-programming
 
Internship in-chennai-for-ece-in-android-application
Internship in-chennai-for-ece-in-android-applicationInternship in-chennai-for-ece-in-android-application
Internship in-chennai-for-ece-in-android-application
 
Internship in-chennai-for-eie-in-ethical-hacking
Internship in-chennai-for-eie-in-ethical-hackingInternship in-chennai-for-eie-in-ethical-hacking
Internship in-chennai-for-eie-in-ethical-hacking
 
Internship in-chennai-for-cse-android-application
Internship in-chennai-for-cse-android-applicationInternship in-chennai-for-cse-android-application
Internship in-chennai-for-cse-android-application
 
Internship-in-chennai-for-cse-android-application
Internship-in-chennai-for-cse-android-applicationInternship-in-chennai-for-cse-android-application
Internship-in-chennai-for-cse-android-application
 
Internship in-chennai-for-it-template-designing
Internship in-chennai-for-it-template-designingInternship in-chennai-for-it-template-designing
Internship in-chennai-for-it-template-designing
 
Internship in-chennai-for-eee-in-android-application
Internship in-chennai-for-eee-in-android-applicationInternship in-chennai-for-eee-in-android-application
Internship in-chennai-for-eee-in-android-application
 

Recently uploaded

WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234AafreenAbuthahir2
 
Online blood donation management system project.pdf
Online blood donation management system project.pdfOnline blood donation management system project.pdf
Online blood donation management system project.pdfKamal Acharya
 
Introduction to Machine Learning Unit-5 Notes for II-II Mechanical Engineering
Introduction to Machine Learning Unit-5 Notes for II-II Mechanical EngineeringIntroduction to Machine Learning Unit-5 Notes for II-II Mechanical Engineering
Introduction to Machine Learning Unit-5 Notes for II-II Mechanical EngineeringC Sai Kiran
 
Construction method of steel structure space frame .pptx
Construction method of steel structure space frame .pptxConstruction method of steel structure space frame .pptx
Construction method of steel structure space frame .pptxwendy cai
 
weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdfPratik Pawar
 
Introduction to Machine Learning Unit-4 Notes for II-II Mechanical Engineering
Introduction to Machine Learning Unit-4 Notes for II-II Mechanical EngineeringIntroduction to Machine Learning Unit-4 Notes for II-II Mechanical Engineering
Introduction to Machine Learning Unit-4 Notes for II-II Mechanical EngineeringC Sai Kiran
 
Laundry management system project report.pdf
Laundry management system project report.pdfLaundry management system project report.pdf
Laundry management system project report.pdfKamal Acharya
 
Introduction to Casting Processes in Manufacturing
Introduction to Casting Processes in ManufacturingIntroduction to Casting Processes in Manufacturing
Introduction to Casting Processes in Manufacturingssuser0811ec
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdfKamal Acharya
 
shape functions of 1D and 2 D rectangular elements.pptx
shape functions of 1D and 2 D rectangular elements.pptxshape functions of 1D and 2 D rectangular elements.pptx
shape functions of 1D and 2 D rectangular elements.pptxVishalDeshpande27
 
İTÜ CAD and Reverse Engineering Workshop
İTÜ CAD and Reverse Engineering WorkshopİTÜ CAD and Reverse Engineering Workshop
İTÜ CAD and Reverse Engineering WorkshopEmre Günaydın
 
Arduino based vehicle speed tracker project
Arduino based vehicle speed tracker projectArduino based vehicle speed tracker project
Arduino based vehicle speed tracker projectRased Khan
 
Top 13 Famous Civil Engineering Scientist
Top 13 Famous Civil Engineering ScientistTop 13 Famous Civil Engineering Scientist
Top 13 Famous Civil Engineering Scientistgettygaming1
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdfAhmedHussein950959
 
ENERGY STORAGE DEVICES INTRODUCTION UNIT-I
ENERGY STORAGE DEVICES  INTRODUCTION UNIT-IENERGY STORAGE DEVICES  INTRODUCTION UNIT-I
ENERGY STORAGE DEVICES INTRODUCTION UNIT-IVigneshvaranMech
 
Courier management system project report.pdf
Courier management system project report.pdfCourier management system project report.pdf
Courier management system project report.pdfKamal Acharya
 
Cloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptx
Cloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptxCloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptx
Cloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptxMd. Shahidul Islam Prodhan
 
power quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptxpower quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptxViniHema
 
Automobile Management System Project Report.pdf
Automobile Management System Project Report.pdfAutomobile Management System Project Report.pdf
Automobile Management System Project Report.pdfKamal Acharya
 
LIGA(E)11111111111111111111111111111111111111111.ppt
LIGA(E)11111111111111111111111111111111111111111.pptLIGA(E)11111111111111111111111111111111111111111.ppt
LIGA(E)11111111111111111111111111111111111111111.pptssuser9bd3ba
 

Recently uploaded (20)

WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234
 
Online blood donation management system project.pdf
Online blood donation management system project.pdfOnline blood donation management system project.pdf
Online blood donation management system project.pdf
 
Introduction to Machine Learning Unit-5 Notes for II-II Mechanical Engineering
Introduction to Machine Learning Unit-5 Notes for II-II Mechanical EngineeringIntroduction to Machine Learning Unit-5 Notes for II-II Mechanical Engineering
Introduction to Machine Learning Unit-5 Notes for II-II Mechanical Engineering
 
Construction method of steel structure space frame .pptx
Construction method of steel structure space frame .pptxConstruction method of steel structure space frame .pptx
Construction method of steel structure space frame .pptx
 
weather web application report.pdf
weather web application report.pdfweather web application report.pdf
weather web application report.pdf
 
Introduction to Machine Learning Unit-4 Notes for II-II Mechanical Engineering
Introduction to Machine Learning Unit-4 Notes for II-II Mechanical EngineeringIntroduction to Machine Learning Unit-4 Notes for II-II Mechanical Engineering
Introduction to Machine Learning Unit-4 Notes for II-II Mechanical Engineering
 
Laundry management system project report.pdf
Laundry management system project report.pdfLaundry management system project report.pdf
Laundry management system project report.pdf
 
Introduction to Casting Processes in Manufacturing
Introduction to Casting Processes in ManufacturingIntroduction to Casting Processes in Manufacturing
Introduction to Casting Processes in Manufacturing
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
 
shape functions of 1D and 2 D rectangular elements.pptx
shape functions of 1D and 2 D rectangular elements.pptxshape functions of 1D and 2 D rectangular elements.pptx
shape functions of 1D and 2 D rectangular elements.pptx
 
İTÜ CAD and Reverse Engineering Workshop
İTÜ CAD and Reverse Engineering WorkshopİTÜ CAD and Reverse Engineering Workshop
İTÜ CAD and Reverse Engineering Workshop
 
Arduino based vehicle speed tracker project
Arduino based vehicle speed tracker projectArduino based vehicle speed tracker project
Arduino based vehicle speed tracker project
 
Top 13 Famous Civil Engineering Scientist
Top 13 Famous Civil Engineering ScientistTop 13 Famous Civil Engineering Scientist
Top 13 Famous Civil Engineering Scientist
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
 
ENERGY STORAGE DEVICES INTRODUCTION UNIT-I
ENERGY STORAGE DEVICES  INTRODUCTION UNIT-IENERGY STORAGE DEVICES  INTRODUCTION UNIT-I
ENERGY STORAGE DEVICES INTRODUCTION UNIT-I
 
Courier management system project report.pdf
Courier management system project report.pdfCourier management system project report.pdf
Courier management system project report.pdf
 
Cloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptx
Cloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptxCloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptx
Cloud-Computing_CSE311_Computer-Networking CSE GUB BD - Shahidul.pptx
 
power quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptxpower quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptx
 
Automobile Management System Project Report.pdf
Automobile Management System Project Report.pdfAutomobile Management System Project Report.pdf
Automobile Management System Project Report.pdf
 
LIGA(E)11111111111111111111111111111111111111111.ppt
LIGA(E)11111111111111111111111111111111111111111.pptLIGA(E)11111111111111111111111111111111111111111.ppt
LIGA(E)11111111111111111111111111111111111111111.ppt
 

Resume and CV Summarization using NLP Report

  • 1. VIVEKANAND EDUCATION SOCIETY’S INSTITUTE OF TECHNOLOGY Department of Computer Engineering Project Report on Resume and CV Summarization using NLP In partial fulfillment of the Final Year, Bachelor of Engineering (B.E.) Degree in Computer Engineering at the University of Mumbai Academic Year 2020-2021. Submitted by Anjali Asrani D17B 05 Sneha Indulkar D17B 23 Kaif Siddique D17B 63 Project Mentor Mrs. Vidya Zope (2020-2021) 1
  • 2. VIVEKANAND EDUCATION SOCIETY’S INSTITUTE OF TECHNOLOGY Department of Computer Engineering Certificate This is to certify that Anjali Asrani, Sneha Indulkar and Kaif Siddique of Final Year Computer Engineering studying under the University of Mumbai have satisfactorily completed the mini project on “Resume and CV Summarization using NLP” as a part of their coursework of Mini Project for Semester-VIII under the guidance of their mentor Mrs. Vidya Zope in the year 2020-2021. This mini project report entitled (Resume and CV Summarization using NLP) by (Anjali Asrani, Sneha Indulkar and Kaif Siddique) is approved for the degree of ____________. Programme Outcomes Grade PO1,PO2,PO3,PO4,PO5,PO6,PO7, PO8, PO9, PO10, PO11, PO12 PSO1, PSO2 Date: Project Guide : Mrs. Vidya Zope 2
  • 3. MINI PROJECT REPORT APPROVAL FOR B. E (COMPUTER ENGINEERING) This mini project report entitled Resume and CV Summarization using NLP by Anjali Asrani, Sneha Indulkar and Kaif Siddique is approved for the degree of B.E Computer Engg. Internal Examiner --------------------------------------------- External Examiner --------------------------------------------- Head of the Department ----------------------------------------------- Principal ----------------------------------------------- Date: Place: 3
  • 4. Declaration We declare that this written submission represents our ideas in our own words and where others' ideas or words have been included, we have adequately cited and referenced the original sources. We also declare that we have adhered to all principles of academic honesty and integrity and have not misrepresented or fabricated or falsified any idea/data/fact/source in our submission. We understand that any violation of the above will be cause for disciplinary action by the Institute and can also evoke penal action from the sources which have thus not been properly cited or from whom proper permission has not been taken when needed. ______________________________ ______________________________ Anjali Asrani 05 Sneha Indulkar 23 ______________________________ Kaif siddiqui 63 Date : 4
  • 5. ACKNOWLEDGEMENT We are thankful to our college Vivekanand Education Society’s Institute of Technology for considering our project and extending help at all stages needed during our work of collecting information regarding the project. It gives us immense pleasure to express our deep and sincere gratitude to Assistant Professor Mrs. Vidya Zope (Project Guide) for her kind help and valuable advice during the development of project synopsis and for her guidance and suggestions. We are deeply indebted to Head of the Computer Department Dr.(Mrs.) Nupur Giri and our Principal Dr. (Mrs.) J.M. Nair for giving us this valuable opportunity to do this project. We express our hearty thanks to them for their assistance without which it would have been difficult in finishing this project synopsis and project review successfully. We convey our deep sense of gratitude to all teaching and non-teaching staff for their constant encouragement, support and selfless help throughout the project work. It is great pleasure to acknowledge the help and suggestion, which we received from the Department of Computer Engineering. We wish to express our profound thanks to all those who helped us in gathering information about the project. Our families too have provided moral support and encouragement at several times. 5
  • 6. Computer Engineering Department COURSE OUTCOMES FOR B.E PROJECT Learners will be to:- Course Outcome Description of the Course Outcome CO 1 Do literature survey/industrial visit and identify the problem of the selected project topic. CO2 Apply basic engineering fundamental in the domain of practical applications FORproblem identification, formulation and solution CO 3 Attempt & Design a problem solution in a right approach to complex problems CO 4 Cultivate the habit of working in a team CO 5 Correlate the theoretical and experimental/simulations results and draw the proper inferences CO 6 Demonstrate the knowledge, skills and attitudes of a professional engineer & Prepare report as per the standard guidelines. 6
  • 7. Abstract This project proposes a model of extracting important information from the semi-structured text format in a curriculum vitae or resume and ranking it according to the preference of the associated company and requirements. In order to achieve the desired goal, the entire process has been divided into 3 basic segments. The first segment consists of segmenting the entire CV / Resume based on the topic of each part, the second segment consists of extracting data in structured form from the unstructured data and the final segment consists of evaluating the structured data by decision tree algorithm and training the system. The structured data extraction process is done by segmenting the entire CV / Resume by converting it to text. After the conversion to structured data, decision tree algorithm techniques are used to classify the input into different categories based on qualifications, experience etc. 7
  • 8. CHAPTER 1 INTRODUCTION ____________________________________________________________________________ 1.1 Introduction to the project After completing education the next phase that comes in a person’s life is a job. However, there are lots of people who start working before completing their formal education. While searching for jobs the most important thing to represent an applicant is Curriculum Vitae (CV) or Resume. In this era of technology, job searching has become more smart and easier at the same time. However, there are more than enough applicants for a single job and it is really tough for an employer to select candidates only based on their CV / Resume. To solve this problem, there are companies who provide specific formats for their applicants so that they can make this process a little bit easier. Even after doing that the process is still pretty boring and most of the cases full of errors. Every organization has to deal with folders together with resumes. Going through these resumes can be a tiring process added to the fact that it is very time consuming. It would help a ton if there were to exist a model which processes the resumes and not only gives out how many and which of them meet the requirements, but also gives a summarized version of the resumes. These concise resumes can come of great use for a hiring company in their selection procedures , to straight away reject those applications that are not suitable for the job description 1.2 Problem Definition Large companies and recruitment agencies receive, process and manage hundreds of resumes from job applicants. Besides, many people publish their resumes on the web. Dealing with loads of resumes at once can be time consuming since all they need is a bunch of valuable resumes that represent candidates specialized in fields the company/agency is looking for. The resumes that one receives in their original format (pdf, docm etc.) are unstructured. The unstructured format of these resumes , with random templates and fonts make it difficult for processing. These resumes 8
  • 9. can be automatically retrieved and processed by a resume information extraction system. Extracted information such as name, phone / mobile nos., e-mails id., qualification, experience, skill sets etc. can be stored as structured information in a database. 1.3 Scope of project Even though in the research one of the most feasible ways to evaluate a CV/ Resume was detailed, the domain was kept restricted to the CVs/ Resumes of only engineering students and the amount of sample data versus the amount of test data was relatively small. In addition to that, CVs/ Resumes with some varied layout design is out of the scope of this paper. For the future scope of this research, the methodologies can be used for the data from CVs/ Resumes of other job departments or the whole research can be done in a much larger scope. 1.4 CV / Resume Analyzing Process In the past, CVs/Resumes submitted by job seekers used to be manually analyzed and judged by the employers. This method is still followed in recent times. However, as the big companies often need to deal with hundreds of CVs/Resumes each and every day, it has become very problematic and time consuming to handle such a big number of CVs/Resumes one by one. As a result, many companies started to provide specific formats or forms where the job seekers need to fill up with required information and then the CV/Resume will be analyzed by machine with simple pattern recognition and keywords searching. While this method reduced the workload for the employers, it increased the amount of work for the applicants significantly as they need to maintain different formats for each job they apply for. Additionally, it also tends to reduce the creativity and the flexibility of writing the skills along with the qualifications in a CV/Resume. 1.5 Natural Language Processing Approach With all the pros and cons in mind, there has always been an attempt to find an automated method which finds the best of worlds, where the employers can easily select qualified candidates in a short time and where the applicants can also demonstrate their creativity while maintaining just one format to apply in different organizations. The innovation in the field of Natural Language 9
  • 10. Processing [4] along with Machine Learning [5] has been really helpful in this case. The ability to understand unstructured written language and extract important information from it to teach the machine is exactly what is needed to analyze any written documents such as resume papers just like human beings. 1.6 Technologies to be used 1. Software requirement : i. Jypeter notebook ii. Colab 2. Technology used i. NLP ii. Python iii. Spacy tool 10
  • 11. CHAPTER 2 LITERATURE SURVEY ______________________________________________________________________________ 2.1 Research Paper Referred ● Satyaki Sanyal et. al [1] Parse information from a resume using natural language processing, find the keywords, cluster them onto sectors based on their keywords and lastly show the most relevant resume to the employer based on keyword matching. First, the user uploads a resume to the web platform. The parser parses all the necessary information from the resume and auto fills a form for the user to proofread. Once the user confirms, the resume is saved into our NoSQL database ready to show itself to the employers. Also, t he user gets their resume in both JSON format and pdf. . ● Dr.K.Satheesh*(Professor), A.Jahnavi 1 et. al.[2] Screening resumes out of bulk is a challenging task and recruiters or hiring managers waste a lot of their valuable time by searching through each and every resume. Often resumes are populated with irrelevant and unnecessary information. Therefore, the process of parsing thousands of resumes manually consumes a lot of time and energy thereby it makes the hiring process expensive. In traditional hiring, resume screening is a manual process which consumes a lot of time and energy. In this paper the process of screening resumes is automated by using advanced Natural Language Processing which is a field in Machine Learning .Our model helps the recruiters in screening the resumes based on job description within no time. It makes the hiring process easy and efficient by extracting the required entities automatically by using the Spacy NER model from the resumes and then generates a graph displaying the score of each and every resume. Based on the scores recruiter can choose the required candidates without rummaging through piles of resumes from unqualified candidates 11
  • 12. CHAPTER 3 CONCEPTUAL SYSTEM DESIGN ______________________________________________________________________________ 3.1 System diagram The resumes that one receives in their original format (pdf, docm etc.) are unstructured. The unstructured format of these resumes , with random templates and fonts make it difficult for processing. These resumes can be automatically retrieved and processed by a resume information extraction system. Extracted information such as name, phone / mobile nos., e-mails id., qualification, experience, skill sets etc. can be stored as structured information in a database. Considered sample resume or CV Segment 1 : (Name and details) Md. Sakib Zaman Flat-A3, 127 West Kafrul, Agargaon, Taltola, Dhaka 1207 Mobile: +8801912397694 E-mail: sakib2033@gmail.com 12
  • 13. Segment 2 : (Working experience) Employment Status Currently working as a Student Tutor/Teaching Assistant at Department of Computer Science & Engineering, BRAC University from January 2017 Currently working as a Student Trainer at Competitive Programming Training Session organized by Department of Computer Science & Engineering, BRAC University and BRAC University ACM Students Chapter from August 2016 Currently working as a Student Mentor at First Year Advising Team, BRAC University Former Intern Software Engineer at Projukti Next from 2 nd May 2017 to 31st May 2017 Segment 3 : (Educational qualifications) Educational Qualification Final year student of Computer Science and Engineering, BRAC University, Dhaka CGPA: 3.74 in scale of 4.0 (till April, 2017) H.S.C. (2013) from Notre Dame College, Dhaka GPA: 5.0 in scale of 5.0 S.S.C (2011) from Sher-E-Bangla Nagar Govt. Boys High School, Dhaka GPA: 5.0 in scale of 5.0 Segment 4 : (Technical skills) Field dependent Technical Skills Programming Languages: Java, C, C++, C# Operating Systems: Windows, Linux Database System: MySQL Web: HTML5, CSS3 Segment 5 (Awards & achievements) Achievements in Competitions and Programming 1st Runner-up in BRAC University Intra University Programming Contest, Segment 6 : (Projects) Field dependent: Projects Currently working on an Online File Server System for Educational Institutions Library Management System for Data Structure course Hospital Management System for Data Structure course Cineplex Management System for Database course Dhaka City Management System for Software Engineering course 13
  • 14. 3.2 Flowchart We use python’s [3]spaCy module for training the NER model. spaCy’s models are statistical and every “decision” they make — for example, which part-of-speech tag to assign, or whether a word is a named entity — is a prediction. This prediction is based on the examples the model has seen during training. The model is then shown the unlabelled text and will make a prediction. Because we know the correct answer, we can give the model feedback on its prediction in the form of an error gradient of the loss function that calculates the difference between the training example and the expected output. The greater the difference, the more significant the gradient and the updates to our model. 14
  • 15. CHAPTER 4 IMPLEMENTATION AND EVALUATION ____________________________________________________________________________ 4.1 Resume summarization using NER: Data Preparation: Our first task is to create manually annotated training data to train the model. So we are using an online automation tool called Dataturks which automatically parses the documents and allows us to create annotations of required entities. PDF to Text: Our aim for this project is to come up with an end-to-end tool which takes in a document and gives out the expected result, in this case - The category and the summary. The majority of the resumes out there are submitted in pdf format, we decided to add a preprocessing step of converting PDF to text, by making use of the well known Optical Character Recognition. We made use of pdfminer under python for this task 15
  • 16. Data Cleaning : 1. Unnecessary separators: A lot of resumes had separators like a string of ’-’, which was considered to be removed too[3].. 2. Punctuation and Stop Words: Punctuation and stop words didn’t seem to add any value to the analysis, and hence it was decided to be gotten rid of. 3. Erroneous Formatting: There were also some records with highly erroneous formatting which came in the way of our cleaning/analysis. Getting rid of them was the best resort. 4. Personal details: Details like email id, phone numbers, dates etc would add nothing but plain noise to the analysis which would add merely any value in the process of classification.It was hence considered best to remove them. Class Labels and Indexing : [3] For added ease during further processing, mapping the class labels to numeric constants was done. This made it easier for the model to predict 16
  • 17. 4.2 Results and Evaluation: 17
  • 18. Train data set output 18
  • 19. Non train resume output: 19
  • 20. CHAPTER 6 CONCLUSION AND FUTURE SCOPE ______________________________________________________________________________ 6.1 Conclusion Our application helps the recruiters to screen the resumes more efficiently thereby reducing the cost of hiring. This will provide a potential candidate to the organization and the candidate will be successfully placed in an organization which appreciates his/her skill set and ability. 6.2 Future Scope The application can be extended further to other domains like Telecom, Healthcare, E-commerce and public sector jobs 20
  • 21. CHAPTER 7 REFERENCES ______________________________________________________________________________ ● [1] Resume Parser with Natural Language Processing: https://www.researchgate.net/publication/313851778_Resume_Parser_with_Natural_Languag e_Processing / international Journal of Engineering Science and Computing, February 2017 ● [2]Resume Ranking based on Job Description using SpaCy NER model: https://www.irjet.net/archives/V7/i5/IRJET-V7I516.pdf / International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 05 | May 2020 www.irjet.net p-ISSN: 2395-0072 ● [3] Article : A Review of Named Entity Recognition (NER) Using Automatic Summarization of Resumes https://towardsdatascience.com/a-review-of-named-entity-recognition-ner-using-automatic-su mmarization-of-resumes-5248a75de175 21
  • 22. 22