SlideShare a Scribd company logo
1 of 4
Download to read offline
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 05 | May 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 2055
Automated Essay Evaluation using Natural Language Processing
Chahat Sharma1, Akash Bishnoi2, Akshay Kr. Sachan3, Aman Verma4
1Assistant Professor, Dept. of Computer Science Engineering, Inderprastha Engineering College,
Uttar Pradesh, India
2,3,4 Dept. of Computer Science & Engineering, Inderprastha Engineering College, Uttar Pradesh, India
---------------------------------------------------------------------***----------------------------------------------------------------------
Abstract - Automated Essay Scoring (AES) is a great
research area to analyze the human expertise. AES is one of
the most challenging activities in Natural Language
Processing (NLP). It makes use ofNLPandMachineLearning
(ML) techniques to predict the score and match with the
human like grading system. The very first model was PEG
(Project Essay Grader) propped by Ellis Page in 1960s.Since
then, there have been multiple systems that look at
providing either a holistic score to the essay or to score
individual attributes of the essay. Examples of a few online
systems include Grammarly, Paper-Rater, ETS e-rater and
IntelliMetric by Vantage. AES relies sufficient number of
essay prompts in order to createa gradingmodel whichlater
is used to evaluate the prompts. With the increasingnumber
of people attempting several exams like GRE, TOEFL, IELTS
etc., it’ll become quite difficultfortheinstitutestograde each
paper besides the difficulty for humans to focus with a
consistent mindset. In this scenario, a person finds it very
difficult to grade numerous essays every day within time
bounds. This paper aims to overcome and solve the
problems faced by human expertsbyprovidingthemwith an
interface that can perform their work with the same
accuracy. For this, features including Bag of words models,
numerical features like sentence count, word count,
vocabulary size, perplexity etc. were extracted to grade the
essay and achieve maximum accuracy. For this,datasetfrom
Hewlett Foundation was picked to select the best features
set, by comparing the accuracy of every possible set. The
essay set consist of 4 essay prompts with 12000+ essays. So,
to evaluate a large number of essays,suchassessmentsseem
expensive and time-consuming task. Even if essays graded
by human graders are biased, we as a student feel grateful to
a system like that, and that is what motivated us to work on
this.
Key Words: Essays, E-Rater, Graduate Management
Admission Test, IntelliMetric, Machine Learning, Natural
Language Processing, Project Essay Grader.
1. INTRODUCTION
This paper aims to build a machine learning system for
automatic scoring of essays written by students. The basic
idea is to search for features which can model the attributes
like language fluency, vocabulary, structure, organization,
content etc. Such a system can have a high utility in many
places. A linear regression model is built with polynomial
basis function to predict the score of a given essay. The
subsequent sections explain the input data, features
extraction, detailedapproach,results,andfuturescopeof the
work. Its objective is to classify a large set of textual entities
into a small number of discrete categories, corresponding to
the possible grades—for example, the numbers 1 to 10.
Therefore, it can be considered a problem of statistical
classification. The impacts that computer have on our
writings have been in use for 40 years now. Even the most
basics of computers like processing of words, thesaurus etc.
is of great help to authors in updating their writing material.
The research has revealed that computers have the capacity
to function as a more effective cognitive tool. Revision and
feedback are important parts of the writing material.
Students in general require input from their teachers for
mastering the art of writing. However,respondingtostudent
papers can be a burden for teachers. Particularly giving
feedback to a large group of students on frequent writing
assignments can be pretty hectic from teacher’s point of
view. Therefore, creation of such a system that can be
accurate in both providing the feedback and grading the
performance with consuming a lot of time is required.
Computerized scoring has many weak points. In AEEF the
scores would be much more detail oriented than the ratings
provided by two human graders. ThemethodusedbyHamp-
Lyons stated the lack of man to man communication as well
as the sense in which the writer rates the essays. Similarly,
Page stated that the computers could not assess an essay as
human grader do because computer systems are
programmed and lack the intensity of human emotions and
therefore it will not be able to appreciate the context.
Another criticism is the construct objections. That is,
computer can give importance to unimportant factors while
rating or providing the score to the user i.e., focusing on
traditional aspects rather than orthodox ones.
2. Literature Survey
2.1 Project Essay Grader (PEG)
Project Essay Grade(PEG) isone of the earliest implemented
automated essay grading system. It was developed by Ellis
Page in 1966 and others upon the request of the college
board. This scoring system was developed to make essay
scoring more practical and effective and it relies on style
analysis of surface linguistic featuresofatextblock.PEGdoes
not use NLP approach but instead uses proxy measures to
predicttheintrinsicqualityandcomputerapproximations.As
no Natural Language Processing (NLP) is used so it does not
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 05 | May 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 2056
take lexical content in account. Proxies referstothestructure
of essay which consist of average word length, essay length,
number of semicolons or commas, counts of preposition,
parts of speech and so on. Proxies are calculated by set of
training essays and thentransformedtobeusedinastandard
multiple regression along with the essays graded by human
to calculate the regression coefficients.
One of the best things about PEG is that it’s predicted
scores are in close agreement to those of human raters.
Second is, this system is computationally tractable which
means it can track the errors madebytheusers.PEGscoring
system contains two stages the training stage and scoring
stage. In training stage, it is trained on sample of essays and
in the scoring stage it makes use of proxies to score the
essays. PEG purely relies on a statistical approach based on
the assumption that the quality of essays is reflected by the
measurable proxies. PEG has been criticized becauseitdoes
not include semantic features of the essays and only
focusing on the structure. Since PEG uses structure of essay
to score it was easy to cheat by increasing the length of the
essay. It has been modified on several aspects in 1990s.
It achieved a correlation score of 0.87 with human raters.
A correlation score/coefficient is a statistical relationship
between two variables.
2.2 Intelligent Essay Assessors (IEA)
IEA uses Latent Semantic Analysis (LSA) which is a
computational model of human knowledge representation.
It is also a method for extracting semantic similarity of
words and passages from text. Both aspects are presented
elsewhere [6].
It is based on statistical analysis of large amount of text
(typically thousands to millions of words). Previous
attempts to develop automated essay scoring models have
primary focus on style of writing. Focus on content have
always remained secondary priority. However,LSAfocused
on conceptual content, the knowledge conveyedinanessay.
Note: IEA does not make use of NLP techniques.
In their study [3], they compared the performance against
two trained ETS graders. They pick two questions, for
question one, the correlation betweentwograderswas0.86
while correlation of LSA with ETS grader was also 0.86. For
questions two, the correlation was 0.87 and 0.86
respectively. Thus, LSA approach was able to performatthe
same reliability level as the trained ETS graders.
The biggest advantage of IEA is that, it can flag those essays
that are off topic, so that it becomes easy for graderstograde.
2.3 E-Rater
ETS developed e-rater has been used for scoring
essays since Feb’1999, developed under the team lead of Jill
Burstein. E-rater has been continuously upgrading with
newer versions. It generatesadvisoryflagwhenitencounters
anomalous essay & writing samples such as exceedinglength,
repetition material & off topic response. Scores are reported
low & summarized in flag message. E-rater version 12.1 uses
10 features to evaluate the essay such as grammar,
mechanics, style, usage, organization, development, word
choice, word length, positive features, differential word use.
E-rater follows Multiple Linear Regression (MLR)
methodology in which the data is split into two sets- training
dataset & validation dataset. Training dataset is used to build
scoring models and evaluation dataset for evaluating the
essay.
Estimated featured weightswithhuman scorearemaximized
on the basis of least-square estimation. The final e-rater
scores are scaled to match the distributionalmean&standard
deviation of e-rater scores & those of human scores. In
addition, with Multiple Linear Regression (MLR), Support
Vector Machine (SVM), Random Forest (RF), and K- Nearest
Neighbors (k‐NN) can also be used.
Fig 1. Comparision between various Models
The results show that SVM performs better than MLR model
in evaluation of human scores. Overall SVM yields highest
among all desired methods. It concludes that MLR do notfully
give the useful content in the feature variable for prediction.
More sophisticated models need to be developed to improve
ratings. On the other hand, machine learning models result
may not be straight forward as theMLRmodel.Theadvantage
of the MLR model is that basis for the score is seen in the
weight that each feature receives. The SVM algorithm, which
employs a set of decision surfaces to separate the essays
optimally, works best for our particular datasets.
2.4 IntelliMetric
GraduateManagementAdmissionCouncil (GMAC®)haslong
been benefitted from advances in automated essay scoring.
It started with ETS® e-rater®, but in January 2006, ACT,
Inc. became responsible for GMAT test development and
scoring. ACT included IntelliMetric Essay scoring system of
vantage learning as part of their initial proposal. It has
following approach- Two humans are assigned the work of
rating a certain number of prompts.If theirreviewsdiffer by
more than one score point on a scale of 0 to 6, a third rater
is appointed to adjudicates scores. Once a sufficientnumber
of prompts are hand scored, a scoring model is developed
and later used for evaluation the prompts.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 05 | May 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 2057
Vantage used various mathematical models:
Simple word counts- It used a collection of well-developed
literature using Bayes Theorem in text classification. The
concept is to identify the words or phrases the are more
closely associated with essay score.
Probabilistic Modelling- An ACT 2004 evaluation of GMAT
AWA prompts founds that 87% candidates obtained scoresof
3,4 or 5. This model randomly draws a score from AWA
frequency distribution.
3. METHODOLOGY USED
Following features have been included in the model:
3.1 Words Count
For any essay prompt, it is the most important feature for
grading an essay. As the number of words increases, the
scoring pattern also increases. Above conclusion is obtained
after plotting graphs of words count vs score for our dataset.
Fig 2. Score vs Total Words for set 1 and set 4
3.2 Sentences Count
An essay is not a single phrase, it consists of many lines
supporting writers’ points of views. Again, similar trend was
recorded, concluding that it also contributes in scoring
scheme.
Fig 3. Score vs Sentence count for set 1 and set 4
3.3 Spellings
How good a writer’s writing skills also depends on the
correct spellings of the words used. This feature also shows a
mix trend i.e. positive as well as negative while plotted versus
score.
Fig 4. Score vs Correct spelling for set 1 and set 4
2.4 Unique Words
Repetition of words lacks indicates that writer lacks
vocabulary. Here, we ignored the stopwords available in
NLTK package and words with length less than 3 to get the
unique words counts.
Fig 5. Score vs Unique words for set 1 and set 4
3.5 Grammar Usage
This feature holds the largest weightage after above features
when essays are graded manually. In AEG, grammar error
doesn’t show a variation trend over which a comment can be
made. However, we have included it, as it helped in achieving
better correlation value irrespective of its much contribution
in modelling.
3.6 Words Choice
It is a well-known feature whatever you speak or write, good
quality of words presents you better to the second person.
We have used NLTK vaden lexicon to predict the sentiment of
the words used in the essay (stopwords and words with
length less than 3 are ignored).
3.7 Perplexity
Perplexity [8] is a measurement of how well a probability
distribution or probability model predicts a sample. It may be
used to compare probability models. A low perplexity
indicates the probability distributionisgoodatpredictingthe
samples.
4. RESULT AND DISCUSSION
The dataset [9] has been extracted from Kaggle.com, it
consists of data that was used in the competition organized
by Hewlett Foundation. The dataset has 4 essay prompts and
there are very large number of essays (12000+). Hence, the
prompts are divided into 8 essay sets based on their prompts.
Our project mainly works on PEG technique. Important
features are extracted from the datasets like total word
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 05 | May 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 2058
count, sentence count, paragraph, correct spellings, partsof
speech, perplexity, words choice (positive and neutral
words) using word sentiments. Individual Features were
plotted against score to identifythetrend.All thementioned
features showed a positive trend versus score.
We had trained the selected features using below models to
find the correlation between the features and the scores
graded by human experts. We got reliable results. The
predicted scores are in close agreement with the actualscore
as shown in table below.
Fig 7. Various Models Correlation Matrix
Fig 8. Meaning of Correlation Score
5. CONCLUSIONS
In sum, we were able to successfully implement a Lasso
linear regression model as shown in Table1,using bothtrivial
and nontrivial essay features to vastly improve upon our
baseline model. While featureslikewordcountappeartohave
the most correlated relationship with score from a graphical
standpoint, we believe that a featuresuchasperplexity,which
actually takes a language model into account, would in the
long run be a superior predictor. Namely, we would ideally
extend our self-implemented perplexityfunctionality tothen-
gram case, rather than simply using unigrams. With this
added capability, we believe our model could achieve even
greater Spearman correlation scores. Other features that we
believe could improve the effectiveness of the model include
parse trees. Parse trees are ordered trees that represent the
syntactic structure of a phrase. This linguistic model is, much
like perplexity, based on content rather than the “metadata”
that are provided by many trivial features. As such, it may
prove effective in contributing to the model a more in-depth
analysis of the context and construction of sentences,
pointing to writing styles that may correlate to higher grades.
Finally, we would like to take the prompts of the essays into
account. This could be a significant feature for our model,
because depending on the type of essay being written— e.g.
persuasive, narrative, summary—the organization of the
essay could vary, which would then affect how we create our
models and which features become more important.
There is certainly room for improvement on our model—
namely, the features we just mentioned,aswellasmany more
we have not discussed. However, given the time, resources
and scope for this project, we were very pleased with our
results. None of us had ever performed NLP before, but we
now look forward to apply more statistical methodology to
such problems in the future!
6. REFERENCES
[1] Peter W. Foltz, New Mexico State University, Darrell
Laham, Knowledge Analysis Technologies
[2] S. Dikli, “Automated Essay Scoring”, Florida State
University, Tallahassee (PEG)
[3] Thomas K. Landauer, University of Colorado, “The
Intelligent Essay Assessor: Applications to Educational
Technology”, Volume 1, Number 2, October 1999.
[4] V. Salvatore, F. Neri, and A. Cucchiarelli, "An overviewof
current research on automated essay grading.", Journal
of Information Technology Education: Research 2.1
(2003): 319-330, 2003
[5] L. Hamp Lyons, “The Scope of writing assessment”,
Elsevier Science Inc, 1075-2935/02 © 2002.
[6] Deerwester, Dumais, Furnas, Landauer, & Harshman,
1990; Landauer & Dumais, 1997; Landauer, Foltz, &
Laham, 1998
[7] M. Rudner, V. Garcia, & C. Welch, “An Evaluation of the
IntelliMetric™ Essay Scoring System Lawrence”, The
Journal of Technology, Learning, and Assessment,
Volume 4, Number 4 · March 2006
[8] https://en.wikipedia.org/wiki/Perplexity
[9] https://www.kaggle.com/c/asap-aes

More Related Content

What's hot

Semi Automated Text Categorization Using Demonstration Based Term Set
Semi Automated Text Categorization Using Demonstration Based Term SetSemi Automated Text Categorization Using Demonstration Based Term Set
Semi Automated Text Categorization Using Demonstration Based Term SetIJCSEA Journal
 
Single document keywords extraction in Bahasa Indonesia using phrase chunking
Single document keywords extraction in Bahasa Indonesia using phrase chunkingSingle document keywords extraction in Bahasa Indonesia using phrase chunking
Single document keywords extraction in Bahasa Indonesia using phrase chunkingTELKOMNIKA JOURNAL
 
IRJET- Opinion Mining using Supervised and Unsupervised Machine Learning Appr...
IRJET- Opinion Mining using Supervised and Unsupervised Machine Learning Appr...IRJET- Opinion Mining using Supervised and Unsupervised Machine Learning Appr...
IRJET- Opinion Mining using Supervised and Unsupervised Machine Learning Appr...IRJET Journal
 
IRJET- Automated Exam Question Generator using Genetic Algorithm
IRJET-  	  Automated Exam Question Generator using Genetic AlgorithmIRJET-  	  Automated Exam Question Generator using Genetic Algorithm
IRJET- Automated Exam Question Generator using Genetic AlgorithmIRJET Journal
 
Requirementv4
Requirementv4Requirementv4
Requirementv4stat
 
Coverage-Criteria-for-Testing-SQL-Queries
Coverage-Criteria-for-Testing-SQL-QueriesCoverage-Criteria-for-Testing-SQL-Queries
Coverage-Criteria-for-Testing-SQL-QueriesMohamed Reda
 
DEVELOPING A FRAMEWORK FOR ONLINE PRACTICE EXAMINATION AND AUTOMATED SCORE GE...
DEVELOPING A FRAMEWORK FOR ONLINE PRACTICE EXAMINATION AND AUTOMATED SCORE GE...DEVELOPING A FRAMEWORK FOR ONLINE PRACTICE EXAMINATION AND AUTOMATED SCORE GE...
DEVELOPING A FRAMEWORK FOR ONLINE PRACTICE EXAMINATION AND AUTOMATED SCORE GE...AIRCC Publishing Corporation
 
Quantification of Portrayal Concepts using tf-idf Weighting
Quantification of Portrayal Concepts using tf-idf WeightingQuantification of Portrayal Concepts using tf-idf Weighting
Quantification of Portrayal Concepts using tf-idf Weightingijistjournal
 
A Survey on Using Artificial Intelligence Techniques in the Software Developm...
A Survey on Using Artificial Intelligence Techniques in the Software Developm...A Survey on Using Artificial Intelligence Techniques in the Software Developm...
A Survey on Using Artificial Intelligence Techniques in the Software Developm...IJERA Editor
 
Do Personality Profiles Differ in the Pakistani Software Industry and Academi...
Do Personality Profiles Differ in the Pakistani Software Industry and Academi...Do Personality Profiles Differ in the Pakistani Software Industry and Academi...
Do Personality Profiles Differ in the Pakistani Software Industry and Academi...Waqas Tariq
 
Industry Supported Semiconductor Test Engineering Academic Survey and Round-T...
Industry Supported Semiconductor Test Engineering Academic Survey and Round-T...Industry Supported Semiconductor Test Engineering Academic Survey and Round-T...
Industry Supported Semiconductor Test Engineering Academic Survey and Round-T...drboon
 
Presentation: Tool Support for Essential Use Cases to Better Capture Software...
Presentation: Tool Support for Essential Use Cases to Better Capture Software...Presentation: Tool Support for Essential Use Cases to Better Capture Software...
Presentation: Tool Support for Essential Use Cases to Better Capture Software...Naelah AlAgeel
 
Santosh Sahu_MTech_CSE
Santosh Sahu_MTech_CSESantosh Sahu_MTech_CSE
Santosh Sahu_MTech_CSESantosh Sahu
 
Master CV v2
Master CV v2Master CV v2
Master CV v2Frans vs
 
IRJET -Survey on Named Entity Recognition using Syntactic Parsing for Hindi L...
IRJET -Survey on Named Entity Recognition using Syntactic Parsing for Hindi L...IRJET -Survey on Named Entity Recognition using Syntactic Parsing for Hindi L...
IRJET -Survey on Named Entity Recognition using Syntactic Parsing for Hindi L...IRJET Journal
 
an efficient approach for co extracting opinion targets based in online revie...
an efficient approach for co extracting opinion targets based in online revie...an efficient approach for co extracting opinion targets based in online revie...
an efficient approach for co extracting opinion targets based in online revie...INFOGAIN PUBLICATION
 
Advanced Question Paper Generator using Fuzzy Logic
Advanced Question Paper Generator using Fuzzy LogicAdvanced Question Paper Generator using Fuzzy Logic
Advanced Question Paper Generator using Fuzzy LogicIRJET Journal
 
Enhanced Retrieval of Web Pages using Improved Page Rank Algorithm
Enhanced Retrieval of Web Pages using Improved Page Rank AlgorithmEnhanced Retrieval of Web Pages using Improved Page Rank Algorithm
Enhanced Retrieval of Web Pages using Improved Page Rank Algorithmijnlc
 

What's hot (20)

Semi Automated Text Categorization Using Demonstration Based Term Set
Semi Automated Text Categorization Using Demonstration Based Term SetSemi Automated Text Categorization Using Demonstration Based Term Set
Semi Automated Text Categorization Using Demonstration Based Term Set
 
Single document keywords extraction in Bahasa Indonesia using phrase chunking
Single document keywords extraction in Bahasa Indonesia using phrase chunkingSingle document keywords extraction in Bahasa Indonesia using phrase chunking
Single document keywords extraction in Bahasa Indonesia using phrase chunking
 
IRJET- Opinion Mining using Supervised and Unsupervised Machine Learning Appr...
IRJET- Opinion Mining using Supervised and Unsupervised Machine Learning Appr...IRJET- Opinion Mining using Supervised and Unsupervised Machine Learning Appr...
IRJET- Opinion Mining using Supervised and Unsupervised Machine Learning Appr...
 
IRJET- Automated Exam Question Generator using Genetic Algorithm
IRJET-  	  Automated Exam Question Generator using Genetic AlgorithmIRJET-  	  Automated Exam Question Generator using Genetic Algorithm
IRJET- Automated Exam Question Generator using Genetic Algorithm
 
Requirementv4
Requirementv4Requirementv4
Requirementv4
 
Coverage-Criteria-for-Testing-SQL-Queries
Coverage-Criteria-for-Testing-SQL-QueriesCoverage-Criteria-for-Testing-SQL-Queries
Coverage-Criteria-for-Testing-SQL-Queries
 
DEVELOPING A FRAMEWORK FOR ONLINE PRACTICE EXAMINATION AND AUTOMATED SCORE GE...
DEVELOPING A FRAMEWORK FOR ONLINE PRACTICE EXAMINATION AND AUTOMATED SCORE GE...DEVELOPING A FRAMEWORK FOR ONLINE PRACTICE EXAMINATION AND AUTOMATED SCORE GE...
DEVELOPING A FRAMEWORK FOR ONLINE PRACTICE EXAMINATION AND AUTOMATED SCORE GE...
 
Viva
VivaViva
Viva
 
Quantification of Portrayal Concepts using tf-idf Weighting
Quantification of Portrayal Concepts using tf-idf WeightingQuantification of Portrayal Concepts using tf-idf Weighting
Quantification of Portrayal Concepts using tf-idf Weighting
 
A Survey on Using Artificial Intelligence Techniques in the Software Developm...
A Survey on Using Artificial Intelligence Techniques in the Software Developm...A Survey on Using Artificial Intelligence Techniques in the Software Developm...
A Survey on Using Artificial Intelligence Techniques in the Software Developm...
 
Do Personality Profiles Differ in the Pakistani Software Industry and Academi...
Do Personality Profiles Differ in the Pakistani Software Industry and Academi...Do Personality Profiles Differ in the Pakistani Software Industry and Academi...
Do Personality Profiles Differ in the Pakistani Software Industry and Academi...
 
Industry Supported Semiconductor Test Engineering Academic Survey and Round-T...
Industry Supported Semiconductor Test Engineering Academic Survey and Round-T...Industry Supported Semiconductor Test Engineering Academic Survey and Round-T...
Industry Supported Semiconductor Test Engineering Academic Survey and Round-T...
 
Presentation: Tool Support for Essential Use Cases to Better Capture Software...
Presentation: Tool Support for Essential Use Cases to Better Capture Software...Presentation: Tool Support for Essential Use Cases to Better Capture Software...
Presentation: Tool Support for Essential Use Cases to Better Capture Software...
 
Santosh Sahu_MTech_CSE
Santosh Sahu_MTech_CSESantosh Sahu_MTech_CSE
Santosh Sahu_MTech_CSE
 
Master CV v2
Master CV v2Master CV v2
Master CV v2
 
Mush cv E.E
Mush cv E.EMush cv E.E
Mush cv E.E
 
IRJET -Survey on Named Entity Recognition using Syntactic Parsing for Hindi L...
IRJET -Survey on Named Entity Recognition using Syntactic Parsing for Hindi L...IRJET -Survey on Named Entity Recognition using Syntactic Parsing for Hindi L...
IRJET -Survey on Named Entity Recognition using Syntactic Parsing for Hindi L...
 
an efficient approach for co extracting opinion targets based in online revie...
an efficient approach for co extracting opinion targets based in online revie...an efficient approach for co extracting opinion targets based in online revie...
an efficient approach for co extracting opinion targets based in online revie...
 
Advanced Question Paper Generator using Fuzzy Logic
Advanced Question Paper Generator using Fuzzy LogicAdvanced Question Paper Generator using Fuzzy Logic
Advanced Question Paper Generator using Fuzzy Logic
 
Enhanced Retrieval of Web Pages using Improved Page Rank Algorithm
Enhanced Retrieval of Web Pages using Improved Page Rank AlgorithmEnhanced Retrieval of Web Pages using Improved Page Rank Algorithm
Enhanced Retrieval of Web Pages using Improved Page Rank Algorithm
 

Similar to Automated essay evaluation using natural language processing and machine learning techniques

IRJET- An Automated Approach to Conduct Pune University’s In-Sem Examination
IRJET- An Automated Approach to Conduct Pune University’s In-Sem ExaminationIRJET- An Automated Approach to Conduct Pune University’s In-Sem Examination
IRJET- An Automated Approach to Conduct Pune University’s In-Sem ExaminationIRJET Journal
 
Automated Evaluation of Essays and Short Answers.pdf
Automated Evaluation of Essays and Short Answers.pdfAutomated Evaluation of Essays and Short Answers.pdf
Automated Evaluation of Essays and Short Answers.pdfKathryn Patel
 
IRJET - Automated Essay Grading System using Deep Learning
IRJET -  	  Automated Essay Grading System using Deep LearningIRJET -  	  Automated Essay Grading System using Deep Learning
IRJET - Automated Essay Grading System using Deep LearningIRJET Journal
 
Automated Essay Grading using Features Selection
Automated Essay Grading using Features SelectionAutomated Essay Grading using Features Selection
Automated Essay Grading using Features SelectionIRJET Journal
 
Assisting Tool For Essay Grading For Turkish Language Instructors
Assisting Tool For Essay Grading For Turkish Language InstructorsAssisting Tool For Essay Grading For Turkish Language Instructors
Assisting Tool For Essay Grading For Turkish Language InstructorsLeslie Schulte
 
DEVELOPING A FRAMEWORK FOR ONLINE PRACTICE EXAMINATION AND AUTOMATED SCORE GE...
DEVELOPING A FRAMEWORK FOR ONLINE PRACTICE EXAMINATION AND AUTOMATED SCORE GE...DEVELOPING A FRAMEWORK FOR ONLINE PRACTICE EXAMINATION AND AUTOMATED SCORE GE...
DEVELOPING A FRAMEWORK FOR ONLINE PRACTICE EXAMINATION AND AUTOMATED SCORE GE...ijcsit
 
Automatic Text Summarization: A Critical Review
Automatic Text Summarization: A Critical ReviewAutomatic Text Summarization: A Critical Review
Automatic Text Summarization: A Critical ReviewIRJET Journal
 
Machine Learning Techniques with Ontology for Subjective Answer Evaluation
Machine Learning Techniques with Ontology for Subjective Answer EvaluationMachine Learning Techniques with Ontology for Subjective Answer Evaluation
Machine Learning Techniques with Ontology for Subjective Answer Evaluationijnlc
 
IRJET - Analysis of Student Feedback on Faculty Teaching using Sentiment Anal...
IRJET - Analysis of Student Feedback on Faculty Teaching using Sentiment Anal...IRJET - Analysis of Student Feedback on Faculty Teaching using Sentiment Anal...
IRJET - Analysis of Student Feedback on Faculty Teaching using Sentiment Anal...IRJET Journal
 
Analysis of sms feedback and online feedback using sentiment analysis for ass...
Analysis of sms feedback and online feedback using sentiment analysis for ass...Analysis of sms feedback and online feedback using sentiment analysis for ass...
Analysis of sms feedback and online feedback using sentiment analysis for ass...eSAT Journals
 
IRJET - Online Assignment Plagiarism Checking using Data Mining and NLP
IRJET -  	  Online Assignment Plagiarism Checking using Data Mining and NLPIRJET -  	  Online Assignment Plagiarism Checking using Data Mining and NLP
IRJET - Online Assignment Plagiarism Checking using Data Mining and NLPIRJET Journal
 
IRJET- Automatic Recapitulation of Text Document
IRJET- Automatic Recapitulation of Text DocumentIRJET- Automatic Recapitulation of Text Document
IRJET- Automatic Recapitulation of Text DocumentIRJET Journal
 
IRJET- Automatic Generation of Question Paper using Blooms Taxonomy
IRJET- Automatic Generation of Question Paper using Blooms TaxonomyIRJET- Automatic Generation of Question Paper using Blooms Taxonomy
IRJET- Automatic Generation of Question Paper using Blooms TaxonomyIRJET Journal
 
IRJET- Personalized E-Learning using Learner’s Capability Score (LCS)
IRJET- Personalized E-Learning using Learner’s Capability Score (LCS)IRJET- Personalized E-Learning using Learner’s Capability Score (LCS)
IRJET- Personalized E-Learning using Learner’s Capability Score (LCS)IRJET Journal
 
IRJET- Personalized E-Learning using Learner’s Capability Score (LCS)
IRJET- Personalized E-Learning using Learner’s Capability Score (LCS)IRJET- Personalized E-Learning using Learner’s Capability Score (LCS)
IRJET- Personalized E-Learning using Learner’s Capability Score (LCS)IRJET Journal
 
Online Examination and Evaluation System
Online Examination and Evaluation SystemOnline Examination and Evaluation System
Online Examination and Evaluation SystemIRJET Journal
 
Automated Essay Scoring Using Generalized Latent Semantic Analysis
Automated Essay Scoring Using Generalized Latent Semantic AnalysisAutomated Essay Scoring Using Generalized Latent Semantic Analysis
Automated Essay Scoring Using Generalized Latent Semantic AnalysisGina Rizzo
 
Cyprus_paper-final_D.O.Santos_et_al
Cyprus_paper-final_D.O.Santos_et_alCyprus_paper-final_D.O.Santos_et_al
Cyprus_paper-final_D.O.Santos_et_alVictor Santos
 
IRJET- Modeling Student’s Vocabulary Knowledge with Natural
IRJET-  	  Modeling Student’s Vocabulary Knowledge with NaturalIRJET-  	  Modeling Student’s Vocabulary Knowledge with Natural
IRJET- Modeling Student’s Vocabulary Knowledge with NaturalIRJET Journal
 

Similar to Automated essay evaluation using natural language processing and machine learning techniques (20)

IRJET- An Automated Approach to Conduct Pune University’s In-Sem Examination
IRJET- An Automated Approach to Conduct Pune University’s In-Sem ExaminationIRJET- An Automated Approach to Conduct Pune University’s In-Sem Examination
IRJET- An Automated Approach to Conduct Pune University’s In-Sem Examination
 
Automated Evaluation of Essays and Short Answers.pdf
Automated Evaluation of Essays and Short Answers.pdfAutomated Evaluation of Essays and Short Answers.pdf
Automated Evaluation of Essays and Short Answers.pdf
 
IRJET - Automated Essay Grading System using Deep Learning
IRJET -  	  Automated Essay Grading System using Deep LearningIRJET -  	  Automated Essay Grading System using Deep Learning
IRJET - Automated Essay Grading System using Deep Learning
 
A017640107
A017640107A017640107
A017640107
 
Automated Essay Grading using Features Selection
Automated Essay Grading using Features SelectionAutomated Essay Grading using Features Selection
Automated Essay Grading using Features Selection
 
Assisting Tool For Essay Grading For Turkish Language Instructors
Assisting Tool For Essay Grading For Turkish Language InstructorsAssisting Tool For Essay Grading For Turkish Language Instructors
Assisting Tool For Essay Grading For Turkish Language Instructors
 
DEVELOPING A FRAMEWORK FOR ONLINE PRACTICE EXAMINATION AND AUTOMATED SCORE GE...
DEVELOPING A FRAMEWORK FOR ONLINE PRACTICE EXAMINATION AND AUTOMATED SCORE GE...DEVELOPING A FRAMEWORK FOR ONLINE PRACTICE EXAMINATION AND AUTOMATED SCORE GE...
DEVELOPING A FRAMEWORK FOR ONLINE PRACTICE EXAMINATION AND AUTOMATED SCORE GE...
 
Automatic Text Summarization: A Critical Review
Automatic Text Summarization: A Critical ReviewAutomatic Text Summarization: A Critical Review
Automatic Text Summarization: A Critical Review
 
Machine Learning Techniques with Ontology for Subjective Answer Evaluation
Machine Learning Techniques with Ontology for Subjective Answer EvaluationMachine Learning Techniques with Ontology for Subjective Answer Evaluation
Machine Learning Techniques with Ontology for Subjective Answer Evaluation
 
IRJET - Analysis of Student Feedback on Faculty Teaching using Sentiment Anal...
IRJET - Analysis of Student Feedback on Faculty Teaching using Sentiment Anal...IRJET - Analysis of Student Feedback on Faculty Teaching using Sentiment Anal...
IRJET - Analysis of Student Feedback on Faculty Teaching using Sentiment Anal...
 
Analysis of sms feedback and online feedback using sentiment analysis for ass...
Analysis of sms feedback and online feedback using sentiment analysis for ass...Analysis of sms feedback and online feedback using sentiment analysis for ass...
Analysis of sms feedback and online feedback using sentiment analysis for ass...
 
IRJET - Online Assignment Plagiarism Checking using Data Mining and NLP
IRJET -  	  Online Assignment Plagiarism Checking using Data Mining and NLPIRJET -  	  Online Assignment Plagiarism Checking using Data Mining and NLP
IRJET - Online Assignment Plagiarism Checking using Data Mining and NLP
 
IRJET- Automatic Recapitulation of Text Document
IRJET- Automatic Recapitulation of Text DocumentIRJET- Automatic Recapitulation of Text Document
IRJET- Automatic Recapitulation of Text Document
 
IRJET- Automatic Generation of Question Paper using Blooms Taxonomy
IRJET- Automatic Generation of Question Paper using Blooms TaxonomyIRJET- Automatic Generation of Question Paper using Blooms Taxonomy
IRJET- Automatic Generation of Question Paper using Blooms Taxonomy
 
IRJET- Personalized E-Learning using Learner’s Capability Score (LCS)
IRJET- Personalized E-Learning using Learner’s Capability Score (LCS)IRJET- Personalized E-Learning using Learner’s Capability Score (LCS)
IRJET- Personalized E-Learning using Learner’s Capability Score (LCS)
 
IRJET- Personalized E-Learning using Learner’s Capability Score (LCS)
IRJET- Personalized E-Learning using Learner’s Capability Score (LCS)IRJET- Personalized E-Learning using Learner’s Capability Score (LCS)
IRJET- Personalized E-Learning using Learner’s Capability Score (LCS)
 
Online Examination and Evaluation System
Online Examination and Evaluation SystemOnline Examination and Evaluation System
Online Examination and Evaluation System
 
Automated Essay Scoring Using Generalized Latent Semantic Analysis
Automated Essay Scoring Using Generalized Latent Semantic AnalysisAutomated Essay Scoring Using Generalized Latent Semantic Analysis
Automated Essay Scoring Using Generalized Latent Semantic Analysis
 
Cyprus_paper-final_D.O.Santos_et_al
Cyprus_paper-final_D.O.Santos_et_alCyprus_paper-final_D.O.Santos_et_al
Cyprus_paper-final_D.O.Santos_et_al
 
IRJET- Modeling Student’s Vocabulary Knowledge with Natural
IRJET-  	  Modeling Student’s Vocabulary Knowledge with NaturalIRJET-  	  Modeling Student’s Vocabulary Knowledge with Natural
IRJET- Modeling Student’s Vocabulary Knowledge with Natural
 

More from IRJET Journal

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...IRJET Journal
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTUREIRJET Journal
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...IRJET Journal
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsIRJET Journal
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...IRJET Journal
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...IRJET Journal
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...IRJET Journal
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...IRJET Journal
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASIRJET Journal
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...IRJET Journal
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProIRJET Journal
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...IRJET Journal
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemIRJET Journal
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesIRJET Journal
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web applicationIRJET Journal
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...IRJET Journal
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.IRJET Journal
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...IRJET Journal
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignIRJET Journal
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...IRJET Journal
 

More from IRJET Journal (20)

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web application
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
 

Recently uploaded

OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduitsrknatarajan
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdfankushspencer015
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)simmis5
 

Recently uploaded (20)

OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduits
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 

Automated essay evaluation using natural language processing and machine learning techniques

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 05 | May 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 2055 Automated Essay Evaluation using Natural Language Processing Chahat Sharma1, Akash Bishnoi2, Akshay Kr. Sachan3, Aman Verma4 1Assistant Professor, Dept. of Computer Science Engineering, Inderprastha Engineering College, Uttar Pradesh, India 2,3,4 Dept. of Computer Science & Engineering, Inderprastha Engineering College, Uttar Pradesh, India ---------------------------------------------------------------------***---------------------------------------------------------------------- Abstract - Automated Essay Scoring (AES) is a great research area to analyze the human expertise. AES is one of the most challenging activities in Natural Language Processing (NLP). It makes use ofNLPandMachineLearning (ML) techniques to predict the score and match with the human like grading system. The very first model was PEG (Project Essay Grader) propped by Ellis Page in 1960s.Since then, there have been multiple systems that look at providing either a holistic score to the essay or to score individual attributes of the essay. Examples of a few online systems include Grammarly, Paper-Rater, ETS e-rater and IntelliMetric by Vantage. AES relies sufficient number of essay prompts in order to createa gradingmodel whichlater is used to evaluate the prompts. With the increasingnumber of people attempting several exams like GRE, TOEFL, IELTS etc., it’ll become quite difficultfortheinstitutestograde each paper besides the difficulty for humans to focus with a consistent mindset. In this scenario, a person finds it very difficult to grade numerous essays every day within time bounds. This paper aims to overcome and solve the problems faced by human expertsbyprovidingthemwith an interface that can perform their work with the same accuracy. For this, features including Bag of words models, numerical features like sentence count, word count, vocabulary size, perplexity etc. were extracted to grade the essay and achieve maximum accuracy. For this,datasetfrom Hewlett Foundation was picked to select the best features set, by comparing the accuracy of every possible set. The essay set consist of 4 essay prompts with 12000+ essays. So, to evaluate a large number of essays,suchassessmentsseem expensive and time-consuming task. Even if essays graded by human graders are biased, we as a student feel grateful to a system like that, and that is what motivated us to work on this. Key Words: Essays, E-Rater, Graduate Management Admission Test, IntelliMetric, Machine Learning, Natural Language Processing, Project Essay Grader. 1. INTRODUCTION This paper aims to build a machine learning system for automatic scoring of essays written by students. The basic idea is to search for features which can model the attributes like language fluency, vocabulary, structure, organization, content etc. Such a system can have a high utility in many places. A linear regression model is built with polynomial basis function to predict the score of a given essay. The subsequent sections explain the input data, features extraction, detailedapproach,results,andfuturescopeof the work. Its objective is to classify a large set of textual entities into a small number of discrete categories, corresponding to the possible grades—for example, the numbers 1 to 10. Therefore, it can be considered a problem of statistical classification. The impacts that computer have on our writings have been in use for 40 years now. Even the most basics of computers like processing of words, thesaurus etc. is of great help to authors in updating their writing material. The research has revealed that computers have the capacity to function as a more effective cognitive tool. Revision and feedback are important parts of the writing material. Students in general require input from their teachers for mastering the art of writing. However,respondingtostudent papers can be a burden for teachers. Particularly giving feedback to a large group of students on frequent writing assignments can be pretty hectic from teacher’s point of view. Therefore, creation of such a system that can be accurate in both providing the feedback and grading the performance with consuming a lot of time is required. Computerized scoring has many weak points. In AEEF the scores would be much more detail oriented than the ratings provided by two human graders. ThemethodusedbyHamp- Lyons stated the lack of man to man communication as well as the sense in which the writer rates the essays. Similarly, Page stated that the computers could not assess an essay as human grader do because computer systems are programmed and lack the intensity of human emotions and therefore it will not be able to appreciate the context. Another criticism is the construct objections. That is, computer can give importance to unimportant factors while rating or providing the score to the user i.e., focusing on traditional aspects rather than orthodox ones. 2. Literature Survey 2.1 Project Essay Grader (PEG) Project Essay Grade(PEG) isone of the earliest implemented automated essay grading system. It was developed by Ellis Page in 1966 and others upon the request of the college board. This scoring system was developed to make essay scoring more practical and effective and it relies on style analysis of surface linguistic featuresofatextblock.PEGdoes not use NLP approach but instead uses proxy measures to predicttheintrinsicqualityandcomputerapproximations.As no Natural Language Processing (NLP) is used so it does not
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 05 | May 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 2056 take lexical content in account. Proxies referstothestructure of essay which consist of average word length, essay length, number of semicolons or commas, counts of preposition, parts of speech and so on. Proxies are calculated by set of training essays and thentransformedtobeusedinastandard multiple regression along with the essays graded by human to calculate the regression coefficients. One of the best things about PEG is that it’s predicted scores are in close agreement to those of human raters. Second is, this system is computationally tractable which means it can track the errors madebytheusers.PEGscoring system contains two stages the training stage and scoring stage. In training stage, it is trained on sample of essays and in the scoring stage it makes use of proxies to score the essays. PEG purely relies on a statistical approach based on the assumption that the quality of essays is reflected by the measurable proxies. PEG has been criticized becauseitdoes not include semantic features of the essays and only focusing on the structure. Since PEG uses structure of essay to score it was easy to cheat by increasing the length of the essay. It has been modified on several aspects in 1990s. It achieved a correlation score of 0.87 with human raters. A correlation score/coefficient is a statistical relationship between two variables. 2.2 Intelligent Essay Assessors (IEA) IEA uses Latent Semantic Analysis (LSA) which is a computational model of human knowledge representation. It is also a method for extracting semantic similarity of words and passages from text. Both aspects are presented elsewhere [6]. It is based on statistical analysis of large amount of text (typically thousands to millions of words). Previous attempts to develop automated essay scoring models have primary focus on style of writing. Focus on content have always remained secondary priority. However,LSAfocused on conceptual content, the knowledge conveyedinanessay. Note: IEA does not make use of NLP techniques. In their study [3], they compared the performance against two trained ETS graders. They pick two questions, for question one, the correlation betweentwograderswas0.86 while correlation of LSA with ETS grader was also 0.86. For questions two, the correlation was 0.87 and 0.86 respectively. Thus, LSA approach was able to performatthe same reliability level as the trained ETS graders. The biggest advantage of IEA is that, it can flag those essays that are off topic, so that it becomes easy for graderstograde. 2.3 E-Rater ETS developed e-rater has been used for scoring essays since Feb’1999, developed under the team lead of Jill Burstein. E-rater has been continuously upgrading with newer versions. It generatesadvisoryflagwhenitencounters anomalous essay & writing samples such as exceedinglength, repetition material & off topic response. Scores are reported low & summarized in flag message. E-rater version 12.1 uses 10 features to evaluate the essay such as grammar, mechanics, style, usage, organization, development, word choice, word length, positive features, differential word use. E-rater follows Multiple Linear Regression (MLR) methodology in which the data is split into two sets- training dataset & validation dataset. Training dataset is used to build scoring models and evaluation dataset for evaluating the essay. Estimated featured weightswithhuman scorearemaximized on the basis of least-square estimation. The final e-rater scores are scaled to match the distributionalmean&standard deviation of e-rater scores & those of human scores. In addition, with Multiple Linear Regression (MLR), Support Vector Machine (SVM), Random Forest (RF), and K- Nearest Neighbors (k‐NN) can also be used. Fig 1. Comparision between various Models The results show that SVM performs better than MLR model in evaluation of human scores. Overall SVM yields highest among all desired methods. It concludes that MLR do notfully give the useful content in the feature variable for prediction. More sophisticated models need to be developed to improve ratings. On the other hand, machine learning models result may not be straight forward as theMLRmodel.Theadvantage of the MLR model is that basis for the score is seen in the weight that each feature receives. The SVM algorithm, which employs a set of decision surfaces to separate the essays optimally, works best for our particular datasets. 2.4 IntelliMetric GraduateManagementAdmissionCouncil (GMAC®)haslong been benefitted from advances in automated essay scoring. It started with ETS® e-rater®, but in January 2006, ACT, Inc. became responsible for GMAT test development and scoring. ACT included IntelliMetric Essay scoring system of vantage learning as part of their initial proposal. It has following approach- Two humans are assigned the work of rating a certain number of prompts.If theirreviewsdiffer by more than one score point on a scale of 0 to 6, a third rater is appointed to adjudicates scores. Once a sufficientnumber of prompts are hand scored, a scoring model is developed and later used for evaluation the prompts.
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 05 | May 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 2057 Vantage used various mathematical models: Simple word counts- It used a collection of well-developed literature using Bayes Theorem in text classification. The concept is to identify the words or phrases the are more closely associated with essay score. Probabilistic Modelling- An ACT 2004 evaluation of GMAT AWA prompts founds that 87% candidates obtained scoresof 3,4 or 5. This model randomly draws a score from AWA frequency distribution. 3. METHODOLOGY USED Following features have been included in the model: 3.1 Words Count For any essay prompt, it is the most important feature for grading an essay. As the number of words increases, the scoring pattern also increases. Above conclusion is obtained after plotting graphs of words count vs score for our dataset. Fig 2. Score vs Total Words for set 1 and set 4 3.2 Sentences Count An essay is not a single phrase, it consists of many lines supporting writers’ points of views. Again, similar trend was recorded, concluding that it also contributes in scoring scheme. Fig 3. Score vs Sentence count for set 1 and set 4 3.3 Spellings How good a writer’s writing skills also depends on the correct spellings of the words used. This feature also shows a mix trend i.e. positive as well as negative while plotted versus score. Fig 4. Score vs Correct spelling for set 1 and set 4 2.4 Unique Words Repetition of words lacks indicates that writer lacks vocabulary. Here, we ignored the stopwords available in NLTK package and words with length less than 3 to get the unique words counts. Fig 5. Score vs Unique words for set 1 and set 4 3.5 Grammar Usage This feature holds the largest weightage after above features when essays are graded manually. In AEG, grammar error doesn’t show a variation trend over which a comment can be made. However, we have included it, as it helped in achieving better correlation value irrespective of its much contribution in modelling. 3.6 Words Choice It is a well-known feature whatever you speak or write, good quality of words presents you better to the second person. We have used NLTK vaden lexicon to predict the sentiment of the words used in the essay (stopwords and words with length less than 3 are ignored). 3.7 Perplexity Perplexity [8] is a measurement of how well a probability distribution or probability model predicts a sample. It may be used to compare probability models. A low perplexity indicates the probability distributionisgoodatpredictingthe samples. 4. RESULT AND DISCUSSION The dataset [9] has been extracted from Kaggle.com, it consists of data that was used in the competition organized by Hewlett Foundation. The dataset has 4 essay prompts and there are very large number of essays (12000+). Hence, the prompts are divided into 8 essay sets based on their prompts. Our project mainly works on PEG technique. Important features are extracted from the datasets like total word
  • 4. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 05 | May 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 2058 count, sentence count, paragraph, correct spellings, partsof speech, perplexity, words choice (positive and neutral words) using word sentiments. Individual Features were plotted against score to identifythetrend.All thementioned features showed a positive trend versus score. We had trained the selected features using below models to find the correlation between the features and the scores graded by human experts. We got reliable results. The predicted scores are in close agreement with the actualscore as shown in table below. Fig 7. Various Models Correlation Matrix Fig 8. Meaning of Correlation Score 5. CONCLUSIONS In sum, we were able to successfully implement a Lasso linear regression model as shown in Table1,using bothtrivial and nontrivial essay features to vastly improve upon our baseline model. While featureslikewordcountappeartohave the most correlated relationship with score from a graphical standpoint, we believe that a featuresuchasperplexity,which actually takes a language model into account, would in the long run be a superior predictor. Namely, we would ideally extend our self-implemented perplexityfunctionality tothen- gram case, rather than simply using unigrams. With this added capability, we believe our model could achieve even greater Spearman correlation scores. Other features that we believe could improve the effectiveness of the model include parse trees. Parse trees are ordered trees that represent the syntactic structure of a phrase. This linguistic model is, much like perplexity, based on content rather than the “metadata” that are provided by many trivial features. As such, it may prove effective in contributing to the model a more in-depth analysis of the context and construction of sentences, pointing to writing styles that may correlate to higher grades. Finally, we would like to take the prompts of the essays into account. This could be a significant feature for our model, because depending on the type of essay being written— e.g. persuasive, narrative, summary—the organization of the essay could vary, which would then affect how we create our models and which features become more important. There is certainly room for improvement on our model— namely, the features we just mentioned,aswellasmany more we have not discussed. However, given the time, resources and scope for this project, we were very pleased with our results. None of us had ever performed NLP before, but we now look forward to apply more statistical methodology to such problems in the future! 6. REFERENCES [1] Peter W. Foltz, New Mexico State University, Darrell Laham, Knowledge Analysis Technologies [2] S. Dikli, “Automated Essay Scoring”, Florida State University, Tallahassee (PEG) [3] Thomas K. Landauer, University of Colorado, “The Intelligent Essay Assessor: Applications to Educational Technology”, Volume 1, Number 2, October 1999. [4] V. Salvatore, F. Neri, and A. Cucchiarelli, "An overviewof current research on automated essay grading.", Journal of Information Technology Education: Research 2.1 (2003): 319-330, 2003 [5] L. Hamp Lyons, “The Scope of writing assessment”, Elsevier Science Inc, 1075-2935/02 © 2002. [6] Deerwester, Dumais, Furnas, Landauer, & Harshman, 1990; Landauer & Dumais, 1997; Landauer, Foltz, & Laham, 1998 [7] M. Rudner, V. Garcia, & C. Welch, “An Evaluation of the IntelliMetric™ Essay Scoring System Lawrence”, The Journal of Technology, Learning, and Assessment, Volume 4, Number 4 · March 2006 [8] https://en.wikipedia.org/wiki/Perplexity [9] https://www.kaggle.com/c/asap-aes