SlideShare a Scribd company logo
Aspiring Minds
www.aspiringminds.com
Grading Programs using
Machine Learning
Varun Aggarwal
Presented at KDD, 2014
Programming Assessments: Existing solutions
• Manual evaluation: Can’t scale; not standardized
• Test-case based evaluation:
• High false-positives – hard code, inadvertent errors
• High false-negatives – correct code but not efficient
• Similarity metric between control flow graphs, syntax trees:
• Need to handle multiple correct implementations – theoretically doesn’t fit in
• No mapping of metric to an objective feedback
Automatic grading of programs– Why?
• Widely performed - will help professors and TAs save a lot of time.
• Companies can recruit efficiently
• MOOCs - need automated open response assessments to really make it effective. True scaling of such system
currently not achieved.
A model to predict the logical correctness of a
program, given the control and data
dependencies it possesses.
Our Approach
Automata – Automatic program evaluation engine
Machine Learning based scoring
engine
Evaluation of programming best
practices
Asymptotic complexity evaluation
Lint-styled rule-based system to detect
programs not following programming best
practices.
Measures the run-time of the code for
various input sizes
and empirically derives the complexity.
Why programming modules give a better test-
shortlist rate ?
• Programming has more predictive power in identifying good performers than Logical ability.
• Due to lower predictive power of Logical, a higher cut-off has to be applied to it as compared to
Programming to get the same organizational efficiency.
• Higher the Programming capability of the person, requirement on Logical score is lesser.
• Given the person is lower than a given score on Programming, even having a higher logical ability does
not help.
Evaluation Rubric
ML based scoring
Understanding the human process
Problem and Language independent
Features
Machine learning model
Ungraded programs
Graded programs
Predicted grades
1 2
3
4
5
6 7
Evaluation Rubric
Our Approach
Understanding the human process
Problem and Language independent
Features
Machine learning model
Ungraded programs
Graded programs
Predicted grades
1
Evaluation Rubric
Our Approach
Understanding the human process
Problem and Language independent
Features
Machine learning model
Ungraded programs
Graded programs
Predicted grades
2
Evaluation Rubric
Our Approach
Understanding the human process
Problem and Language independent
Features
Machine learning model
Ungraded programs
Graded programs
Predicted grades
3
void print_1(int N){
for(i =1 ; i<=N; i++){
print newline;
count = i;
for(j=0; j<i; j++)
print count;
count++;
}
}
1
2 3
3 4 5
4 5 6 7
OBJECTIVE
To print the pattern of integers
An implementation
1. Are there loops? Are there
print statements?
3. Is the conditional in the inner loop dependent on
- a variable modified in the outer loop?
- a variable used in the conditional of the outer
loop?
What does a grader look for?
2. Is there a nested-loop structure?
Grammar for expressing features
• Simple features
• Keywords and Tokens (Counts):
• Tokens like for, if, return, break; function calls like printf, strrev, strcat; declarations like int, char
• Operators like various arithmetic, logical, relational operators used
• Character constants like ‘0’, ‘ ’, ‘65’, ‘96’
• Capturing logical constructs (Interactions)
• Control flow structure
• Data-dependencies
• Data-dependencies in context of control-flow
CONTROL FEATURES – COUNTS
Counts of control-related keywords/tokens
Ex. count(for) = 2
count(for-in-for) = 1
count(while) = 0
Control-context of these keywords
- The Print command as loop(loop(print)))
for(i =1 ; i<=N; i++){
print newline;
count = i;
CONTROL FLOW GRAPH
i = 1
i <= N
i++
j < i
count = i
j = 0
print(count)
count++
j++
END
Loop 1
Loop 2
Parent scope
for(j=0; j<i; j++)
print count; count++;
void print(int N){
}
}
TARGET PROGRAM
DATA OPERATION FEATURES IN CONTROL-CONTEXT
Counts of data-related tokens in context of the control structure
Ex. count(block1 :loop(loop(++))) = 2
count(block1 :loop(loop_cond(<))) = 1
Capture control-context of data-dependencies in groups of expressions
 i++ j < i : var (i) related to var (j) : appearing in a loop(loop_cond)
previously incremented : appearing in a loop
The relation and the increment happen in the same block
Loop 1
Loop 2
Loop 1
Loop 1
Loop 2
Loop 1
Loop 2
Loop 1
i = 1
i <= N
print(count)
count = ii++
j < i
count = 0
count++
j= 0
j++
Parent scope Parent scope
Loop 1
Loop 1
Loop 2
CONTROL FLOW INFORMATION
ANNOTATED IN A-D-D GRAPH
Loop 1
• Deployed Automata in a major product-based company’s recruitment
• Analyzed the performance improvement in using Automata over test-case pass based selection criterion
• 22.6% candidates who were not being shortlisted through test-case pass were now shortlisted using
Automata.
Case study
Experimental Results
Sort Problem
Doing it the one-class way!
PROBLEM All features Basic features
Mean Min25 Mean Min25
1 0.57 0.61 0.52 0.56
2 0.80 0.83 0.72 0.75
3 0.75 0.81 0.59 0.73
4 0.81 0.81 0.75 0.75
5 0.68 0.69 0.55 0.61
Betters test-case in all, but one
How good is the final ML-based score?
Validation Correlation >= 0.79
Matches Inter-rater Correlation between two human raters
PROBLEM # of features Cross-val correl Train correl Validation correl Test Case Score
1 80 0.61 0.85 0.79 0.54
2 68 0.77 0.93 0.91 0.80
3 193 0.91 0.98 0.90 0.64
4 66 0.90 0.94 0.90 0.80
5 87 0.81 0.92 0.84 0.84
Can we get insight?
• The most contributing feature for Find Digit problem -
int findDigit(int N, int digit){
…
…
LOOP (N != <constant value>){
…
N = N / <constant value>
…
}
}
Features for FindDigit problem
analyzed. Given a multi-digit
number and a digit, one has to
find the number of times the
digit appears in the number
Yes, we can!
• The most contributing feature for Find Digit problem -
int findDigit(int N, int digit){
…
LOOP (N != <constant value>){
…
N = N / <constant value>
…
}
…
}
int findDigit(int N, int digit){
...
while(N != 0){
d = N%10;
if(d == digit)
...
N = N / 10;
}
}
Evaluation Rubric
Score Interpretation
5 Completely correct and efficient
An efficient implementation of the problem using right control structures and data-
dependencies.
4 Correct with some silly errors
Correct control structures and closely matching data-dependencies. Some silly
mistakes fail the code to pass test-cases.
3 Inconsistent logical structures
Right control structures start exist with few correct data dependencies
2 Emerging basic structures
Appropriate keywords and tokens present, showing some understanding of the
problem
1 Gibberish code
Seemingly unrelated to problem at hand.
Automata – Sample report
Candidate’s source code
Feedback on
programming
best practices
Asymptotic
complexity of the
candidate’s
solution
Test case pass/fail
information
Problem summary
Do our fancy features help?
Control and Data dependency features add around 0.15 correlation points above token information.
PROBLEM Type of feature # of features Cross-val correl Train correl Validation correl
1
All, w/o test case 35 0.57 0.72 0.56
Basic 60 0.62 0.87 0.41
2
All, w/o test case 80 0.81 0.99 0.80
Basic 26 0.59 0.72 0.67
3
All, w/o test case 190 0.87 0.97 0.90
Basic 26 0.74 0.89 0.74
4
All, w/o test case 134 0.85 0.91 0.82
Basic 35 0.83 0.88 0.69
5
All, w/o test case 166 0.66 0.81 0.64
Basic 40 0.61 0.78 0.61
Conclusion
• We propose the first machine learning based approach to automatically grade programs
• An innovative feature grammar is proposed which matches human intuition of grading programs.
• Models built for sample problems show promising results.
• We propose and demonstrate machine learning techniques to lower the need of human-graded data to build
models.

More Related Content

What's hot

Selection procedure in samsung
Selection procedure in samsungSelection procedure in samsung
Selection procedure in samsung
Harsh Vardhan
 
Software testing
Software testingSoftware testing
Software testing
thaneofife
 
IKMTest-ResultMartinOKello4
IKMTest-ResultMartinOKello4IKMTest-ResultMartinOKello4
IKMTest-ResultMartinOKello4
Martin Okello
 
Unit testing
Unit testingUnit testing
Unit testing
medsherb
 
Internship 3 months
Internship 3 monthsInternship 3 months
Internship 3 months
Iuliana Gultur
 
Training and development
Training and developmentTraining and development
Training and development
Snehali Bagchi
 
MBA WiZard – 2013: India’s Biggest Contest for MBA Aspirants
MBA WiZard – 2013: India’s Biggest Contest for MBA AspirantsMBA WiZard – 2013: India’s Biggest Contest for MBA Aspirants
MBA WiZard – 2013: India’s Biggest Contest for MBA Aspirants
TCYonline
 
Machine Translation Quality Estimation
Machine Translation Quality EstimationMachine Translation Quality Estimation
Machine Translation Quality Estimation
Harsha Vardhan
 
Linways assessment codeways!
Linways assessment  codeways!  Linways assessment  codeways!
Linways assessment codeways!
Bastin Thomas
 
Equivalence class testing
Equivalence  class testingEquivalence  class testing
Equivalence class testing
Mani Kanth
 
Hiring process of Amazon
Hiring process of AmazonHiring process of Amazon
Hiring process of Amazon
Aircto
 

What's hot (11)

Selection procedure in samsung
Selection procedure in samsungSelection procedure in samsung
Selection procedure in samsung
 
Software testing
Software testingSoftware testing
Software testing
 
IKMTest-ResultMartinOKello4
IKMTest-ResultMartinOKello4IKMTest-ResultMartinOKello4
IKMTest-ResultMartinOKello4
 
Unit testing
Unit testingUnit testing
Unit testing
 
Internship 3 months
Internship 3 monthsInternship 3 months
Internship 3 months
 
Training and development
Training and developmentTraining and development
Training and development
 
MBA WiZard – 2013: India’s Biggest Contest for MBA Aspirants
MBA WiZard – 2013: India’s Biggest Contest for MBA AspirantsMBA WiZard – 2013: India’s Biggest Contest for MBA Aspirants
MBA WiZard – 2013: India’s Biggest Contest for MBA Aspirants
 
Machine Translation Quality Estimation
Machine Translation Quality EstimationMachine Translation Quality Estimation
Machine Translation Quality Estimation
 
Linways assessment codeways!
Linways assessment  codeways!  Linways assessment  codeways!
Linways assessment codeways!
 
Equivalence class testing
Equivalence  class testingEquivalence  class testing
Equivalence class testing
 
Hiring process of Amazon
Hiring process of AmazonHiring process of Amazon
Hiring process of Amazon
 

Viewers also liked

Aspiring Minds | Svar
Aspiring Minds | SvarAspiring Minds | Svar
Aspiring Minds | Svar
Aspiring Minds
 
Aspiring Minds | AM Situations
Aspiring Minds | AM SituationsAspiring Minds | AM Situations
Aspiring Minds | AM Situations
Aspiring Minds
 
Campus New Proposal.
Campus New Proposal.Campus New Proposal.
Campus New Proposal.
Sayed Ali
 
Am cat workshop part 1
Am cat workshop part 1Am cat workshop part 1
Am cat workshop part 1
vanatteveldt
 
Amcat Certificate
Amcat CertificateAmcat Certificate
Amcat Certificate
HIMANSHU SAXENA
 
Campus Performace Report
Campus Performace ReportCampus Performace Report
Campus Performace Report
Sayed Ali
 
Prediction of Salary From Profiles
Prediction of Salary From ProfilesPrediction of Salary From Profiles
Prediction of Salary From Profiles
Sohom Ghosh
 
16720032294774_Sirallapu_Anitha_corpReport
16720032294774_Sirallapu_Anitha_corpReport16720032294774_Sirallapu_Anitha_corpReport
16720032294774_Sirallapu_Anitha_corpReport
anitha sirallapu
 
Aspiring Minds | Labor market insights
Aspiring Minds | Labor market insightsAspiring Minds | Labor market insights
Aspiring Minds | Labor market insights
Aspiring Minds
 
Recruitments & assessment industry
Recruitments & assessment industryRecruitments & assessment industry
Recruitments & assessment industry
Rahul Koul
 
Institute Performance Solutions
Institute Performance Solutions Institute Performance Solutions
Institute Performance Solutions
Youth4work.com
 
About Youth4work - Integrated Talent Solutions
About Youth4work - Integrated Talent SolutionsAbout Youth4work - Integrated Talent Solutions
About Youth4work - Integrated Talent Solutions
Youth4work.com
 
Youth4work Marketing & Advertising Solutions
Youth4work Marketing & Advertising SolutionsYouth4work Marketing & Advertising Solutions
Youth4work Marketing & Advertising Solutions
Youth4work.com
 
Amcat & nac
Amcat & nac Amcat & nac
Campus Hiring Made Easy
Campus Hiring Made Easy Campus Hiring Made Easy
Campus Hiring Made Easy
Youth4work.com
 
Humanika Consulting presentation 2012b english
Humanika Consulting presentation 2012b englishHumanika Consulting presentation 2012b english
Humanika Consulting presentation 2012b english
Seta Wicaksana
 
Amcat 2
Amcat 2Amcat 2

Viewers also liked (17)

Aspiring Minds | Svar
Aspiring Minds | SvarAspiring Minds | Svar
Aspiring Minds | Svar
 
Aspiring Minds | AM Situations
Aspiring Minds | AM SituationsAspiring Minds | AM Situations
Aspiring Minds | AM Situations
 
Campus New Proposal.
Campus New Proposal.Campus New Proposal.
Campus New Proposal.
 
Am cat workshop part 1
Am cat workshop part 1Am cat workshop part 1
Am cat workshop part 1
 
Amcat Certificate
Amcat CertificateAmcat Certificate
Amcat Certificate
 
Campus Performace Report
Campus Performace ReportCampus Performace Report
Campus Performace Report
 
Prediction of Salary From Profiles
Prediction of Salary From ProfilesPrediction of Salary From Profiles
Prediction of Salary From Profiles
 
16720032294774_Sirallapu_Anitha_corpReport
16720032294774_Sirallapu_Anitha_corpReport16720032294774_Sirallapu_Anitha_corpReport
16720032294774_Sirallapu_Anitha_corpReport
 
Aspiring Minds | Labor market insights
Aspiring Minds | Labor market insightsAspiring Minds | Labor market insights
Aspiring Minds | Labor market insights
 
Recruitments & assessment industry
Recruitments & assessment industryRecruitments & assessment industry
Recruitments & assessment industry
 
Institute Performance Solutions
Institute Performance Solutions Institute Performance Solutions
Institute Performance Solutions
 
About Youth4work - Integrated Talent Solutions
About Youth4work - Integrated Talent SolutionsAbout Youth4work - Integrated Talent Solutions
About Youth4work - Integrated Talent Solutions
 
Youth4work Marketing & Advertising Solutions
Youth4work Marketing & Advertising SolutionsYouth4work Marketing & Advertising Solutions
Youth4work Marketing & Advertising Solutions
 
Amcat & nac
Amcat & nac Amcat & nac
Amcat & nac
 
Campus Hiring Made Easy
Campus Hiring Made Easy Campus Hiring Made Easy
Campus Hiring Made Easy
 
Humanika Consulting presentation 2012b english
Humanika Consulting presentation 2012b englishHumanika Consulting presentation 2012b english
Humanika Consulting presentation 2012b english
 
Amcat 2
Amcat 2Amcat 2
Amcat 2
 

Similar to Aspiring Minds | Automata

Using formal methods in Industrial Software Development
Using formal methods in Industrial Software DevelopmentUsing formal methods in Industrial Software Development
Using formal methods in Industrial Software Development
Robert van Lieshout
 
Staroletov testing TDD BDD MBT
Staroletov testing TDD BDD MBTStaroletov testing TDD BDD MBT
Staroletov testing TDD BDD MBT
Sergey Staroletov
 
Testing concepts [3] - Software Testing Techniques (CIS640)
Testing concepts [3] - Software Testing Techniques (CIS640)Testing concepts [3] - Software Testing Techniques (CIS640)
Testing concepts [3] - Software Testing Techniques (CIS640)
Venkatesh Prasad Ranganath
 
Unit 1 python (2021 r)
Unit 1 python (2021 r)Unit 1 python (2021 r)
Unit 1 python (2021 r)
praveena p
 
Foutse_Khomh.pptx
Foutse_Khomh.pptxFoutse_Khomh.pptx
Foutse_Khomh.pptx
Foutse Khomh
 
Words that Matter
Words that MatterWords that Matter
Introduction to White box testing
Introduction to White box testingIntroduction to White box testing
Introduction to White box testing
Aliaa Monier Ismaail
 
Experimental Design for Distributed Machine Learning with Myles Baker
Experimental Design for Distributed Machine Learning with Myles BakerExperimental Design for Distributed Machine Learning with Myles Baker
Experimental Design for Distributed Machine Learning with Myles Baker
Databricks
 
Different Methodologies For Testing Web Application Testing
Different Methodologies For Testing Web Application TestingDifferent Methodologies For Testing Web Application Testing
Different Methodologies For Testing Web Application Testing
Rachel Davis
 
testing
testingtesting
testing
Rashmi Deoli
 
Defying Logic - Business Logic Testing with Automation
Defying Logic - Business Logic Testing with AutomationDefying Logic - Business Logic Testing with Automation
Defying Logic - Business Logic Testing with Automation
Rafal Los
 
Symposium 2019 : Gestion de projet en Intelligence Artificielle
Symposium 2019 : Gestion de projet en Intelligence ArtificielleSymposium 2019 : Gestion de projet en Intelligence Artificielle
Symposium 2019 : Gestion de projet en Intelligence Artificielle
PMI-Montréal
 
presentation.pdf
presentation.pdfpresentation.pdf
presentation.pdf
caa28steve
 
ЄРМЕК КАДИРБАЄВ & АЛЕКС РИБКІН «How we train QAEs to join automation» Online ...
ЄРМЕК КАДИРБАЄВ & АЛЕКС РИБКІН «How we train QAEs to join automation» Online ...ЄРМЕК КАДИРБАЄВ & АЛЕКС РИБКІН «How we train QAEs to join automation» Online ...
ЄРМЕК КАДИРБАЄВ & АЛЕКС РИБКІН «How we train QAEs to join automation» Online ...
QADay
 
Integrating AI in software quality in absence of a well-defined requirements
Integrating AI in software quality in absence of a well-defined requirementsIntegrating AI in software quality in absence of a well-defined requirements
Integrating AI in software quality in absence of a well-defined requirements
Nagarro
 
DP Project Report
DP Project ReportDP Project Report
DP Project Report
Chawal Ukesh
 
Coding - SDLC Model
Coding - SDLC ModelCoding - SDLC Model
Software Testing interview - Q&A and tips
Software Testing interview - Q&A and tipsSoftware Testing interview - Q&A and tips
Software Testing interview - Q&A and tips
Pankaj Dubey
 
An analytical approach to effective risk based test planning
An analytical approach to effective risk based test planning An analytical approach to effective risk based test planning
An analytical approach to effective risk based test planning
Joe Kevens
 
Lessons Learned When Automating
Lessons Learned When AutomatingLessons Learned When Automating
Lessons Learned When Automating
Alan Richardson
 

Similar to Aspiring Minds | Automata (20)

Using formal methods in Industrial Software Development
Using formal methods in Industrial Software DevelopmentUsing formal methods in Industrial Software Development
Using formal methods in Industrial Software Development
 
Staroletov testing TDD BDD MBT
Staroletov testing TDD BDD MBTStaroletov testing TDD BDD MBT
Staroletov testing TDD BDD MBT
 
Testing concepts [3] - Software Testing Techniques (CIS640)
Testing concepts [3] - Software Testing Techniques (CIS640)Testing concepts [3] - Software Testing Techniques (CIS640)
Testing concepts [3] - Software Testing Techniques (CIS640)
 
Unit 1 python (2021 r)
Unit 1 python (2021 r)Unit 1 python (2021 r)
Unit 1 python (2021 r)
 
Foutse_Khomh.pptx
Foutse_Khomh.pptxFoutse_Khomh.pptx
Foutse_Khomh.pptx
 
Words that Matter
Words that MatterWords that Matter
Words that Matter
 
Introduction to White box testing
Introduction to White box testingIntroduction to White box testing
Introduction to White box testing
 
Experimental Design for Distributed Machine Learning with Myles Baker
Experimental Design for Distributed Machine Learning with Myles BakerExperimental Design for Distributed Machine Learning with Myles Baker
Experimental Design for Distributed Machine Learning with Myles Baker
 
Different Methodologies For Testing Web Application Testing
Different Methodologies For Testing Web Application TestingDifferent Methodologies For Testing Web Application Testing
Different Methodologies For Testing Web Application Testing
 
testing
testingtesting
testing
 
Defying Logic - Business Logic Testing with Automation
Defying Logic - Business Logic Testing with AutomationDefying Logic - Business Logic Testing with Automation
Defying Logic - Business Logic Testing with Automation
 
Symposium 2019 : Gestion de projet en Intelligence Artificielle
Symposium 2019 : Gestion de projet en Intelligence ArtificielleSymposium 2019 : Gestion de projet en Intelligence Artificielle
Symposium 2019 : Gestion de projet en Intelligence Artificielle
 
presentation.pdf
presentation.pdfpresentation.pdf
presentation.pdf
 
ЄРМЕК КАДИРБАЄВ & АЛЕКС РИБКІН «How we train QAEs to join automation» Online ...
ЄРМЕК КАДИРБАЄВ & АЛЕКС РИБКІН «How we train QAEs to join automation» Online ...ЄРМЕК КАДИРБАЄВ & АЛЕКС РИБКІН «How we train QAEs to join automation» Online ...
ЄРМЕК КАДИРБАЄВ & АЛЕКС РИБКІН «How we train QAEs to join automation» Online ...
 
Integrating AI in software quality in absence of a well-defined requirements
Integrating AI in software quality in absence of a well-defined requirementsIntegrating AI in software quality in absence of a well-defined requirements
Integrating AI in software quality in absence of a well-defined requirements
 
DP Project Report
DP Project ReportDP Project Report
DP Project Report
 
Coding - SDLC Model
Coding - SDLC ModelCoding - SDLC Model
Coding - SDLC Model
 
Software Testing interview - Q&A and tips
Software Testing interview - Q&A and tipsSoftware Testing interview - Q&A and tips
Software Testing interview - Q&A and tips
 
An analytical approach to effective risk based test planning
An analytical approach to effective risk based test planning An analytical approach to effective risk based test planning
An analytical approach to effective risk based test planning
 
Lessons Learned When Automating
Lessons Learned When AutomatingLessons Learned When Automating
Lessons Learned When Automating
 

Recently uploaded

Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
Wouter Lemaire
 
dbms calicut university B. sc Cs 4th sem.pdf
dbms  calicut university B. sc Cs 4th sem.pdfdbms  calicut university B. sc Cs 4th sem.pdf
dbms calicut university B. sc Cs 4th sem.pdf
Shinana2
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
Postman
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Wask
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Tatiana Kojar
 
Trusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process MiningTrusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process Mining
LucaBarbaro3
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
Hiike
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Operating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptxOperating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptx
Pravash Chandra Das
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
alexjohnson7307
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 

Recently uploaded (20)

Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
 
dbms calicut university B. sc Cs 4th sem.pdf
dbms  calicut university B. sc Cs 4th sem.pdfdbms  calicut university B. sc Cs 4th sem.pdf
dbms calicut university B. sc Cs 4th sem.pdf
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
 
Trusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process MiningTrusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process Mining
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Operating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptxOperating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptx
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 

Aspiring Minds | Automata

  • 1. Aspiring Minds www.aspiringminds.com Grading Programs using Machine Learning Varun Aggarwal Presented at KDD, 2014
  • 2. Programming Assessments: Existing solutions • Manual evaluation: Can’t scale; not standardized • Test-case based evaluation: • High false-positives – hard code, inadvertent errors • High false-negatives – correct code but not efficient • Similarity metric between control flow graphs, syntax trees: • Need to handle multiple correct implementations – theoretically doesn’t fit in • No mapping of metric to an objective feedback
  • 3. Automatic grading of programs– Why? • Widely performed - will help professors and TAs save a lot of time. • Companies can recruit efficiently • MOOCs - need automated open response assessments to really make it effective. True scaling of such system currently not achieved.
  • 4. A model to predict the logical correctness of a program, given the control and data dependencies it possesses. Our Approach Automata – Automatic program evaluation engine Machine Learning based scoring engine Evaluation of programming best practices Asymptotic complexity evaluation Lint-styled rule-based system to detect programs not following programming best practices. Measures the run-time of the code for various input sizes and empirically derives the complexity.
  • 5. Why programming modules give a better test- shortlist rate ? • Programming has more predictive power in identifying good performers than Logical ability. • Due to lower predictive power of Logical, a higher cut-off has to be applied to it as compared to Programming to get the same organizational efficiency. • Higher the Programming capability of the person, requirement on Logical score is lesser. • Given the person is lower than a given score on Programming, even having a higher logical ability does not help.
  • 6. Evaluation Rubric ML based scoring Understanding the human process Problem and Language independent Features Machine learning model Ungraded programs Graded programs Predicted grades 1 2 3 4 5 6 7
  • 7. Evaluation Rubric Our Approach Understanding the human process Problem and Language independent Features Machine learning model Ungraded programs Graded programs Predicted grades 1
  • 8. Evaluation Rubric Our Approach Understanding the human process Problem and Language independent Features Machine learning model Ungraded programs Graded programs Predicted grades 2
  • 9. Evaluation Rubric Our Approach Understanding the human process Problem and Language independent Features Machine learning model Ungraded programs Graded programs Predicted grades 3
  • 10. void print_1(int N){ for(i =1 ; i<=N; i++){ print newline; count = i; for(j=0; j<i; j++) print count; count++; } } 1 2 3 3 4 5 4 5 6 7 OBJECTIVE To print the pattern of integers An implementation 1. Are there loops? Are there print statements? 3. Is the conditional in the inner loop dependent on - a variable modified in the outer loop? - a variable used in the conditional of the outer loop? What does a grader look for? 2. Is there a nested-loop structure?
  • 11. Grammar for expressing features • Simple features • Keywords and Tokens (Counts): • Tokens like for, if, return, break; function calls like printf, strrev, strcat; declarations like int, char • Operators like various arithmetic, logical, relational operators used • Character constants like ‘0’, ‘ ’, ‘65’, ‘96’ • Capturing logical constructs (Interactions) • Control flow structure • Data-dependencies • Data-dependencies in context of control-flow
  • 12. CONTROL FEATURES – COUNTS Counts of control-related keywords/tokens Ex. count(for) = 2 count(for-in-for) = 1 count(while) = 0 Control-context of these keywords - The Print command as loop(loop(print))) for(i =1 ; i<=N; i++){ print newline; count = i; CONTROL FLOW GRAPH i = 1 i <= N i++ j < i count = i j = 0 print(count) count++ j++ END Loop 1 Loop 2 Parent scope for(j=0; j<i; j++) print count; count++; void print(int N){ } } TARGET PROGRAM
  • 13. DATA OPERATION FEATURES IN CONTROL-CONTEXT Counts of data-related tokens in context of the control structure Ex. count(block1 :loop(loop(++))) = 2 count(block1 :loop(loop_cond(<))) = 1 Capture control-context of data-dependencies in groups of expressions  i++ j < i : var (i) related to var (j) : appearing in a loop(loop_cond) previously incremented : appearing in a loop The relation and the increment happen in the same block Loop 1 Loop 2 Loop 1 Loop 1 Loop 2 Loop 1 Loop 2 Loop 1 i = 1 i <= N print(count) count = ii++ j < i count = 0 count++ j= 0 j++ Parent scope Parent scope Loop 1 Loop 1 Loop 2 CONTROL FLOW INFORMATION ANNOTATED IN A-D-D GRAPH Loop 1
  • 14. • Deployed Automata in a major product-based company’s recruitment • Analyzed the performance improvement in using Automata over test-case pass based selection criterion • 22.6% candidates who were not being shortlisted through test-case pass were now shortlisted using Automata. Case study
  • 16. Doing it the one-class way! PROBLEM All features Basic features Mean Min25 Mean Min25 1 0.57 0.61 0.52 0.56 2 0.80 0.83 0.72 0.75 3 0.75 0.81 0.59 0.73 4 0.81 0.81 0.75 0.75 5 0.68 0.69 0.55 0.61 Betters test-case in all, but one
  • 17. How good is the final ML-based score? Validation Correlation >= 0.79 Matches Inter-rater Correlation between two human raters PROBLEM # of features Cross-val correl Train correl Validation correl Test Case Score 1 80 0.61 0.85 0.79 0.54 2 68 0.77 0.93 0.91 0.80 3 193 0.91 0.98 0.90 0.64 4 66 0.90 0.94 0.90 0.80 5 87 0.81 0.92 0.84 0.84
  • 18. Can we get insight? • The most contributing feature for Find Digit problem - int findDigit(int N, int digit){ … … LOOP (N != <constant value>){ … N = N / <constant value> … } } Features for FindDigit problem analyzed. Given a multi-digit number and a digit, one has to find the number of times the digit appears in the number
  • 19. Yes, we can! • The most contributing feature for Find Digit problem - int findDigit(int N, int digit){ … LOOP (N != <constant value>){ … N = N / <constant value> … } … } int findDigit(int N, int digit){ ... while(N != 0){ d = N%10; if(d == digit) ... N = N / 10; } }
  • 20. Evaluation Rubric Score Interpretation 5 Completely correct and efficient An efficient implementation of the problem using right control structures and data- dependencies. 4 Correct with some silly errors Correct control structures and closely matching data-dependencies. Some silly mistakes fail the code to pass test-cases. 3 Inconsistent logical structures Right control structures start exist with few correct data dependencies 2 Emerging basic structures Appropriate keywords and tokens present, showing some understanding of the problem 1 Gibberish code Seemingly unrelated to problem at hand.
  • 21. Automata – Sample report Candidate’s source code Feedback on programming best practices Asymptotic complexity of the candidate’s solution Test case pass/fail information Problem summary
  • 22. Do our fancy features help? Control and Data dependency features add around 0.15 correlation points above token information. PROBLEM Type of feature # of features Cross-val correl Train correl Validation correl 1 All, w/o test case 35 0.57 0.72 0.56 Basic 60 0.62 0.87 0.41 2 All, w/o test case 80 0.81 0.99 0.80 Basic 26 0.59 0.72 0.67 3 All, w/o test case 190 0.87 0.97 0.90 Basic 26 0.74 0.89 0.74 4 All, w/o test case 134 0.85 0.91 0.82 Basic 35 0.83 0.88 0.69 5 All, w/o test case 166 0.66 0.81 0.64 Basic 40 0.61 0.78 0.61
  • 23. Conclusion • We propose the first machine learning based approach to automatically grade programs • An innovative feature grammar is proposed which matches human intuition of grading programs. • Models built for sample problems show promising results. • We propose and demonstrate machine learning techniques to lower the need of human-graded data to build models.