SlideShare a Scribd company logo
1 of 30
Download to read offline
PREDICTING OUTCOME OF
LEGAL CASES
Ankita Singh Nilutpal Goswami
Ankita Singh Nilutpal Goswami
Agenda
Domain Objective Data Extraction EDA
Architecture Model
Development
Q & ASummary
and Findings
1 2 3 4
5 6 7 8
Ankita Singh Nilutpal Goswami
Legal Systems of the World
Source – Citation (http://saint-claire.org/)
Ankita Singh Nilutpal Goswami
Hierarchy of Indian Judiciary
Sources of Law
§ Constitution
§ Legislation
• Ordinary
• Delegated
• Ordinance
§ Judicial Precedent
§ Customs
Ankita Singh Nilutpal Goswami
Background
Indian Judicial System
o Largest judicial machinery based on the biggest constitution
o Constitution of India have 448 articles in 25 parts, 12
schedules, 5 appendices and 98 amendments
o Indian Penal Code (IPC) defines various crimes/offences and
prescribes the punishment
o Criminal Procedure Code (CrPC) defines the mandatory
procedures to be carried out pursuing a case
o 24 High Courts and over 600 district courts
o Nearly 5 lakhs cases are filled daily in Indian Courts
o Approximately 4.5 lakhs of cases are put up before the courts
daily
o About 2.5 lakhs of cases are disposed off daily
Ankita Singh Nilutpal Goswami
Challenging Facts
CASES PENDING CIVIL CASES CRIMINAL CASES TOTAL CASES PERCENTAGE
> 10 years 597,166 1,691,515 2,288,681 8.28%
Between 5 to 10 years 1,244,117 3,212,377 4,456,494 16.13%
Between 2 to 5 years 2,542,925 5,394,015 7,936,940 28.73%
< 2 years 3,946,341 8,997,935 12,944,276 46.85%
Total Pending Cases 8,330,549 19,295,842 27,626,391
> 10 years
8%
Between 5 to 10
years
16%
Between 2 to 5 years
29%
< 2 years
47%
Other
76%
Source – National Judicial Data Grid (as on September 18th 2018)
v Case Disposal Rates (August 2018)
§ 10 years – 1.5 %
§ All cases – 3.8 %
v Cases filed daily
~ 5- 8 Lakhs
v Cases pending registration
~ 7.5 Lakhs
v Has 15 judges for every 1 million
of people
v 22.2 million undertrials –
undertrials outnumber the
convicts
Ankita Singh Nilutpal Goswami
Faster processing of legal issues / cases
“Judgement” data sourcing and understanding of
the details
Evaluating predictions based on various machine
learning model
Develop social value by means of streamlining
the judicial case intake
Objective
Ankita Singh Nilutpal Goswami
Sample Judgement document snapshot
o Case Documents Analyzed – 120
o Data extraction mechanism – manual
o Unique fields extracted – 58
o Total number of final observations - 202
Data
• Nature of Disposal
• Case Type
• Court Number
• Court Name
• Judge
• Judge Gender
• Judgement Date
• Total Number of Sections
• Section 1 thru Section 10
• FIR Number/Year
• Police station
• Investigating officer
• Case Number
• Year
• Complainant
• Total Accused
• Accused #
• Accused Name
• Accused Gender
• Accused Age
• Accused Confessed? (plea)
• Date Of first Hearing
• Complainant advocate
• Prosecution advocate
• Advocate Defendant
• Number of Prosecution witnesses
• Names of prosecution witnesses
• PW's Examined?
• Number of hostile witnesses
• Defense witnesses
• Charge sheet
• Points for consideration
• Exhibits on behalf of prosecution P series
• Number of exhibits considered
• Exhibits on behalf of court Cseries
• Exhibits on behalf of accused Dseries
• Total Number of Material Objects
• Charges proved
• Charges not proved
• Issues Proved
• Issues Not Proved
• Accused released on bail
• Accused committed to prison
• Sentence of Imprisonment granted
• Fine with Imprisonment (Rs)
• Term Served in Prison(days)
• Set off (if any)
• Judgement
• Citations
Original Features
Ankita Singh Nilutpal Goswami
• Source – Publicly available judgement
documents
• Case Documents Analyzed – 120
• Data extraction mechanism – manual
• Unique fields extracted – 58
• Consistent features identified -15
(Judgement decision is the Target Variable)
• Total number of final observations - 202
Data
# Feature Name Description Datatype Value
1 ipc_420
Binary indicator to confirm if the case is filed
under IPC 420
Categorical Yes=1, No=0
2 ipc_120b
Binary indicator to confirm if the case is filed
under IPC 120b
Categorical Yes=1, No=0
3 ipc_471
Binary indicator to confirm if the case is filed
under IPC 471
Categorical Yes=1, No=0
4 ipc_468
Binary indicator to confirm if the case is filed
under IPC 468
Categorical Yes=1, No=0
5 ipc_34
Binary indicator to confirm if the case is filed
under IPC 34
Categorical Yes=1, No=0
6 jud_gender Gender of the judge presiding over the case Categorical Male=0, Female=1
7 jud_date Date when judgement was meted Date Date
8 tot_sec Total number of sections filed for the case Numeric Number
9 case_no Unique number of the case Categorical Multiple Factors
10 comp Complainant name * String Name
11 tot_accu Total number of accused presented in the case Numeric Number
12 accu_gender Gender of the individual accused Categorical Male=0, Female=1
13 accu_no Sequence number of the accused Categorical Multiple Factors
14 accu_age Age of the accused Numeric Number
15 judgement Judgement given in the case Categorical
Guilty=1, Not
Guilty=0
Ankita Singh Nilutpal Goswami
Feature Importance
Ankita Singh Nilutpal Goswami
Exploratory Data Analysis
Guilty – 20 Non-Guilty - 182
Ankita Singh Nilutpal Goswami
Exploratory Data Analysis
Density Plot
Ankita Singh Nilutpal Goswami
Exploratory Data Analysis
IPC sections frequency
Ankita Singh Nilutpal Goswami
Exploratory Data Analysis
Correlation Matrix
Ankita Singh Nilutpal Goswami
Architecture
Ankita Singh Nilutpal Goswami
INITIAL MODEL DEVELOPMENT STEPS
PredictionData Collection
Feed data to model
1
2
3
POST IMPLEMENTATION STEPS
FEEDBACK
Development Methodology
Ankita Singh Nilutpal Goswami
Model Development
• Logistic Regression
• K-Nearest Neighbor
• Random Forest
• Support Vector Machine
Ankita Singh Nilutpal Goswami
Model Development
Ankita Singh Nilutpal Goswami
Logistic Regression
Pseudo R-square - 45.4% of the Intercept only
model has been explained by the Full model
Log likelihood ratio implies that the null hypothesis
of all Betas are zero is rejected and at least one Beta
is nonzero.
Ankita Singh Nilutpal Goswami
Accuracy
• Training Sample – 92.9 %
• Validation Sample – 88.5 %
Logistic Regression
Variable Importance
Ankita Singh Nilutpal Goswami
Cross-Validation
10 fold cross-validation resulted the best value
with k=7
From the results,
Accuracy and Kappa reducing after k=5
K-Nearest Neighbor
Ankita Singh Nilutpal Goswami
K-Nearest Neighbor
Model was further tuned by setting twoclassSummary and classProbs
as True.
Tuned model has better accuracy of
93.44%
Ankita Singh Nilutpal Goswami
Random Forest
Model parameters -
• ntree = 250
as OOB hardly changes after 250 trees
• mtry = 3
initially we took sqrt(total_no_of_features)
• nodesize = 3
1% of the total observation (202 observations)
Ankita Singh Nilutpal Goswami
Random Forest
Cross Validation with Parameter Tuning with mtry=2,3 and 4
Tuned model has
accuracy of 93.55 %
Ankita Singh Nilutpal Goswami
Support Vector Machine
• Model found 41 support vectors with gamma
value of 0.017 and cost of 1
• SVM model accuracy 90.02%
Ankita Singh Nilutpal Goswami
10 fold cross validation identified best values of
gamma - 0.1, Cost - 1
Tuned model has accuracy of 95.04 %
Support Vector Machine
Ankita Singh Nilutpal Goswami
Observation
o From the assessment of all the models, Support Vector Machine provides a better
accuracy including other performance parameters.
Model Accuracy (%) Precision (%) Recall (%)
Decision Trees (Gini) 82% 82% 97%
K-Nearest Neighbor 93% 93% 100%
Logistic Regression 88% 96% 91%
Naïve Bayes 75% 76% 95%
Random Forest 94% 93% 100%
Support Vector Machines 95% 94% 99%
Summary - Model Performance
Ankita Singh Nilutpal Goswami
• Support Vector Machine provides a better accuracy
• Better Precision and Recall values obtained from SVM
and Gradient Boosting
• Random data is skewed towards Non-Guilty cases (89 :
11 in favor of Non-Guilty)
• Model has been developed on IPC 420 cases found
across multiple District / High Courts
• Prediction obtained were majorly predicting Non-
Guilty
Summary and Findings
Ankita Singh Nilutpal Goswami
Plan Ahead
MOBILE APPLICATION
Ankita Singh Nilutpal Goswami
Q&A
Thanks

More Related Content

More from Analytics India Magazine

[Paper Presentation] EMOTIONAL STRESS DETECTION USING DEEP LEARNING
[Paper Presentation] EMOTIONAL STRESS DETECTION USING DEEP LEARNING[Paper Presentation] EMOTIONAL STRESS DETECTION USING DEEP LEARNING
[Paper Presentation] EMOTIONAL STRESS DETECTION USING DEEP LEARNINGAnalytics India Magazine
 
Flood & Other Disaster forecasting using Predictive Modelling and Artificial ...
Flood & Other Disaster forecasting using Predictive Modelling and Artificial ...Flood & Other Disaster forecasting using Predictive Modelling and Artificial ...
Flood & Other Disaster forecasting using Predictive Modelling and Artificial ...Analytics India Magazine
 
AI for Enterprises-The Value Paradigm By Venkat Subramanian VP Marketing at B...
AI for Enterprises-The Value Paradigm By Venkat Subramanian VP Marketing at B...AI for Enterprises-The Value Paradigm By Venkat Subramanian VP Marketing at B...
AI for Enterprises-The Value Paradigm By Venkat Subramanian VP Marketing at B...Analytics India Magazine
 
Keep it simple and it works - Simplicity and sticking to fundamentals in the ...
Keep it simple and it works - Simplicity and sticking to fundamentals in the ...Keep it simple and it works - Simplicity and sticking to fundamentals in the ...
Keep it simple and it works - Simplicity and sticking to fundamentals in the ...Analytics India Magazine
 
Feature Based Opinion Mining By Gourab Nath Core Faculty – Data Science at Pr...
Feature Based Opinion Mining By Gourab Nath Core Faculty – Data Science at Pr...Feature Based Opinion Mining By Gourab Nath Core Faculty – Data Science at Pr...
Feature Based Opinion Mining By Gourab Nath Core Faculty – Data Science at Pr...Analytics India Magazine
 
Deciphering AI - Unlocking the Black Box of AIML with State-of-the-Art Techno...
Deciphering AI - Unlocking the Black Box of AIML with State-of-the-Art Techno...Deciphering AI - Unlocking the Black Box of AIML with State-of-the-Art Techno...
Deciphering AI - Unlocking the Black Box of AIML with State-of-the-Art Techno...Analytics India Magazine
 
Getting your first job in Data Science By Imaad Mohamed Khan Founder-in-Resid...
Getting your first job in Data Science By Imaad Mohamed Khan Founder-in-Resid...Getting your first job in Data Science By Imaad Mohamed Khan Founder-in-Resid...
Getting your first job in Data Science By Imaad Mohamed Khan Founder-in-Resid...Analytics India Magazine
 
10 data science & AI trends in india to watch out for in 2019
10 data science & AI trends in india to watch out for in 201910 data science & AI trends in india to watch out for in 2019
10 data science & AI trends in india to watch out for in 2019Analytics India Magazine
 
The hitchhiker's guide to artificial intelligence 2018-19
The hitchhiker's guide to artificial intelligence 2018-19The hitchhiker's guide to artificial intelligence 2018-19
The hitchhiker's guide to artificial intelligence 2018-19Analytics India Magazine
 
Data Science Skills Study 2018 by AIM & Great Learning
Data Science Skills Study 2018 by AIM & Great LearningData Science Skills Study 2018 by AIM & Great Learning
Data Science Skills Study 2018 by AIM & Great LearningAnalytics India Magazine
 
Emerging engineering issues for building large scale AI systems By Srinivas P...
Emerging engineering issues for building large scale AI systems By Srinivas P...Emerging engineering issues for building large scale AI systems By Srinivas P...
Emerging engineering issues for building large scale AI systems By Srinivas P...Analytics India Magazine
 
Bringing AI into the Enterprise - A Practitioner's view By Piyush Chowhan CIO...
Bringing AI into the Enterprise - A Practitioner's view By Piyush Chowhan CIO...Bringing AI into the Enterprise - A Practitioner's view By Piyush Chowhan CIO...
Bringing AI into the Enterprise - A Practitioner's view By Piyush Chowhan CIO...Analytics India Magazine
 
Explainable deep learning with applications in Healthcare By Sunil Kumar Vupp...
Explainable deep learning with applications in Healthcare By Sunil Kumar Vupp...Explainable deep learning with applications in Healthcare By Sunil Kumar Vupp...
Explainable deep learning with applications in Healthcare By Sunil Kumar Vupp...Analytics India Magazine
 
Getting started with text mining By Mathangi Sri Head of Data Science at Phon...
Getting started with text mining By Mathangi Sri Head of Data Science at Phon...Getting started with text mining By Mathangi Sri Head of Data Science at Phon...
Getting started with text mining By Mathangi Sri Head of Data Science at Phon...Analytics India Magazine
 
“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...
“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...
“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...Analytics India Magazine
 
"Route risks using driving data on road segments" By Jayanta Kumar Pal Staff ...
"Route risks using driving data on road segments" By Jayanta Kumar Pal Staff ..."Route risks using driving data on road segments" By Jayanta Kumar Pal Staff ...
"Route risks using driving data on road segments" By Jayanta Kumar Pal Staff ...Analytics India Magazine
 
“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...
“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...
“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...Analytics India Magazine
 
Analytics Education — A Primer & Learning Path
Analytics Education — A Primer & Learning PathAnalytics Education — A Primer & Learning Path
Analytics Education — A Primer & Learning PathAnalytics India Magazine
 
Analytics & Data Science Industry In India: Study 2018 - by AnalytixLabs & AIM
Analytics & Data Science Industry In India: Study 2018 - by AnalytixLabs & AIMAnalytics & Data Science Industry In India: Study 2018 - by AnalytixLabs & AIM
Analytics & Data Science Industry In India: Study 2018 - by AnalytixLabs & AIMAnalytics India Magazine
 
Study: Analytics & Data Science Jobs in India - 2018
Study: Analytics & Data Science Jobs in India - 2018Study: Analytics & Data Science Jobs in India - 2018
Study: Analytics & Data Science Jobs in India - 2018Analytics India Magazine
 

More from Analytics India Magazine (20)

[Paper Presentation] EMOTIONAL STRESS DETECTION USING DEEP LEARNING
[Paper Presentation] EMOTIONAL STRESS DETECTION USING DEEP LEARNING[Paper Presentation] EMOTIONAL STRESS DETECTION USING DEEP LEARNING
[Paper Presentation] EMOTIONAL STRESS DETECTION USING DEEP LEARNING
 
Flood & Other Disaster forecasting using Predictive Modelling and Artificial ...
Flood & Other Disaster forecasting using Predictive Modelling and Artificial ...Flood & Other Disaster forecasting using Predictive Modelling and Artificial ...
Flood & Other Disaster forecasting using Predictive Modelling and Artificial ...
 
AI for Enterprises-The Value Paradigm By Venkat Subramanian VP Marketing at B...
AI for Enterprises-The Value Paradigm By Venkat Subramanian VP Marketing at B...AI for Enterprises-The Value Paradigm By Venkat Subramanian VP Marketing at B...
AI for Enterprises-The Value Paradigm By Venkat Subramanian VP Marketing at B...
 
Keep it simple and it works - Simplicity and sticking to fundamentals in the ...
Keep it simple and it works - Simplicity and sticking to fundamentals in the ...Keep it simple and it works - Simplicity and sticking to fundamentals in the ...
Keep it simple and it works - Simplicity and sticking to fundamentals in the ...
 
Feature Based Opinion Mining By Gourab Nath Core Faculty – Data Science at Pr...
Feature Based Opinion Mining By Gourab Nath Core Faculty – Data Science at Pr...Feature Based Opinion Mining By Gourab Nath Core Faculty – Data Science at Pr...
Feature Based Opinion Mining By Gourab Nath Core Faculty – Data Science at Pr...
 
Deciphering AI - Unlocking the Black Box of AIML with State-of-the-Art Techno...
Deciphering AI - Unlocking the Black Box of AIML with State-of-the-Art Techno...Deciphering AI - Unlocking the Black Box of AIML with State-of-the-Art Techno...
Deciphering AI - Unlocking the Black Box of AIML with State-of-the-Art Techno...
 
Getting your first job in Data Science By Imaad Mohamed Khan Founder-in-Resid...
Getting your first job in Data Science By Imaad Mohamed Khan Founder-in-Resid...Getting your first job in Data Science By Imaad Mohamed Khan Founder-in-Resid...
Getting your first job in Data Science By Imaad Mohamed Khan Founder-in-Resid...
 
10 data science & AI trends in india to watch out for in 2019
10 data science & AI trends in india to watch out for in 201910 data science & AI trends in india to watch out for in 2019
10 data science & AI trends in india to watch out for in 2019
 
The hitchhiker's guide to artificial intelligence 2018-19
The hitchhiker's guide to artificial intelligence 2018-19The hitchhiker's guide to artificial intelligence 2018-19
The hitchhiker's guide to artificial intelligence 2018-19
 
Data Science Skills Study 2018 by AIM & Great Learning
Data Science Skills Study 2018 by AIM & Great LearningData Science Skills Study 2018 by AIM & Great Learning
Data Science Skills Study 2018 by AIM & Great Learning
 
Emerging engineering issues for building large scale AI systems By Srinivas P...
Emerging engineering issues for building large scale AI systems By Srinivas P...Emerging engineering issues for building large scale AI systems By Srinivas P...
Emerging engineering issues for building large scale AI systems By Srinivas P...
 
Bringing AI into the Enterprise - A Practitioner's view By Piyush Chowhan CIO...
Bringing AI into the Enterprise - A Practitioner's view By Piyush Chowhan CIO...Bringing AI into the Enterprise - A Practitioner's view By Piyush Chowhan CIO...
Bringing AI into the Enterprise - A Practitioner's view By Piyush Chowhan CIO...
 
Explainable deep learning with applications in Healthcare By Sunil Kumar Vupp...
Explainable deep learning with applications in Healthcare By Sunil Kumar Vupp...Explainable deep learning with applications in Healthcare By Sunil Kumar Vupp...
Explainable deep learning with applications in Healthcare By Sunil Kumar Vupp...
 
Getting started with text mining By Mathangi Sri Head of Data Science at Phon...
Getting started with text mining By Mathangi Sri Head of Data Science at Phon...Getting started with text mining By Mathangi Sri Head of Data Science at Phon...
Getting started with text mining By Mathangi Sri Head of Data Science at Phon...
 
“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...
“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...
“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...
 
"Route risks using driving data on road segments" By Jayanta Kumar Pal Staff ...
"Route risks using driving data on road segments" By Jayanta Kumar Pal Staff ..."Route risks using driving data on road segments" By Jayanta Kumar Pal Staff ...
"Route risks using driving data on road segments" By Jayanta Kumar Pal Staff ...
 
“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...
“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...
“Who Moved My Cheese?” – Sniff the changes and stay relevant as an analytics ...
 
Analytics Education — A Primer & Learning Path
Analytics Education — A Primer & Learning PathAnalytics Education — A Primer & Learning Path
Analytics Education — A Primer & Learning Path
 
Analytics & Data Science Industry In India: Study 2018 - by AnalytixLabs & AIM
Analytics & Data Science Industry In India: Study 2018 - by AnalytixLabs & AIMAnalytics & Data Science Industry In India: Study 2018 - by AnalytixLabs & AIM
Analytics & Data Science Industry In India: Study 2018 - by AnalytixLabs & AIM
 
Study: Analytics & Data Science Jobs in India - 2018
Study: Analytics & Data Science Jobs in India - 2018Study: Analytics & Data Science Jobs in India - 2018
Study: Analytics & Data Science Jobs in India - 2018
 

Recently uploaded

How I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prisonHow I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prisonPayment Village
 
basics of data science with application areas.pdf
basics of data science with application areas.pdfbasics of data science with application areas.pdf
basics of data science with application areas.pdfvyankatesh1
 
一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理cyebo
 
Pre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxPre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxStephen266013
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxDilipVasan
 
Artificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdfArtificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdfscitechtalktv
 
2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group MeetingAlison Pitt
 
AI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdfAI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdfMichaelSenkow
 
2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Calllward7
 
一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理pyhepag
 
The Significance of Transliteration Enhancing
The Significance of Transliteration EnhancingThe Significance of Transliteration Enhancing
The Significance of Transliteration Enhancingmohamed Elzalabany
 
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictSupply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictJack Cole
 
一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理cyebo
 
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理pyhepag
 
Fuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertaintyFuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertaintyRafigAliyev2
 
Formulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdfFormulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdfRobertoOcampo24
 
社内勉強会資料  Mamba - A new era or ephemeral
社内勉強会資料   Mamba - A new era or ephemeral社内勉強会資料   Mamba - A new era or ephemeral
社内勉強会資料  Mamba - A new era or ephemeralNABLAS株式会社
 
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...Amil baba
 

Recently uploaded (20)

How I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prisonHow I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prison
 
basics of data science with application areas.pdf
basics of data science with application areas.pdfbasics of data science with application areas.pdf
basics of data science with application areas.pdf
 
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotecAbortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
 
一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理
 
Pre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxPre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptx
 
Machine Learning for Accident Severity Prediction
Machine Learning for Accident Severity PredictionMachine Learning for Accident Severity Prediction
Machine Learning for Accident Severity Prediction
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptx
 
Artificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdfArtificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdf
 
2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting
 
AI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdfAI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdf
 
2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call
 
一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理
 
The Significance of Transliteration Enhancing
The Significance of Transliteration EnhancingThe Significance of Transliteration Enhancing
The Significance of Transliteration Enhancing
 
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictSupply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
 
一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理
 
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
 
Fuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertaintyFuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertainty
 
Formulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdfFormulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdf
 
社内勉強会資料  Mamba - A new era or ephemeral
社内勉強会資料   Mamba - A new era or ephemeral社内勉強会資料   Mamba - A new era or ephemeral
社内勉強会資料  Mamba - A new era or ephemeral
 
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
 

Predicting outcome of legal case using machine learning algorithms By Ankita Singh Service Delivery Specialist at IBM India , Nilutpal Goswami Senior Manager at Capgemini at CYPHER 2018

  • 1. PREDICTING OUTCOME OF LEGAL CASES Ankita Singh Nilutpal Goswami
  • 2. Ankita Singh Nilutpal Goswami Agenda Domain Objective Data Extraction EDA Architecture Model Development Q & ASummary and Findings 1 2 3 4 5 6 7 8
  • 3. Ankita Singh Nilutpal Goswami Legal Systems of the World Source – Citation (http://saint-claire.org/)
  • 4. Ankita Singh Nilutpal Goswami Hierarchy of Indian Judiciary Sources of Law § Constitution § Legislation • Ordinary • Delegated • Ordinance § Judicial Precedent § Customs
  • 5. Ankita Singh Nilutpal Goswami Background Indian Judicial System o Largest judicial machinery based on the biggest constitution o Constitution of India have 448 articles in 25 parts, 12 schedules, 5 appendices and 98 amendments o Indian Penal Code (IPC) defines various crimes/offences and prescribes the punishment o Criminal Procedure Code (CrPC) defines the mandatory procedures to be carried out pursuing a case o 24 High Courts and over 600 district courts o Nearly 5 lakhs cases are filled daily in Indian Courts o Approximately 4.5 lakhs of cases are put up before the courts daily o About 2.5 lakhs of cases are disposed off daily
  • 6. Ankita Singh Nilutpal Goswami Challenging Facts CASES PENDING CIVIL CASES CRIMINAL CASES TOTAL CASES PERCENTAGE > 10 years 597,166 1,691,515 2,288,681 8.28% Between 5 to 10 years 1,244,117 3,212,377 4,456,494 16.13% Between 2 to 5 years 2,542,925 5,394,015 7,936,940 28.73% < 2 years 3,946,341 8,997,935 12,944,276 46.85% Total Pending Cases 8,330,549 19,295,842 27,626,391 > 10 years 8% Between 5 to 10 years 16% Between 2 to 5 years 29% < 2 years 47% Other 76% Source – National Judicial Data Grid (as on September 18th 2018) v Case Disposal Rates (August 2018) § 10 years – 1.5 % § All cases – 3.8 % v Cases filed daily ~ 5- 8 Lakhs v Cases pending registration ~ 7.5 Lakhs v Has 15 judges for every 1 million of people v 22.2 million undertrials – undertrials outnumber the convicts
  • 7. Ankita Singh Nilutpal Goswami Faster processing of legal issues / cases “Judgement” data sourcing and understanding of the details Evaluating predictions based on various machine learning model Develop social value by means of streamlining the judicial case intake Objective
  • 8. Ankita Singh Nilutpal Goswami Sample Judgement document snapshot o Case Documents Analyzed – 120 o Data extraction mechanism – manual o Unique fields extracted – 58 o Total number of final observations - 202 Data • Nature of Disposal • Case Type • Court Number • Court Name • Judge • Judge Gender • Judgement Date • Total Number of Sections • Section 1 thru Section 10 • FIR Number/Year • Police station • Investigating officer • Case Number • Year • Complainant • Total Accused • Accused # • Accused Name • Accused Gender • Accused Age • Accused Confessed? (plea) • Date Of first Hearing • Complainant advocate • Prosecution advocate • Advocate Defendant • Number of Prosecution witnesses • Names of prosecution witnesses • PW's Examined? • Number of hostile witnesses • Defense witnesses • Charge sheet • Points for consideration • Exhibits on behalf of prosecution P series • Number of exhibits considered • Exhibits on behalf of court Cseries • Exhibits on behalf of accused Dseries • Total Number of Material Objects • Charges proved • Charges not proved • Issues Proved • Issues Not Proved • Accused released on bail • Accused committed to prison • Sentence of Imprisonment granted • Fine with Imprisonment (Rs) • Term Served in Prison(days) • Set off (if any) • Judgement • Citations Original Features
  • 9. Ankita Singh Nilutpal Goswami • Source – Publicly available judgement documents • Case Documents Analyzed – 120 • Data extraction mechanism – manual • Unique fields extracted – 58 • Consistent features identified -15 (Judgement decision is the Target Variable) • Total number of final observations - 202 Data # Feature Name Description Datatype Value 1 ipc_420 Binary indicator to confirm if the case is filed under IPC 420 Categorical Yes=1, No=0 2 ipc_120b Binary indicator to confirm if the case is filed under IPC 120b Categorical Yes=1, No=0 3 ipc_471 Binary indicator to confirm if the case is filed under IPC 471 Categorical Yes=1, No=0 4 ipc_468 Binary indicator to confirm if the case is filed under IPC 468 Categorical Yes=1, No=0 5 ipc_34 Binary indicator to confirm if the case is filed under IPC 34 Categorical Yes=1, No=0 6 jud_gender Gender of the judge presiding over the case Categorical Male=0, Female=1 7 jud_date Date when judgement was meted Date Date 8 tot_sec Total number of sections filed for the case Numeric Number 9 case_no Unique number of the case Categorical Multiple Factors 10 comp Complainant name * String Name 11 tot_accu Total number of accused presented in the case Numeric Number 12 accu_gender Gender of the individual accused Categorical Male=0, Female=1 13 accu_no Sequence number of the accused Categorical Multiple Factors 14 accu_age Age of the accused Numeric Number 15 judgement Judgement given in the case Categorical Guilty=1, Not Guilty=0
  • 10. Ankita Singh Nilutpal Goswami Feature Importance
  • 11. Ankita Singh Nilutpal Goswami Exploratory Data Analysis Guilty – 20 Non-Guilty - 182
  • 12. Ankita Singh Nilutpal Goswami Exploratory Data Analysis Density Plot
  • 13. Ankita Singh Nilutpal Goswami Exploratory Data Analysis IPC sections frequency
  • 14. Ankita Singh Nilutpal Goswami Exploratory Data Analysis Correlation Matrix
  • 15. Ankita Singh Nilutpal Goswami Architecture
  • 16. Ankita Singh Nilutpal Goswami INITIAL MODEL DEVELOPMENT STEPS PredictionData Collection Feed data to model 1 2 3 POST IMPLEMENTATION STEPS FEEDBACK Development Methodology
  • 17. Ankita Singh Nilutpal Goswami Model Development • Logistic Regression • K-Nearest Neighbor • Random Forest • Support Vector Machine
  • 18. Ankita Singh Nilutpal Goswami Model Development
  • 19. Ankita Singh Nilutpal Goswami Logistic Regression Pseudo R-square - 45.4% of the Intercept only model has been explained by the Full model Log likelihood ratio implies that the null hypothesis of all Betas are zero is rejected and at least one Beta is nonzero.
  • 20. Ankita Singh Nilutpal Goswami Accuracy • Training Sample – 92.9 % • Validation Sample – 88.5 % Logistic Regression Variable Importance
  • 21. Ankita Singh Nilutpal Goswami Cross-Validation 10 fold cross-validation resulted the best value with k=7 From the results, Accuracy and Kappa reducing after k=5 K-Nearest Neighbor
  • 22. Ankita Singh Nilutpal Goswami K-Nearest Neighbor Model was further tuned by setting twoclassSummary and classProbs as True. Tuned model has better accuracy of 93.44%
  • 23. Ankita Singh Nilutpal Goswami Random Forest Model parameters - • ntree = 250 as OOB hardly changes after 250 trees • mtry = 3 initially we took sqrt(total_no_of_features) • nodesize = 3 1% of the total observation (202 observations)
  • 24. Ankita Singh Nilutpal Goswami Random Forest Cross Validation with Parameter Tuning with mtry=2,3 and 4 Tuned model has accuracy of 93.55 %
  • 25. Ankita Singh Nilutpal Goswami Support Vector Machine • Model found 41 support vectors with gamma value of 0.017 and cost of 1 • SVM model accuracy 90.02%
  • 26. Ankita Singh Nilutpal Goswami 10 fold cross validation identified best values of gamma - 0.1, Cost - 1 Tuned model has accuracy of 95.04 % Support Vector Machine
  • 27. Ankita Singh Nilutpal Goswami Observation o From the assessment of all the models, Support Vector Machine provides a better accuracy including other performance parameters. Model Accuracy (%) Precision (%) Recall (%) Decision Trees (Gini) 82% 82% 97% K-Nearest Neighbor 93% 93% 100% Logistic Regression 88% 96% 91% Naïve Bayes 75% 76% 95% Random Forest 94% 93% 100% Support Vector Machines 95% 94% 99% Summary - Model Performance
  • 28. Ankita Singh Nilutpal Goswami • Support Vector Machine provides a better accuracy • Better Precision and Recall values obtained from SVM and Gradient Boosting • Random data is skewed towards Non-Guilty cases (89 : 11 in favor of Non-Guilty) • Model has been developed on IPC 420 cases found across multiple District / High Courts • Prediction obtained were majorly predicting Non- Guilty Summary and Findings
  • 29. Ankita Singh Nilutpal Goswami Plan Ahead MOBILE APPLICATION
  • 30. Ankita Singh Nilutpal Goswami Q&A Thanks