Prepared as part of the course requirements for the subject IT for Business Intelligence at Vinod Gupta School of Management, IIT Kharagpur. This paper discusses some of the data mining techniques using examples in the software WEKA.
Synergies Between Search and Social MetricsDavid Shamma
My talk from: New Metrics for New Media:
Analytics for Social Media and Virtual Worlds
http://mediax.stanford.edu/WSI/metrics.html
@Media X Stanford 2009
A Classification Problem of Credit Risk Rating Investigated and Solved by Opt...SSA KPI
AACIMP 2010 Summer School lecture by Gerhard Wilhelm Weber. "Applied Mathematics" stream. "Modern Operational Research and Its Mathematical Methods with a Focus on Financial Mathematics" course. Part 7.
More info at http://summerschool.ssa.org.ua
Prepared as part of the course requirements for the subject IT for Business Intelligence at Vinod Gupta School of Management, IIT Kharagpur. This paper discusses some of the data mining techniques using examples in the software WEKA.
Synergies Between Search and Social MetricsDavid Shamma
My talk from: New Metrics for New Media:
Analytics for Social Media and Virtual Worlds
http://mediax.stanford.edu/WSI/metrics.html
@Media X Stanford 2009
A Classification Problem of Credit Risk Rating Investigated and Solved by Opt...SSA KPI
AACIMP 2010 Summer School lecture by Gerhard Wilhelm Weber. "Applied Mathematics" stream. "Modern Operational Research and Its Mathematical Methods with a Focus on Financial Mathematics" course. Part 7.
More info at http://summerschool.ssa.org.ua
A Strategic Approach: GenAI in EducationPeter Windle
Artificial Intelligence (AI) technologies such as Generative AI, Image Generators and Large Language Models have had a dramatic impact on teaching, learning and assessment over the past 18 months. The most immediate threat AI posed was to Academic Integrity with Higher Education Institutes (HEIs) focusing their efforts on combating the use of GenAI in assessment. Guidelines were developed for staff and students, policies put in place too. Innovative educators have forged paths in the use of Generative AI for teaching, learning and assessments leading to pockets of transformation springing up across HEIs, often with little or no top-down guidance, support or direction.
This Gasta posits a strategic approach to integrating AI into HEIs to prepare staff, students and the curriculum for an evolving world and workplace. We will highlight the advantages of working with these technologies beyond the realm of teaching, learning and assessment by considering prompt engineering skills, industry impact, curriculum changes, and the need for staff upskilling. In contrast, not engaging strategically with Generative AI poses risks, including falling behind peers, missed opportunities and failing to ensure our graduates remain employable. The rapid evolution of AI technologies necessitates a proactive and strategic approach if we are to remain relevant.
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...Levi Shapiro
Letter from the Congress of the United States regarding Anti-Semitism sent June 3rd to MIT President Sally Kornbluth, MIT Corp Chair, Mark Gorenberg
Dear Dr. Kornbluth and Mr. Gorenberg,
The US House of Representatives is deeply concerned by ongoing and pervasive acts of antisemitic
harassment and intimidation at the Massachusetts Institute of Technology (MIT). Failing to act decisively to ensure a safe learning environment for all students would be a grave dereliction of your responsibilities as President of MIT and Chair of the MIT Corporation.
This Congress will not stand idly by and allow an environment hostile to Jewish students to persist. The House believes that your institution is in violation of Title VI of the Civil Rights Act, and the inability or
unwillingness to rectify this violation through action requires accountability.
Postsecondary education is a unique opportunity for students to learn and have their ideas and beliefs challenged. However, universities receiving hundreds of millions of federal funds annually have denied
students that opportunity and have been hijacked to become venues for the promotion of terrorism, antisemitic harassment and intimidation, unlawful encampments, and in some cases, assaults and riots.
The House of Representatives will not countenance the use of federal funds to indoctrinate students into hateful, antisemitic, anti-American supporters of terrorism. Investigations into campus antisemitism by the Committee on Education and the Workforce and the Committee on Ways and Means have been expanded into a Congress-wide probe across all relevant jurisdictions to address this national crisis. The undersigned Committees will conduct oversight into the use of federal funds at MIT and its learning environment under authorities granted to each Committee.
• The Committee on Education and the Workforce has been investigating your institution since December 7, 2023. The Committee has broad jurisdiction over postsecondary education, including its compliance with Title VI of the Civil Rights Act, campus safety concerns over disruptions to the learning environment, and the awarding of federal student aid under the Higher Education Act.
• The Committee on Oversight and Accountability is investigating the sources of funding and other support flowing to groups espousing pro-Hamas propaganda and engaged in antisemitic harassment and intimidation of students. The Committee on Oversight and Accountability is the principal oversight committee of the US House of Representatives and has broad authority to investigate “any matter” at “any time” under House Rule X.
• The Committee on Ways and Means has been investigating several universities since November 15, 2023, when the Committee held a hearing entitled From Ivory Towers to Dark Corners: Investigating the Nexus Between Antisemitism, Tax-Exempt Universities, and Terror Financing. The Committee followed the hearing with letters to those institutions on January 10, 202
Model Attribute Check Company Auto PropertyCeline George
In Odoo, the multi-company feature allows you to manage multiple companies within a single Odoo database instance. Each company can have its own configurations while still sharing common resources such as products, customers, and suppliers.
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Dr. Vinod Kumar Kanvaria
Exploiting Artificial Intelligence for Empowering Researchers and Faculty,
International FDP on Fundamentals of Research in Social Sciences
at Integral University, Lucknow, 06.06.2024
By Dr. Vinod Kumar Kanvaria
Francesca Gottschalk - How can education support child empowerment.pptxEduSkills OECD
Francesca Gottschalk from the OECD’s Centre for Educational Research and Innovation presents at the Ask an Expert Webinar: How can education support child empowerment?
Normal Labour/ Stages of Labour/ Mechanism of LabourWasim Ak
Normal labor is also termed spontaneous labor, defined as the natural physiological process through which the fetus, placenta, and membranes are expelled from the uterus through the birth canal at term (37 to 42 weeks
Executive Directors Chat Leveraging AI for Diversity, Equity, and InclusionTechSoup
Let’s explore the intersection of technology and equity in the final session of our DEI series. Discover how AI tools, like ChatGPT, can be used to support and enhance your nonprofit's DEI initiatives. Participants will gain insights into practical AI applications and get tips for leveraging technology to advance their DEI goals.
Operation “Blue Star” is the only event in the history of Independent India where the state went into war with its own people. Even after about 40 years it is not clear if it was culmination of states anger over people of the region, a political game of power or start of dictatorial chapter in the democratic setup.
The people of Punjab felt alienated from main stream due to denial of their just demands during a long democratic struggle since independence. As it happen all over the word, it led to militant struggle with great loss of lives of military, police and civilian personnel. Killing of Indira Gandhi and massacre of innocent Sikhs in Delhi and other India cities was also associated with this movement.
Unit 8 - Information and Communication Technology (Paper I).pdfThiyagu K
This slides describes the basic concepts of ICT, basics of Email, Emerging Technology and Digital Initiatives in Education. This presentations aligns with the UGC Paper I syllabus.
Unit 8 - Information and Communication Technology (Paper I).pdf
Prediction of Credit Default by Continuous Optimization
1. 4th International Summer School
Achievements and Applications of Contemporary
Informatics, Mathematics and Physics
National University of Technology of the Ukraine
Kiev, Ukraine, August 5-16, 2009
Prediction of Credit Default
by Continuous Optimization
Gerhard-
Gerhard-Wilhelm Weber *
Efsun Kürüm, Kasırga Yıldırak
Institute of Applied Mathematics
Middle East Technical University, Ankara, Turkey
* Faculty of Economics, Management and Law, University of Siegen, Germany
Center for Research on Optimization and Control, University of Aveiro, Portugal
2. Outline
• Main Problem from Credit Default
• Logistic Regression and Performance Evaluation
• Cut-Off Values and Thresholds
• Classification and Optimization
• Nonlinear Regression
• Numerical Results
• Outlook and Conclusion
3. Main Problem from Credit Default
Whether a credit application should be consented or rejected.
Solution
Learning about the default probability of the applicant.
4. Main Problem from Credit Default
Whether a credit application should be consented or rejected.
Solution
Learning about the default probability of the applicant.
5. Logistic Regression
P(Y = 1 X = xl )
log = β0 + β1 ⋅ xl1 + β2 ⋅ xl 2 + K + β p ⋅ xlp
P(Y = 0 X = x )
l
(l = 1, 2,..., N )
6. Goal
Our study is based on one of the Basel II criteria which
recommend that the bank should divide corporate firms by
8 rating degrees with one of them being the default class.
We have two problems to solve here:
To distinguish the defaults from non-defaults.
To put non-default firms in an order based on their credit quality
and classify them into (sub) classes.
7. Data
Data have been collected by a bank from the firms operating in the
manufacturing sector in Turkey.
They cover the period between 2001 and 2006.
There are 54 qualitative variables and 36 quantitative variables originally.
Data on quantitative variables are formed based on a balance sheet
submitted by the firms’ accountants.
Essentially, they are the well-known financial ratios.
The data set covers 3150 firms from which 92 are in the state of default.
As the number of default is small, in order to overcome the possible
statistical problems, we downsize the number to 551,
keeping all the default cases in the set.
8. We evaluate performance of the model
non-default default
cases cases
cut-off value
ROC curve
test result value
TPF, sensitivity
FPF, 1-specificity
9. Model outcome versus truth
truth
d n
True Positive False Positive
Fraction Fraction
dı
TPF FPF
model outcome
False Negative True Negative
Fraction Fraction
nı
FNF TNF
1 1
total
10. Definitions
• sensitivity (TPF) := P( Dı | D)
• specificity := P( NDı | ND )
• 1-specificity (FPF) := P( Dı | ND )
• points (TPF, FPF) constitute the ROC curve
• c := cut-off value
• c takes values between - ∞ and ∞
• TPF(c) := P( z>c | D )
• FPF(c) := P( z>c | ND )
11. normal-deviate axes
TPF
Normal Deviate (TPF)
FPF
FPF (ci ) := Φ(ci )
TPF (ci ) := Φ(a + b ⋅ ci )
µn - µs σn
a := b :=
σs σs
Normal Deviate (FPF)
12. normal-deviate axes
TPF
t
Normal Deviate (TPF)
FPF
FPF (ci ) := Φ(ci )
TPF (ci ) := Φ(a + b ⋅ ci )
c
µn - µs σn
a := b :=
σs σs
Normal Deviate (FPF)
13. Classification
Ex.: cut-off values
actually non-default actually default
cases cases
c
−∞ class I class II class III class IV class V ∞
To assess discriminative power of such a model,
we calculate the Area Under (ROC) Curve:
∞
AUC := ∫ Φ(a + b ⋅ c) d Φ(c).
−∞
15. Optimization in Credit Default
Problem:
Simultaneously to obtain the thresholds and the parameters a and b
that maximize AUC,
while balancing the size of the classes (regularization)
and guaranteeing a good accuracy.
16. Optimization Problem
2
1
-1
α 1 ⋅ ∫ Φ( a + b ⋅ Φ (t )) dt − α 2 ⋅∑
γi R −1
max − (ti +1 − ti )
i =0
n
a,b,τ 0
ti +1
subject to
∫ Φ(a + b ⋅ Φ −1 (t ))d t ≥ δi (i = 0,1,..., R − 1)
ti
τ := ( t1 , t 2 ,..., t R -1 ) T t0 = 0, tR = 1
17. Optimization Problem
2
1
-1
α 1 ⋅ ∫ Φ( a + b ⋅ Φ (t )) dt − α 2 ⋅∑
γi R −1
max − (ti +1 − ti )
i =0
n
a,b,τ 0
ti +1
subject to
∫ Φ(a + b ⋅ Φ−1(t ))d t ≥ δi >0 (i = 0,1,..., R − 1)
ti
⇒ ti +1 > ti
τ := ( t1 , t 2 ,..., t R -1 ) T t0 = 0, tR = 1
18. Over the ROC Curve
TPF
1-AUC
AUC
FPF
t0 t1 t2 t3 t4 t5
1
AOC : = ∫ (1 − Φ( a + b ⋅ Φ − 1 (t ))) dt
0
19. New Version of the Optimization Problem
R −1 2
γi 1
min α 2 ⋅ ∑ − (ti +1 − ti ) + α 1 ⋅ ∫ (1 − Φ(a + b ⋅ Φ −1 (t ))) dt
i =0
a, b, τ n 0
subject to
t
j +1
∫ (1− Φ(a + b ⋅Φ−1(t ))) dt ≤ t j +1 − t j − δ j ( j = 0,1, ..., R −1)
t
j
20. Regression in Credit Default
Optimization problem:
Simultaneously to obtain the thresholds and the parameters a and b
that maximize AUC,
while balancing the size of the classes (regularization)
and guaranteeing a good accuracy
discretization of integral
nonlinear regression problem
21. Discretization of the Integral
Riemann-Stieltjes integral
∞
AUC = ∫ Φ (a + b ⋅ c ) dΦ(c )
−∞
Riemann integral
1
AUC = ∫ Φ( a + b ⋅ Φ −1 (t )) dt
0
Discretization
R
AUC ≈ ∑ Φ( a + b ⋅ Φ −1 (tk )) ⋅ ∆tk
k =1
22. Optimization Problem with Penalty Parameters
In the case of violation of anyone of these constraints, we introduce penalty
parameters. As some penalty becomes increased, the iterates are forced
towards the feasible set of the optimization problem.
R −1 2
γi 1
ΠΘ ( a,b, τ ) := α 2 ⋅ ∑ − (ti +1 − ti ) − α 1 ⋅ ∫ (1 - Φ( a + b ⋅ Φ -1 (t ))) dt +
i =0
n 0
R -1 t j +1
−1
α 3 ⋅ ∑ θ j ⋅ δ j − ∫ Φ( a + b ⋅ Φ (t ))) dt
tj
j =0
1444442444444 4
3
=: Ψ j ( a , b , τ )
Θ := ( θ1 , θ 2 , ..., θ R − 1 ) T θj ≥0 ( j = 0,1, ..., R − 1)
25. Nonlinear Regression
2
∑ j ( j )
N
min f ( β ) = d − g x ,β
j =1
N
=: ∑ f j2 ( β )
j =1
F ( β ) := ( f1 ( β ),..., f N ( β ) )
T
min f ( β ) = F T ( β ) F ( β )
26. Nonlinear Regression
β k +1 := β k + qk
• Gauss-Newton method :
∇F ( β )∇T F ( β )q = −∇F ( β ) F ( β )
• Levenberg-Marquardt method :
λ ≥0
( )
∇F ( β )∇T F (β ) + λ I p q = −∇F ( β ) F ( β )
27. Nonlinear Regression
alternative solution
min t,
t,q
subject to ( ∇F (β )∇ T
)
F ( β ) + λ I p q − ( −∇F ( β ) F ( β ) )
2
≤ t , t ≥ 0,
|| Lq || 2 ≤ M
conic quadratic programming
28. Nonlinear Regression
alternative solution
min t,
t,q
subject to ( ∇F (β )∇ T
)
F ( β ) + λ I p q − ( −∇F ( β ) F ( β ) )
2
≤ t , t ≥ 0,
|| Lq || 2 ≤ M
conic quadratic programming
interior point methods
30. Numerical Results
Accuracy Error in Each Class
I II III IV V VI VII VIII
0.0000 0.0000 0.0000 0.0001 0.0001 0.0010 0.0010 0.0075
0.0000 0.0000 0.0000 0.0001 0.0001 0.0010 0.0018 0.0094
0.0000 0.0000 0.0000 0.0000 0.0001 0.0002 0.0018 0.0059
0.0000 0.0000 0.0000 0.0001 0.0001 0.0006 0.0018 0.0075
Number of Firms in Each Class
I II III IV V VI VII VIII
4 56 27 133 115 102 129 61
2 42 52 120 119 111 120 61
4 43 40 129 114 116 120 61
4 56 24 136 106 129 111 61
Number of firms in each class at the beginning: 10, 26, 58, 106, 134, 121, 111, 61
33. References
Aster, A., Borchers, B., and Thurber, C., Parameter Estimation and Inverse Problems, Academic Press, 2004.
Boyd, S., and Vandenberghe, L., Convex Optimization, Cambridge University Press, 2004.
Buja, A., Hastie, T., and Tibshirani, R., Linear smoothers and additive models, The Ann. Stat. 17, 2 (1989)
453-510.
Fox, J., Nonparametric regression, Appendix to an R and S-Plus Companion to Applied Regression,
Sage Publications, 2002.
Friedman, J.H., Multivariate adaptive regression splines, Annals of Statistics 19, 1 (1991) 1-141.
Friedman, J.H., and Stuetzle, W., Projection pursuit regression, J. Amer. Statist Assoc. 76 (1981) 817-823.
Hastie, T., and Tibshirani, R., Generalized additive models, Statist. Science 1, 3 (1986) 297-310.
Hastie, T., and Tibshirani, R., Generalized additive models: some applications, J. Amer. Statist. Assoc.
82, 398 (1987) 371-386.
Hastie, T., Tibshirani, R., and Friedman, J.H., The Element of Statistical Learning, Springer, 2001.
Hastie, T.J., and Tibshirani, R.J., Generalized Additive Models, New York, Chapman and Hall, 1990.
Nash, G., and Sofer, A., Linear and Nonlinear Programming, McGraw-Hill, New York, 1996.
Nemirovski, A., Lectures on modern convex optimization, Israel Institute of Technology (2002).
34. References
Nemirovski, A., Modern Convex Optimization, lecture notes, Israel Institute of Technology (2005).
Nesterov, Y.E , and Nemirovskii, A.S., Interior Point Methods in Convex Programming, SIAM, 1993.
Önalan, Ö., Martingale measures for NIG Lévy processes with applications to mathematical finance,
presentation in: Advanced Mathematical Methods for Finance, Side, Antalya, Turkey, April 26-29, 2006.
Taylan, P., Weber, G.-W., and Yerlikaya, F., A new approach to multivariate adaptive regression spline
by using Tikhonov regularization and continuous optimization, to appear in TOP, Selected Papers at the
Occasion of 20th EURO Mini Conference (Neringa, Lithuania, May 20-23, 2008).
Stone, C.J., Additive regression and other nonparametric models, Annals of Statistics 13, 2 (1985) 689-705.
Weber, G.-W., Taylan, P., Akteke-Öztürk, B., and Uğur, Ö., Mathematical and data mining contributions
dynamics and optimization of gene-environment networks, in the special issue Organization in Matter
from Quarks to Proteins of Electronic Journal of Theoretical Physics.
Weber, G.-W., Taylan, P., Yıldırak, K., and Görgülü, Z.K., Financial regression and organization, to appear
in the Special Issue on Optimization in Finance, of DCDIS-B (Dynamics of Continuous, Discrete and
Impulsive Systems (Series B)).