The document discusses a decision support system that uses a hybrid of decision tree and neural network algorithms. The system was developed to handle loan granting decisions and clinical decisions for eye disease diagnosis. It uses decision trees to segment customers/diseases into clusters and then feeds the rules into a neural network to better predict the risk of loan defaults or presence of eye diseases. The system was analyzed and designed using object-oriented methods and implemented in C programming language with a MATLAB engine. It achieved an 88% success rate in evaluations.
This document proposes a new approach to reduce computational time for credit scoring models based on support vector machines (SVM) and stratified sampling. The approach uses SVM incorporated with feature selection using F-scores and takes a sample of the dataset instead of the whole dataset to create the credit scoring model. The new method is shown to have competitive accuracy to other methods while significantly reducing computational time by optimizing features and reducing dataset size.
Default Probability Prediction using Artificial Neural Networks in R ProgrammingVineet Ojha
The objective of the project is to analyze the ability of the Artificial Neural Network Model
developed to forecast the credit risk profile of retails banking loan consumers and credit card
customers.
From a theoretical point of view, this project introduces a literature review on the detailed
working and the application of Artificial Neural Networks for credit risk management.
Practically, the aim of this project is presenting a model for estimating the Probability of Default
using Artificial Neural Network to accrue benefit non-linear models.
An Explanation Framework for Interpretable Credit Scoring gerogepatton
With the recent boosted enthusiasm in Artificial Intelligence (AI) and Financial Technology (FinTech),
applications such as credit scoring have gained substantial academic interest. However, despite the evergrowing achievements, the biggest obstacle in most AI systems is their lack of interpretability. This
deficiency of transparency limits their application in different domains including credit scoring. Credit
scoring systems help financial experts make better decisions regarding whether or not to accept a loan
application so that loans with a high probability of default are not accepted. Apart from the noisy and
highly imbalanced data challenges faced by such credit scoring models, recent regulations such as the
`right to explanation' introduced by the General Data Protection Regulation (GDPR) and the Equal Credit
Opportunity Act (ECOA) have added the need for model interpretability to ensure that algorithmic
decisions are understandable and coherent. A recently introduced concept is eXplainable AI (XAI), which
focuses on making black-box models more interpretable. In this work, we present a credit scoring model
that is both accurate and interpretable. For classification, state-of-the-art performance on the Home
Equity Line of Credit (HELOC) and Lending Club (LC) Datasets is achieved using the Extreme Gradient
Boosting (XGBoost) model. The model is then further enhanced with a 360-degree explanation framework,
which provides different explanations (i.e. global, local feature-based and local instance- based) that are
required by different people in different situations. Evaluation through the use of functionally-grounded,
application-grounded and human-grounded analysis shows that the explanations provided are simple and
consistent as well as correct, effective, easy to understand, sufficiently detailed and trustworthy.
CRESUS: A TOOL TO SUPPORT COLLABORATIVE REQUIREMENTS ELICITATION THROUGH ENHA...cscpconf
Communicating an organisation's requirements in a semantically consistent and understandable manner and then reflecting the potential impact of those requirements on the IT infrastructure presents a major challenge among stakeholders. Initial research findings indicate a desire among business executives for a tool that allows them to communicate organisational changes using natural language and a simulation of the IT infrastructure that supports those changes. Building on a detailed analysis and evaluation of these findings, the innovative CRESUS tool was designed and implemented. The purpose of this research was to investigate to what extent CRESUS both aids communication in the development of a shared understanding and supports collaborative requirements elicitation to bring about organisational, and associated IT infrastructural, change. This paper presents promising results that show how such a tool can facilitate collaborative requirements elicitation through increased communication around organisational change and the IT infrastructure.
DEVELOPING PREDICTION MODEL OF LOAN RISK IN BANKS USING DATA MINING mlaij
Nowadays, There are many risks related to bank loans, for the bank and for those who get the loans. The
analysis of risk in bank loans need understanding what is the meaning of risk. In addition, the number of
transactions in banking sector is rapidly growing and huge data volumes are available which represent
the customers behavior and the risks around loan are increased. Data Mining is one of the most motivating
and vital area of research with the aim of extracting information from tremendous amount of accumulated
data sets. In this paper a new model for classifying loan risk in banking sector by using data mining. The
model has been built using data form banking sector to predict the status of loans. Three algorithms have
been used to build the proposed model: j48, bayesNet and naiveBayes. By using Weka application, the
model has been implemented and tested. The results has been discussed and a full comparison between
algorithms was conducted. J48 was selected as best algorithm based on accuracy.
Implementing data-driven decision support system based on independent educati...IJECEIAES
Decision makers in the educational field always seek new technologies and tools, which provide solid, fast answers that can support decision-making process. They need a platform that utilize the students’ academic data and turn them into knowledge to make the right strategic decisions. In this paper, a roadmap for implementing a data driven decision support system (DSS) is presented based on an educational data mart. The independent data mart is implemented on the students’ degrees in 8 subjects in a private school (AlIskandaria Primary School in Basrah province, Iraq). The DSS implementation roadmap is started from pre-processing paper-based data source and ended with providing three categories of online analytical processing (OLAP) queries (multidimensional OLAP, desktop OLAP and web OLAP). Key performance indicator (KPI) is implemented as an essential part of educational DSS to measure school performance. The static evaluation method shows that the proposed DSS follows the privacy, security and performance aspects with no errors after inspecting the DSS knowledge base. The evaluation shows that the data driven DSS based on independent data mart with KPI, OLAP is one of the best platforms to support short-tolong term academic decisions.
The advantages of a connected device can be explored through the different categories of needs, by trialling a range of solutions and considering a framework of manageable steps.
Dr. Kamran Sartipi has extensive experience in research and innovation across several fields including software engineering, data analytics, information security, and healthcare informatics. He has published over 100 papers and books on topics such as software system analysis, architecture recovery, decision support systems, and security and privacy in distributed systems. Currently, he is leading two large research projects involving intelligent middleware security, user behavior pattern discovery, and knowledge extraction from medical data across multiple data centers.
This document proposes a new approach to reduce computational time for credit scoring models based on support vector machines (SVM) and stratified sampling. The approach uses SVM incorporated with feature selection using F-scores and takes a sample of the dataset instead of the whole dataset to create the credit scoring model. The new method is shown to have competitive accuracy to other methods while significantly reducing computational time by optimizing features and reducing dataset size.
Default Probability Prediction using Artificial Neural Networks in R ProgrammingVineet Ojha
The objective of the project is to analyze the ability of the Artificial Neural Network Model
developed to forecast the credit risk profile of retails banking loan consumers and credit card
customers.
From a theoretical point of view, this project introduces a literature review on the detailed
working and the application of Artificial Neural Networks for credit risk management.
Practically, the aim of this project is presenting a model for estimating the Probability of Default
using Artificial Neural Network to accrue benefit non-linear models.
An Explanation Framework for Interpretable Credit Scoring gerogepatton
With the recent boosted enthusiasm in Artificial Intelligence (AI) and Financial Technology (FinTech),
applications such as credit scoring have gained substantial academic interest. However, despite the evergrowing achievements, the biggest obstacle in most AI systems is their lack of interpretability. This
deficiency of transparency limits their application in different domains including credit scoring. Credit
scoring systems help financial experts make better decisions regarding whether or not to accept a loan
application so that loans with a high probability of default are not accepted. Apart from the noisy and
highly imbalanced data challenges faced by such credit scoring models, recent regulations such as the
`right to explanation' introduced by the General Data Protection Regulation (GDPR) and the Equal Credit
Opportunity Act (ECOA) have added the need for model interpretability to ensure that algorithmic
decisions are understandable and coherent. A recently introduced concept is eXplainable AI (XAI), which
focuses on making black-box models more interpretable. In this work, we present a credit scoring model
that is both accurate and interpretable. For classification, state-of-the-art performance on the Home
Equity Line of Credit (HELOC) and Lending Club (LC) Datasets is achieved using the Extreme Gradient
Boosting (XGBoost) model. The model is then further enhanced with a 360-degree explanation framework,
which provides different explanations (i.e. global, local feature-based and local instance- based) that are
required by different people in different situations. Evaluation through the use of functionally-grounded,
application-grounded and human-grounded analysis shows that the explanations provided are simple and
consistent as well as correct, effective, easy to understand, sufficiently detailed and trustworthy.
CRESUS: A TOOL TO SUPPORT COLLABORATIVE REQUIREMENTS ELICITATION THROUGH ENHA...cscpconf
Communicating an organisation's requirements in a semantically consistent and understandable manner and then reflecting the potential impact of those requirements on the IT infrastructure presents a major challenge among stakeholders. Initial research findings indicate a desire among business executives for a tool that allows them to communicate organisational changes using natural language and a simulation of the IT infrastructure that supports those changes. Building on a detailed analysis and evaluation of these findings, the innovative CRESUS tool was designed and implemented. The purpose of this research was to investigate to what extent CRESUS both aids communication in the development of a shared understanding and supports collaborative requirements elicitation to bring about organisational, and associated IT infrastructural, change. This paper presents promising results that show how such a tool can facilitate collaborative requirements elicitation through increased communication around organisational change and the IT infrastructure.
DEVELOPING PREDICTION MODEL OF LOAN RISK IN BANKS USING DATA MINING mlaij
Nowadays, There are many risks related to bank loans, for the bank and for those who get the loans. The
analysis of risk in bank loans need understanding what is the meaning of risk. In addition, the number of
transactions in banking sector is rapidly growing and huge data volumes are available which represent
the customers behavior and the risks around loan are increased. Data Mining is one of the most motivating
and vital area of research with the aim of extracting information from tremendous amount of accumulated
data sets. In this paper a new model for classifying loan risk in banking sector by using data mining. The
model has been built using data form banking sector to predict the status of loans. Three algorithms have
been used to build the proposed model: j48, bayesNet and naiveBayes. By using Weka application, the
model has been implemented and tested. The results has been discussed and a full comparison between
algorithms was conducted. J48 was selected as best algorithm based on accuracy.
Implementing data-driven decision support system based on independent educati...IJECEIAES
Decision makers in the educational field always seek new technologies and tools, which provide solid, fast answers that can support decision-making process. They need a platform that utilize the students’ academic data and turn them into knowledge to make the right strategic decisions. In this paper, a roadmap for implementing a data driven decision support system (DSS) is presented based on an educational data mart. The independent data mart is implemented on the students’ degrees in 8 subjects in a private school (AlIskandaria Primary School in Basrah province, Iraq). The DSS implementation roadmap is started from pre-processing paper-based data source and ended with providing three categories of online analytical processing (OLAP) queries (multidimensional OLAP, desktop OLAP and web OLAP). Key performance indicator (KPI) is implemented as an essential part of educational DSS to measure school performance. The static evaluation method shows that the proposed DSS follows the privacy, security and performance aspects with no errors after inspecting the DSS knowledge base. The evaluation shows that the data driven DSS based on independent data mart with KPI, OLAP is one of the best platforms to support short-tolong term academic decisions.
The advantages of a connected device can be explored through the different categories of needs, by trialling a range of solutions and considering a framework of manageable steps.
Dr. Kamran Sartipi has extensive experience in research and innovation across several fields including software engineering, data analytics, information security, and healthcare informatics. He has published over 100 papers and books on topics such as software system analysis, architecture recovery, decision support systems, and security and privacy in distributed systems. Currently, he is leading two large research projects involving intelligent middleware security, user behavior pattern discovery, and knowledge extraction from medical data across multiple data centers.
A Review on Credit Card Default Modelling using Data ScienceYogeshIJTSRD
In the last few years, credit card issuers have become one of the major consumer lending products in the U.S. as well as several other developed nations of the world, representing roughly 30 of total consumer lending USD 3.6 tn in 2016 . Credit cards issued by banks hold the majority of the market share with approximately 70 of the total outstanding balance. Bank’s credit card charge offs have stabilized after the financial crisis to around 3 of the outstanding total balance. However, there are still differences in the credit card charge off levels between different competitors. Harsh Nautiyal | Ayush Jyala | Dishank Bhandari "A Review on Credit Card Default Modelling using Data Science" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Special Issue | International Conference on Advances in Engineering, Science and Technology - 2021 , May 2021, URL: https://www.ijtsrd.com/papers/ijtsrd42461.pdf Paper URL : https://www.ijtsrd.com/engineering/computer-engineering/42461/a-review-on-credit-card-default-modelling-using-data-science/harsh-nautiyal
Identification of important features and data mining classification technique...IJECEIAES
Employees absenteeism at the work costs organizations billions a year. Prediction of employees’ absenteeism and the reasons behind their absence help organizations in reducing expenses and increasing productivity. Data mining turns the vast volume of human resources data into information that can help in decision-making and prediction. Although the selection of features is a critical step in data mining to enhance the efficiency of the final prediction, it is not yet known which method of feature selection is better. Therefore, this paper aims to compare the performance of three well-known feature selection methods in absenteeism prediction, which are relief-based feature selection, correlation-based feature selection and information-gain feature selection. In addition, this paper aims to find the best combination of feature selection method and data mining technique in enhancing the absenteeism prediction accuracy. Seven classification techniques were used as the prediction model. Additionally, cross-validation approach was utilized to assess the applied prediction models to have more realistic and reliable results. The used dataset was built at a courier company in Brazil with records of absenteeism at work. Regarding experimental results, correlationbased feature selection surpasses the other methods through the performance measurements. Furthermore, bagging classifier was the best-performing data mining technique when features were selected using correlation-based feature selection with an accuracy rate of (92%).
The document proposes using several machine learning algorithms, including multilayer perceptron neural networks, logistic regression, support vector machines, AdaBoostM1, and Hidden Layer Learning Vector Quantization (HLVQ-C), to improve personal credit scoring accuracy. The algorithms were tested on a large dataset from a Portuguese bank containing over 400,000 entries. HLVQ-C achieved the most accurate results, outperforming traditional linear methods. The document introduces a "usefulness" measure to evaluate classifiers based on earnings from correctly denying credit to risky applicants and losses from misclassifications.
ARABIC TEXT MINING AND ROUGH SET THEORY FOR DECISION SUPPORT SYSTEM.Hassanein Alwan
The present work based on an implement classification system for Arab Complaint. The
basis of this system is the construction of Decision Support Systems, Text Mining, and Rough Set, a
the specialized semantic approach of correlation which is referred to as the model of ASN for creating
semantic structure between all document’s words. A particular semantic weight represents true
significance of this term in the text after each feature is probable to be utilized the semantic selection
of a suggested feature based on 2 threshold values, the first being the maximum weight in a document
and the other one representing the characteristic having the maximum semantic scale with the first one
of the thresholds. The task of classification in the stage of testing depends on the dependency degree
concept in the raw group theory than before to treat all the characteristics of every one of the classes
that results from the stage of training as particular condition rules and therefore, considered in the
lower rounding group. The outputs of this classification system are sufficient, performance is quite
good persuasive, and assessment of this the system through measurement precision.
CRESUS-T: A COLLABORATIVE REQUIREMENTS ELICITATION SUPPORT TOOLijseajournal
Communicating an organisation's requirements in a semantically consistent and understandable manner
and then reflecting the potential impact of those requirements on the IT infrastructure presents a major
challenge among stakeholders. Initial research findings indicate a desire among business executives for a
tool that allows them to communicate organisational changes using natural language and a model of the IT
infrastructure that supports those changes. Building on a detailed analysis and evaluation of these findings,
the innovative CRESUS-T support tool was designed and implemented. The purpose of this research was to
investigate to what extent CRESUS-T both aids communication in the development of a shared
understanding and supports collaborative requirements elicitation to bring about organisational, and
associated IT infrastructural, change. In order to determine the extent shared understanding was fostered,
the support tool was evaluated in a case study of a business process for the roll out of the IT software
image at a third level educational institution. Statistical analysis showed that the CRESUS-T support tool
fostered shared understanding in the case study, through increased communication. Shared understanding
is also manifested in the creation of two knowledge representation artefacts namely, a requirements model
and the IT infrastructure model. The CRESUS-T support tool will be useful to requirements engineers and
business analysts that have to gather requirements asynchronously.
Biometric Identification and Authentication Providence using Fingerprint for ...IJECEIAES
The raise in the recent security incidents of cloud computing and its challenges is to secure the data. To solve this problem, the integration of mobile with cloud computing, Mobile biometric authentication in cloud computing is presented in this paper. To enhance the security, the biometric authentication is being used, since the Mobile cloud computing is popular among the mobile user. This paper examines how the mobile cloud computing (MCC) is used in security issue with finger biometric authentication model. Through this fingerprint biometric, the secret code is generated by entropy value. This enables the person to request for accessing the data in the desk computer. When the person requests the access to the authorized user through Bluetooth in mobile, the Authorized user sends the permit access through fingerprint secret code. Finally this fingerprint is verified with the database in the Desk computer. If it is matched, then the computer can be accessed by the requested person.
Clustering Prediction Techniques in Defining and Predicting Customers Defecti...IJECEIAES
With the growth of the e-commerce sector, customers have more choices, a fact which encourages them to divide their purchases amongst several ecommerce sites and compare their competitors‟ products, yet this increases high risks of churning. A review of the literature on customer churning models reveals that no prior research had considered both partial and total defection in non-contractual online environments. Instead, they focused either on a total or partial defect. This study proposes a customer churn prediction model in an e-commerce context, wherein a clustering phase is based on the integration of the k-means method and the Length-RecencyFrequency-Monetary (LRFM) model. This phase is employed to define churn followed by a multi-class prediction phase based on three classification techniques: Simple decision tree, Artificial neural networks and Decision tree ensemble, in which the dependent variable classifies a particular customer into a customer continuing loyal buying patterns (Non-churned), a partial defector (Partially-churned), and a total defector (Totally-churned). Macroaveraging measures including average accuracy, macro-average of Precision, Recall, and F-1 are used to evaluate classifiers‟ performance on 10-fold cross validation. Using real data from an online store, the results show the efficiency of decision tree ensemble model over the other models in identifying both future partial and total defection.
Identical Users in Different Social Media Provides Uniform Network Structure ...IJMTST Journal
The primary point of this venture is secure the client login and information sharing among the interpersonal organizations like Gmail, Face book and furthermore find unknown client utilizing this systems. On the off chance that the first client not accessible in the systems, but rather their companions or mysterious client knows their login points of interest implies conceivable to abuse their talks. In this venture we need to defeat the mysterious client utilizing the system without unique client information. Unapproved client utilizing the login to talk, share pictures or recordings and so on. This is the issue to be overcome in this venture .That implies client initially enlist their subtle elements with one secured question and reply. Since the unknown client can erase their talk or information. In this by utilizing the secured questions we need to recuperate the unapproved client talk history or imparting subtle elements to their IP address or MAC address. So in this venture they have discovered an approach to keep the mysterious clients abuse the first client login points.
An overview of information extraction techniques for legal document analysis ...IJECEIAES
In an Indian law system, different courts publish their legal proceedings every month for future reference of legal experts and common people. Extensive manual labor and time are required to analyze and process the information stored in these lengthy complex legal documents. Automatic legal document processing is the solution to overcome drawbacks of manual processing and will be very helpful to the common man for a better understanding of a legal domain. In this paper, we are exploring the recent advances in the field of legal text processing and provide a comparative analysis of approaches used for it. In this work, we have divided the approaches into three classes NLP based, deep learning-based and, KBP based approaches. We have put special emphasis on the KBP approach as we strongly believe that this approach can handle the complexities of the legal domain well. We finally discuss some of the possible future research directions for legal document analysis and processing.
A Novel Approach of Data Driven Analytics for Personalized Healthcare through...IJMTST Journal
Despite the fact that big data technologies appear to be overhyped and guaranteed to have extraordinary potential in the space of pharmaceutical, if the improvement happens in coordinated condition in mix with other showing strategies, it will going to ensure an unvarying redesign of in-silico solution and prompt positive clinical reception. This proposed explore is wanted to investigate the real issues with a specific end goal to have a compelling coordination of enormous information analytics and effective modeling in healthcare.
IRJET- Prediction of Credit Risks in Lending Bank LoansIRJET Journal
This document discusses machine learning models for predicting credit risk in bank loan applications. It begins with an introduction to credit risk assessment and types of loans. Then, it describes how machine learning can be used to more accurately evaluate borrowers' ability to repay loans based on important variables like borrower characteristics, loan details, and repayment status. The document proposes using artificial neural network and support vector machine models to classify borrowers as good or bad credit risks based on these variables. It evaluates the accuracy of support vector machines and boosted decision tree models for the task of credit risk prediction.
The document summarizes a panel discussion at the North Carolina Federal Advanced Technologies Symposium on May 9, 2013 about human/social sciences and advanced analytics. The panel was hosted by various government and academic organizations and discussed IntePoint's work on the DARPA ACSES project integrating multiple social science theories into an agent-based modeling simulation. IntePoint demonstrated how this simulation could be combined with infrastructure analysis to evaluate plans of action. IntePoint has over 10 years of experience transitioning research into usable modeling capabilities through projects for the US government.
Evidence Data Preprocessing for Forensic and Legal AnalyticsCSCJournals
The document discusses best practices for preprocessing evidentiary data from legal cases or forensic investigations for use in analytical experiments. It outlines key steps like identifying the analytical aim or problem based on the case scope or investigation protocol, understanding the case data through assessment and exploration of its format, features, quality, and potential issues. Challenges of working with common text-based case data like emails, social media posts are also discussed. The goal is to clean and transform raw data into a suitable format for machine learning or other advanced analytical techniques while maintaining integrity and relevance to the case.
IRJET- Fraud Detection Algorithms for a Credit CardIRJET Journal
This document discusses algorithms for detecting credit card fraud. It compares the performance of two algorithms: random forest and K-nearest neighbors (KNN). Random forest uses decision trees to classify transactions as normal or fraudulent based on attributes of past transactions. KNN compares new transactions to historical ones based on attributes. The document tests these algorithms on a real-world credit card transaction dataset. It finds that random forest obtains good results on smaller datasets but has issues with imbalanced data. The authors' future work will focus on addressing these issues and improving the random forest algorithm.
Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...Melissa Moody
Researchers Navin Kasa, Andrew Dahbura, and Charishma Ravoori undertook a capstone project—part of the UVA Data Science Institute Master of Science in Data Science program—that addresses credit card fraud detection through a semi-supervised approach, in which clusters of account profiles are created and used for modeling classifiers.
RECOMMENDATION GENERATION JUSTIFIED FOR INFORMATION ACCESS ASSISTANCE SERVICE...ijcsit
Recommendation systems only provide more specific recommendations to users. They do not consider
giving a justification for the recommendation. However, the justification for the recommendation allows the
user to make the decision whether or not to accept the recommendation. It also improves user satisfaction
and the relevance of the recommended item. However, the IAAS recommendation system that uses
advisories to make recommendations does not provide a justification for the recommendations. That is why
in this article, our task consists for helping IAAS users to justify their recommendations. For this, we
conducted a related work on architectures and approaches for justifying recommendations in order to
identify an architecture and approach suitable for the context of IAAS. From the analysis in this article, we
note that neither of these approaches uses the notices (IAAS mechanism) to justify their recommendations.
Therefore, existing architectures cannot be used in the context of IAAS. That is why,we have developed a
new IAAS architecture that deals separately with item filtration and justification extraction that
accompanied the item during recommendation generation (Figure 7). And we haveimproved the reviews by
adding users’ reviews on the items. The user’s notices include the Documentary Unit (DU), the user Group
(G), the Justification (J) and the weight (a); noted A=(DU,G,J,a).
This document discusses the application of machine learning algorithms for fraud detection in the banking sector. It proposes using Classification and Regression Tree (CART), AdaBoost, LogitBoost, and Bagging algorithms to classify banking data and detect fraud. An experiment is conducted to analyze the performance of these algorithms on a banking data set. The results show that the Bagging algorithm has the lowest misclassification rate, indicating it performs better than the other algorithms at classifying banking data for fraud detection. In conclusion, the Bagging algorithm is deemed the best performing of the meta-learning algorithms analyzed for fraud detection in banking data.
Detecting health insurance fraud using analytics Nitin Verma
Healthcare fraud, abuse and waste costs the industry nearly $80 billion per year. Predictive analytics and data mining techniques can help payers more closely monitor for fraudulent billing activities. These techniques analyze relationships within large amounts of data to predict and detect fraudulent claims, saving millions of dollars traditionally lost to healthcare fraud.
There are essential security considerations in the systems used by semiconductor companies like TI. Along
with other semiconductor companies, TI has recognized that IT security is highly crucial during web
application developers' system development life cycle (SDLC). The challenges faced by TI web developers
were consolidated via questionnaires starting with how risk management and secure coding can be
reinforced in SDLC; and how to achieve IT Security, PM and SDLC initiatives by developing a prototype
which was evaluated considering the aforementioned goals. This study aimed to practice NIST strategies
by integrating risk management checkpoints in the SDLC; enforce secure coding using static code analysis
tool by developing a prototype application mapped with IT Security goals, project management and SDLC
initiatives and evaluation of the impact of the proposed solution. This paper discussed how SecureTI was
able to satisfy IT Security requirements in the SDLC and PM phases.
This document provides a summary of a market research report on enterprise information portals commissioned by Adobe. It identifies the top 14 portal vendors, including Glyphica. The report finds the portal market is growing rapidly and portals are becoming an important part of knowledge management strategies. It provides recommendations for Adobe's portal strategy, including developing partners and strengthening PDF capabilities for portals. The summary also highlights key findings on portal adoption drivers, components, and the size and growth of the total portal market.
This document discusses different types of management information systems and decision support systems. It describes management information systems as providing managers with information to support decision making and monitor daily operations. Decision support systems are organized collections of tools used to support problem-specific decision making. Group support systems add collaboration functionality, while executive support systems are tailored specifically for senior executives. Key types of systems include management information systems, decision support systems, group support systems, and executive support systems.
A Review on Credit Card Default Modelling using Data ScienceYogeshIJTSRD
In the last few years, credit card issuers have become one of the major consumer lending products in the U.S. as well as several other developed nations of the world, representing roughly 30 of total consumer lending USD 3.6 tn in 2016 . Credit cards issued by banks hold the majority of the market share with approximately 70 of the total outstanding balance. Bank’s credit card charge offs have stabilized after the financial crisis to around 3 of the outstanding total balance. However, there are still differences in the credit card charge off levels between different competitors. Harsh Nautiyal | Ayush Jyala | Dishank Bhandari "A Review on Credit Card Default Modelling using Data Science" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Special Issue | International Conference on Advances in Engineering, Science and Technology - 2021 , May 2021, URL: https://www.ijtsrd.com/papers/ijtsrd42461.pdf Paper URL : https://www.ijtsrd.com/engineering/computer-engineering/42461/a-review-on-credit-card-default-modelling-using-data-science/harsh-nautiyal
Identification of important features and data mining classification technique...IJECEIAES
Employees absenteeism at the work costs organizations billions a year. Prediction of employees’ absenteeism and the reasons behind their absence help organizations in reducing expenses and increasing productivity. Data mining turns the vast volume of human resources data into information that can help in decision-making and prediction. Although the selection of features is a critical step in data mining to enhance the efficiency of the final prediction, it is not yet known which method of feature selection is better. Therefore, this paper aims to compare the performance of three well-known feature selection methods in absenteeism prediction, which are relief-based feature selection, correlation-based feature selection and information-gain feature selection. In addition, this paper aims to find the best combination of feature selection method and data mining technique in enhancing the absenteeism prediction accuracy. Seven classification techniques were used as the prediction model. Additionally, cross-validation approach was utilized to assess the applied prediction models to have more realistic and reliable results. The used dataset was built at a courier company in Brazil with records of absenteeism at work. Regarding experimental results, correlationbased feature selection surpasses the other methods through the performance measurements. Furthermore, bagging classifier was the best-performing data mining technique when features were selected using correlation-based feature selection with an accuracy rate of (92%).
The document proposes using several machine learning algorithms, including multilayer perceptron neural networks, logistic regression, support vector machines, AdaBoostM1, and Hidden Layer Learning Vector Quantization (HLVQ-C), to improve personal credit scoring accuracy. The algorithms were tested on a large dataset from a Portuguese bank containing over 400,000 entries. HLVQ-C achieved the most accurate results, outperforming traditional linear methods. The document introduces a "usefulness" measure to evaluate classifiers based on earnings from correctly denying credit to risky applicants and losses from misclassifications.
ARABIC TEXT MINING AND ROUGH SET THEORY FOR DECISION SUPPORT SYSTEM.Hassanein Alwan
The present work based on an implement classification system for Arab Complaint. The
basis of this system is the construction of Decision Support Systems, Text Mining, and Rough Set, a
the specialized semantic approach of correlation which is referred to as the model of ASN for creating
semantic structure between all document’s words. A particular semantic weight represents true
significance of this term in the text after each feature is probable to be utilized the semantic selection
of a suggested feature based on 2 threshold values, the first being the maximum weight in a document
and the other one representing the characteristic having the maximum semantic scale with the first one
of the thresholds. The task of classification in the stage of testing depends on the dependency degree
concept in the raw group theory than before to treat all the characteristics of every one of the classes
that results from the stage of training as particular condition rules and therefore, considered in the
lower rounding group. The outputs of this classification system are sufficient, performance is quite
good persuasive, and assessment of this the system through measurement precision.
CRESUS-T: A COLLABORATIVE REQUIREMENTS ELICITATION SUPPORT TOOLijseajournal
Communicating an organisation's requirements in a semantically consistent and understandable manner
and then reflecting the potential impact of those requirements on the IT infrastructure presents a major
challenge among stakeholders. Initial research findings indicate a desire among business executives for a
tool that allows them to communicate organisational changes using natural language and a model of the IT
infrastructure that supports those changes. Building on a detailed analysis and evaluation of these findings,
the innovative CRESUS-T support tool was designed and implemented. The purpose of this research was to
investigate to what extent CRESUS-T both aids communication in the development of a shared
understanding and supports collaborative requirements elicitation to bring about organisational, and
associated IT infrastructural, change. In order to determine the extent shared understanding was fostered,
the support tool was evaluated in a case study of a business process for the roll out of the IT software
image at a third level educational institution. Statistical analysis showed that the CRESUS-T support tool
fostered shared understanding in the case study, through increased communication. Shared understanding
is also manifested in the creation of two knowledge representation artefacts namely, a requirements model
and the IT infrastructure model. The CRESUS-T support tool will be useful to requirements engineers and
business analysts that have to gather requirements asynchronously.
Biometric Identification and Authentication Providence using Fingerprint for ...IJECEIAES
The raise in the recent security incidents of cloud computing and its challenges is to secure the data. To solve this problem, the integration of mobile with cloud computing, Mobile biometric authentication in cloud computing is presented in this paper. To enhance the security, the biometric authentication is being used, since the Mobile cloud computing is popular among the mobile user. This paper examines how the mobile cloud computing (MCC) is used in security issue with finger biometric authentication model. Through this fingerprint biometric, the secret code is generated by entropy value. This enables the person to request for accessing the data in the desk computer. When the person requests the access to the authorized user through Bluetooth in mobile, the Authorized user sends the permit access through fingerprint secret code. Finally this fingerprint is verified with the database in the Desk computer. If it is matched, then the computer can be accessed by the requested person.
Clustering Prediction Techniques in Defining and Predicting Customers Defecti...IJECEIAES
With the growth of the e-commerce sector, customers have more choices, a fact which encourages them to divide their purchases amongst several ecommerce sites and compare their competitors‟ products, yet this increases high risks of churning. A review of the literature on customer churning models reveals that no prior research had considered both partial and total defection in non-contractual online environments. Instead, they focused either on a total or partial defect. This study proposes a customer churn prediction model in an e-commerce context, wherein a clustering phase is based on the integration of the k-means method and the Length-RecencyFrequency-Monetary (LRFM) model. This phase is employed to define churn followed by a multi-class prediction phase based on three classification techniques: Simple decision tree, Artificial neural networks and Decision tree ensemble, in which the dependent variable classifies a particular customer into a customer continuing loyal buying patterns (Non-churned), a partial defector (Partially-churned), and a total defector (Totally-churned). Macroaveraging measures including average accuracy, macro-average of Precision, Recall, and F-1 are used to evaluate classifiers‟ performance on 10-fold cross validation. Using real data from an online store, the results show the efficiency of decision tree ensemble model over the other models in identifying both future partial and total defection.
Identical Users in Different Social Media Provides Uniform Network Structure ...IJMTST Journal
The primary point of this venture is secure the client login and information sharing among the interpersonal organizations like Gmail, Face book and furthermore find unknown client utilizing this systems. On the off chance that the first client not accessible in the systems, but rather their companions or mysterious client knows their login points of interest implies conceivable to abuse their talks. In this venture we need to defeat the mysterious client utilizing the system without unique client information. Unapproved client utilizing the login to talk, share pictures or recordings and so on. This is the issue to be overcome in this venture .That implies client initially enlist their subtle elements with one secured question and reply. Since the unknown client can erase their talk or information. In this by utilizing the secured questions we need to recuperate the unapproved client talk history or imparting subtle elements to their IP address or MAC address. So in this venture they have discovered an approach to keep the mysterious clients abuse the first client login points.
An overview of information extraction techniques for legal document analysis ...IJECEIAES
In an Indian law system, different courts publish their legal proceedings every month for future reference of legal experts and common people. Extensive manual labor and time are required to analyze and process the information stored in these lengthy complex legal documents. Automatic legal document processing is the solution to overcome drawbacks of manual processing and will be very helpful to the common man for a better understanding of a legal domain. In this paper, we are exploring the recent advances in the field of legal text processing and provide a comparative analysis of approaches used for it. In this work, we have divided the approaches into three classes NLP based, deep learning-based and, KBP based approaches. We have put special emphasis on the KBP approach as we strongly believe that this approach can handle the complexities of the legal domain well. We finally discuss some of the possible future research directions for legal document analysis and processing.
A Novel Approach of Data Driven Analytics for Personalized Healthcare through...IJMTST Journal
Despite the fact that big data technologies appear to be overhyped and guaranteed to have extraordinary potential in the space of pharmaceutical, if the improvement happens in coordinated condition in mix with other showing strategies, it will going to ensure an unvarying redesign of in-silico solution and prompt positive clinical reception. This proposed explore is wanted to investigate the real issues with a specific end goal to have a compelling coordination of enormous information analytics and effective modeling in healthcare.
IRJET- Prediction of Credit Risks in Lending Bank LoansIRJET Journal
This document discusses machine learning models for predicting credit risk in bank loan applications. It begins with an introduction to credit risk assessment and types of loans. Then, it describes how machine learning can be used to more accurately evaluate borrowers' ability to repay loans based on important variables like borrower characteristics, loan details, and repayment status. The document proposes using artificial neural network and support vector machine models to classify borrowers as good or bad credit risks based on these variables. It evaluates the accuracy of support vector machines and boosted decision tree models for the task of credit risk prediction.
The document summarizes a panel discussion at the North Carolina Federal Advanced Technologies Symposium on May 9, 2013 about human/social sciences and advanced analytics. The panel was hosted by various government and academic organizations and discussed IntePoint's work on the DARPA ACSES project integrating multiple social science theories into an agent-based modeling simulation. IntePoint demonstrated how this simulation could be combined with infrastructure analysis to evaluate plans of action. IntePoint has over 10 years of experience transitioning research into usable modeling capabilities through projects for the US government.
Evidence Data Preprocessing for Forensic and Legal AnalyticsCSCJournals
The document discusses best practices for preprocessing evidentiary data from legal cases or forensic investigations for use in analytical experiments. It outlines key steps like identifying the analytical aim or problem based on the case scope or investigation protocol, understanding the case data through assessment and exploration of its format, features, quality, and potential issues. Challenges of working with common text-based case data like emails, social media posts are also discussed. The goal is to clean and transform raw data into a suitable format for machine learning or other advanced analytical techniques while maintaining integrity and relevance to the case.
IRJET- Fraud Detection Algorithms for a Credit CardIRJET Journal
This document discusses algorithms for detecting credit card fraud. It compares the performance of two algorithms: random forest and K-nearest neighbors (KNN). Random forest uses decision trees to classify transactions as normal or fraudulent based on attributes of past transactions. KNN compares new transactions to historical ones based on attributes. The document tests these algorithms on a real-world credit card transaction dataset. It finds that random forest obtains good results on smaller datasets but has issues with imbalanced data. The authors' future work will focus on addressing these issues and improving the random forest algorithm.
Improving Credit Card Fraud Detection: Using Machine Learning to Profile and ...Melissa Moody
Researchers Navin Kasa, Andrew Dahbura, and Charishma Ravoori undertook a capstone project—part of the UVA Data Science Institute Master of Science in Data Science program—that addresses credit card fraud detection through a semi-supervised approach, in which clusters of account profiles are created and used for modeling classifiers.
RECOMMENDATION GENERATION JUSTIFIED FOR INFORMATION ACCESS ASSISTANCE SERVICE...ijcsit
Recommendation systems only provide more specific recommendations to users. They do not consider
giving a justification for the recommendation. However, the justification for the recommendation allows the
user to make the decision whether or not to accept the recommendation. It also improves user satisfaction
and the relevance of the recommended item. However, the IAAS recommendation system that uses
advisories to make recommendations does not provide a justification for the recommendations. That is why
in this article, our task consists for helping IAAS users to justify their recommendations. For this, we
conducted a related work on architectures and approaches for justifying recommendations in order to
identify an architecture and approach suitable for the context of IAAS. From the analysis in this article, we
note that neither of these approaches uses the notices (IAAS mechanism) to justify their recommendations.
Therefore, existing architectures cannot be used in the context of IAAS. That is why,we have developed a
new IAAS architecture that deals separately with item filtration and justification extraction that
accompanied the item during recommendation generation (Figure 7). And we haveimproved the reviews by
adding users’ reviews on the items. The user’s notices include the Documentary Unit (DU), the user Group
(G), the Justification (J) and the weight (a); noted A=(DU,G,J,a).
This document discusses the application of machine learning algorithms for fraud detection in the banking sector. It proposes using Classification and Regression Tree (CART), AdaBoost, LogitBoost, and Bagging algorithms to classify banking data and detect fraud. An experiment is conducted to analyze the performance of these algorithms on a banking data set. The results show that the Bagging algorithm has the lowest misclassification rate, indicating it performs better than the other algorithms at classifying banking data for fraud detection. In conclusion, the Bagging algorithm is deemed the best performing of the meta-learning algorithms analyzed for fraud detection in banking data.
Detecting health insurance fraud using analytics Nitin Verma
Healthcare fraud, abuse and waste costs the industry nearly $80 billion per year. Predictive analytics and data mining techniques can help payers more closely monitor for fraudulent billing activities. These techniques analyze relationships within large amounts of data to predict and detect fraudulent claims, saving millions of dollars traditionally lost to healthcare fraud.
There are essential security considerations in the systems used by semiconductor companies like TI. Along
with other semiconductor companies, TI has recognized that IT security is highly crucial during web
application developers' system development life cycle (SDLC). The challenges faced by TI web developers
were consolidated via questionnaires starting with how risk management and secure coding can be
reinforced in SDLC; and how to achieve IT Security, PM and SDLC initiatives by developing a prototype
which was evaluated considering the aforementioned goals. This study aimed to practice NIST strategies
by integrating risk management checkpoints in the SDLC; enforce secure coding using static code analysis
tool by developing a prototype application mapped with IT Security goals, project management and SDLC
initiatives and evaluation of the impact of the proposed solution. This paper discussed how SecureTI was
able to satisfy IT Security requirements in the SDLC and PM phases.
This document provides a summary of a market research report on enterprise information portals commissioned by Adobe. It identifies the top 14 portal vendors, including Glyphica. The report finds the portal market is growing rapidly and portals are becoming an important part of knowledge management strategies. It provides recommendations for Adobe's portal strategy, including developing partners and strengthening PDF capabilities for portals. The summary also highlights key findings on portal adoption drivers, components, and the size and growth of the total portal market.
This document discusses different types of management information systems and decision support systems. It describes management information systems as providing managers with information to support decision making and monitor daily operations. Decision support systems are organized collections of tools used to support problem-specific decision making. Group support systems add collaboration functionality, while executive support systems are tailored specifically for senior executives. Key types of systems include management information systems, decision support systems, group support systems, and executive support systems.
This document provides an overview of decision support systems and business intelligence. It defines key concepts like decision support frameworks, the types of decisions that systems support, and the evolution of business intelligence tools. The document also explains how decision support systems and business intelligence are related through their architectures and goals of improving access to data and decision making.
OLAP (Online Analytical Processing) is a technology that uses a multidimensional view of aggregated data to provide quicker access to strategic information and help with decision making. It has four main characteristics: using multidimensional data analysis techniques, providing advanced database support, offering easy-to-use end user interfaces, and supporting client/server architecture. A key aspect is representing data in a multidimensional structure that allows for consolidation and aggregation of data at different levels.
Industrial Marketing Research, Marketing Intelligence & Decision Support Systemmsrim
The document summarizes key aspects of industrial marketing research including:
1) It relies more on secondary data, exploratory research, and expert opinion compared to consumer marketing research which focuses more on primary data collection.
2) The scope includes market potential development, market share analysis, sales analysis, forecasting, competitor analysis, and benchmarking.
3) The marketing research process involves identifying the problem, developing a research design, collecting and analyzing data, and presenting findings.
Decision support systems, group decision support systems,expert systems-manag...clincy cleetus
concept of decision making,decision making process-intelligence phase-design phase-choice phase,types of decisions,meaning and definition of decision support systems(dss),evolution of dss,characteristics of dss,decision support and repetitiveness of decisions,objectives and importance of dss,classification of dss,components of dss,functions of dss,development of dss,support for different phases of decision making,benefits and risk of dss,group decision support systems, gdss software, gdss benefits and risks,expert systems,difference between dss and es, comparison between dss and es
This document discusses decision support systems (DSS) and online analytical processing (OLAP). It defines DSS as interactive computer systems that help managers make decisions, using tools like analytical models, databases, and modeling processes. OLAP enables examining and manipulating large amounts of consolidated data from different perspectives. Both DSS and OLAP support analysis of operational data, markets, sales, and customers to help with decisions around pricing, forecasting, and risk.
The document discusses decision support systems (DSS), which help executives make better decisions by using historical and current data from internal and external sources. DSS combine large amounts of data with analytical models and tools to provide better information for decision making. The document also describes group decision support systems (GDSS), which are electronic meeting systems that facilitate group collaboration to solve problems. Finally, the document defines intelligent systems as systems that can learn from experiences to improve performance and decision making.
Decision Support System - Management Information SystemNijaz N
Refers to class of system which supports in the process of decision making and does not always give a decision itself.
Decision Support Systems supply computerized support for the decision making process.
Knowledge Management System & TechnologyElijah Ezendu
Knowledge management systems (KMS) aim to support knowledge generation, codification, and transfer in organizations. Various technologies can provide value-adding capabilities to boost and entrench knowledge management, including information technology, communication technology, and media technology. While information technology alone is not knowledge management, different technologies can fulfill deliverables that support knowledge management processes within an organization. Properly identifying an organization's required and applicable knowledge management activities facilitates effective mapping of knowledge management processes, which then determines a fitting knowledge management system.
A decision support system (DSS) is a computer-based information system that supports business or organizational decision-making activities. A DSS can provide suggestions or solutions to help decision makers, and allows modification of suggestions before validation. DSS can be classified based on their relationship with the user as passive, active or cooperative, and based on their scope as enterprise-wide or desktop. The objectives of a DSS are to increase effectiveness of decision making and improve directors' effectiveness. A DSS has components like inputs, user knowledge, outputs, and decisions.
Decision support systems (DSS) are a class of computerized systems that help organizational decision-making. A DSS compiles useful information from data, documents, and business models to help decision-makers identify and solve problems. It has three key functions: capturing past information, data processing, and data retrieval. A DSS has three core components - a database management system, model-based management system, and dialog generation/management system. There are different types of DSS that aid decision-making through various methods like data, models, knowledge, or documents.
Decision support systems and business intelligenceShwetabh Jaiswal
The document discusses decision support systems and business intelligence. It describes how business environments have become more complex, requiring faster and better decision-making supported by computerized systems. Business intelligence involves transforming raw data into useful information to enable strategic, tactical and operational insights. Decision support systems couple individual expertise with computer capabilities to improve decision quality for semi-structured problems.
Data is raw facts and events that are recorded, information is processed data that is meaningful and relevant, and intelligence emerges from information that has been analyzed and from which conclusions have been drawn. Management information systems process data into useful information reports and dashboards to help managers make effective decisions. There are three main categories of information technology - functional IT that supports tasks, network IT that enables collaboration, and enterprise IT that structures interactions across the organization.
Supervised and unsupervised data mining approaches in loan default prediction IJECEIAES
Given the paramount importance of data mining in organizations and the possible contribution of a data-driven customer classification recommender systems for loan-extending financial institutions, the study applied supervised and supervised data mining approaches to derive the best classifier of loan default. A total of 900 instances with determined attributes and class labels were used for the training and cross-validation processes while prediction used 100 new instances without class labels. In the training phase, J48 with confidence factor of 50% attained the highest classification accuracy (76.85%), k-nearest neighbors (k-NN) 3 the highest (78.38%) in IBk variants, naïve Bayes has a classification accuracy of 76.65%, and logistic has 77.31% classification accuracy. k-NN 3 and logistic have the highest classification accuracy, F-measures, and kappa statistics. Implementation of these algorithms to the test set yielded 48 non-defaulters and 52 defaulters for k-NN 3 while 44 non-defaulters and 56 defaulters under logistic. Implications were discussed in the paper.
Loan Default Prediction Using Machine Learning TechniquesIRJET Journal
This document discusses using machine learning techniques to predict loan defaults. It begins with an abstract that outlines using data collection, cleaning, and performance assessment to predict loan defaulters. It then discusses implementing models like decision trees and KNN (K-nearest neighbors) for classification and regression. The document evaluates the performance of these models on loan default prediction and concludes the KNN model performs better. It proposes using both models and comparing their accuracy to improve prediction performance and minimize risk.
Dynamic Rule Base Construction and Maintenance Scheme for Disease Predictionijsrd.com
Business and healthcare application are tuned to automatically detect and react events generated from local are remote sources. Event detection refers to an action taken to an activity. The association rule mining techniques are used to detect activities from data sets. Events are divided into 2 types' external event and internal event. External events are generated under the remote machines and deliver data across distributed systems. Internal events are delivered and derived by the system itself. The gap between the actual event and event notification should be minimized. Event derivation should also scale for a large number of complex rules. Attacks and its severity are identified from event derivation systems. Transactional databases and external data sources are used in the event detection process. The new event discovery process is designed to support uncertain data environment. Uncertain derivation of events is performed on uncertain data values. Relevance estimation is a more challenging task under uncertain event analysis. Selectability and sampling mechanism are used to improve the derivation accuracy. Selectability filters events that are irrelevant to derivation by some rules. Selectability algorithm is applied to extract new event derivation. A Bayesian network representation is used to derive new events given the arrival of an uncertain event and to compute its probability. A sampling algorithm is used for efficient approximation of new event derivation. Medical decision support system is designed with event detection model. The system adopts the new rule mapping mechanism for the disease analysis. The rule base construction and maintenance operations are handled by the system. Rule probability estimation is carried out using the Apriori algorithm. The rule derivation process is optimized for domain specific model.
Applying Convolutional-GRU for Term Deposit Likelihood PredictionVandanaSharma356
Banks are normally offered two kinds of deposit accounts. It consists of deposits like current/saving account and term deposits like fixed or recurring deposits.For enhancing the maximized profit from bank as well as customer perspective, term deposit can accelerate uplifting of finance fields. This paper focuses on likelihood of term deposit subscription taken by the customers. Bank campaign efforts and customer detail analysis caninfluence term deposit subscription chances. An automated system is approached in this paper that works towards prediction of term deposit investment possibilities in advance. This paper proposes deep learning based hybrid model that stacks Convolutional layers and Recurrent Neural Network (RNN) layers as predictive model. For RNN, Gated Recurrent Unit (GRU) is employed. The proposed predictive model is later compared with other benchmark classifiers such as k-Nearest Neighbor (k-NN), Decision tree classifier (DT), and Multi-layer perceptron classifier (MLP). Experimental study concludesthat proposed model attainsan accuracy of 89.59% and MSE of 0.1041 which outperform wellother baseline models.
A potential objective of every financial organization is to retain existing customers and attain new
prospective customers for long-term. The economic behaviour of customer and the nature of the
organization are controlled by a prescribed form called Know Your Customer (KYC) in manual banking.
Depositor customers in some sectors (business of Jewellery/Gold, Arms, Money exchanger etc) are with
high risk; whereas in some sectors (Transport Operators, Auto-delear, religious) are with medium risk;
and in remaining sectors (Retail, Corporate, Service, Farmer etc) belongs to low risk. Presently, credit risk
for counterparty can be broadly categorized under quantitative and qualitative factors. Although there are
many existing systems on customer retention as well as customer attrition systems in bank, these rigorous
methods suffers clear and defined approach to disburse loan in business sector. In the paper, we have used
records of business customers of a retail commercial bank in the city including rural and urban area of
(Tangail city) Bangladesh to analyse the major transactional determinants of customers and predicting of a
model for prospective sectors in retail bank. To achieve this, data mining approach is adopted for
analysing the challenging issues, where pruned decision tree classification technique has been used to
develop the model and finally tested its performance with Weka result. Moreover, this paper attempts to
build up a model to predict prospective business sectors in retail banking.
Given the scenario, your role, and the information provided by theMatthewTennant613
Given the scenario, your role, and the information provided by the key players involved, it is time for you to make a decision. If you are finished reviewing this scenario, close this window and return to this week's You Decide tab, in Canvas, to complete the activity for this scenario. You can return and review this scenario again at any time.
The local healthcare market's dynamics are changing and evolving due to the increased competition between healthcare providers and systems. Things are looking up for Community Memorial as it tries to transition to a value-based care delivery system! But one day, just before lunch, you get a call from Bill Jacobs, the human resource director at Commercial Intertech (CI), the largest employer in the community. Bill says, "I wanted you to hear it from me first. We signed a contract yesterday with MegaPlan Health. It will be the managed care organization for all 4,500 of our employees and their families. About 9,000 patients total. I'm sure that you will want to get a contract with MegaPlan as soon as possible. I noticed that your hospital is not in its preferred provider network (PPN), and I am pretty sure that you will want to be so that our employees can continue using the facility." By the time you thank Bill for the heads up, the acid is already churning in your stomach. In the hospital world, MegaPlan is known for cutthroat tactics, negotiating steep discounts with hospitals, and fighting every claim the hospital makes. Commercial Intertech has every right to contract with any health insurance provider it likes, but now you have a problem. If you cannot get a decent contract with MegaPlan and become part of its PPN, many local patients may bypass your hospital and go the closest PPN facility. Delivered to your office this afternoon, by no coincidence, is a contract proposal from MegaPlan. It calls for the hospital to provide a 35% discount from charges to MegaPlan and all of its members. And it includes service preauthorization requirements that will make life very difficult for your business office. You know from experience that the hospital loses money whenever the discount from charges exceeds 20%.
My Role You are the hospital CFO, trying to solve the managed care problem.
Key Players
Dr. John Evans
Chief of Staff
Katrina Eaton
CEO
Linda Freed
Business Office Manager
Nancy Stritmatter
CNO
Technology in Education - Research Article
Educational data mining using cluster
analysis and decision tree technique:
A case study
Snježana Križanić
1
Abstract
Data mining refers to the application of data analysis techniques with the aim of extracting hidden knowledge from data by
performing the tasks of pattern recognition and predictive modeling. This article describes the application of data mining
techniques on educational data of a higher education institution in Croatia. Data used for the analysis are event logs
downloaded from an e-learning environment of a real e-course. Data mining ...
Improving Credit Risk Assessment in Financial Institutions Using Deep Learnin...IRJTAE
This research paper explores the application of deep learning and explainable artificial intelligence (XAI) in the
context of credit risk assessment for financial institutions. While deep learning models have shown high accuracy
in predicting credit risk, their complexity has raised concerns about interpretability and regulatory compliance.
This study aims to create a hybrid approach that combines the predictive power of deep learning with the
transparency of XAI. Using a large dataset of credit applications and loan outcomes, the study evaluates the
performance of various deep learning architectures and employs techniques like SHAP (SHapley Additive
exPlanations) to provide insights into model decisions. The results demonstrate that the hybrid approach can
maintain high accuracy while offering
# Project 03 - Data Mining on Financial Data
In this project, various models like Logistic Regression, KNN, SVM, and Random Forest have been applied to three finance-related datasets in order to discover the insights from the datasets. The methods will be applied to the datasets in R studio and corresponding outputs will be shown in this paper. Then the applied methods will be compared with each other to identify how well the method is performing on each of the datasets and then finally the better method will be chosen for each of the datasets. The methods were explained in detail along with their advantages with the help of the information from the relevant papers. Every method which has been applied to the datasets has been confirmed whether it follows the data mining methodologies. Data mining methodologies like CRISP-DM, KDD and SEMMA will also be explained in detail via this paper. The process flow of the data mining methodologies will be explained and also will be made sure that whether the process flow has been followed while applying each method on the datasets. Data mining is now considered as a major factor in the risk management process of financial institutions. Even though various data mining tools are existing in the market, this paper allows readers to understand how the algorithm works on the dataset and how to justify whether an algorithm’s prediction.
Predicting user behavior using data profiling and hidden Markov modelIJECEIAES
Mental health disorders affect many aspects of patient’s lives, including emotions, cognition, and especially behaviors. E-health technology helps to collect information wealth in a non-invasive manner, which represents a promising opportunity to construct health behavior markers. Combining such user behavior data can provide a more comprehensive and contextual view than questionnaire data. Due to behavioral data, we can train machine learning models to understand the data pattern and also use prediction algorithms to know the next state of a person’s behavior. The remaining challenges for this issue are how to apply mathematical formulations to textual datasets and find metadata that aids to identify the person’s life pattern and also predict the next state of his comportment. The main idea of this work is to use a hidden Markov model (HMM) to predict user behavior from social media applications by analyzing and detecting states and symbols from the user behavior dataset. To achieve this goal, we need to analyze and detect the states and symbols from the user behavior dataset, then convert the textual data to mathematical and numerical matrices. Finally, apply the HMM model to predict the hidden user behavior states. We tested our program and identified that the log-likelihood was higher and better when the model fits the data. In any case, the results of the study indicated that the program was suitable for the purpose and yielded valuable data.
Data mining technique has a key role in knowledge
extraction from databases to promote efficient decision making.
This paper presents an approach for knowledge extraction from
a sample database of some school dropped students using
association rule generation and classification algorithms to
demonstrate how knowledge-based development policy making
decisions can be processed from the extracted knowledge. A
system architecture is proposed considering mobile computing
devices as user interface to the system connecting mass people
database with cloud computing environment resources. The
causes of education termination are investigated by analyzing the
sample database in terms of attribute value relationship in the
form of association rules to reason about the causes based on the
computed support and confidence. It is observed that if the
affected family had no service holders, the dropped student had
to stop his education because of financial problem. Classification
is applied to classify the dropped students in different groups
based on their level of education.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
IRJET- Machine Learning Classification Algorithms for Predictive Analysis in ...IRJET Journal
This document discusses machine learning classification algorithms and their applications for predictive analysis in healthcare. It provides an overview of data mining techniques like association, classification, clustering, prediction, and sequential patterns. Specific classification algorithms discussed include Naive Bayes, Support Vector Machine, Decision Trees, K-Nearest Neighbors, Neural Networks, and Bayesian Methods. The document examines examples of these algorithms being used for disease diagnosis, prognosis, and healthcare management. It analyzes their predictive performance on datasets for conditions like breast cancer, heart disease, and ICU readmissions. Overall, the document reviews how machine learning techniques can enhance predictive accuracy for various healthcare problems.
Health Care Application using Machine Learning and Deep LearningIRJET Journal
This document presents a study on using machine learning and deep learning techniques for healthcare applications like disease prediction. It discusses algorithms like logistic regression, decision trees, random forests, SVMs and deep learning models like VGG16 applied on various disease datasets. For diabetes, heart and liver diseases, ML algorithms were used while CNN models were used for malaria and pneumonia image datasets. Random forest achieved the highest accuracy of 84.81% for diabetes prediction, SVM had 81.57% accuracy for heart disease and random forest was best at 83.33% for liver disease. The VGG16 model attained accuracies of 94.29% and 95.48% for malaria and pneumonia respectively. The study aims to develop an intelligent healthcare application for predicting different
SWARM OPTIMIZED MODULAR NEURAL NETWORK BASED DIAGNOSTIC SYSTEM FOR BREAST CAN...ijscai
The document describes a modular neural network approach optimized by particle swarm optimization for breast cancer diagnosis. The approach uses a modular neural network with several independent neural network experts that analyze input data individually and provide outputs that are combined by an integrator. Particle swarm optimization is used to determine optimal connections for each expert neural network during training. The optimized modular neural network is then used to classify breast cancer samples as cancerous or non-cancerous, demonstrating better diagnostic ability than traditional methods.
Intelligent data analysis for medicinal diagnosisIRJET Journal
The document describes a proposed privacy-preserving patient-centric clinical decision support system called PPCD that uses naive Bayesian classification to help doctors predict disease risks for patients in a privacy-preserving manner. PPCD allows medical diagnosis and prediction of disease risks for new patients without leaking any individual patient medical information. It utilizes historical medical information from past patients, stored privately in the cloud, to train a naive Bayesian classifier. This trained classifier can then be used to diagnose diseases for new patients based on their symptoms while preserving privacy. The system also introduces a new aggregation technique called additive homomorphic proxy aggregation to allow training of the naive Bayesian classifier without revealing individual patient medical records.
This document discusses case based reasoning and its application in data mining and databases. Case based reasoning involves solving current problems by adapting solutions from similar past problems. The author defines case based reasoning and describes the typical four step structure of a case base database used in case based reasoning: 1) retrieval of similar past cases, 2) reuse of solutions from these similar cases, 3) revision of these solutions if needed, and 4) retention of the revised solutions as new cases. The article examines how case based reasoning, data mining techniques, and databases can be used together across various industries.
How analytical hierarchy process prioritizing internet banking influencing fa...IJECEIAES
Internet banking is a method for conducting financial transactions online that makes use of the internet as a platform. Customers could transact at any time and from any location. Numerous aspects relating to the adoption of internet banking have been analyzed and studied in recent in-depth studies from the literature. This paper will merge these numerous predefined factors into a model by drawing on the various ideas related to the acceptance model. It was decided to adapt the analytical hierarchy process (AHP) method for finding the important numerous components for the model. The finest mathematical calculation method is AHP, which enables decision-makers to prioritize their ranking in order to satisfy various criteria. The goal of this research is to rank the elements that influence the use of online banking. Three main factors such as technical information, website and service availability were chosen as the main factors of the model based on the literature review. Sub-factors such as ease of use, responsiveness, privacy, reliability, security, communication and efficiency were also suggested and combined into a single integrated framework. Utilizing the systematic literature review (SLR) methodology, several factors were found. As a result, the article will enhance understanding of the unique elements supporting the adoption of internet banking.
AN EXPLANATION FRAMEWORK FOR INTERPRETABLE CREDIT SCORINGijaia
With the recent boosted enthusiasm in Artificial Intelligence (AI) and Financial Technology (FinTech),
applications such as credit scoring have gained substantial academic interest. However, despite the evergrowing achievements, the biggest obstacle in most AI systems is their lack of interpretability. This
deficiency of transparency limits their application in different domains including credit scoring. Credit
scoring systems help financial experts make better decisions regarding whether or not to accept a loan
application so that loans with a high probability of default are not accepted. Apart from the noisy and
highly imbalanced data challenges faced by such credit scoring models, recent regulations such as the
`right to explanation' introduced by the General Data Protection Regulation (GDPR) and the Equal Credit
Opportunity Act (ECOA) have added the need for model interpretability to ensure that algorithmic
decisions are understandable and coherent. A recently introduced concept is eXplainable AI (XAI), which
focuses on making black-box models more interpretable. In this work, we present a credit scoring model
that is both accurate and interpretable. For classification, state-of-the-art performance on the Home
Equity Line of Credit (HELOC) and Lending Club (LC) Datasets is achieved using the Extreme Gradient
Boosting (XGBoost) model. The model is then further enhanced with a 360-degree explanation framework,
which provides different explanations (i.e. global, local feature-based and local instance- based) that are
required by different people in different situations. Evaluation through the use of functionally-grounded,
application-grounded and human-grounded analysis shows that the explanations provided are simple and
consistent as well as correct, effective, easy to understand, sufficiently detailed and trustworthy.
Applying Classification Technique using DID3 Algorithm to improve Decision Su...IJMER
International Journal of Modern Engineering Research (IJMER) is Peer reviewed, online Journal. It serves as an international archival forum of scholarly research related to engineering and science education.
International Journal of Modern Engineering Research (IJMER) covers all the fields of engineering and science: Electrical Engineering, Mechanical Engineering, Civil Engineering, Chemical Engineering, Computer Engineering, Agricultural Engineering, Aerospace Engineering, Thermodynamics, Structural Engineering, Control Engineering, Robotics, Mechatronics, Fluid Mechanics, Nanotechnology, Simulators, Web-based Learning, Remote Laboratories, Engineering Design Methods, Education Research, Students' Satisfaction and Motivation, Global Projects, and Assessment…. And many more.
Similar to Decision support system using decision tree and neural networks (20)
Abnormalities of hormones and inflammatory cytokines in women affected with p...Alexander Decker
Women with polycystic ovary syndrome (PCOS) have elevated levels of hormones like luteinizing hormone and testosterone, as well as higher levels of insulin and insulin resistance compared to healthy women. They also have increased levels of inflammatory markers like C-reactive protein, interleukin-6, and leptin. This study found these abnormalities in the hormones and inflammatory cytokines of women with PCOS ages 23-40, indicating that hormone imbalances associated with insulin resistance and elevated inflammatory markers may worsen infertility in women with PCOS.
A usability evaluation framework for b2 c e commerce websitesAlexander Decker
This document presents a framework for evaluating the usability of B2C e-commerce websites. It involves user testing methods like usability testing and interviews to identify usability problems in areas like navigation, design, purchasing processes, and customer service. The framework specifies goals for the evaluation, determines which website aspects to evaluate, and identifies target users. It then describes collecting data through user testing and analyzing the results to identify usability problems and suggest improvements.
A universal model for managing the marketing executives in nigerian banksAlexander Decker
This document discusses a study that aimed to synthesize motivation theories into a universal model for managing marketing executives in Nigerian banks. The study was guided by Maslow and McGregor's theories. A sample of 303 marketing executives was used. The results showed that managers will be most effective at motivating marketing executives if they consider individual needs and create challenging but attainable goals. The emerged model suggests managers should provide job satisfaction by tailoring assignments to abilities and monitoring performance with feedback. This addresses confusion faced by Nigerian bank managers in determining effective motivation strategies.
A unique common fixed point theorems in generalized dAlexander Decker
This document presents definitions and properties related to generalized D*-metric spaces and establishes some common fixed point theorems for contractive type mappings in these spaces. It begins by introducing D*-metric spaces and generalized D*-metric spaces, defines concepts like convergence and Cauchy sequences. It presents lemmas showing the uniqueness of limits in these spaces and the equivalence of different definitions of convergence. The goal of the paper is then stated as obtaining a unique common fixed point theorem for generalized D*-metric spaces.
A trends of salmonella and antibiotic resistanceAlexander Decker
This document provides a review of trends in Salmonella and antibiotic resistance. It begins with an introduction to Salmonella as a facultative anaerobe that causes nontyphoidal salmonellosis. The emergence of antimicrobial-resistant Salmonella is then discussed. The document proceeds to cover the historical perspective and classification of Salmonella, definitions of antimicrobials and antibiotic resistance, and mechanisms of antibiotic resistance in Salmonella including modification or destruction of antimicrobial agents, efflux pumps, modification of antibiotic targets, and decreased membrane permeability. Specific resistance mechanisms are discussed for several classes of antimicrobials.
A transformational generative approach towards understanding al-istifhamAlexander Decker
This document discusses a transformational-generative approach to understanding Al-Istifham, which refers to interrogative sentences in Arabic. It begins with an introduction to the origin and development of Arabic grammar. The paper then explains the theoretical framework of transformational-generative grammar that is used. Basic linguistic concepts and terms related to Arabic grammar are defined. The document analyzes how interrogative sentences in Arabic can be derived and transformed via tools from transformational-generative grammar, categorizing Al-Istifham into linguistic and literary questions.
A time series analysis of the determinants of savings in namibiaAlexander Decker
This document summarizes a study on the determinants of savings in Namibia from 1991 to 2012. It reviews previous literature on savings determinants in developing countries. The study uses time series analysis including unit root tests, cointegration, and error correction models to analyze the relationship between savings and variables like income, inflation, population growth, deposit rates, and financial deepening in Namibia. The results found inflation and income have a positive impact on savings, while population growth negatively impacts savings. Deposit rates and financial deepening were found to have no significant impact. The study reinforces previous work and emphasizes the importance of improving income levels to achieve higher savings rates in Namibia.
A therapy for physical and mental fitness of school childrenAlexander Decker
This document summarizes a study on the importance of exercise in maintaining physical and mental fitness for school children. It discusses how physical and mental fitness are developed through participation in regular physical exercises and cannot be achieved solely through classroom learning. The document outlines different types and components of fitness and argues that developing fitness should be a key objective of education systems. It recommends that schools ensure pupils engage in graded physical activities and exercises to support their overall development.
A theory of efficiency for managing the marketing executives in nigerian banksAlexander Decker
This document summarizes a study examining efficiency in managing marketing executives in Nigerian banks. The study was examined through the lenses of Kaizen theory (continuous improvement) and efficiency theory. A survey of 303 marketing executives from Nigerian banks found that management plays a key role in identifying and implementing efficiency improvements. The document recommends adopting a "3H grand strategy" to improve the heads, hearts, and hands of management and marketing executives by enhancing their knowledge, attitudes, and tools.
This document discusses evaluating the link budget for effective 900MHz GSM communication. It describes the basic parameters needed for a high-level link budget calculation, including transmitter power, antenna gains, path loss, and propagation models. Common propagation models for 900MHz that are described include Okumura model for urban areas and Hata model for urban, suburban, and open areas. Rain attenuation is also incorporated using the updated ITU model to improve communication during rainfall.
A synthetic review of contraceptive supplies in punjabAlexander Decker
This document discusses contraceptive use in Punjab, Pakistan. It begins by providing background on the benefits of family planning and contraceptive use for maternal and child health. It then analyzes contraceptive commodity data from Punjab, finding that use is still low despite efforts to improve access. The document concludes by emphasizing the need for strategies to bridge gaps and meet the unmet need for effective and affordable contraceptive methods and supplies in Punjab in order to improve health outcomes.
A synthesis of taylor’s and fayol’s management approaches for managing market...Alexander Decker
1) The document discusses synthesizing Taylor's scientific management approach and Fayol's process management approach to identify an effective way to manage marketing executives in Nigerian banks.
2) It reviews Taylor's emphasis on efficiency and breaking tasks into small parts, and Fayol's focus on developing general management principles.
3) The study administered a survey to 303 marketing executives in Nigerian banks to test if combining elements of Taylor and Fayol's approaches would help manage their performance through clear roles, accountability, and motivation. Statistical analysis supported combining the two approaches.
A survey paper on sequence pattern mining with incrementalAlexander Decker
This document summarizes four algorithms for sequential pattern mining: GSP, ISM, FreeSpan, and PrefixSpan. GSP is an Apriori-based algorithm that incorporates time constraints. ISM extends SPADE to incrementally update patterns after database changes. FreeSpan uses frequent items to recursively project databases and grow subsequences. PrefixSpan also uses projection but claims to not require candidate generation. It recursively projects databases based on short prefix patterns. The document concludes by stating the goal was to find an efficient scheme for extracting sequential patterns from transactional datasets.
A survey on live virtual machine migrations and its techniquesAlexander Decker
This document summarizes several techniques for live virtual machine migration in cloud computing. It discusses works that have proposed affinity-aware migration models to improve resource utilization, energy efficient migration approaches using storage migration and live VM migration, and a dynamic consolidation technique using migration control to avoid unnecessary migrations. The document also summarizes works that have designed methods to minimize migration downtime and network traffic, proposed a resource reservation framework for efficient migration of multiple VMs, and addressed real-time issues in live migration. Finally, it provides a table summarizing the techniques, tools used, and potential future work or gaps identified for each discussed work.
A survey on data mining and analysis in hadoop and mongo dbAlexander Decker
This document discusses data mining of big data using Hadoop and MongoDB. It provides an overview of Hadoop and MongoDB and their uses in big data analysis. Specifically, it proposes using Hadoop for distributed processing and MongoDB for data storage and input. The document reviews several related works that discuss big data analysis using these tools, as well as their capabilities for scalable data storage and mining. It aims to improve computational time and fault tolerance for big data analysis by mining data stored in Hadoop using MongoDB and MapReduce.
1. The document discusses several challenges for integrating media with cloud computing including media content convergence, scalability and expandability, finding appropriate applications, and reliability.
2. Media content convergence challenges include dealing with the heterogeneity of media types, services, networks, devices, and quality of service requirements as well as integrating technologies used by media providers and consumers.
3. Scalability and expandability challenges involve adapting to the increasing volume of media content and being able to support new media formats and outlets over time.
This document surveys trust architectures that leverage provenance in wireless sensor networks. It begins with background on provenance, which refers to the documented history or derivation of data. Provenance can be used to assess trust by providing metadata about how data was processed. The document then discusses challenges for using provenance to establish trust in wireless sensor networks, which have constraints on energy and computation. Finally, it provides background on trust, which is the subjective probability that a node will behave dependably. Trust architectures need to be lightweight to account for the constraints of wireless sensor networks.
This document discusses private equity investments in Kenya. It provides background on private equity and discusses trends in various regions. The objectives of the study discussed are to establish the extent of private equity adoption in Kenya, identify common forms of private equity utilized, and determine typical exit strategies. Private equity can involve venture capital, leveraged buyouts, or mezzanine financing. Exits allow recycling of capital into new opportunities. The document provides context on private equity globally and in developing markets like Africa to frame the goals of the study.
This document discusses a study that analyzes the financial health of the Indian logistics industry from 2005-2012 using Altman's Z-score model. The study finds that the average Z-score for selected logistics firms was in the healthy to very healthy range during the study period. The average Z-score increased from 2006 to 2010 when the Indian economy was hit by the global recession, indicating the overall performance of the Indian logistics industry was good. The document reviews previous literature on measuring financial performance and distress using ratios and Z-scores, and outlines the objectives and methodology used in the current study.
In the realm of cybersecurity, offensive security practices act as a critical shield. By simulating real-world attacks in a controlled environment, these techniques expose vulnerabilities before malicious actors can exploit them. This proactive approach allows manufacturers to identify and fix weaknesses, significantly enhancing system security.
This presentation delves into the development of a system designed to mimic Galileo's Open Service signal using software-defined radio (SDR) technology. We'll begin with a foundational overview of both Global Navigation Satellite Systems (GNSS) and the intricacies of digital signal processing.
The presentation culminates in a live demonstration. We'll showcase the manipulation of Galileo's Open Service pilot signal, simulating an attack on various software and hardware systems. This practical demonstration serves to highlight the potential consequences of unaddressed vulnerabilities, emphasizing the importance of offensive security practices in safeguarding critical infrastructure.
This talk will cover ScyllaDB Architecture from the cluster-level view and zoom in on data distribution and internal node architecture. In the process, we will learn the secret sauce used to get ScyllaDB's high availability and superior performance. We will also touch on the upcoming changes to ScyllaDB architecture, moving to strongly consistent metadata and tablets.
Main news related to the CCS TSI 2023 (2023/1695)Jakub Marek
An English 🇬🇧 translation of a presentation to the speech I gave about the main changes brought by CCS TSI 2023 at the biggest Czech conference on Communications and signalling systems on Railways, which was held in Clarion Hotel Olomouc from 7th to 9th November 2023 (konferenceszt.cz). Attended by around 500 participants and 200 on-line followers.
The original Czech 🇨🇿 version of the presentation can be found here: https://www.slideshare.net/slideshow/hlavni-novinky-souvisejici-s-ccs-tsi-2023-2023-1695/269688092 .
The videorecording (in Czech) from the presentation is available here: https://youtu.be/WzjJWm4IyPk?si=SImb06tuXGb30BEH .
Dandelion Hashtable: beyond billion requests per second on a commodity serverAntonios Katsarakis
This slide deck presents DLHT, a concurrent in-memory hashtable. Despite efforts to optimize hashtables, that go as far as sacrificing core functionality, state-of-the-art designs still incur multiple memory accesses per request and block request processing in three cases. First, most hashtables block while waiting for data to be retrieved from memory. Second, open-addressing designs, which represent the current state-of-the-art, either cannot free index slots on deletes or must block all requests to do so. Third, index resizes block every request until all objects are copied to the new index. Defying folklore wisdom, DLHT forgoes open-addressing and adopts a fully-featured and memory-aware closed-addressing design based on bounded cache-line-chaining. This design offers lock-free index operations and deletes that free slots instantly, (2) completes most requests with a single memory access, (3) utilizes software prefetching to hide memory latencies, and (4) employs a novel non-blocking and parallel resizing. In a commodity server and a memory-resident workload, DLHT surpasses 1.6B requests per second and provides 3.5x (12x) the throughput of the state-of-the-art closed-addressing (open-addressing) resizable hashtable on Gets (Deletes).
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor IvaniukFwdays
At this talk we will discuss DDoS protection tools and best practices, discuss network architectures and what AWS has to offer. Also, we will look into one of the largest DDoS attacks on Ukrainian infrastructure that happened in February 2022. We'll see, what techniques helped to keep the web resources available for Ukrainians and how AWS improved DDoS protection for all customers based on Ukraine experience
What is an RPA CoE? Session 2 – CoE RolesDianaGray10
In this session, we will review the players involved in the CoE and how each role impacts opportunities.
Topics covered:
• What roles are essential?
• What place in the automation journey does each role play?
Speaker:
Chris Bolin, Senior Intelligent Automation Architect Anika Systems
inQuba Webinar Mastering Customer Journey Management with Dr Graham HillLizaNolte
HERE IS YOUR WEBINAR CONTENT! 'Mastering Customer Journey Management with Dr. Graham Hill'. We hope you find the webinar recording both insightful and enjoyable.
In this webinar, we explored essential aspects of Customer Journey Management and personalization. Here’s a summary of the key insights and topics discussed:
Key Takeaways:
Understanding the Customer Journey: Dr. Hill emphasized the importance of mapping and understanding the complete customer journey to identify touchpoints and opportunities for improvement.
Personalization Strategies: We discussed how to leverage data and insights to create personalized experiences that resonate with customers.
Technology Integration: Insights were shared on how inQuba’s advanced technology can streamline customer interactions and drive operational efficiency.
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...DanBrown980551
This LF Energy webinar took place June 20, 2024. It featured:
-Alex Thornton, LF Energy
-Hallie Cramer, Google
-Daniel Roesler, UtilityAPI
-Henry Richardson, WattTime
In response to the urgency and scale required to effectively address climate change, open source solutions offer significant potential for driving innovation and progress. Currently, there is a growing demand for standardization and interoperability in energy data and modeling. Open source standards and specifications within the energy sector can also alleviate challenges associated with data fragmentation, transparency, and accessibility. At the same time, it is crucial to consider privacy and security concerns throughout the development of open source platforms.
This webinar will delve into the motivations behind establishing LF Energy’s Carbon Data Specification Consortium. It will provide an overview of the draft specifications and the ongoing progress made by the respective working groups.
Three primary specifications will be discussed:
-Discovery and client registration, emphasizing transparent processes and secure and private access
-Customer data, centering around customer tariffs, bills, energy usage, and full consumption disclosure
-Power systems data, focusing on grid data, inclusive of transmission and distribution networks, generation, intergrid power flows, and market settlement data
"Scaling RAG Applications to serve millions of users", Kevin GoedeckeFwdays
How we managed to grow and scale a RAG application from zero to thousands of users in 7 months. Lessons from technical challenges around managing high load for LLMs, RAGs and Vector databases.
"NATO Hackathon Winner: AI-Powered Drug Search", Taras KlobaFwdays
This is a session that details how PostgreSQL's features and Azure AI Services can be effectively used to significantly enhance the search functionality in any application.
In this session, we'll share insights on how we used PostgreSQL to facilitate precise searches across multiple fields in our mobile application. The techniques include using LIKE and ILIKE operators and integrating a trigram-based search to handle potential misspellings, thereby increasing the search accuracy.
We'll also discuss how the azure_ai extension on PostgreSQL databases in Azure and Azure AI Services were utilized to create vectors from user input, a feature beneficial when users wish to find specific items based on text prompts. While our application's case study involves a drug search, the techniques and principles shared in this session can be adapted to improve search functionality in a wide range of applications. Join us to learn how PostgreSQL and Azure AI can be harnessed to enhance your application's search capability.
What is an RPA CoE? Session 1 – CoE VisionDianaGray10
In the first session, we will review the organization's vision and how this has an impact on the COE Structure.
Topics covered:
• The role of a steering committee
• How do the organization’s priorities determine CoE Structure?
Speaker:
Chris Bolin, Senior Intelligent Automation Architect Anika Systems
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...Jason Yip
The typical problem in product engineering is not bad strategy, so much as “no strategy”. This leads to confusion, lack of motivation, and incoherent action. The next time you look for a strategy and find an empty space, instead of waiting for it to be filled, I will show you how to fill it in yourself. If you’re wrong, it forces a correction. If you’re right, it helps create focus. I’ll share how I’ve approached this in the past, both what works and lessons for what didn’t work so well.
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Decision support system using decision tree and neural networks
1. Computer Engineering and Intelligent Systems www.iiste.org
ISSN 2222-1719 (Paper) ISSN 2222-2863 (Online)
Vol.4, No.7, 2013
8
Decision Support System Using Decision Tree and Neural
Networks
L. G. Kabari1*
and E. O. Nwachukwu2
1. Computer Science Department, Rivers State Polytechnic, Bori, PMB 20, Rivers State, Nigeria
2. Computer Science Department, University of Port Harcourt, Port Harcourt, PMB 5323, Nigeria
*
E-mail of the corresponding author: ledisigiokkabari@yahoo.com
ABSTRACT
Decision making in a complex and dynamically changing environment of the present day demands a new
techniques of computational intelligence for building equally an adaptive, hybrid intelligent decision support
system. In this paper, a Decision Tree-Neuro Based model was developed to handle loan granting decision
support system and clinical decision support system(Eye Disease Diagnosis) which are two important decision
problems that requires delicate care. The system uses an integration of Decision Tree and Artificial Neural
Networks with a hybrid of Decision Tree algorithm and Multilayer Feed-forward Neural Network with
backpropagation learning algorithm to build up the proposed model. Different representative cases of loan
applications and eye disease diagnosis were considered based on the guidelines of different banks in Nigeria and
according to patient complaint, symptoms and physical eye examinations to validate the model. Object-Oriented
Analysis and Design (OO-AD) methodology was used in the development of the system, and an object-oriented
programming language was used with a MATLAB engine to implement the models and classes designed in the
system. The system developed, gives 88% success rate and eliminate the opacity of an ordinary neural networks
system.
Keywords: Decision Tree-Neuro Based Model, Backpropagation Learning Algorithm, Object-Oriented Analysis
and Design, MATLAB Embedded Engine, Loan Granting, Eye Diseases Diagnosis.
1. Introduction
A decision support system (DSS) is a computer-based information system that supports business or
organizational decision-making activities. DSSs serve the management, operations, and planning levels of an
organization and help to make decisions, which may be rapidly changing and not easily specified in advance.
Decision makers are faced with increasingly stressful environments – highly competitive, fast-paced, near real-
time, overloaded with information, data distributed throughout the enterprise, and multinational in scope. The
combination of the Internet enabling speed and access, and the maturation of artificial intelligence techniques,
has led to sophisticated aids to support decision making under these risky and uncertain conditions. These aids
have the potential to improve decision making by suggesting solutions that are better than those made by the
human alone.
They are increasingly available in diverse fields from medical diagnosis to traffic control to engineering
applications. The granting of loans by a financial institution (bank or home loan business) is one of the important
decision problems that require delicate care. Wrong decision may lead to bank distress and even near collapse of
national economy which may require huge bail-outs from government to keep the economy going.
In this paper, decision tree-neuro based model will be developed for the loan granting decision support system
and eye disease diagnosis, which is a hybrid of decision tree and neural network. This hybrid approach enables
us to build rules for different groups of borrowers and eye diseases separately. In the first stage, bank customers
or eye diseases are segmented into clusters, that are characterized by similar features and then, in the second step,
for each group, decision trees are built to obtain rules that are feed into the neural net for indicating clients
expected not to repay the loan or presence or absence of an eye disease. The main advantage of applying the
integration of two techniques consists of building models that, may better predict risk connected with granting
credits for each client with good explanations, than while using each method separately. The developed model
was analyzed and designed using appropriate tools and the implementation was carried out using C programming
language with an embedded MATLAB engine i.e. C programming was used with its classes to call MATLAB
tools at different stages.
2.0 Literature Review
An artificial neural network (NN) is a computational structure modelled loosely on biological process. NNs
explore many competing hypothesis simultaneously using a massively parallel network composed of non-linear
relatively computational elements interconnected by links with variable weights. It is this interconnected set of
weight that contains the knowledge generated by the Neural Networks(NN). NNs have been successfully used
2. Computer Engineering and Intelligent Systems www.iiste.org
ISSN 2222-1719 (Paper) ISSN 2222-2863 (Online)
Vol.4, No.7, 2013
9
for low- level cognitive tasks such as speech recognition and character recognition. They are been explored for
decision support and knowledge induction [1][2][3] . In general, NN models are specified by network topology,
node characteristics, and training or learning rules. NNs are composed of large number of simple processing
units, each interacting with others via excitatory or inhibitory connections. Distributed representation over a
large number of units, together with interconnectedness among processing units, provides a fault tolerance.
Learning is achieved through a rule that adapt connection weights in response to input patterns. Alterations in the
weights associated with the connections permits adaptability to new situations [4]
A decision tree is a decision support tool that uses a tree-like graph or model of decisions and their possible
consequences, including chance event outcomes, resource costs, and utility. It is one way to display an algorithm.
Decision trees are commonly used in operations research, specifically in decision analysis, to help identify a
strategy most likely to reach a goal. Another use of decision trees is as a descriptive means for calculating
conditional probabilities.
In financial institutions Loan applications can be categorized into good applications and bad applications. Good
applications are the applications that are worthy of giving the loan. Bad applications are those ones that should
be rejected due to the low probability of the applicants ever returning the loan. The institution usually employs
loan officers to make credit decisions or recommendations for that institution. These officers are given some
hard rules to guide them in evaluating the worthiness of loan applications. After some period of time, the officers
also gain their own experiential knowledge or intuition (other than those guidelines given from their institution)
in deciding whether an application is loan worthy or not. Generally, there is widespread recognition that the
capability of humans to judge the worthiness of a loan is rather poor [5]. Some of the reasons are:
(i) There is a large gray area where the decision is up to the officers, and there are cases which are not
immediately obvious for decision making;
(ii) Humans are prone to bias, for instance the presence of a physical or emotional condition can affect the
decision making process. Also personal acquaintances with the applicants might distort the judgmental capability;
(iii) Business data warehouses store historical data from the previous applications. It is likely that there are
knowledge hidden in this data, which may be useful for assisting the decision making. Unfortunately, the task of
discovering useful relationships or patterns from data is difficult for humans [6]. The reasons for such difficulties
are the large volume of the data to be examined, and the nature of the relationships themselves that are not
obvious.
Given the fact that humans are not good at evaluating loan applications, a knowledge discovery tool thus is
needed to assist the decision maker to make decisions regarding loan applications. Knowledge discovery
provides a variety of useful tools for discovering the non-obvious relationships in historical data, while ensuring
those relationships discovered will generalize to the new/future data [7][8]. This knowledge in the end can be
used by the loan officers to assist them in rejecting or accepting applications. Past studies show that even the
application of a simplistic linear discriminant technique in place of human judgment yields a significant,
although still unsatisfactory increase in performance [5]. Treating the nature of the loan application evaluation as
a classification [9] and forecasting problem [10], have also been offered.
Techniques in literature ranges from traditional statistical methods like logistic regression [11], k-nearest neigh-
bor [12], classification trees [13] or simple neural network models [14][15][16], as well as cluster analysis
[17][18][19] to combination of methods to obtain stronger general rules[20].The combination approach allowed
for connecting two kinds of representation knowledge and for formulating rules for a set of typical examples.
The use of expert system as a mean of conducting medical diagnosis and recommending successful treatments
has been a highly active research field in the past few years. Development of medical expert system that uses
artificial neural networks (ANN) as knowledge base appears to be a promising method for diagnosis and possible
treatment routines. One of the major applications of medical informatics has been the implementation and use of
expert systems to predict medical diagnoses based upon a set of symptoms [21]. Furthermore, such expert
systems serve as an aid to medical professionals in recommending effective laboratory tests and treatments of
diseases. An intelligent computer program assisting medical diagnosis could provide easy access to a wealth of
information from past patient data. Such a resource may help hospitals reduce excessive costs from unnecessary
laboratory test and ineffective patient treatment, while maintaining high quality of medical care. One major
drawback of conventional medical expert systems is the use of static knowledge base developed from a limited
number of cases and a limited population size, demographics, and geographic location. The knowledge base is
inherently not dynamic and is not routinely updated to keep up with emerging trends such as the appearance or
increased prevalence of unforeseen diagnoses. The result is that, after a given period of time this inflexibility
limits the use of the knowledge base as it no longer reflects the current characteristics of the population at risk.
In [22], Clinical Decision Support System (CDSS) using a hybrid of neural networks and decision trees for the
diagnosis of eye diseases was presented. Neural networks are first trained and then combined with decision trees
in order to extract knowledge learnt in the training process. Artificial neural networks are used for the diagnosis
3. Computer Engineering and Intelligent Systems www.iiste.org
ISSN 2222-1719 (Paper) ISSN 2222-2863 (Online)
Vol.4, No.7, 2013
10
of selected eye diseases according to patient complaint, symptoms and physical eye examinations. The trained
network diagnosis the eye diseases according to the knowledge the network acquired by learning from previous
eye diseases. After successful training, knowledge is extracted from these trained networks using decision trees
in the form of ‘if-then’ rules. The system can be used by ophthalmologists to minimize unnecessary laboratory
tests and reduce operational costs in treatment of eye diseases. The general paradigm can be applied to other
categories of diseases.
3.0 Methodology
In this paper, an integration of Decision Tress and Neural Networks is used to form a hybrid called Decision
Tree-Neuro Based Decision Support System (DTNBDSS). The implementation of the design was carried out on
two areas:- Loan granting decision support system(LGDSS) and Clinical Decision Support System( Eye
Diseases Diagnosis)
3.1 Analysis of the Method used
When decision trees and neural networks are compared, one can see that their advantages and disadvantages are
almost complementary. For instance knowledge representation of decision tree is easily understood by humans,
which is not the case for neural networks; decision trees have trouble dealing with noise in training data, which
is again not the case for neural networks; decision trees learn fast and neural networks learn relatively slow, etc.
Therefore, the idea is to combine decision trees and a neural network in order to combine their advantages seems
to be a welcome research area.
Artificial neural networks (ANN) are very efficient in solving various kinds of problems. But Lack of
explanation capability (Black box nature of Neural Networks) is one of the most important reasons why artificial
neural networks do not get necessary interest in some parts of industry [23]. Even though neural networks have
huge potential we will only get the best of them when they are integrate with other computing techniques, fuzzy
logic and so on[24].
Decision Tress on the other hand simple to understand and interpret. People are able to understand decision tree
models after a brief explanation, but decision-tree learners can create over-complex trees that do not generalize
the data well. This is call overfitting, mechanisms such as pruning are necessary to avoid this problem.
3.2 Decision Tree Neuro-Based Model Design
The development of decision tree neuro-based model is to fine tune the intelligence decision making progress in
a complex computational and organizational decision making process so that it can adapt to the present day
complex and dynamically changing environment. Financial decision making have increasingly become far more
challenging, on the other side the ever changing nature of our environment is bring new eye diseases every day
hence, making the use of decision trees less efficient in handling such increasing complexity of decision making
process based on increasing volume of data required in decision making
In the design of the system model, there are two major parts to the system as illustrated in the decision tree
neuro-based architecture in figure3.3. The first part is the decision tree part which handles decision making
based on the fundamental decision rules. At the end the decision tree output becomes the input to the neural net
which uses the result as a pad to start-up further refinement that we believe resulted in a much high level of
accuracy in decision-making.
Ordinarily humans are the ones that will select an action from a decision tree based on the suggested lines or
actions in complex organizational conditions. These conditions are built-in into the algorithm that reduces the
complexity and the computational effort required by system in arriving at a given result based on the sample
input and the resultant output from the tree. The processing effort of the neural net carries out further refinement
based on the level of processing done by the decision tree. The decision tree then acts as the first processing
engine for the system. In the model presented, the neural net takes over and completes the final decision making
process from the point that complex decisions stop. This greatly reduces possibilities of error in decision making
process involving complex organizational situations or in areas where wrong a decision leads to irredeemable
catastrophic consequences such as loan granting.
The procedure of designing a neural network model is a logical process. The process was not a single-pass one,
but it required going back to previous steps several times (see figure1). The neural net hidden layer processes the
input variables based on the algorithm and the weight of the threshold offered to the system. The computation of
the hidden layer is based on various trials until the error is minimized. The overall design is as shown in figure2.
4. Computer Engineering and Intelligent Systems www.iiste.org
ISSN 2222-1719 (Paper) ISSN 2222-2863 (Online)
Vol.4, No.7, 2013
11
Figure1: Adjusting network based on a comparison of the output and the target
Figure2: Decision Tree Neuro-Based Architecture
In figure2 the root case leads to parameters as children of the decision tree. The number of parameters are based
on the features identified in the case to be resolved by the decision tree Neuro-Based model. Once the parameters
are identified and determined as the child nodes, their possible outcomes become the next level of the tree
children. The number of outcomes varies based on the parameters; hence the tree may not necessarily be a binary
tree. Once the outcomes are determined they form the bases for input variable into the neural net on the right
hand side of the system. The variables are then transformed to the first and second layers where possible and
depending on the conditions from the decision tree outcomes. The output can then be generated based on this
possible handling of variables from the neural net. The decision tree neuro-based model is clearly divided into
two and has an interlinking interface that joined them to make it a single model.
3.3 Data for the System
To carry out this study a random selection was made in a universe of clients of a bank in Nigeria, 1000 credit
contracts, 500 considered as good and 500 considered as bad, dated from March 2009 to April 2011. All these
contracts had already matured, that is to say the sample was collected after the due date of the last installment of
all contracts. This is an historical data-base with monthly information on the utilization of the product. Based
upon this structure, the progress of the contract could be accompanied and particularized when the client did not
pay one or more installments.
Of this data set, 500 cases were used in the training and 500 were used in the testing. Both training and testing
data sets contained half-good applications and half-bad applications. There are 12 influential variables over the
loan decision. The definition and recoding of the variables are given in Table 3.1. On the other hand; the output
for the neural network was 1 for good applications or 0 for bad applications.
Root
Case
Output Layer
Decision Tree Input Variables Hidden Layer
Var1
1
Var1n
Var21
Var2
n
Nuro1
Clust
Nuro2
Nuro
Output1
VarN1
VarN
NuroN
Parameter1
OutCome1
Outcome1n
Outcome21
Outcome2
OutcomeN1
OutcomeNM
Parameter2
Parametern
Output2
Input
NN including
Connection
(weight) between
Output
Compare TargetAdjust Weight
5. Computer Engineering and Intelligent Systems www.iiste.org
ISSN 2222-1719 (Paper) ISSN 2222-2863 (Online)
Vol.4, No.7, 2013
12
Table1. Loan Granting Decision Factor
Variable
code
Variable description Variable Explanation
X1 Age 1 if the applicant age is accepted, 0 otherwise
X2 Income 0 if income < N 40,000, 1 for (N40,000≤income≤ N 100,000) and 2 for
(income≥ N100,000).
X3 Job Experience 0 if (exp < 6 months), 1 for (6 months≤exp< 2years) and 2 (exp ≥ 2years)
X4 Account type 1 for payroll account, 2 for self account
X5 Nationality 1 for Nigerian, 0 otherwise
X6 Residency 1 resident in Nigeria, 0 otherwise
X7 Loan Size 1 if (100,000≤ LS≤350,000) if X4=1, 0 otherwise Or 1 if (100,000≤
LS≤250,000) if X4=0, 0 otherwise
X8 Place of work 1 of company is accredited by the bank, 0 otherwise
X9 Guarantor 1 if applicant has guarantor, 0 otherwise
X10 Collateral 1 if collateral exist, 0 otherwise
X11 Debt balance ratio 1 for good DBR and 0 otherwise
X12 Social security 1 if applicant has social security, 0 therwise
Table1 Eye Test Decision Tree Input Data
Symptom Description Output
1 Recent negative examination Yes or no
2 Eye pressure normal Yes or no
3 Optic nerve shape normal Yes or no
4 Iris angle open Yes or no
5 Iris angle closed Yes or no
6 Cornea thickness normal Yes or no
7 Recent hazy/blurred vision Yes or no
8 Seeing rainbow colored circles around bright lights Yes or no
9 Severe eye pain Yes or no
10 Severe head pain Yes or no
11 Nausea/vomiting Yes or no
12 Sudden sight loss Yes or no
3.4 Decision Tree Modeling
The operation of DTs are based on the ID3 or C4.5 divide-and-conquer algorithms[25] and search heuristics
which make the clusters at the node gradually purer by progressively reducing disorder in the original data set.
The algorithms place the attribute that has the most predictive power at the top node of the tree and they have to
find the optimum number of splits and determine where to partition the data to maximize the information gain.
The fewer the splits, the more explainable the output is as there are less rules to understand. Selecting the best
split is based on the degree of impurity of the child nodes. For example, a node which contains only cases of
class good_loan or class bad_loan has the smallest disorder = 0. Similarly, a node that contains an equal number
of cases of class good_loan and class bad_loan has the highest disorder = 1. Disorder is measured by the well
established concept of entropy and information gain which we formally introduce below.
Given a collection S, containing the positive (good_loan) and negative examples (bad_loan) of
some target concept, the entropy of S relative to this Boolean classification is
(1)
where is the proportion of positive examples in S and is the proportion of negative examples
in S. If the output variable takes on k different values, then the entropy of S relative to this k-wise classification is
defined as
(2)
Hence we see that both when the category is nearly - or completely - empty, or when the category nearly
contains - or completely contains - all the examples, the score for the category gets close to zero, which models
what we wanted it to. Note that 0*ln(0) is taken to be zero by convention.
6. Computer Engineering and Intelligent Systems www.iiste.org
ISSN 2222-1719 (Paper) ISSN 2222-2863 (Online)
Vol.4, No.7, 2013
13
Thus, if disorder is measured by entropy, the problem of trying to determine the best attribute to choose for a
particular node in a tree can be obtained by following measure that calculates a numerical value for a given
attribute, A, with respect to a set of examples, S. Note that the values of attribute A will range over a set of
possibilities which we call Values(A), and that, for a particular value from that set, v, we write Sv for the set of
examples which have value v for attribute A.
In tree-growing, the heuristic plays a critical role in determining both classification performance and
computational cost. Most modern decision-tree learning algorithms adopt a (im)purity-based heuristic, which
essentially measures the purity of the resulting subsets after applying the splitting attribute to partition the
training data. Information gain, defined as follows, is widely used as a standard heuristic.
(3)
where S is a set of training instances, X is an attribute and x is its value, Sx is a subset of S consisting of the
instances with X = x, and Entropy(S) is defined as
(4)
where is estimated by the percentage of instances belonging to in S, and is the number of classes.
is similar.
3.4 Neural Networks Modeling
Artificial Neural networks learn by training on past experience using an algorithm which modifies the
interconnection weight links as directed by a learning objective for a particular application. A neuron is a single
processing unit which computes the weighted sum of its inputs. The output of the network relies on cooperation
of the individual neurons. The learnt knowledge is distributed over the trained networks weights. Neural
networks are characterized into feedforward and recurrent neural networks. Neural networks are capable of
performing tasks that include pattern classification, function approximation, prediction or forecasting, clustering
or categorization, time series prediction, optimization, and control. Feedforward networks contain an input layer,
one or many hidden layers and an output layer. Equation (5) shows the dynamics of a feedforward network.
(5)
where is the output of the neuron j in layer , is the output of neuron j in layer (containing m
neurons) and the weight associated with that connection with j. is the internal threshold/bias of the neuron
and is the sigmoidal discriminant function
Backpropagation is the most widely applied learning algorithm for neural networks. It learns the weights for a
multilayer network, given a network with a fixed set of weights and interconnections. Backpropagation employs
gradient descent to minimize the squared error between the networks output values and desired values for those
outputs. The goal of gradient descent learning is to minimize the sum of squared errors by propagating error
signals backward through the network architecture upon the presentation of training samples from the training
set. These error signals are used to calculate the weight updates which represent the knowledge learnt in the
network. The performance of backpropagation can be improved by adding a momentum term and training
multiple networks with the same data but different small random initializations prior to training. In gradient
descent search for a solution, the network searches through a weight space of errors. A limitation of gradient
descent is that it may get trapped in a local minimum easily. This may prove costly in terms for network training
and generalization performance. In the past, research has been done to improve the training performance of
neural networks which has significance on its generalization. Symbolic or expert knowledge is inserted into
neural networks prior to training for better training and generalization performance as demonstrated in [26]. The
generalization ability of neural networks is an important measure of its performance as it indicates the accuracy
of the trained network when presented with data not present in the training set. A poor choice of the network
architecture i.e. the number of neurons in the hidden layer will result in poor generalization even with optimal
values of its weights after training. Until recently neural networks were viewed as black boxes because they
could not explain the knowledge learnt in the training process. The extraction of rules from neural networks
shows how they arrived to a particular solution after training.
The backpropagation algorithm with supervised learning was used, which means that we provide the algorithm
with examples of the inputs and outputs we want the network to compute, and then the error (difference between
actual and expected results) is calculated. The idea is to reduce this error, until the ANN learns the training data.
7. Computer Engineering and Intelligent Systems www.iiste.org
ISSN 2222-1719 (Paper) ISSN 2222-2863 (Online)
Vol.4, No.7, 2013
14
The training begins with random weights, and the goal is to adjust them so that the error will be minimal. The
activation function of the artificial neurons in ANNs implementing the backpropagation algorithm is given as
follows[27] in equation6,7,8,9.
Where: are the inputs, are the weights, are the actual outputs, are the expected outputs and η -
learning rate.
4.0 Testing and Results
Several data files for both processing modes were generated and the network’s functionality was tested
extensively for various sized training and testing files. The results indicated that the system was indeed able to
perform both loan application processing and eye test diagnosis as predicted. As an example, two 500 vector data
files were generated that represent 500 loan applicants and 500 medical patients. The data for these cases is
given in loan1tran.mat and eye1trn.mat. An additional 500 vectors was generated for testing both modes. This
data is given in loan1tst.mat and eye1tst.mat.
Both train and test data files were processed through the tree module to generate 500 vector long input train and
test files for the ANN inference engine of module netinf. This data is given in loan2trn.mat, loan2tst.mat,
eye2trn.mat and eye2tst.mat. NOTE: 2 corresponds to an input file for the netinf module and 1 corresponds to an
input file for the tree module. This naming convention does not have to be adopted; however, it is advisable to
utilize some scheme that will minimize confusion when processing the modules. The netinf module creates trains
and tests the ANNs. There are several network parameters that utilize default values; however, these may be
edited to achieve more favorable results. The results obtained; however, are promising. For example, ANNs were
built, loan500.mat and eye500.mat that were able to achieve the desired rms error rate of 0.01 for loan
processing and eye test diagnosis, respectively. This is shown in Figs. 3 and 4, below.
Figure 3: Training Curve for loan500
8. Computer Engineering and Intelligent Systems www.iiste.org
ISSN 2222-1719 (Paper) ISSN 2222-2863 (Online)
Vol.4, No.7, 2013
15
Figure 4: Training Curve for eye500
These networks were tested on the corresponding 500 vector test sets and the results are below in figure 5 and 6.
respectively.
Figure 5: Error Curve for loan500
9. Computer Engineering and Intelligent Systems www.iiste.org
ISSN 2222-1719 (Paper) ISSN 2222-2863 (Online)
Vol.4, No.7, 2013
16
Figure 6: Error Curve for eye500
As the criterion for selection/denial and disease presence/absence was linearly separable, represented by a
threshold of 0.5, these results clearly indicate the success of the LDSS in solving the problems for the given data.
4.1 Performance of the system using Random Tests
In testing the system randomly, forty different tests from the data sets for testing the system were carried out at
random and the result compared with the actual data in file collected from the bank. This data set was used for
the Decision Tree, Neural Networks and LDSS. Table3 and Table4 for Bank A and Bank B respectively, show
the result indicating TRUE POSITIVE(TP) where system suggested grant loan and the actual grand loan, TRUE
NEGATIVE(TN) where the system suggested don’t grant and the actual is don’t grant, FALSE POSITIVE(FP)
where the system suggested grant loan but actual is don’t grant, FALSE NEGATIVE(FN) where the system
suggested don’t grant but actual is grant loan.
Table3: Result from Bank A
Decision Tree Neural Networks LDSS
TP = 5 FP = 1 Total = 6 TP = 7 FP = 1 Total = 8 TP = 9 FP = 1 Total = 10
FN = 5 TN = 9 Total = 14 FN = 3 TN = 9 Total = 12 FN = 1 TN = 9 Total = 10
Total = 10 Total = 10 Total = 20 Total = 10 Total = 10 Total = 20 Total = 10 Total = 10 Total = 20
Table4: Result from Bank B
Decision Tree Neural Networks LDSS
TP = 7 FP = 4 Total = 11 TP = 8 FP = 4 Total = 12 TP = 9 FP = 2 Total = 11
FN = 3 TN = 6 Total = 9 FN = 2 TN = 6 Total = 6 FN = 1 TN = 8 Total = 9
Total = 10 Total = 10 Total = 20 Total = 10 Total = 10 Total = 20 Total = 10 Total = 10 Total = 20
For Decision Tree, , for Neural Networks, and for LDSS,
.
4.2 Performance Comparison with decision Tree alone and Neural Networks alone.
The system was developed in a way that different cases can be executed using Decision(DT) alone, Neural
10. Computer Engineering and Intelligent Systems www.iiste.org
ISSN 2222-1719 (Paper) ISSN 2222-2863 (Online)
Vol.4, No.7, 2013
17
Networks(NN) alone and LDSS( the Decision Tree-Neuro Based System). The performance of the Decision
Tree-Neuro based system was compare with the performance of Neural Networks and Decision Tree alone using
the receiver operating characteristics (ROC) curve from Table2 and Table3 using MATLAB software
package(MATLABR2009b). The result is as shown in figure7.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Plot of DECISION TREE Vs NeuralNet Vs LDSS)
PF
PD
DT
NN
LDSS
Figure 7: Comparison of LDSS with NN and DT
Explanations: From Figure 7, the ROC curve for LDSS was able to correctly classify over 80% of customers as
bad or good without causing false alarms. This is followed by the ROC curve for neural networks with 70%
classification with no false alarms. The least is the ROC curve for decision tree with 50% detection. Figure 7
again shows that the performance of LDSS is in agreement with other classification software, but performs better.
This relative advantage of LDSS over others is as a result of combining decision tree and neural networks.
5.0 Conclusion
Decision making, particularly in a complex and dynamically changing environment of the present day is a
difficult task that requires new techniques of computational intelligence for building adaptive, hybrid intelligent
decision support systems. The research work thus proposed Decision Tree- Neuro based Decision Support
System with 88% accuracy for decision support. The developed system can provide explanation why a particular
customer was selected knowing that a customer may protest his rejection.
The works thus contribute the following to Knowledge:-
1. Design a Decision Tree-Neuro Based architectural topology to implement a decision support system.
1. Design of a Decision Tree-Neuro Based model that can adequately decide if customers applying for
loan should be granted or not.
2. By changing variables in the system can be used to as clinical decision support system to diagnosis eye
diseases.
6.0 Recommendations
The contribution in this work is recommended:-
1. To financial institutions, particularly in Nigeria for loan granting applications. Given the fact that
humans are not good at evaluating loan applications together with the fact that Nigeria is full of
nepotism, tribalism and corruption, necessitates the need for robust knowledge tool using Decision Tree
– Neuro based Decision support system to assist banks in credit risk evaluation for the sustainability of
the banks and Nigerian economy bearing in mind that the loan officer may be asked for explanations
why certain applicants are chosen in preference to others.
11. Computer Engineering and Intelligent Systems www.iiste.org
ISSN 2222-1719 (Paper) ISSN 2222-2863 (Online)
Vol.4, No.7, 2013
18
2. To medical experts (ophthalmologists) as an aid in the decision making process and confirmation of
suspected cases. Also, a non expert will still find the work useful in areas where prompt and swift
actions are required for the diagnosis of a given eye disease covered in the system. Medical
practitioners who operate in areas where there are no specialist (ophthalmologist) can also rely on the
system for assistance.
References
[1] Gallinari, P. (1998):Predictive models for sequence modelling, application to speech and character
recognition.Adaptive Processing of Sequences and Data Structures, Lecture Notes in Computer Science
Volume 1387, 1998, pp 418-434
[2] Gevaert, W., Senov, G. and Mladenov, V.(2010). Neural Networks for Speech Recognition. Journal of
Automatic Control, University of Belgrade, vol. 20:1-7
[3] Matan, O., Kiang, R. K., Stenard, C. E., Boser, B.,Denkar, J. S., Henderson, D., Howard, R. E.,
Hubbard, W., Jackel, L. D. and Le Cun, Y.(1990). Hand Written Character Recognition Using Neural
Networks Architectures. Proceeding of the 4th
USPS Advanced Technology Conference, Washington
D.C., Pp 1003-1011.
[4] Ralston, A. and Reilly, E. D.(2000). Encyclopedia of Computer Science, London : Nature Pub. Group.
Ref. QA76.15 .E48 2000.
[5] Glorfeld, L.W. & Hardgrave, B.C. (1996): An improved method for developing neural networks: The
case of evaluating commercial loan creditworthiness. Computer Operation Research, 23 (10), Pp: 933-
944.
[6] Handzic, M. & Aurum, A. (2001). Knowledge discovery: Some empirical evidence and directions for
future research. Proceedings of the 5th. International Conference on Wirtschafts Informatics (WI'2001),
19-21 September, Augsburg, Ge rmany.
[7] Bigus, J.P. (1996): Data mining with neural networks: Solving business problems from application
development to decision support. McGraw Hill, USA.
[8] Marakas, G.M. (1999): Decision support systems in the twenty-first century. Prentice Hall, New Jersey,
USA.
[9] Smith, K.A. (1999): Introduction to neural networks and data mining for business applications.
Eruditions Publishing, Australia.
[10] Thomas, L.C. (1998): A survey of credit and behavioural scoring: Forecsting financial risk of lending to
customers. Retrieved June 5, 2002, from www.bus.ed.ac.uk/working_papers/full_text/crc9902.pdf
[11] Steenackers A.,and Goovaerts M.J.(1998): A credit scoring model for personal loans. Insurance
Mathematics & Economics, 8, 31-34.
[12] Henley W.E., Hand D.E.(1997). Construction of a k-nearest neighbor credit-scoring system. IMA
Journal of Mana-gement Mathematics, 8, 305-321.
[13] Davis, R. H., Edelman, D. B., and Gammerman, A. J.(1992):Machine Learning Algorithms for Credit-
Card Applications,. IMA Journal of Mathematics Applied in Business and Industry (4), Pp: 43-51.
[14] Desai V.S., Crook J.N., Overstreet G.A. Jr(1996): On comparison of neural networks and linear scoring
models in the credit union environment. European Journal of Operational Research, 95(1), Pp: 24-37.
[15] Baesens B., Setiono R., Mues C, Vanthienen J.(2003): Using Neural Network Rule Extraction and
Decision Tables for Credit-Risk Evaluation, Management Science, Volume 49 , Issue 3, March 2003,
Pp:312 – 329.
[16] West, D.(2000): Neural Network Credit Scoring Models, Computers & Operations Research, vol. 27,
no. 11-12, pp. 1131-1152.
[17] Chi G., Hao J., Xiu Ch., Zhu Z. (2001): Cluster Analysis for Weight of Credit Risk Evaluation Index.
Systems Engineering-Theory Methodology, Applications, 10(1), Pp. 64-67.
[18] Lundy M.(1993): Cluster Analysis in Credit Scoring. Credit Scoring and Credit Control. New York:
Oxford Uni-versity Press.
[19] Luo Y.Z., Pang S.L., S.(2003): Fuzzy Cluster in Credit Scoring. Proceedings of the Second
International Conference on Machine Learning and Cybernetics, Xi’an, 2-5 November 2003, 2731-2736.
[20] Baesens, B., Van, T., Gestel, M. Stepanova et al.,(2005):Neural network Survival Analysis for Personal
Loan Data, Journal of the Operational Research Society, vol. 56, no. 9, Pp. 1089- 1098, Sep.
[21] Bakpo, F. S. and Kabari, L. G. (2011):Diagnosing Skin Diseases using an Artificial Neural Network,
Artificial Neural Networks - Methodological Advances and Biomedical Applications, kenji suzuki (ed.),
isbn: 978-953-307-243-2, intech, Available from:
http://www.intechopen.com/articles/show/title/diagnosing-skin-diseases-using-an-artificial-neural-
network
12. Computer Engineering and Intelligent Systems www.iiste.org
ISSN 2222-1719 (Paper) ISSN 2222-2863 (Online)
Vol.4, No.7, 2013
19
[22] Kabari, L. G and Nwachukwu, E. O. (2012): Neural Networks and Decision Trees For Eye Diseases
Diagnosis, Advances in Expert Systems, Petrica Vizureanu (Ed.), ISBN: 978-953-51-0888-7, InTech,
Available from: http://www.intechopen.com/books/advances-in-expert-systems/neural-networks-and-
decision-trees-for-eye-diseases-diagnosis
[23] Kumar, D. S., Sathyadevi, G. and Sivanesh, S.(2011): Decision Support System for Medical Diagnosis
Using Data Mining, IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 3, No.
1,ISSN (Online): 1694-0814,www.IJCSI.org.
[24] UKessay.com(2013):Advantages and Limitation of Neural Networks. Retrieved from
http://www.ukeassay.com/essays/education on 11/06/2013.
[25] Quinlan, J. R.(1987): Simplifying Decision Trees, International Journal of Man-Machine Studies, vol.
27, Pp.221-234, 1987.
[26] Chandra R., and Omlin C. W.(2007): Knowledge Discovery Using Artificial Neural Networks For A
Conservation Biology Domain, Proceedings of the 2007 International Conference on Data Mining, Las
Vegas, USA, In Press, 2007.
[27] Haykin, S. (1999): Neural Networks: a Comprehensive Foundation. Second Edition. Prentice Hall.
13. This academic article was published by The International Institute for Science,
Technology and Education (IISTE). The IISTE is a pioneer in the Open Access
Publishing service based in the U.S. and Europe. The aim of the institute is
Accelerating Global Knowledge Sharing.
More information about the publisher can be found in the IISTE’s homepage:
http://www.iiste.org
CALL FOR PAPERS
The IISTE is currently hosting more than 30 peer-reviewed academic journals and
collaborating with academic institutions around the world. There’s no deadline for
submission. Prospective authors of IISTE journals can find the submission
instruction on the following page: http://www.iiste.org/Journals/
The IISTE editorial team promises to the review and publish all the qualified
submissions in a fast manner. All the journals articles are available online to the
readers all over the world without financial, legal, or technical barriers other than
those inseparable from gaining access to the internet itself. Printed version of the
journals is also available upon request of readers and authors.
IISTE Knowledge Sharing Partners
EBSCO, Index Copernicus, Ulrich's Periodicals Directory, JournalTOCS, PKP Open
Archives Harvester, Bielefeld Academic Search Engine, Elektronische
Zeitschriftenbibliothek EZB, Open J-Gate, OCLC WorldCat, Universe Digtial
Library , NewJour, Google Scholar