Fraud in healthcare insurance claims is one of the significant research challenges that affect the growth of the healthcare services. The healthcare frauds are happening through subscribers, companies and the providers. The development of a decision support is to automate the claim data from service provider and to offset the patient’s challenges. In this paper, a novel hybridized big data and statistical machine learning technique, named MapReduce based iterative support vector machine (MR-ISVM) that provide a set of sophisticated steps for the automatic detection of fraudulent claims in the health insurance databases. The experimental results have proven that the MR-ISVM classifier outperforms better in classification and detection than other support vector machine (SVM) kernel classifiers. From the results, a positive impact seen in declining the computational time on processing the healthcare insurance claims without compromising the classification accuracy is achieved. The proposed MR-ISVM classifier achieves 87.73% accuracy than the linear (75.3%) and radial basis function (79.98%).
A rule-based machine learning model for financial fraud detectionIJECEIAES
Financial fraud is a growing problem that poses a significant threat to the banking industry, the government sector, and the public. In response, financial institutions must continuously improve their fraud detection systems. Although preventative and security precautions are implemented to reduce financial fraud, criminals are constantly adapting and devising new ways to evade fraud prevention systems. The classification of transactions as legitimate or fraudulent poses a significant challenge for existing classification models due to highly imbalanced datasets. This research aims to develop rules to detect fraud transactions that do not involve any resampling technique. The effectiveness of the rule-based model (RBM) is assessed using a variety of metrics such as accuracy, specificity, precision, recall, confusion matrix, Matthew’s correlation coefficient (MCC), and receiver operating characteristic (ROC) values. The proposed rule-based model is compared to several existing machine learning models such as random forest (RF), decision tree (DT), multi-layer perceptron (MLP), k-nearest neighbor (KNN), naive Bayes (NB), and logistic regression (LR) using two benchmark datasets. The results of the experiment show that the proposed rule-based model beat the other methods, reaching accuracy and precision of 0.99 and 0.99, respectively.
A comprehensive study on disease risk predictions in machine learning IJECEIAES
Over recent years, multiple disease risk prediction models have been developed. These models use various patient characteristics to estimate the probability of outcomes over a certain period of time and hold the potential to improve decision making and individualize care. Discovering hidden patterns and interactions from medical databases with growing evaluation of the disease prediction model has become crucial. It needs many trials in traditional clinical findings that could complicate disease prediction. A Comprehensive study on different strategies used to predict disease is conferred in this paper. Applying these techniques to healthcare data, has improvement of risk prediction models to find out the patients who would get benefit from disease management programs to reduce hospital readmission and healthcare cost, but the results of these endeavors have been shifted.
Welcome to International Journal of Engineering Research and Development (IJERD)IJERD Editor
call for paper 2012, hard copy of journal, research paper publishing, where to publish research paper,
journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJERD, journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, publishing of research paper, reserach and review articles, IJERD Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathemetics journal, Physics journal, Chemistry journal, Computer Engineering, Computer Science journal, how to submit your paper, peer reviw journal, indexed journal, reserach and review articles, engineering journal, www.ijerd.com, research journals,
yahoo journals, bing journals, International Journal of Engineering Research and Development, google journals, hard copy of journal,
The Case Study of an Early Warning Models for the Telecare Patients in TaiwanIJERA Editor
To propose a practical early warning analysis model for the telecare patients, this study applied data mining
technology as a basis to investigate the classification of patient groups by disease severity and incidence using
data contained in a telecare database regarding the number of a clinic. The ultimate purpose of this study was to
provide a new direction for telecare system planning and developing strategies.
The subject of this case study was a private clinic which is providing telecare system to patients in Taiwan, and
we used three data mining techniques including discriminant analysis, logistic regression and artificial neural
network to construct an early warning analysis model based on several factors such as: Demographic variables,
pathological signals, health management index, diagnosis and treatment records, emergency notification signal.
According the results, the telecare system can build stronger physician-patient relationship in advance through
previously paying attention to patients’ physiological conditions, reminding them to do self-management, even
taking them to the hospital for observation. A comparison of discriminative rates showed that the artificial neural
network model had the highest overall correct classification rate, 85.52%, and thus is a tool worthy of
recommendation
A rule-based machine learning model for financial fraud detectionIJECEIAES
Financial fraud is a growing problem that poses a significant threat to the banking industry, the government sector, and the public. In response, financial institutions must continuously improve their fraud detection systems. Although preventative and security precautions are implemented to reduce financial fraud, criminals are constantly adapting and devising new ways to evade fraud prevention systems. The classification of transactions as legitimate or fraudulent poses a significant challenge for existing classification models due to highly imbalanced datasets. This research aims to develop rules to detect fraud transactions that do not involve any resampling technique. The effectiveness of the rule-based model (RBM) is assessed using a variety of metrics such as accuracy, specificity, precision, recall, confusion matrix, Matthew’s correlation coefficient (MCC), and receiver operating characteristic (ROC) values. The proposed rule-based model is compared to several existing machine learning models such as random forest (RF), decision tree (DT), multi-layer perceptron (MLP), k-nearest neighbor (KNN), naive Bayes (NB), and logistic regression (LR) using two benchmark datasets. The results of the experiment show that the proposed rule-based model beat the other methods, reaching accuracy and precision of 0.99 and 0.99, respectively.
A comprehensive study on disease risk predictions in machine learning IJECEIAES
Over recent years, multiple disease risk prediction models have been developed. These models use various patient characteristics to estimate the probability of outcomes over a certain period of time and hold the potential to improve decision making and individualize care. Discovering hidden patterns and interactions from medical databases with growing evaluation of the disease prediction model has become crucial. It needs many trials in traditional clinical findings that could complicate disease prediction. A Comprehensive study on different strategies used to predict disease is conferred in this paper. Applying these techniques to healthcare data, has improvement of risk prediction models to find out the patients who would get benefit from disease management programs to reduce hospital readmission and healthcare cost, but the results of these endeavors have been shifted.
Welcome to International Journal of Engineering Research and Development (IJERD)IJERD Editor
call for paper 2012, hard copy of journal, research paper publishing, where to publish research paper,
journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJERD, journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, publishing of research paper, reserach and review articles, IJERD Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathemetics journal, Physics journal, Chemistry journal, Computer Engineering, Computer Science journal, how to submit your paper, peer reviw journal, indexed journal, reserach and review articles, engineering journal, www.ijerd.com, research journals,
yahoo journals, bing journals, International Journal of Engineering Research and Development, google journals, hard copy of journal,
The Case Study of an Early Warning Models for the Telecare Patients in TaiwanIJERA Editor
To propose a practical early warning analysis model for the telecare patients, this study applied data mining
technology as a basis to investigate the classification of patient groups by disease severity and incidence using
data contained in a telecare database regarding the number of a clinic. The ultimate purpose of this study was to
provide a new direction for telecare system planning and developing strategies.
The subject of this case study was a private clinic which is providing telecare system to patients in Taiwan, and
we used three data mining techniques including discriminant analysis, logistic regression and artificial neural
network to construct an early warning analysis model based on several factors such as: Demographic variables,
pathological signals, health management index, diagnosis and treatment records, emergency notification signal.
According the results, the telecare system can build stronger physician-patient relationship in advance through
previously paying attention to patients’ physiological conditions, reminding them to do self-management, even
taking them to the hospital for observation. A comparison of discriminative rates showed that the artificial neural
network model had the highest overall correct classification rate, 85.52%, and thus is a tool worthy of
recommendation
Detecting health insurance fraud using analytics Nitin Verma
Any Healthcare organization that exchanges money with service providers, customers and vendors are prone to health insurance fraud and abuse. Health plans around the world are losing more money than the amount of the Medical Loss Ratio (MLR). Examples of fraud include: billing for services not rendered, misrepresenting the diagnosis to fraudulently collect payment, soliciting, offering, or receiving a kickback, unbundling or "exploding" charges and the never ending list goes on and on forever.
The real difference between fraud and abuse is the person's intent. Both acts have the same impact: they detract valuable resources from the Health Plans that would otherwise be used to offer economical plans and provide efficient services to the subscribers and higher reimbursement to the providers.
MACHINE LEARNING ALGORITHMS FOR CREDIT CARD FRAUD DETECTIONmlaij
Fraud is a critical issue in our society today. Losses due to payment fraud are on the increase as ecommerce keeps evolving. Organizations, governments, and individuals have experienced huge losses due
to payment. Merchant Savvy projects that global losses due to payment fraud will increase to about $40.62
billion in 2027 . Among all payment fraud, credit card fraud results in a higher loss. Therefore, we intend
to leverage the potential of machine learning to deal with the problem of fraud in credit cards which can
be generalized to other fraud types. This paper compares the performance of logistic regression, decision
trees, random forest classifier, isolation forest, local outlier factor, and one-class support vector machines
(SVM) based on their AUC and F1-score. We applied a smote technique to handle the imbalanced nature
of the data and compared the performance of the supervised models on the oversampled data to the raw
data. From the results, the Random Forest classifier outperformed the other models with a higher AUC
score and better f1-score on both the actual and oversampled data. Oversampling the data didn't change
the result of the decision trees. One-class SVM performs better than isolation forest in terms of AUC score
but has a very low f1-score compared to isolation forest. The local outlier factor had the poorest
performance.
The rise of Fintech, changing consumer behavior, and advanced technologies are disrupting equally all the financial services industry, among which also it’s most prominent member, insurance
The insurance industry has been using data to calculate risks for years, still, with new technology now available to collect and analyze large volumes of data for patterns and better risk prediction and calculation, the value of understanding how to store and analyze it has grown exponentially (Liu et al., 2018).
Insurers are at their early stage of discovering the potential of big data, and multiple technology companies are investigate how to make value of such technology (Pisoni, 2020)
Supervised and unsupervised data mining approaches in loan default prediction IJECEIAES
Given the paramount importance of data mining in organizations and the possible contribution of a data-driven customer classification recommender systems for loan-extending financial institutions, the study applied supervised and supervised data mining approaches to derive the best classifier of loan default. A total of 900 instances with determined attributes and class labels were used for the training and cross-validation processes while prediction used 100 new instances without class labels. In the training phase, J48 with confidence factor of 50% attained the highest classification accuracy (76.85%), k-nearest neighbors (k-NN) 3 the highest (78.38%) in IBk variants, naïve Bayes has a classification accuracy of 76.65%, and logistic has 77.31% classification accuracy. k-NN 3 and logistic have the highest classification accuracy, F-measures, and kappa statistics. Implementation of these algorithms to the test set yielded 48 non-defaulters and 52 defaulters for k-NN 3 while 44 non-defaulters and 56 defaulters under logistic. Implications were discussed in the paper.
Problem Reduction in Online Payment System Using Hybrid ModelIJMIT JOURNAL
Online auction, shopping, electronic billing etc. all such types of application involves problems of fraudulent transactions. Online fraud occurrence and its detection is one of the challenging fields for web development and online phantom transaction. As no-secure specification of online frauds is in research database, so the techniques to evaluate and stop them are also in study. We are providing an approach with Hidden Markov Model (HMM) and mobile implicit authentication to find whether the user interacting online is a fraud or not. We propose a model based on these approaches to counter the occurred fraud and prevent the loss of the customer. Our technique is more parameterized than traditional approaches and so, chances of detecting legitimate user as a fraud will reduce.
THE TRANSFORMATION RISK-BENEFIT MODEL OF ARTIFICIAL INTELLIGENCE:BALANCING RI...gerogepatton
This paper summarizes the most cogent advantages and risks associated with Artificial Intelligence from an
in-depth review of the literature. Then the authors synthesize the salient risk-related models currently being
used in AI, technology and business-related scenarios. Next, in view of an updated context of AI along with
theories and models reviewed and expanded constructs, the writers propose a new framework called “The
Transformation Risk-Benefit Model of Artificial Intelligence” to address the increasing fears and levels of
AIrisk. Using the model characteristics, the article emphasizes practical and innovative solutions where
benefitsoutweigh risks and three use cases in healthcare, climate change/environment and cyber security to
illustrate unique interplay of principles, dimensions and processes of this powerful AI transformational
model.
THE TRANSFORMATION RISK-BENEFIT MODEL OF ARTIFICIAL INTELLIGENCE:BALANCING RI...ijaia
This paper summarizes the most cogent advantages and risks associated with Artificial Intelligence from an
in-depth review of the literature. Then the authors synthesize the salient risk-related models currently being
used in AI, technology and business-related scenarios. Next, in view of an updated context of AI along with
theories and models reviewed and expanded constructs, the writers propose a new framework called “The
Transformation Risk-Benefit Model of Artificial Intelligence” to address the increasing fears and levels of
AIrisk. Using the model characteristics, the article emphasizes practical and innovative solutions where
benefitsoutweigh risks and three use cases in healthcare, climate change/environment and cyber security to
illustrate unique interplay of principles, dimensions and processes of this powerful AI transformational
model.
THE TRANSFORMATION RISK-BENEFIT MODEL OF ARTIFICIAL INTELLIGENCE:BALANCING RI...ijaia
This paper summarizes the most cogent advantages and risks associated with Artificial Intelligence from an
in-depth review of the literature. Then the authors synthesize the salient risk-related models currently being
used in AI, technology and business-related scenarios. Next, in view of an updated context of AI along with
theories and models reviewed and expanded constructs, the writers propose a new framework called “The
Transformation Risk-Benefit Model of Artificial Intelligence” to address the increasing fears and levels of
AIrisk. Using the model characteristics, the article emphasizes practical and innovative solutions where
benefitsoutweigh risks and three use cases in healthcare, climate change/environment and cyber security to
illustrate unique interplay of principles, dimensions and processes of this powerful AI transformational
model.
Advancing the cybersecurity of the healthcare system with self- optimising an...Petar Radanliev
This article advances the knowledge on teaching and training new artificial intelligence algorithms, for securing, preparing,
and adapting the healthcare system to cope with future pandemics. The core objective is to develop a concept healthcare
system supported by autonomous artificial intelligence that can use edge health devices with real-time data. The article constructs two case scenarios for applying cybersecurity with autonomous artificial intelligence for (1) self-optimising predictive cyber risk analytics of failures in healthcare systems during a Disease X event (i.e., undefined future pandemic), and (2) self-adaptive forecasting of medical production and supply chain bottlenecks during future pandemics. To construct the two testing scenarios, the article uses the case of Covid-19 to synthesise data for the algorithms – i.e., for optimising and securing digital healthcare systems in anticipation of Disease X. The testing scenarios are built to tackle the logistical challenges and disruption of complex production and supply chains for vaccine distribution with optimisation algorithms.
Big data analytics and its impact on internet usersStruggler Ever
Big Data Analytic tools are promising techniques for a future prediction in many aspects of our life. The need for such predictive techniques has been exponentially increasing. even though, there are many challenges and risks are still of concern of researchers and decision makers, the outcome from the use of these techniques will considerable revolutionize our world to a new era of technology.
Fake accounts detection on social media using stack ensemble systemIJECEIAES
In today’s world, social media has spread widely, and the social life of people have become deeply associated with social media use. They use it to communicate with each other, share events and news, and even run businesses. The huge growth in social media and the massive number of users has lured attackers to distribute harmful content through fake accounts, leading to a large number of people falling victim to those accounts. In this work, we propose a mechanism for identifying fake accounts on the social media site Twitter by using two methods to preprocess data and extract the most effective features, they are the spearman correlation coefficient and the chi-square test. For classification, we used supervised machine learning algorithms based on the ensemble system (stack method) by using random forest, support vector machine, and naive Bayes algorithms in the first level of the stack, and the logistic regression algorithm as a meta classifier. The stack ensemble system was shown to be effective in achieving the best results when compared to the algorithms used with it, with data accuracy reaching 99%.
Bibliometric analysis highlighting the role of women in addressing climate ch...IJECEIAES
Fossil fuel consumption increased quickly, contributing to climate change
that is evident in unusual flooding and draughts, and global warming. Over
the past ten years, women's involvement in society has grown dramatically,
and they succeeded in playing a noticeable role in reducing climate change.
A bibliometric analysis of data from the last ten years has been carried out to
examine the role of women in addressing the climate change. The analysis's
findings discussed the relevant to the sustainable development goals (SDGs),
particularly SDG 7 and SDG 13. The results considered contributions made
by women in the various sectors while taking geographic dispersion into
account. The bibliometric analysis delves into topics including women's
leadership in environmental groups, their involvement in policymaking, their
contributions to sustainable development projects, and the influence of
gender diversity on attempts to mitigate climate change. This study's results
highlight how women have influenced policies and actions related to climate
change, point out areas of research deficiency and recommendations on how
to increase role of the women in addressing the climate change and
achieving sustainability. To achieve more successful results, this initiative
aims to highlight the significance of gender equality and encourage
inclusivity in climate change decision-making processes.
Voltage and frequency control of microgrid in presence of micro-turbine inter...IJECEIAES
The active and reactive load changes have a significant impact on voltage
and frequency. In this paper, in order to stabilize the microgrid (MG) against
load variations in islanding mode, the active and reactive power of all
distributed generators (DGs), including energy storage (battery), diesel
generator, and micro-turbine, are controlled. The micro-turbine generator is
connected to MG through a three-phase to three-phase matrix converter, and
the droop control method is applied for controlling the voltage and
frequency of MG. In addition, a method is introduced for voltage and
frequency control of micro-turbines in the transition state from gridconnected mode to islanding mode. A novel switching strategy of the matrix
converter is used for converting the high-frequency output voltage of the
micro-turbine to the grid-side frequency of the utility system. Moreover,
using the switching strategy, the low-order harmonics in the output current
and voltage are not produced, and consequently, the size of the output filter
would be reduced. In fact, the suggested control strategy is load-independent
and has no frequency conversion restrictions. The proposed approach for
voltage and frequency regulation demonstrates exceptional performance and
favorable response across various load alteration scenarios. The suggested
strategy is examined in several scenarios in the MG test systems, and the
simulation results are addressed.
More Related Content
Similar to MapReduce-iterative support vector machine classifier: novel fraud detection systems in healthcare insurance industry
Detecting health insurance fraud using analytics Nitin Verma
Any Healthcare organization that exchanges money with service providers, customers and vendors are prone to health insurance fraud and abuse. Health plans around the world are losing more money than the amount of the Medical Loss Ratio (MLR). Examples of fraud include: billing for services not rendered, misrepresenting the diagnosis to fraudulently collect payment, soliciting, offering, or receiving a kickback, unbundling or "exploding" charges and the never ending list goes on and on forever.
The real difference between fraud and abuse is the person's intent. Both acts have the same impact: they detract valuable resources from the Health Plans that would otherwise be used to offer economical plans and provide efficient services to the subscribers and higher reimbursement to the providers.
MACHINE LEARNING ALGORITHMS FOR CREDIT CARD FRAUD DETECTIONmlaij
Fraud is a critical issue in our society today. Losses due to payment fraud are on the increase as ecommerce keeps evolving. Organizations, governments, and individuals have experienced huge losses due
to payment. Merchant Savvy projects that global losses due to payment fraud will increase to about $40.62
billion in 2027 . Among all payment fraud, credit card fraud results in a higher loss. Therefore, we intend
to leverage the potential of machine learning to deal with the problem of fraud in credit cards which can
be generalized to other fraud types. This paper compares the performance of logistic regression, decision
trees, random forest classifier, isolation forest, local outlier factor, and one-class support vector machines
(SVM) based on their AUC and F1-score. We applied a smote technique to handle the imbalanced nature
of the data and compared the performance of the supervised models on the oversampled data to the raw
data. From the results, the Random Forest classifier outperformed the other models with a higher AUC
score and better f1-score on both the actual and oversampled data. Oversampling the data didn't change
the result of the decision trees. One-class SVM performs better than isolation forest in terms of AUC score
but has a very low f1-score compared to isolation forest. The local outlier factor had the poorest
performance.
The rise of Fintech, changing consumer behavior, and advanced technologies are disrupting equally all the financial services industry, among which also it’s most prominent member, insurance
The insurance industry has been using data to calculate risks for years, still, with new technology now available to collect and analyze large volumes of data for patterns and better risk prediction and calculation, the value of understanding how to store and analyze it has grown exponentially (Liu et al., 2018).
Insurers are at their early stage of discovering the potential of big data, and multiple technology companies are investigate how to make value of such technology (Pisoni, 2020)
Supervised and unsupervised data mining approaches in loan default prediction IJECEIAES
Given the paramount importance of data mining in organizations and the possible contribution of a data-driven customer classification recommender systems for loan-extending financial institutions, the study applied supervised and supervised data mining approaches to derive the best classifier of loan default. A total of 900 instances with determined attributes and class labels were used for the training and cross-validation processes while prediction used 100 new instances without class labels. In the training phase, J48 with confidence factor of 50% attained the highest classification accuracy (76.85%), k-nearest neighbors (k-NN) 3 the highest (78.38%) in IBk variants, naïve Bayes has a classification accuracy of 76.65%, and logistic has 77.31% classification accuracy. k-NN 3 and logistic have the highest classification accuracy, F-measures, and kappa statistics. Implementation of these algorithms to the test set yielded 48 non-defaulters and 52 defaulters for k-NN 3 while 44 non-defaulters and 56 defaulters under logistic. Implications were discussed in the paper.
Problem Reduction in Online Payment System Using Hybrid ModelIJMIT JOURNAL
Online auction, shopping, electronic billing etc. all such types of application involves problems of fraudulent transactions. Online fraud occurrence and its detection is one of the challenging fields for web development and online phantom transaction. As no-secure specification of online frauds is in research database, so the techniques to evaluate and stop them are also in study. We are providing an approach with Hidden Markov Model (HMM) and mobile implicit authentication to find whether the user interacting online is a fraud or not. We propose a model based on these approaches to counter the occurred fraud and prevent the loss of the customer. Our technique is more parameterized than traditional approaches and so, chances of detecting legitimate user as a fraud will reduce.
THE TRANSFORMATION RISK-BENEFIT MODEL OF ARTIFICIAL INTELLIGENCE:BALANCING RI...gerogepatton
This paper summarizes the most cogent advantages and risks associated with Artificial Intelligence from an
in-depth review of the literature. Then the authors synthesize the salient risk-related models currently being
used in AI, technology and business-related scenarios. Next, in view of an updated context of AI along with
theories and models reviewed and expanded constructs, the writers propose a new framework called “The
Transformation Risk-Benefit Model of Artificial Intelligence” to address the increasing fears and levels of
AIrisk. Using the model characteristics, the article emphasizes practical and innovative solutions where
benefitsoutweigh risks and three use cases in healthcare, climate change/environment and cyber security to
illustrate unique interplay of principles, dimensions and processes of this powerful AI transformational
model.
THE TRANSFORMATION RISK-BENEFIT MODEL OF ARTIFICIAL INTELLIGENCE:BALANCING RI...ijaia
This paper summarizes the most cogent advantages and risks associated with Artificial Intelligence from an
in-depth review of the literature. Then the authors synthesize the salient risk-related models currently being
used in AI, technology and business-related scenarios. Next, in view of an updated context of AI along with
theories and models reviewed and expanded constructs, the writers propose a new framework called “The
Transformation Risk-Benefit Model of Artificial Intelligence” to address the increasing fears and levels of
AIrisk. Using the model characteristics, the article emphasizes practical and innovative solutions where
benefitsoutweigh risks and three use cases in healthcare, climate change/environment and cyber security to
illustrate unique interplay of principles, dimensions and processes of this powerful AI transformational
model.
THE TRANSFORMATION RISK-BENEFIT MODEL OF ARTIFICIAL INTELLIGENCE:BALANCING RI...ijaia
This paper summarizes the most cogent advantages and risks associated with Artificial Intelligence from an
in-depth review of the literature. Then the authors synthesize the salient risk-related models currently being
used in AI, technology and business-related scenarios. Next, in view of an updated context of AI along with
theories and models reviewed and expanded constructs, the writers propose a new framework called “The
Transformation Risk-Benefit Model of Artificial Intelligence” to address the increasing fears and levels of
AIrisk. Using the model characteristics, the article emphasizes practical and innovative solutions where
benefitsoutweigh risks and three use cases in healthcare, climate change/environment and cyber security to
illustrate unique interplay of principles, dimensions and processes of this powerful AI transformational
model.
Advancing the cybersecurity of the healthcare system with self- optimising an...Petar Radanliev
This article advances the knowledge on teaching and training new artificial intelligence algorithms, for securing, preparing,
and adapting the healthcare system to cope with future pandemics. The core objective is to develop a concept healthcare
system supported by autonomous artificial intelligence that can use edge health devices with real-time data. The article constructs two case scenarios for applying cybersecurity with autonomous artificial intelligence for (1) self-optimising predictive cyber risk analytics of failures in healthcare systems during a Disease X event (i.e., undefined future pandemic), and (2) self-adaptive forecasting of medical production and supply chain bottlenecks during future pandemics. To construct the two testing scenarios, the article uses the case of Covid-19 to synthesise data for the algorithms – i.e., for optimising and securing digital healthcare systems in anticipation of Disease X. The testing scenarios are built to tackle the logistical challenges and disruption of complex production and supply chains for vaccine distribution with optimisation algorithms.
Big data analytics and its impact on internet usersStruggler Ever
Big Data Analytic tools are promising techniques for a future prediction in many aspects of our life. The need for such predictive techniques has been exponentially increasing. even though, there are many challenges and risks are still of concern of researchers and decision makers, the outcome from the use of these techniques will considerable revolutionize our world to a new era of technology.
Fake accounts detection on social media using stack ensemble systemIJECEIAES
In today’s world, social media has spread widely, and the social life of people have become deeply associated with social media use. They use it to communicate with each other, share events and news, and even run businesses. The huge growth in social media and the massive number of users has lured attackers to distribute harmful content through fake accounts, leading to a large number of people falling victim to those accounts. In this work, we propose a mechanism for identifying fake accounts on the social media site Twitter by using two methods to preprocess data and extract the most effective features, they are the spearman correlation coefficient and the chi-square test. For classification, we used supervised machine learning algorithms based on the ensemble system (stack method) by using random forest, support vector machine, and naive Bayes algorithms in the first level of the stack, and the logistic regression algorithm as a meta classifier. The stack ensemble system was shown to be effective in achieving the best results when compared to the algorithms used with it, with data accuracy reaching 99%.
Similar to MapReduce-iterative support vector machine classifier: novel fraud detection systems in healthcare insurance industry (20)
Bibliometric analysis highlighting the role of women in addressing climate ch...IJECEIAES
Fossil fuel consumption increased quickly, contributing to climate change
that is evident in unusual flooding and draughts, and global warming. Over
the past ten years, women's involvement in society has grown dramatically,
and they succeeded in playing a noticeable role in reducing climate change.
A bibliometric analysis of data from the last ten years has been carried out to
examine the role of women in addressing the climate change. The analysis's
findings discussed the relevant to the sustainable development goals (SDGs),
particularly SDG 7 and SDG 13. The results considered contributions made
by women in the various sectors while taking geographic dispersion into
account. The bibliometric analysis delves into topics including women's
leadership in environmental groups, their involvement in policymaking, their
contributions to sustainable development projects, and the influence of
gender diversity on attempts to mitigate climate change. This study's results
highlight how women have influenced policies and actions related to climate
change, point out areas of research deficiency and recommendations on how
to increase role of the women in addressing the climate change and
achieving sustainability. To achieve more successful results, this initiative
aims to highlight the significance of gender equality and encourage
inclusivity in climate change decision-making processes.
Voltage and frequency control of microgrid in presence of micro-turbine inter...IJECEIAES
The active and reactive load changes have a significant impact on voltage
and frequency. In this paper, in order to stabilize the microgrid (MG) against
load variations in islanding mode, the active and reactive power of all
distributed generators (DGs), including energy storage (battery), diesel
generator, and micro-turbine, are controlled. The micro-turbine generator is
connected to MG through a three-phase to three-phase matrix converter, and
the droop control method is applied for controlling the voltage and
frequency of MG. In addition, a method is introduced for voltage and
frequency control of micro-turbines in the transition state from gridconnected mode to islanding mode. A novel switching strategy of the matrix
converter is used for converting the high-frequency output voltage of the
micro-turbine to the grid-side frequency of the utility system. Moreover,
using the switching strategy, the low-order harmonics in the output current
and voltage are not produced, and consequently, the size of the output filter
would be reduced. In fact, the suggested control strategy is load-independent
and has no frequency conversion restrictions. The proposed approach for
voltage and frequency regulation demonstrates exceptional performance and
favorable response across various load alteration scenarios. The suggested
strategy is examined in several scenarios in the MG test systems, and the
simulation results are addressed.
Enhancing battery system identification: nonlinear autoregressive modeling fo...IJECEIAES
Precisely characterizing Li-ion batteries is essential for optimizing their
performance, enhancing safety, and prolonging their lifespan across various
applications, such as electric vehicles and renewable energy systems. This
article introduces an innovative nonlinear methodology for system
identification of a Li-ion battery, employing a nonlinear autoregressive with
exogenous inputs (NARX) model. The proposed approach integrates the
benefits of nonlinear modeling with the adaptability of the NARX structure,
facilitating a more comprehensive representation of the intricate
electrochemical processes within the battery. Experimental data collected
from a Li-ion battery operating under diverse scenarios are employed to
validate the effectiveness of the proposed methodology. The identified
NARX model exhibits superior accuracy in predicting the battery's behavior
compared to traditional linear models. This study underscores the
importance of accounting for nonlinearities in battery modeling, providing
insights into the intricate relationships between state-of-charge, voltage, and
current under dynamic conditions.
Smart grid deployment: from a bibliometric analysis to a surveyIJECEIAES
Smart grids are one of the last decades' innovations in electrical energy.
They bring relevant advantages compared to the traditional grid and
significant interest from the research community. Assessing the field's
evolution is essential to propose guidelines for facing new and future smart
grid challenges. In addition, knowing the main technologies involved in the
deployment of smart grids (SGs) is important to highlight possible
shortcomings that can be mitigated by developing new tools. This paper
contributes to the research trends mentioned above by focusing on two
objectives. First, a bibliometric analysis is presented to give an overview of
the current research level about smart grid deployment. Second, a survey of
the main technological approaches used for smart grid implementation and
their contributions are highlighted. To that effect, we searched the Web of
Science (WoS), and the Scopus databases. We obtained 5,663 documents
from WoS and 7,215 from Scopus on smart grid implementation or
deployment. With the extraction limitation in the Scopus database, 5,872 of
the 7,215 documents were extracted using a multi-step process. These two
datasets have been analyzed using a bibliometric tool called bibliometrix.
The main outputs are presented with some recommendations for future
research.
Use of analytical hierarchy process for selecting and prioritizing islanding ...IJECEIAES
One of the problems that are associated to power systems is islanding
condition, which must be rapidly and properly detected to prevent any
negative consequences on the system's protection, stability, and security.
This paper offers a thorough overview of several islanding detection
strategies, which are divided into two categories: classic approaches,
including local and remote approaches, and modern techniques, including
techniques based on signal processing and computational intelligence.
Additionally, each approach is compared and assessed based on several
factors, including implementation costs, non-detected zones, declining
power quality, and response times using the analytical hierarchy process
(AHP). The multi-criteria decision-making analysis shows that the overall
weight of passive methods (24.7%), active methods (7.8%), hybrid methods
(5.6%), remote methods (14.5%), signal processing-based methods (26.6%),
and computational intelligent-based methods (20.8%) based on the
comparison of all criteria together. Thus, it can be seen from the total weight
that hybrid approaches are the least suitable to be chosen, while signal
processing-based methods are the most appropriate islanding detection
method to be selected and implemented in power system with respect to the
aforementioned factors. Using Expert Choice software, the proposed
hierarchy model is studied and examined.
Enhancing of single-stage grid-connected photovoltaic system using fuzzy logi...IJECEIAES
The power generated by photovoltaic (PV) systems is influenced by
environmental factors. This variability hampers the control and utilization of
solar cells' peak output. In this study, a single-stage grid-connected PV
system is designed to enhance power quality. Our approach employs fuzzy
logic in the direct power control (DPC) of a three-phase voltage source
inverter (VSI), enabling seamless integration of the PV connected to the
grid. Additionally, a fuzzy logic-based maximum power point tracking
(MPPT) controller is adopted, which outperforms traditional methods like
incremental conductance (INC) in enhancing solar cell efficiency and
minimizing the response time. Moreover, the inverter's real-time active and
reactive power is directly managed to achieve a unity power factor (UPF).
The system's performance is assessed through MATLAB/Simulink
implementation, showing marked improvement over conventional methods,
particularly in steady-state and varying weather conditions. For solar
irradiances of 500 and 1,000 W/m2
, the results show that the proposed
method reduces the total harmonic distortion (THD) of the injected current
to the grid by approximately 46% and 38% compared to conventional
methods, respectively. Furthermore, we compare the simulation results with
IEEE standards to evaluate the system's grid compatibility.
Enhancing photovoltaic system maximum power point tracking with fuzzy logic-b...IJECEIAES
Photovoltaic systems have emerged as a promising energy resource that
caters to the future needs of society, owing to their renewable, inexhaustible,
and cost-free nature. The power output of these systems relies on solar cell
radiation and temperature. In order to mitigate the dependence on
atmospheric conditions and enhance power tracking, a conventional
approach has been improved by integrating various methods. To optimize
the generation of electricity from solar systems, the maximum power point
tracking (MPPT) technique is employed. To overcome limitations such as
steady-state voltage oscillations and improve transient response, two
traditional MPPT methods, namely fuzzy logic controller (FLC) and perturb
and observe (P&O), have been modified. This research paper aims to
simulate and validate the step size of the proposed modified P&O and FLC
techniques within the MPPT algorithm using MATLAB/Simulink for
efficient power tracking in photovoltaic systems.
Adaptive synchronous sliding control for a robot manipulator based on neural ...IJECEIAES
Robot manipulators have become important equipment in production lines, medical fields, and transportation. Improving the quality of trajectory tracking for
robot hands is always an attractive topic in the research community. This is a
challenging problem because robot manipulators are complex nonlinear systems
and are often subject to fluctuations in loads and external disturbances. This
article proposes an adaptive synchronous sliding control scheme to improve trajectory tracking performance for a robot manipulator. The proposed controller
ensures that the positions of the joints track the desired trajectory, synchronize
the errors, and significantly reduces chattering. First, the synchronous tracking
errors and synchronous sliding surfaces are presented. Second, the synchronous
tracking error dynamics are determined. Third, a robust adaptive control law is
designed,the unknown components of the model are estimated online by the neural network, and the parameters of the switching elements are selected by fuzzy
logic. The built algorithm ensures that the tracking and approximation errors
are ultimately uniformly bounded (UUB). Finally, the effectiveness of the constructed algorithm is demonstrated through simulation and experimental results.
Simulation and experimental results show that the proposed controller is effective with small synchronous tracking errors, and the chattering phenomenon is
significantly reduced.
Remote field-programmable gate array laboratory for signal acquisition and de...IJECEIAES
A remote laboratory utilizing field-programmable gate array (FPGA) technologies enhances students’ learning experience anywhere and anytime in embedded system design. Existing remote laboratories prioritize hardware access and visual feedback for observing board behavior after programming, neglecting comprehensive debugging tools to resolve errors that require internal signal acquisition. This paper proposes a novel remote embeddedsystem design approach targeting FPGA technologies that are fully interactive via a web-based platform. Our solution provides FPGA board access and debugging capabilities beyond the visual feedback provided by existing remote laboratories. We implemented a lab module that allows users to seamlessly incorporate into their FPGA design. The module minimizes hardware resource utilization while enabling the acquisition of a large number of data samples from the signal during the experiments by adaptively compressing the signal prior to data transmission. The results demonstrate an average compression ratio of 2.90 across three benchmark signals, indicating efficient signal acquisition and effective debugging and analysis. This method allows users to acquire more data samples than conventional methods. The proposed lab allows students to remotely test and debug their designs, bridging the gap between theory and practice in embedded system design.
Detecting and resolving feature envy through automated machine learning and m...IJECEIAES
Efficiently identifying and resolving code smells enhances software project quality. This paper presents a novel solution, utilizing automated machine learning (AutoML) techniques, to detect code smells and apply move method refactoring. By evaluating code metrics before and after refactoring, we assessed its impact on coupling, complexity, and cohesion. Key contributions of this research include a unique dataset for code smell classification and the development of models using AutoGluon for optimal performance. Furthermore, the study identifies the top 20 influential features in classifying feature envy, a well-known code smell, stemming from excessive reliance on external classes. We also explored how move method refactoring addresses feature envy, revealing reduced coupling and complexity, and improved cohesion, ultimately enhancing code quality. In summary, this research offers an empirical, data-driven approach, integrating AutoML and move method refactoring to optimize software project quality. Insights gained shed light on the benefits of refactoring on code quality and the significance of specific features in detecting feature envy. Future research can expand to explore additional refactoring techniques and a broader range of code metrics, advancing software engineering practices and standards.
Smart monitoring technique for solar cell systems using internet of things ba...IJECEIAES
Rapidly and remotely monitoring and receiving the solar cell systems status parameters, solar irradiance, temperature, and humidity, are critical issues in enhancement their efficiency. Hence, in the present article an improved smart prototype of internet of things (IoT) technique based on embedded system through NodeMCU ESP8266 (ESP-12E) was carried out experimentally. Three different regions at Egypt; Luxor, Cairo, and El-Beheira cities were chosen to study their solar irradiance profile, temperature, and humidity by the proposed IoT system. The monitoring data of solar irradiance, temperature, and humidity were live visualized directly by Ubidots through hypertext transfer protocol (HTTP) protocol. The measured solar power radiation in Luxor, Cairo, and El-Beheira ranged between 216-1000, 245-958, and 187-692 W/m 2 respectively during the solar day. The accuracy and rapidity of obtaining monitoring results using the proposed IoT system made it a strong candidate for application in monitoring solar cell systems. On the other hand, the obtained solar power radiation results of the three considered regions strongly candidate Luxor and Cairo as suitable places to build up a solar cells system station rather than El-Beheira.
An efficient security framework for intrusion detection and prevention in int...IJECEIAES
Over the past few years, the internet of things (IoT) has advanced to connect billions of smart devices to improve quality of life. However, anomalies or malicious intrusions pose several security loopholes, leading to performance degradation and threat to data security in IoT operations. Thereby, IoT security systems must keep an eye on and restrict unwanted events from occurring in the IoT network. Recently, various technical solutions based on machine learning (ML) models have been derived towards identifying and restricting unwanted events in IoT. However, most ML-based approaches are prone to miss-classification due to inappropriate feature selection. Additionally, most ML approaches applied to intrusion detection and prevention consider supervised learning, which requires a large amount of labeled data to be trained. Consequently, such complex datasets are impossible to source in a large network like IoT. To address this problem, this proposed study introduces an efficient learning mechanism to strengthen the IoT security aspects. The proposed algorithm incorporates supervised and unsupervised approaches to improve the learning models for intrusion detection and mitigation. Compared with the related works, the experimental outcome shows that the model performs well in a benchmark dataset. It accomplishes an improved detection accuracy of approximately 99.21%.
Developing a smart system for infant incubators using the internet of things ...IJECEIAES
This research is developing an incubator system that integrates the internet of things and artificial intelligence to improve care for premature babies. The system workflow starts with sensors that collect data from the incubator. Then, the data is sent in real-time to the internet of things (IoT) broker eclipse mosquito using the message queue telemetry transport (MQTT) protocol version 5.0. After that, the data is stored in a database for analysis using the long short-term memory network (LSTM) method and displayed in a web application using an application programming interface (API) service. Furthermore, the experimental results produce as many as 2,880 rows of data stored in the database. The correlation coefficient between the target attribute and other attributes ranges from 0.23 to 0.48. Next, several experiments were conducted to evaluate the model-predicted value on the test data. The best results are obtained using a two-layer LSTM configuration model, each with 60 neurons and a lookback setting 6. This model produces an R 2 value of 0.934, with a root mean square error (RMSE) value of 0.015 and a mean absolute error (MAE) of 0.008. In addition, the R 2 value was also evaluated for each attribute used as input, with a result of values between 0.590 and 0.845.
A review on internet of things-based stingless bee's honey production with im...IJECEIAES
Honey is produced exclusively by honeybees and stingless bees which both are well adapted to tropical and subtropical regions such as Malaysia. Stingless bees are known for producing small amounts of honey and are known for having a unique flavor profile. Problem identified that many stingless bees collapsed due to weather, temperature and environment. It is critical to understand the relationship between the production of stingless bee honey and environmental conditions to improve honey production. Thus, this paper presents a review on stingless bee's honey production and prediction modeling. About 54 previous research has been analyzed and compared in identifying the research gaps. A framework on modeling the prediction of stingless bee honey is derived. The result presents the comparison and analysis on the internet of things (IoT) monitoring systems, honey production estimation, convolution neural networks (CNNs), and automatic identification methods on bee species. It is identified based on image detection method the top best three efficiency presents CNN is at 98.67%, densely connected convolutional networks with YOLO v3 is 97.7%, and DenseNet201 convolutional networks 99.81%. This study is significant to assist the researcher in developing a model for predicting stingless honey produced by bee's output, which is important for a stable economy and food security.
A trust based secure access control using authentication mechanism for intero...IJECEIAES
The internet of things (IoT) is a revolutionary innovation in many aspects of our society including interactions, financial activity, and global security such as the military and battlefield internet. Due to the limited energy and processing capacity of network devices, security, energy consumption, compatibility, and device heterogeneity are the long-term IoT problems. As a result, energy and security are critical for data transmission across edge and IoT networks. Existing IoT interoperability techniques need more computation time, have unreliable authentication mechanisms that break easily, lose data easily, and have low confidentiality. In this paper, a key agreement protocol-based authentication mechanism for IoT devices is offered as a solution to this issue. This system makes use of information exchange, which must be secured to prevent access by unauthorized users. Using a compact contiki/cooja simulator, the performance and design of the suggested framework are validated. The simulation findings are evaluated based on detection of malicious nodes after 60 minutes of simulation. The suggested trust method, which is based on privacy access control, reduced packet loss ratio to 0.32%, consumed 0.39% power, and had the greatest average residual energy of 0.99 mJoules at 10 nodes.
Fuzzy linear programming with the intuitionistic polygonal fuzzy numbersIJECEIAES
In real world applications, data are subject to ambiguity due to several factors; fuzzy sets and fuzzy numbers propose a great tool to model such ambiguity. In case of hesitation, the complement of a membership value in fuzzy numbers can be different from the non-membership value, in which case we can model using intuitionistic fuzzy numbers as they provide flexibility by defining both a membership and a non-membership functions. In this article, we consider the intuitionistic fuzzy linear programming problem with intuitionistic polygonal fuzzy numbers, which is a generalization of the previous polygonal fuzzy numbers found in the literature. We present a modification of the simplex method that can be used to solve any general intuitionistic fuzzy linear programming problem after approximating the problem by an intuitionistic polygonal fuzzy number with n edges. This method is given in a simple tableau formulation, and then applied on numerical examples for clarity.
The performance of artificial intelligence in prostate magnetic resonance im...IJECEIAES
Prostate cancer is the predominant form of cancer observed in men worldwide. The application of magnetic resonance imaging (MRI) as a guidance tool for conducting biopsies has been established as a reliable and well-established approach in the diagnosis of prostate cancer. The diagnostic performance of MRI-guided prostate cancer diagnosis exhibits significant heterogeneity due to the intricate and multi-step nature of the diagnostic pathway. The development of artificial intelligence (AI) models, specifically through the utilization of machine learning techniques such as deep learning, is assuming an increasingly significant role in the field of radiology. In the realm of prostate MRI, a considerable body of literature has been dedicated to the development of various AI algorithms. These algorithms have been specifically designed for tasks such as prostate segmentation, lesion identification, and classification. The overarching objective of these endeavors is to enhance diagnostic performance and foster greater agreement among different observers within MRI scans for the prostate. This review article aims to provide a concise overview of the application of AI in the field of radiology, with a specific focus on its utilization in prostate MRI.
Seizure stage detection of epileptic seizure using convolutional neural networksIJECEIAES
According to the World Health Organization (WHO), seventy million individuals worldwide suffer from epilepsy, a neurological disorder. While electroencephalography (EEG) is crucial for diagnosing epilepsy and monitoring the brain activity of epilepsy patients, it requires a specialist to examine all EEG recordings to find epileptic behavior. This procedure needs an experienced doctor, and a precise epilepsy diagnosis is crucial for appropriate treatment. To identify epileptic seizures, this study employed a convolutional neural network (CNN) based on raw scalp EEG signals to discriminate between preictal, ictal, postictal, and interictal segments. The possibility of these characteristics is explored by examining how well timedomain signals work in the detection of epileptic signals using intracranial Freiburg Hospital (FH), scalp Children's Hospital Boston-Massachusetts Institute of Technology (CHB-MIT) databases, and Temple University Hospital (TUH) EEG. To test the viability of this approach, two types of experiments were carried out. Firstly, binary class classification (preictal, ictal, postictal each versus interictal) and four-class classification (interictal versus preictal versus ictal versus postictal). The average accuracy for stage detection using CHB-MIT database was 84.4%, while the Freiburg database's time-domain signals had an accuracy of 79.7% and the highest accuracy of 94.02% for classification in the TUH EEG database when comparing interictal stage to preictal stage.
Analysis of driving style using self-organizing maps to analyze driver behaviorIJECEIAES
Modern life is strongly associated with the use of cars, but the increase in acceleration speeds and their maneuverability leads to a dangerous driving style for some drivers. In these conditions, the development of a method that allows you to track the behavior of the driver is relevant. The article provides an overview of existing methods and models for assessing the functioning of motor vehicles and driver behavior. Based on this, a combined algorithm for recognizing driving style is proposed. To do this, a set of input data was formed, including 20 descriptive features: About the environment, the driver's behavior and the characteristics of the functioning of the car, collected using OBD II. The generated data set is sent to the Kohonen network, where clustering is performed according to driving style and degree of danger. Getting the driving characteristics into a particular cluster allows you to switch to the private indicators of an individual driver and considering individual driving characteristics. The application of the method allows you to identify potentially dangerous driving styles that can prevent accidents.
Hyperspectral object classification using hybrid spectral-spatial fusion and ...IJECEIAES
Because of its spectral-spatial and temporal resolution of greater areas, hyperspectral imaging (HSI) has found widespread application in the field of object classification. The HSI is typically used to accurately determine an object's physical characteristics as well as to locate related objects with appropriate spectral fingerprints. As a result, the HSI has been extensively applied to object identification in several fields, including surveillance, agricultural monitoring, environmental research, and precision agriculture. However, because of their enormous size, objects require a lot of time to classify; for this reason, both spectral and spatial feature fusion have been completed. The existing classification strategy leads to increased misclassification, and the feature fusion method is unable to preserve semantic object inherent features; This study addresses the research difficulties by introducing a hybrid spectral-spatial fusion (HSSF) technique to minimize feature size while maintaining object intrinsic qualities; Lastly, a soft-margins kernel is proposed for multi-layer deep support vector machine (MLDSVM) to reduce misclassification. The standard Indian pines dataset is used for the experiment, and the outcome demonstrates that the HSSF-MLDSVM model performs substantially better in terms of accuracy and Kappa coefficient.
Immunizing Image Classifiers Against Localized Adversary Attacksgerogepatton
This paper addresses the vulnerability of deep learning models, particularly convolutional neural networks
(CNN)s, to adversarial attacks and presents a proactive training technique designed to counter them. We
introduce a novel volumization algorithm, which transforms 2D images into 3D volumetric representations.
When combined with 3D convolution and deep curriculum learning optimization (CLO), itsignificantly improves
the immunity of models against localized universal attacks by up to 40%. We evaluate our proposed approach
using contemporary CNN architectures and the modified Canadian Institute for Advanced Research (CIFAR-10
and CIFAR-100) and ImageNet Large Scale Visual Recognition Challenge (ILSVRC12) datasets, showcasing
accuracy improvements over previous techniques. The results indicate that the combination of the volumetric
input and curriculum learning holds significant promise for mitigating adversarial attacks without necessitating
adversary training.
Cosmetic shop management system project report.pdfKamal Acharya
Buying new cosmetic products is difficult. It can even be scary for those who have sensitive skin and are prone to skin trouble. The information needed to alleviate this problem is on the back of each product, but it's thought to interpret those ingredient lists unless you have a background in chemistry.
Instead of buying and hoping for the best, we can use data science to help us predict which products may be good fits for us. It includes various function programs to do the above mentioned tasks.
Data file handling has been effectively used in the program.
The automated cosmetic shop management system should deal with the automation of general workflow and administration process of the shop. The main processes of the system focus on customer's request where the system is able to search the most appropriate products and deliver it to the customers. It should help the employees to quickly identify the list of cosmetic product that have reached the minimum quantity and also keep a track of expired date for each cosmetic product. It should help the employees to find the rack number in which the product is placed.It is also Faster and more efficient way.
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxR&R Consult
CFD analysis is incredibly effective at solving mysteries and improving the performance of complex systems!
Here's a great example: At a large natural gas-fired power plant, where they use waste heat to generate steam and energy, they were puzzled that their boiler wasn't producing as much steam as expected.
R&R and Tetra Engineering Group Inc. were asked to solve the issue with reduced steam production.
An inspection had shown that a significant amount of hot flue gas was bypassing the boiler tubes, where the heat was supposed to be transferred.
R&R Consult conducted a CFD analysis, which revealed that 6.3% of the flue gas was bypassing the boiler tubes without transferring heat. The analysis also showed that the flue gas was instead being directed along the sides of the boiler and between the modules that were supposed to capture the heat. This was the cause of the reduced performance.
Based on our results, Tetra Engineering installed covering plates to reduce the bypass flow. This improved the boiler's performance and increased electricity production.
It is always satisfying when we can help solve complex challenges like this. Do your systems also need a check-up or optimization? Give us a call!
Work done in cooperation with James Malloy and David Moelling from Tetra Engineering.
More examples of our work https://www.r-r-consult.dk/en/cases-en/
About
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Technical Specifications
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
Key Features
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface
• Compatible with MAFI CCR system
• Copatiable with IDM8000 CCR
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
Application
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Overview of the fundamental roles in Hydropower generation and the components involved in wider Electrical Engineering.
This paper presents the design and construction of hydroelectric dams from the hydrologist’s survey of the valley before construction, all aspects and involved disciplines, fluid dynamics, structural engineering, generation and mains frequency regulation to the very transmission of power through the network in the United Kingdom.
Author: Robbie Edward Sayers
Collaborators and co editors: Charlie Sims and Connor Healey.
(C) 2024 Robbie E. Sayers
Final project report on grocery store management system..pdfKamal Acharya
In today’s fast-changing business environment, it’s extremely important to be able to respond to client needs in the most effective and timely manner. If your customers wish to see your business online and have instant access to your products or services.
Online Grocery Store is an e-commerce website, which retails various grocery products. This project allows viewing various products available enables registered users to purchase desired products instantly using Paytm, UPI payment processor (Instant Pay) and also can place order by using Cash on Delivery (Pay Later) option. This project provides an easy access to Administrators and Managers to view orders placed using Pay Later and Instant Pay options.
In order to develop an e-commerce website, a number of Technologies must be studied and understood. These include multi-tiered architecture, server and client-side scripting techniques, implementation technologies, programming language (such as PHP, HTML, CSS, JavaScript) and MySQL relational databases. This is a project with the objective to develop a basic website where a consumer is provided with a shopping cart website and also to know about the technologies used to develop such a website.
This document will discuss each of the underlying technologies to create and implement an e- commerce website.
Welcome to WIPAC Monthly the magazine brought to you by the LinkedIn Group Water Industry Process Automation & Control.
In this month's edition, along with this month's industry news to celebrate the 13 years since the group was created we have articles including
A case study of the used of Advanced Process Control at the Wastewater Treatment works at Lleida in Spain
A look back on an article on smart wastewater networks in order to see how the industry has measured up in the interim around the adoption of Digital Transformation in the Water Industry.
Explore the innovative world of trenchless pipe repair with our comprehensive guide, "The Benefits and Techniques of Trenchless Pipe Repair." This document delves into the modern methods of repairing underground pipes without the need for extensive excavation, highlighting the numerous advantages and the latest techniques used in the industry.
Learn about the cost savings, reduced environmental impact, and minimal disruption associated with trenchless technology. Discover detailed explanations of popular techniques such as pipe bursting, cured-in-place pipe (CIPP) lining, and directional drilling. Understand how these methods can be applied to various types of infrastructure, from residential plumbing to large-scale municipal systems.
Ideal for homeowners, contractors, engineers, and anyone interested in modern plumbing solutions, this guide provides valuable insights into why trenchless pipe repair is becoming the preferred choice for pipe rehabilitation. Stay informed about the latest advancements and best practices in the field.
block diagram and signal flow graph representation
MapReduce-iterative support vector machine classifier: novel fraud detection systems in healthcare insurance industry
1. International Journal of Electrical and Computer Engineering (IJECE)
Vol. 13, No. 1, February 2023, pp. 756~769
ISSN: 2088-8708, DOI: 10.11591/ijece.v13i1.pp756-769 756
Journal homepage: http://ijece.iaescore.com
MapReduce-iterative support vector machine classifier: novel
fraud detection systems in healthcare insurance industry
Jenita Mary Arockiam, Angelin Claret Seraphim Pushpanathan
Department of Computer Science, College of Science and Humanities, SRM Institute of Science and Technology, Tamil Nadu, India
Article Info ABSTRACT
Article history:
Received Nov 30, 2021
Revised Jul 14, 2022
Accepted Aug 19, 2022
Fraud in healthcare insurance claims is one of the significant research
challenges that affect the growth of the healthcare services. The healthcare
frauds are happening through subscribers, companies and the providers. The
development of a decision support is to automate the claim data from service
provider and to offset the patient’s challenges. In this paper, a novel
hybridized big data and statistical machine learning technique, named
MapReduce based iterative support vector machine (MR-ISVM) that provide
a set of sophisticated steps for the automatic detection of fraudulent claims
in the health insurance databases. The experimental results have proven that
the MR-ISVM classifier outperforms better in classification and detection
than other support vector machine (SVM) kernel classifiers. From the
results, a positive impact seen in declining the computational time on
processing the healthcare insurance claims without compromising the
classification accuracy is achieved. The proposed MR-ISVM classifier
achieves 87.73% accuracy than the linear (75.3%) and radial basis function
(79.98%).
Keywords:
Big data
Fraud detection
Insurance claims
Iterative support vector
machine
MapReduce framework
This is an open access article under the CC BY-SA license.
Corresponding Author:
Angelin Claret Seraphim Pushpanathan
Department of Computer Science, College of Science and Humanities, SRM Institute of Science and
Technology
Kattankulathur, Chennai District, Tamil Nadu 603203, India
Email: angelins@srmist.edu.in
1. INTRODUCTION
The recent advancements made in communication and digital technologies have revolutionized the
modern world. It develops a highly connected environment among the communication entities. Different
types of networks such as social platforms, e-commerce, blogs, industrial trading, banking and insurance
networks are increasing along with the development of communication technologies. A tremendous volume
of data is being generated from these networks. A billion transactions are carried out in a fraction of seconds.
A vast array of information is easily accessible by the fraudsters (or) attackers via creating anonymous
platforms [1]. The growth of anomalous networks, fraudsters have developed several opportunities to
manipulate the data without the user’s knowledge. Many organizations employ preventive measures to secure their
networks and data from internal and external threats with the help of digital technologies. Special considerations
are taken on the interactions and the activities performed among the inter-network entities [2]–[5].
A widespread of machine learning (ML) algorithms is incessantly explored in the different fields of
real-time applications. In recent years, it has been increasing prominence due to the popularity of big data
[6]–[8]. The problems in ML algorithms are known to be the issue of learning from experience by analyzing
some tasks and performance measures. It helps the users to unleash the data structure and develop the
2. Int J Elec & Comp Eng ISSN: 2088-8708
MapReduce-iterative support vector machine classifier: novel fraud detection … (Jenita Mary Arockiam)
757
predictions model from large datasets. ML proliferates the learning algorithms, rich set of information and
dynamic computing environments.
Figure 1 illustrates the role of ML algorithms in big data. The ML component is surrounded by four
elements: big data, system, user, and domain. The communication flows between all elements are bi-
directional. Large and complex financial data is given as the input to the machine learning components, and
then the extensive data computation becomes a part of big data.
Here, the user provides domain analysis and feedback to the learning element, which eases the
decision-making process. The domain component provides the context guidelines to the learned models. The
System component deals with the infrastructure module that illustrates the usage of computing environments
like distributed computing, and edge computing [9].
The growth of assisting connected devices and communication technologies has developed a
passage for stealers to manipulate the data, leading to a severe financial loss crisis for the healthcare sectors.
Hence, the researchers have explored data security analytics [10]. In the insurance sector, fraud activities
distress both customers as well as insurers. It decreases the trust and loyalty between customers and insurers.
A diversified process and products in healthcare services are being designed with the use of medical
technologies. Healthcare insurance management systems administer the different insurance companies and
the healthcare organizations in the marketplace. Generally, it includes two models, namely, the payment
model and the claim management models. The real challenge pertains to the claim management process,
which allows for more advanced analysis like fraud management [11], [12]. Figure 2 illustrates the scope of
the fraud detection approach during the insurance claim management process.
Figure 1. Role of machine learning (ML) in big data Figure 2. Scope of fraud detection approach
There has been a myriad of studies in detecting fraud claims in the healthcare industry. The review
study has analyzed from two aspects, class imbalance and features representation. Class imbalance and
feature representation are some of the classical problems in machine learning algorithms [13]. Due to the
improper definition of features, an imbalance occurs between the classes, i.e. one class has high data samples
whereas the other class has low data samples. Finding the abnormal claims is a challenging task due to the
issues mentioned earlier.
The performance of ML techniques in financial frauds has been surveyed by [14]. The general
techniques involved in machine learning are descriptive techniques, predictive techniques, artificial
intelligence techniques and hybrid techniques. The analysis has stated that the hybrid techniques, genetic
algorithm and support vector machine (SVM) outperformed better than other techniques [15]–[17]. The
solution to financial frauds is always a never-ending task because of class imbalance. Frauds and abuse are
the two factors that incline healthcare costs. Due to class imbalance, it is highly affected by the Brazilian
Health Care Market [18]. Service Providers were asked to find out the link between fraud and abuse on the
claim authorization process. The execution of cross-validation distribution on treatment methods, machine
learning algorithms like SVM, C4.5, random forest and naive Bayes were analyzed. The results have stated
that the random forest was not affected by the class imbalance. Other ML algorithms were involved when the
class distribution changed.
Insurance frauds have become more complex due to the accumulation of prominent data resolved by
big data predictive modelling [19]. Distributed and the ML algorithms tested parallel computing tools to
differentiate the fraud records. The fraudulent patterns change over time, and thus, an imbalance between the
detected patterns creates trouble for the detection approaches. Concept drift [20] is a domain that
3. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 13, No. 1, February 2023: 756-769
758
encompasses the dynamic data, i.e. change over time. Labelling of unsupervised data requires concept
learning. The authors have dealt with automatic labelling of unsupervised data using the concept drift
approach. A permutation test was conducted over each statistical data, and the p-value determined the class
of data. Since it follows a one-fixed algorithm, the effects of class imbalance are high in noisy data.
The decision support tools require an intensive SME analysis when it comes to prepayment and
post-payment control models. The presence of outliers in medical data lowered the accuracy of the detection
framework. Outlier detection techniques [21] were explored to find the misclassified patterns, i.e. false
positive rate. It tested on Medicaid data of 650,000 healthcare claims and 369 dentists of one state. An
improper cluster formation has increased the FPR, and also, the estimated clusters mean improper class
distribution. The comprehensive services provided by healthcare sectors have become more portable by
adopting android technologies [22]. The tracking of claiming benefits requires a timely prediction. Because
of the complex data granularity, it has lowered the accuracy of the framework. Thus, methods such as semi
supervised isomap (SSIsomap) activity clustering, simple local outlier factor (SimLOF) outlier detection, and
the Dempster-Shafer theory-based evidence aggregation are studied on the real-world dataset. The behavior
profile pattern also alters when the data size increases, which strongly induces the estimated frequent itemset.
The provider-consumer model incurs a considerable expense from the healthcare systems. Thus, the
anomaly was detected from the provider and consumer models [23]. Brazilian healthcare records from 2008
to 2015 were collected and evaluated using bipartite graphs and k-nearest neighbors algorithm (k-NN)
algorithms. The bipartite charts were employed to find the relationships between those two models. The
detected similar patterns used to classify into potential providers and anomaly classes using k-NN. The
performance measure cost and effectiveness validate it. Instead of validating the number of hospitals, the
available cities and consumer scores was used for effectiveness estimation. Therefore, representing the
features are essential in graph-based approaches.
Several researchers have explored Medicaid to discover fraud in medical data beyond the transaction
level [24]. The multidimensional data analysis was designed for fraud classifications using sparrow’s
insights. The discovered fraud patterns are also from unsupervised data. The data was classified into six
classes based on the levels of fraudulent data patterns were identified. It was concluded that the inefficiency
of training data had lowered the performance of supervised ML techniques. With the above as a base, an ML
model was designed to detect frauds done by physicians [25]. When it comes to billing procedures, the frauds
may be external (or) internal frauds. Irrespective of the claimer, the physicians were also performing the
misuse of billing procedures, which is challenging. Hence, a multinomial Naïve Bayes algorithm was
designed to resolve multi-class classification by following 5-cross validation. The classification was done by
interchanging the features, like field experts, specialty, and provider types. Relied upon F-score, the fraud
levels on procedures done by physicians were reported. It has built an association among different levels of
physicians when handling the claim data.
Association rule mining is also employed to recognize fraudulent patterns. It is used to constructing
associations/correlations between features. Initially, the transaction data was transformed into a set of
clusters, and then some standard association rules [26] were framed. Based on the lift and confidence value,
the data samples were classified into fraudulent and non-fraudulent claims. The analysis of claim data was
concentrated on the feature extraction phase rather than the classification phase. However, some features are
discarded in terms of big data analytics. The invasion of variant actors and commodities [27] in the
healthcare insurance claims has imposed different challenges to the ML techniques. Therefore, an interactive
framework for unsupervised data analysis was required using pairwise computational models such as
analytical hierarchical processing (AHP) and expectation-maximization (EM). CGM Turkey for private
insurance companies was validated under area under curve (AUC). It has been stated that the independent
analysis of actors and the commodities reduced the time rate for predicting fraud. The fragmented nature of
feature representation has brought significant changes towards the facts finding the process of institutions.
The patient rule induction method (PRIM) [28] was designed to extract the anomalies patterns under
big data context. It was implemented in Center for Medicare Services (CMS) 2014 dataset, which has
improved the feature space. While partitioning the feature space, a depth-analysis on different classes is not
done. Since it performs conditional probability on features, the activities of physicians are not traced.
Heuristics approaches on defining optimal fraud indicators are not possible due to the higher accumulation of
false claims. Fake billing frauds are available more than other frauds, especially in auto/vehicle insurance
claims [29]. Comparison models were designed using random forest, naive Bayes and decision tree under
confusion matrix measure. It was implemented in a synthetic dataset, which concluded that the random forest
has outperformed better than the other two models. Feature modelling has a significant part in designing the
classifiers to reduce false positive and true negative rates. Analyzing camouflage behaviors [30] is a
troublesome task from the classification approaches because it sustains for a short period. Patient cluster
divergence-based healthcare insurance fraudster detection (PCDHIFD) was designed to classify the
4. Int J Elec & Comp Eng ISSN: 2088-8708
MapReduce-iterative support vector machine classifier: novel fraud detection … (Jenita Mary Arockiam)
759
fraudulent caused by camouflage behaviors [30]. With the help of patient admission date features, the
correlation value between the patients, hospitals and the providers were computed from a graph-based dense
peak clustering approach. Then, a divergence cluster value was used to detect the fraud patients. The f-
measure has been improved by 15% than other classification models. Interpreting the medical admission-
oriented features affects the classifiers in the camouflage behaviors analysis. This research study proposes a
novel fraud detection model by hybridizing the strengths of big data and machine learning approaches to
solve the insurance claim classification. It reduces the effects of class imbalance over the voluminous data
that has multi-classes. The insurance claim data is preprocessed using MapReduce framework that scales up
the efficiency of data processing capabilities. The deployment of iterative support vector machine (ISVM)
classifier on processed data helps to classify the fraud providers by executing the pointed iterative conditions.
The proposed MapReduce based iterative support vector machine (MR-ISVM) classifier achieves the
objectives of classification accuracy with the less computational time.
2. METHOD
Class imbalance and feature modelling are mutually dependent on supervised based ML techniques.
Multi-class learning (MCL) is a challenging domain between ML and big data analytics. The research on
MCL has not been suggested more than single-class learning (SCL). MCL is defined as the problem of
associating an instance with more than one class, even for binary labels. Conventional methods do not
support MCL because it reduces the prediction accuracy of the application framework. MCL requires a
systematic approach to handle the medical data effectively and enhances cost-saving and detection efficiency.
Feature selection technique (FST) has to take the categorical, continuous, and high-dimensional data to
innovate the MCL domain. Let us define the problem in vector form. Each instance in the database is
represented as, 𝑎 = {𝑎1 … … 𝑎𝑝} where, p represents the final instance. The data instance is obtained from the
domain, 𝐷 = {𝐴1 … … 𝐴𝑝}. Then, each instance a is associated with the class labels, is denoted as, 𝐿𝑎𝑏𝑒𝑙 𝑙 =
{𝑙1 … … 𝑙𝑞} where q represents the final class value. The class labels are obtained from the domain,
𝐶𝑙𝑎𝑠𝑠𝑑𝑜𝑚𝑎𝑖𝑛(𝐶) = {𝐿1, 𝐿2 … … . . 𝐿𝑄}. Each class label contains a possible set of class variables j, which is
represented as 𝐶𝑗 = (1, 2, . . . 𝐾𝑗).
This research aspires on framing a fraud detection model that detects the mishandling of the
claiming process using machine learning algorithms. Ideally, the proposed method is designed to discover
provider abuse by analyzing the variables used in treatment, disease and claim. The steps of the proposed
process are explained in a detailed manner. Figure 3 presents the block diagram of the proposed research.
The proposed research comprises five phases, and they are explained in brief:
a) Data acquisition: It is the foremost step that portrays the information of datasets.
b) Data preprocessing: It is the second step that portrays the organization of the collected datasets.
c) Feature selection: The third step describes the selection of features used for constructing the training
classifier.
d) Classification: It is the fourth step that presents the workflow of the proposed classifier.
e) Detection: It is the final step that assists the testing data.
Figure 3. Block diagram of the proposed study
2.1. Data acquisition
Dataset is collected from the well-known public repository, known as “Healthcare provider fraud
detection analysis” [31]. Provider fraud is one of the biggest scams prevailing in the healthcare industry. Due
to the mishandling of disease and the treatment details by the physician, the providers increase the medical
costs. The metadata of the dataset is presented in section 3. The collected dataset determines the success rate
of the research objectives.
5. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 13, No. 1, February 2023: 756-769
760
2.2. Data preprocessing
This step’s task is to organize the data presented in the datasets efficiently. It is achieved by
eliminating the missing values, duplicate data and also developing efficient data partitioning. The
examination of missing values and duplicate data is described in the next section. The development of the
data partitioning is explored by using a novel MapReduce technique. It is found that multiple claim IDs are
generated with various providers, which is differentiated by the diseases. Owing to this, the MapReduce
technique is employed over the ‘inpatient and outpatient’ tables. Based on the disorders, a new table is
created. As the name suggests, the MapReduce technique consists of viz, mapper, and reducer functions. The
mapper function is expressed as,
𝑀𝑎𝑝𝑝𝑒𝑟: (𝑘1, 𝑣1) → [(𝑘2, 𝑣2)] (1)
The reducer function is expressed as,
𝑅𝑒𝑑𝑢𝑐𝑒𝑟: (𝑘2, |𝑣2|) → [(𝑘3, 𝑣3)] (2)
where, 𝑘1&𝑘2 are the input key and the output key; 𝑣1&𝑣2 are the input value and the output value;
(𝑘3, 𝑣3) are the final output key and the value obtained from the reducer function; and |𝑣2| is the final data
list.
The Figure 4 presents the workflow of the MapReduce technique. It consists of four functions,
namely, splitting, mapping, partitioning and reducing. Both the functions execute parallel on the input
datasets by creating many subsets under different cluster nodes. The intermediate output values of the mapper
function will serve as the input to the reducer functions. Based on the user-defined values of many partitions
(p) and the partitioning function, the MapReduce technique executes the steps: i) a unique processor is
created for the master and the slave nodes; ii) master nodes are responsible for assigning the task to the nodes
in the mapper and reducer jobs; iii) for the user-defined partition values, each partition runs on the mapper
node; iv) the output values, i.e. keys and the intermediate values of a mapper job, are preserved in the local
files of local storage; v) the keys and the intermediate values on the local files are then assigned to the
reducer job; and vi) after completing the reducer job, the reduced output with the final values is stored at the
master node.
Figure 4. Workflow of the MapReduce technique
6. Int J Elec & Comp Eng ISSN: 2088-8708
MapReduce-iterative support vector machine classifier: novel fraud detection … (Jenita Mary Arockiam)
761
2.3. Feature extraction and selection
Feature selection is the third step that deals with the extraction of required features to build an
efficient training classifier. The data table contains a high set of features, and thus, the importance of each
feature is studied to eliminate the irrelevant features. Linear discriminant analysis (LDA) is performed over
the data table. The objective of LDA is to explore the linear combination of features that combines two (or)
more classes of objects. The most desired features are obtained from reducing the dimensionalities before
building the classifier. It is the most suitable model for preserving the multiple classes with reduced
dimensions. The claiming procedure depends on the different aspects of the medical reports of the patients. It
is found that a beneficiary holds multiple claiming strategies for multiple diseases. The amount is claimed
based on the disease code, treatment code and the total amount. Here, three types of variables, viz, claiming
variables, disease variables and treatment variables. In this step, we intend to find out the ‘confidence score’
of the bills given by the provider. The estimated confident score will help verify the attributes taken for
creating, validating and verifying the statements provided by the provider. The confidence score function is
calculated:
𝑍 = 𝛽1𝑥1 + 𝛽2𝑥2 + ⋯ . +𝛽𝑑𝑥𝑑 (3)
𝐶𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒𝑠𝑐𝑜𝑟𝑒
𝛽
=
𝛽𝑇𝜇1−𝛽𝑇𝜇2
𝛽𝑇𝐶𝛽
// score function of a class (4)
𝐶𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒𝑠𝑐𝑜𝑟𝑒
𝛽
=
𝑍1−𝑍2
𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒𝑜𝑓𝑍𝑤𝑖𝑡ℎ𝑖𝑛𝑔𝑟𝑜𝑢𝑝𝑠
// score function between the classes (5)
For the given score function, the aim is to estimate the linear coefficients of variables that maximize the
score, which is further given as:
𝛽 = 𝐶−1
(𝜇1 − 𝜇2) // coefficients of the model (6)
𝐶 =
1
𝑛1+𝑛2
(𝑛1𝐶1 + 𝑛1𝐶1) // Pooled covariance matrix (7)
where, 𝛽: coefficients of Linear model, C1 and C2: covariance matrices, and 𝜇1&𝜇2: mean vectors.
The discriminant assessment can be done by computing the Mahalanobis distance between two
groups.
𝑀𝑎ℎ𝑎𝑙𝑎𝑛𝑜𝑏𝑖𝑠𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒(∆2) = 𝛽𝑇
(𝜇1 − 𝜇2) (8)
At last, we obtain a new data point that can be classified into C1 (default) and C2 (not-default) by following
the conditional formatting on,
(x − (
𝜇1+𝜇2
2
)) ≥ log
p(C1)
p(C2)
(9)
where, 𝛽𝑇
: coefficients of vector, x: vector of the data,
𝜇1+𝜇2
2
: mean value of vector, and
p(C1)
p(C2)
: probability of
class. Depending on the obtained scores, the relevant features are extracted and selected for the classification
purpose.
2.4. Classification
Iterative support vector machine (ISVM) is employed to ease the classification tasks with minimized
computational efforts. It extracts the provider data via feedback loops in an iterative manner. Initially, a
hyperplane data cube is created by combining source data tables and their principal components. Then, a
general SVM is applied to the hyperplane data cube that generates the classification map. MapReduce
framework is employed to receive the required information from the SVM based classification map. The
output obtained from the applied preprocessing technique is combined with the other hyperplane data cube
for the next iteration process. Likewise, the iterative process continues until achieving the stopping criteria.
The proposed steps of the ISVM are given as:
a) Initially, let us consider a K class of interest, {𝐶𝑝}𝑞=1
𝐾
.
b) Initializing the conditions as, K be the number of classes and k=1 and Ω(0)
= {𝐷𝑎𝑡𝑎𝑡𝑎𝑏𝑙𝑒𝑠} 𝑈 {𝑃𝐶1},
where, 𝑃𝐶1 is the principal component of considered data tables.
c) Deriving the classification map 𝐶𝑙𝑎𝑠𝑠 − 𝑀𝑎𝑝𝑆𝑉𝑀
(0)
for the executed SVM on Ω(0)
.
7. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 13, No. 1, February 2023: 756-769
762
d) Based on the generated class-map, the K classification maps are created for jth
iteration. Data (x,y) of data
table under the k-th
classification map is represented as,
𝐵𝑆𝑉𝑀
(𝑗)
(𝑥, 𝑦) = {1|𝑖𝑓𝑑𝑎𝑡𝑎(𝑥, 𝑦)𝜖𝑐𝑙𝑎𝑠𝑠𝑆𝑉𝑀} ; {0|𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒} (10)
e) Then, applying the (preprocessing technique) on the 𝐵𝑆𝑉𝑀
(𝑗)
(𝑥, 𝑦) and then filtered the inputs are
represented as, 𝑝𝑟𝑒𝑝𝑟𝑜𝑐𝑒𝑠𝑠𝑖𝑛𝑔𝑆𝑉𝑀
(𝑗)
(𝑥, 𝑦)
f) Creating the new hyperplane data cube as,
Ω(𝑗)
= Ω(𝑗−1)
𝑈 {𝑝𝑟𝑒𝑝𝑟𝑜𝑐𝑒𝑠𝑠𝑖𝑛𝑔𝑆𝑉𝑀,1
(𝑗)
} 𝑈 {𝑝𝑟𝑒𝑝𝑟𝑜𝑐𝑒𝑠𝑠𝑖𝑛𝑔𝑆𝑉𝑀,𝑛
(𝑗)
}
g) Executing the SVM on Ω(𝑗)
to generate 𝐶𝑙𝑎𝑠𝑠 − 𝑀𝑎𝑝𝑆𝑉𝑀
(𝑗)
h) Stopping rule is defined for terminating the iteration i.e. Feedback process, which is explained in the next
section.
i) If 𝐶𝑙𝑎𝑠𝑠 − 𝑀𝑎𝑝𝑆𝑉𝑀
(𝑗)
, satisfies the stopping rule, then the ISVM is stopped. Atlast, the final classification
map is declared.
j) Else, the process continues by following the step (d), by iteratively, j= j+1.
2.5. Framing of stopping rule for ISVM
The main concept behind the stopping rule of ISVM is to find the best classification maps obtained
from jth
and (j-1)th
iterations. Tanimoto index (TI) is employed to find the best stopping rules from the
generated classification maps. It is given as,
𝑇𝐼(𝑗)
=
|𝑆𝑖𝑧𝑒𝑗 ∩ 𝑠𝑖𝑧𝑒𝑗−1 |
|𝑆𝑖𝑧𝑒𝑗 ∪ 𝑠𝑖𝑧𝑒𝑗−1 |
(11)
where, 𝑆𝑖𝑧𝑒𝑗
; 𝑠𝑖𝑧𝑒𝑗−1
are the classification maps.
TI ranges from [0, 1] and a threshold value 𝛽 is defined. If obtained classification maps cross
higher than the given threshold 𝛽, then the iteration is stopped. Figure 5 represents the functionality of the
ISVM.
Figure 5. Functional block diagram of the ISVM
3. RESULTS AND DISCUSSION
The proposed framework is applied to the real-world insurance provider data obtained from medical
fraud provider detection the previous model is compared with the proposed model using institution-level
variables. From the Medicare data warehouse, beneficiary, inpatient and outpatient data details are preserved
in different tables. The Table 1 presents the tables and their details. It is to be noted that the unique feature
between inpatient and outpatient data is the absence of diagnosis code.
8. Int J Elec & Comp Eng ISSN: 2088-8708
MapReduce-iterative support vector machine classifier: novel fraud detection … (Jenita Mary Arockiam)
763
Table 1. Used database tables and their details
Data Tables Details No. of features
Beneficiary Basic information of the patients i.e. outpatients as well as inpatients details such as Gender,
Claiming details, and Reimbursement details
25
Outpatient It contains details of the patients who visited hospitals but were not admitted. E.g. ClaimID,
ProviderID, and PhysicianID
27
Inpatient It contains details of the patients admitted in the hospital. E.g. ClaimID, ProviderID,
PhysicianID, and Diagnosis code
30
The Table 1 presents the database tables and their feature details. The data is preprocessed using
MapReduce technique that eliminates the medical treatment records, claiming records, removing missing
values, and fixing errors. As a result, we have used 5,000 records for modeling. Provider ID and the
Beneficiary ID are the primary key and the claim details like reimbursement and the deductible amount are
taken as the secondary key value of this study. The collected dataset is preprocessed using MapReduce
framework. A beneficiary can hold the inpatient and outpatient data and thus, it is organized using
MapReduce framework which is given as:
BENE11001 → (Inpatient, 3) & (Outpatient, 0)
BENE11002 → (Inpatient, 0) & (Outpatient, 1)
BENE11014→ (Inpatient, 1) & (Outpatient, 1)
Table 2 presents the sample records organized using the MapReduce framework. The primary key is
to recognize the “clean and organized” data that can reuse the previous results, i.e., it splits the input data into
smaller volumes of data quickly and stably. These smaller data volumes may ensure that more small data
volumes are clean. Regardless, much smaller data volume increases the overhead, and thus, the designed
MapReduce framework, as a preprocessor, must assure stability and speed. In the view of sorting the data
imbalance issue, the MapReduce framework adoption has scrutinized the cardinality of the majority and
minority classes. Compared to the synthetic minority oversampling technique (SMOTE), the proposed
MapReduce technique has modified the intrinsic way of data learning process. The developed Java-based
decision support engine is associated with MySQL using java server pages (JSP) scripts. The feature
extraction process on the preprocessed data involves claims cost validation. A new data table, ‘Unbundled
date’’ is created and linked with the proposed (ISVM) classifier. The claims are split into two, namely:
i) claims with the approved costs within each diagnostic related group and ii) claims with the disapproved
costs within each diagnostic related group.
Table 2. Sample records organized using MapReduce framework
Beneficiary ID Count
BENE11002 1
BENE11003 2
BENE11004 12
BENE11005 8
BENE11006 1
BENE11007 4
BENE11008 1
BENE11009 2
BENE11132 16
BENE11012 15
BENE11016 15
BENE11024 11
BENE11045 11
LDA is used to haul out a nominal attributes subset that aims for the probability distribution of data
classes. The separated classes are close to the original class data distribution by making use of attributes. A
new data table is constructed to the estimated ‘confidence score’ of the bills given by the provider. The
choice of features based on the LDA are, attendance data, hospital code, diagnostic related group, Claim bill,
and drug bill. The dataset is subjected to the ISVM by 70% training and 30% for testing. The approved
claims are then fed into ISVM training classifier. The best data those that meet the confidence score of
LDA’s criteria are classified first. Each instance of this dataset is organized into “Fraud provider” (or) “Legal
Provider”. The proposed iterative conditions fed into the ISVM classifier to detect the fraud providers are:
i) count of total BeneID is compared with the total ClaimID for each provider. If the count of BeneID is
greater than the count of ClaimID, it is labeled as a fraud provider; ii) claimStartDate and ClaimEndDate are
9. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 13, No. 1, February 2023: 756-769
764
matched with PatientAdmissionDate and DischargeDate; and iii) inpatient claim admit diagnosis code is
matched with the outpatient diagnosis code.
After each classification, the confusion matrix is displayed. The matrix is embossed of the count of
true legal, true fraud, false legal, and false fraud.
a) True legal provider: It includes the count of 'approved costs" correctly classified as "True legal provider"
by the ISVM classifier.
b) True fraud provider: It includes the count of 'disapproved costs' correctly classified as "True fraud
provider" by the ISVM classifier.
c) False legal provider: It includes the count of 'disapproved costs" incorrectly classified as "False legal
provider," even though they are not, by the ISVM classifier.
d) False fraud provider: It includes the count of 'approved costs", which were incorrectly classified as "False
fraud provider," even though they are not by the ISVM classifier.
The Figure 6 presents the proposed implementation framework. The performance metrics are
employed to evaluate the MR-ISVM classifiers.
a) Accuracy: The proportion of recognizing the classes to the proportion of aggregate total data samples.
The efficacy of the accuracy metric is achieved on the balanced datasets which is expressed as,
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =
𝑇𝐿𝑃+𝑇𝐹𝑃
𝑇𝐿𝑃+𝑇𝐹𝑃+𝐹𝐿𝑃+𝐹𝐹𝑃
(12)
b) Precision: The proportion of true legal providers and true fraud providers to the classified positive data
samples. It implies the confidence level of the fraud detection, which is expressed as,
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =
𝑇𝐿𝑃
𝑇𝐿𝑃+𝐹𝐿𝑃
(13)
c) Recall: The proportion of true legal providers to the classified positive samples. It implies efficiency of
detection rate, which is expressed as,
𝑅𝑒𝑐𝑎𝑙𝑙 =
𝑇𝐿𝑃
𝑇𝐿𝑃+𝐹𝐹𝑃
(14)
Figure 6. Implementation framework
Table 3 and Figure 7 represents the number of fraud data available in the testing datasets. The
accuracy of the MR-ISVM classifier is evaluated from the classification and detection ability of fraudulent
providers. The proposed MR-ISVM classifier is tested in 10-fold cross validation of hyperparameters (C,𝛽).
A random search is performed on ISVM parameter training until classifying the optimal claims data samples.
The sample screenshots of the proposed framework are shown in Figure 8.
10. Int J Elec & Comp Eng ISSN: 2088-8708
MapReduce-iterative support vector machine classifier: novel fraud detection … (Jenita Mary Arockiam)
765
Table 3. Fraud provider types based on data volume
Fraud provider types Sample data size
1,000 2,000 3,000 4,000 5,000
Identity-wise analysis 6 8 45 34 22
Date-wise analysis 0 56 34 12 98
Diagnosis code-wise analysis 45 23 35 122 406
Figure 7. Number of fraud data
Figure 8. Sample screenshots
11. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 13, No. 1, February 2023: 756-769
766
From Table 4 and Figures 9 to 11 represents statistics of the SVM classifiers on sample data size.
The confusion matrix is also known as the error matrix that helps to visualize the performance of the iterative
SVM classifier. As the sample data size increases and given iterative conditions, the classification, and
detection of fraud claims incline exponentially.
Table 4. Summary statistics of SVM classifiers on sample sizes
Kernels used Data size Accuracy (%) Precision (%) Recall (%)
Linear 1000 68.03 56.00 66.90
2000 73.03 79.80 0.001
3000 78.90 86.34 65.00
4000 83.45 78.09 56.12
5000 73.09 81.03 67.98
Radial basis function 1000 72.45 65.45 63.09
2000 70.12 87.34 25.67
3000 81.45 98.33 45.34
4000 89.34 67.89 39.45
5000 86.56 85.00 78.90
Iterative loop 1000 73.45 58.91 89.76
2000 96.78 94.35 98.34
3000 89.56 96.45 87.46
4000 83.56 97.88 40.78
5000 95.34 97.32 83.45
Figure 9. Accuracy analysis
Figure 10. Precision analysis
12. Int J Elec & Comp Eng ISSN: 2088-8708
MapReduce-iterative support vector machine classifier: novel fraud detection … (Jenita Mary Arockiam)
767
Figure 11. Recall analysis
Table 5 represents the average of different versions of the SVM classifier's performances. It is
observed that the MR-ISVM classifiers perform better classification with an accuracy of 87.73%, followed
by 88.98% precision and 79.95% recall. Compared to the radial basis function and linear kernels, the
MR-ISVM outperformed better to classify and detect the fraud provider.
Figure 12 represent the analysis of the computational time of the MR-ISVM classifier with the linear
and radial basis function. Along with the classification, the required time in computing the sample datasets is
significant in this study. It is understood from the above analysis that the computational time increases
depending on the volume of the sample dataset.
Figure 13 presents the comparative analysis between the existing and proposed techniques. The
proposed MR-ISVM classifier takes less computational time than the linear and radial basis functions. The
variation in instant time is owing to the training dataset using the MapReduce framework. As we know that
the data has been growing widely and rapidly in recent times. Thus, more computational resources need
proper and accurate machine learning approaches.
Table 5. Average performance analysis of SVM classifiers
Kernels used Accuracy (%) Precision (%) Recall (%)
Linear 75.3 76.25 51.2
Radial basis function 79.98 80.8 50.49
Iterative loop 87.73 88.98 79.95
Figure 12. Number computational time analysis
13. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 13, No. 1, February 2023: 756-769
768
Figure 13. Comparative techniques with proposed MR-ISVM
4. CONCLUSION
The healthcare industry generates a tremendous amount of data from heterogeneous data sources
like medical reports, hospital devices, and billing systems. The healthcare data transactions are too complex
and voluminous to be computed by conventional methods. Fraud detection is one of the major research areas
that need to be scaled up in real-time scenarios. It is a kind of risk management control activity. Class
imbalance and feature modeling are the major issues that degrade the performance of machine learning
approaches on healthcare data. This research work aims to introduce a novel fraud detection model by
hybridizing the qualities of big data and machine learning approaches. The collected insurance claims data is
preprocessed using the MapReduce framework that categorizes the voluminous claims data. The required
features related to the disease, treatment, and total amount are modeled using LDA. The ISVM approach is
widely explored due to its strength in separating the claims data into legal and fraud providers. The soft
margin function enables the separation of claims data, which is done by iterative conditions. Thus, the fraud
detection systems support the combination of two approaches and achieve higher fraud detection accuracy.
The implementation analysis has demonstrated that the MR-ISVM classifier outperforms better in
classification and detection than other SVM kernel classifiers. The achieved results explore a positive impact
in reducing the computational time on processing healthcare insurance claims without compromising the
classification accuracy. The proposed MR-ISVM classifier achieves 87.73% accuracy than the linear (75.3%)
and radial basis function (79.98%).
REFERENCES
[1] L. Akoglu, H. Tong, and D. Koutra, “Graph based anomaly detection and description: a survey,” Data Mining and Knowledge
Discovery, vol. 29, no. 3, pp. 626–688, May 2015, doi: 10.1007/s10618-014-0365-y.
[2] A. A. Aburomman and M. bin Ibne Reaz, “A novel weighted support vector machines multiclass classifier based on differential
evolution for intrusion detection systems,” Information Sciences, vol. 414, pp. 225–246, Nov. 2017, doi:
10.1016/j.ins.2017.06.007.
[3] I. Sadgali, N. Sael, and F. Benabbou, “Human behavior scoring in credit card fraud detection,” IAES International Journal of
Artificial Intelligence (IJ-AI), vol. 10, no. 3, pp. 698–706, Sep. 2021, doi: 10.11591/ijai.v10.i3.pp698-706.
[4] P. P. Vishwakarma, A. K. Tripathy, and S. Vemuru, “An empiric path towards fraud detection and protection for NFC-enabled
mobile payment system,” TELKOMNIKA (Telecommunication Computing Electronics and Control), vol. 17, no. 5,
pp. 2313–2320, Oct. 2019, doi: 10.12928/telkomnika.v17i5.12290.
[5] R. A. I. Alhayali, M. Aljanabi, A. H. Ali, M. A. Mohammed, and T. Sutikno, “Optimized machine learning algorithm for
intrusion detection,” Indonesian Journal of Electrical Engineering and Computer Science (IJEECS), vol. 24, no. 1, pp. 590–599,
Oct. 2021, doi: 10.11591/ijeecs.v24.i1.pp590-599.
[6] E. Raguseo, “Big data technologies: An empirical investigation on their adoption, benefits and risks for companies,” International
Journal of Information Management, vol. 38, no. 1, pp. 187–195, Feb. 2018, doi: 10.1016/j.ijinfomgt.2017.07.008.
[7] S. M. Erfani, S. Rajasegarar, S. Karunasekera, and C. Leckie, “High-dimensional and large-scale anomaly detection using a linear
one-class SVM with deep learning,” Pattern Recognition, vol. 58, pp. 121–134, Oct. 2016, doi: 10.1016/j.patcog.2016.03.028.
[8] W.-C. Lin, S.-W. Ke, and C.-F. Tsai, “CANN: An intrusion detection system based on combining cluster centers and nearest
neighbors,” Knowledge-Based Systems, vol. 78, pp. 13–21, Apr. 2015, doi: 10.1016/j.knosys.2015.01.009.
[9] A. Fernández, C. J. Carmona, M. J. del Jesus, and F. Herrera, “A view on fuzzy systems for big data: Progress and opportunities,”
Int. Journal of Computational Intelligence Systems, vol. 9, no. 1, pp. 69–80, 2016, doi: 10.1080/18756891.2016.1180820.
[10] H. Hassani and E. S. Silva, “Forecasting with big data: A review,” Annals of Data Science, vol. 2, no. 1, pp. 5–19, Mar. 2015, doi:
10.1007/s40745-015-0029-9.
[11] A. L. Buczak and E. Guven, “A survey of data mining and machine learning methods for cyber security intrusion detection,”
IEEE Communications Surveys & Tutorials, vol. 18, no. 2, pp. 1153–1176, 2016, doi: 10.1109/COMST.2015.2494502.
[12] I. A. T. Hashem et al., “The role of big data in smart city,” International Journal of Information Management, vol. 36, no. 5,
pp. 748–758, Oct. 2016, doi: 10.1016/j.ijinfomgt.2016.05.002.
14. Int J Elec & Comp Eng ISSN: 2088-8708
MapReduce-iterative support vector machine classifier: novel fraud detection … (Jenita Mary Arockiam)
769
[13] H. Joudaki et al., “Using data mining to detect health care fraud and abuse: A review of literature,” Global Journal of Health
Science, vol. 7, no. 1, pp. 194–202, Aug. 2014, doi: 10.5539/gjhs.v7n1p194.
[14] I. Sadgali, N. Sael, and F. Benabbou, “Performance of machine learning techniques in the detection of financial frauds,” Procedia
Computer Science, vol. 148, pp. 45–54, 2019, doi: 10.1016/j.procs.2019.01.007.
[15] E. M. Hussein Saeed, H. A. Saleh, and E. A. Khalel, “Classification of mammograms based on features extraction techniques
using support vector machine,” Computer Science and Information Technologies, vol. 2, no. 3, pp. 121–131, Nov. 2020, doi:
10.11591/csit.v2i3.p121-131.
[16] M. D. Salawu et al., “A chi-square-SVM based pedagogical rule extraction method for microarray data analysis,” International
Journal of Advances in Applied Sciences, vol. 9, no. 2, pp. 93–100, Jun. 2020, doi: 10.11591/ijaas.v9.i2.pp93-100.
[17] M. Moukhafi, K. El Yassini, and B. Seddik, “Intrusions detection using optimized support vector machine,” International Journal
of Advances in Applied Sciences, vol. 9, no. 1, pp. 62–66, Mar. 2020, doi: 10.11591/ijaas.v9.i1.pp62-66.
[18] J. C. Cassimiro, A. M. Santana, P. S. Neto, and R. L. Rabelo, “Investigating the effects of class imbalance in learning the claim
authorization process in the Brazilian health care market,” in 2017 International Joint Conference on Neural Networks (IJCNN),
May 2017, pp. 3265–3272, doi: 10.1109/IJCNN.2017.7966265.
[19] P. Dora and G. H. Sekharan, “Healthcare in-surance fraud detection leveraging big data analytics,” G. Hari Sekharan, vol. 4,
no. 4, pp. 2073–2076, 2015.
[20] N. Che and W. Janusz, “Unsupervised labeling of data for supervised learning and its ap-plication to medical claims prediction,”
Computer Science, vol. 14, no. 3, pp. 191–214, 2013, doi: 10.7494/csci.2013.14.2.191.
[21] G. van Capelleveen, M. Poel, R. M. Mueller, D. Thornton, and J. van Hillegersberg, “Outlier detection in healthcare fraud: A case
study in the Medicaid dental domain,” International Journal of Accounting Information Systems, vol. 21, pp. 18–31, Jun. 2016,
doi: 10.1016/j.accinf.2016.04.001.
[22] Y. Gao, C. Sun, R. Li, Q. Li, L. Cui, and B. Gong, “An efficient fraud identification method combining manifold learning and
outliers detection in mobile healthcare services,” IEEE Access, vol. 6, pp. 60059–60068, 2018, doi:
10.1109/ACCESS.2018.2875516.
[23] L. F. M. Carvalho, C. H. C. Teixeira, W. Meira, M. Ester, O. Carvalho, and M. H. Brandao, “Provider-consumer anomaly
detection for healthcare systems,” in 2017 IEEE International Conference on Healthcare Informatics (ICHI), Aug. 2017,
pp. 229–238, doi: 10.1109/ICHI.2017.75.
[24] D. Thornton, R. M. Mueller, P. Schoutsen, and J. van Hillegersberg, “Predicting healthcare fraud in medicaid: A
multidimensional data model and analysis techniques for fraud detection,” Procedia Technology, vol. 9, pp. 1252–1264, 2013,
doi: 10.1016/j.protcy.2013.12.140.
[25] R. A. Bauder, T. M. Khoshgoftaar, A. Richter, and M. Herland, “Predicting medical provider specialties to detect anomalous
insurance claims,” in 2016 IEEE 28th International Conference on Tools with Artificial Intelligence (ICTAI), Nov. 2016,
pp. 784–790, doi: 10.1109/ICTAI.2016.0123.
[26] S. Kareem, R. Binti Ahmad, and A. B. Sarlan, “Framework for the identification of fraudulent health insurance claims using
association rule mining,” in 2017 IEEE Conference on Big Data and Analytics (ICBDA), Nov. 2017, pp. 99–104, doi:
10.1109/ICBDAA.2017.8284114.
[27] I. Kose, M. Gokturk, and K. Kilic, “An interactive machine-learning-based electronic fraud and abuse detection system in
healthcare insurance,” Applied Soft Computing, vol. 36, pp. 283–299, Nov. 2015, doi: 10.1016/j.asoc.2015.07.018.
[28] S. Sadiq, Y. Tao, Y. Yan, and M.-L. Shyu, “Mining anomalies in Medicare big data using patient rule induction method,” in 2017
IEEE Third International Conference on Multimedia Big Data (BigMM), Apr. 2017, pp. 185–192, doi: 10.1109/BigMM.2017.56.
[29] R. Roy and K. T. George, “Detecting insurance claims fraud using machine learning techniques,” in 2017 International
Conference on Circuit, Power and Computing Technologies (ICCPCT), Apr. 2017, pp. 1–6, doi: 10.1109/ICCPCT.2017.8074258.
[30] C. Sun, Q. Li, H. Li, Y. Shi, S. Zhang, and W. Guo, “Patient cluster divergence healthcare insurance fraudster detection,” IEEE
Access, vol. 7, pp. 14162–14170, 2019, doi: 10.1109/ACCESS.2018.2886680.
[31] R. A. Gupta, “Medical provider fraud detection dataset,” Kaggle, 2019. Accessed Oct 15, 2021. [Online]. Available:
https://www.kaggle.com/rohitrox/medical-provider-fraud-detection.
BIOGRAPHIES OF AUTHORS
Jenita Mary Arockiam Currently persuing as a Research Scholar in the
Department of Computer Science, College of Science and Humanities, SRM Institute of Science
and Technology. Received bachelor’s degree from the Computer Science Department, J. A.
College affiliated to MK University, Madurai in 2000. Master of Computer Application degree
from MGR arts-science College, Periyar University, Salem in 2003. M.Phil. degree from Periyar
University, Salem in 2008. Having Twelve years of experience in teaching field and very much
interested in big data analytics especially machine learning. She can be contacted at email:
ja3368@srmist.edu.in.
Angelin Claret Seraphim Pushpanathan Currently working as an Assistant
Professor in the Department of Computer Science at SRM Institute of Science and Technology,
Chennai. Received doctorate degree from Bharathiar University, Coimbatore. Having 15 years of
teaching experience and published many articles related to component-based power distribution
system under various SCI and Scopus indexed journal. Area of research focuses on component-
based technology, power distribution system, internet of things, machine learning, artificial
intelligence with healthcare. She can be contacted at email: angelins@srmist.edu.in.