Clustering is a technique in data mining capable of grouping very large amounts of data to gain new knowledge based on unsupervised learning. Clustering is capable of grouping various types of data and fields. The process that requires this technique is in the business sector, especially banking. In the transaction business process in banking, fraud is often encountered in transactions. This raises interest in clustering data fraud in transactions. An algorithm is needed in the cluster, namely Louvain’s algorithm. Louvain’s algorithm is capable of clustering in large numbers, which represent them in a graph. So, the Louvain algorithm is optimized with colored graphs to facilitate research continuity in labeling. In this study, 33,491 non-fraud data were grouped, and 241 fraud transaction data were carried out. However, Louvain’s algorithm shows that clustering increases the amount of data fraud of 90% by accurate.
Detecting Suspicious Accounts in Money LaunderingIRJET Journal
This document summarizes a research paper that proposes a new method for detecting suspicious accounts involved in money laundering. The method uses hash-based association mining to generate frequent transaction datasets from a large database of transactions. A graph theoretical approach is then used to identify the traversal path of transactions between accounts, allowing agents and beneficiaries in the money laundering process to be identified. The method was tested on artificial transaction datasets of up to 17,000 transactions and was able to successfully find paths between frequent accounts. An enhancement was also proposed that compares transaction amounts to previous years to identify significant increases that may indicate suspicious activity.
Analysis on Fraud Detection Mechanisms Using Machine Learning TechniquesIRJET Journal
1) The document discusses using machine learning techniques like Random Forest Classifier and AdaBoost to detect fraud in blockchain transactions through an ensemble model.
2) It analyzes the individual accuracy of Random Forest and AdaBoost classifiers, finding accuracies over 99.99%, then ensembles their predictions using a stacking method.
3) The stacking ensemble model combines the predictions of the Random Forest and AdaBoost models into a new training set to potentially provide even more accurate fraud detection compared to the individual models.
This document summarizes a research paper that aims to predict taxes using machine learning. It discusses using feature extraction and random forest algorithms on tax data to classify taxpayers and detect potential fraud. The methodology extracts numerical features from text data, uses random forest classification, and decision trees to predict future tax amounts owed. It finds this machine learning approach outperforms traditional statistical methods for tax prediction and fraud detection, helping improve tax compliance and design.
Applying K-Means Clustering Algorithm to Discover Knowledge from Insurance Da...theijes
Data mining works to extract information known in advance from the enormous quantities of data which can lead to knowledge. It provides information that helps to make good decisions. The effectiveness of data mining in access to knowledge to achieve the goal of which is the discovery of the hidden facts contained in databases and through the use of multiple technologies. Clustering is organizing data into clusters or groups such that they have high intra-cluster similarity and low inter cluster similarity. This paper deals with K-means clustering algorithm which collect a number of data based on the characteristics and attributes of this data, and process the Clustering by reducing the distances between the data center. This algorithm is applied using open source tool called WEKA, with the Insurance dataset as its input
IRJET- Online Crime Reporting and Management System using Data MiningIRJET Journal
This document describes an online crime reporting and management system that uses data mining. The system allows users and police to file crime reports, complaints, and missing person reports online. It notifies users if they enter a high crime alert area and provides crime rates by area. The system registers complaints through a web app where users can upload images/videos. It uses KNN, AES encryption, and K-means algorithms. The goal is to help law enforcement track and solve crimes more efficiently.
A NOVEL EVALUATION APPROACH TO FINDING LIGHTWEIGHT MACHINE LEARNING ALGORITHM...IJNSA Journal
Building practical and efficient intrusion detection systems in computer network is important in industrial areas today and machine learning technique provides a set of effective algorithms to detect network
intrusion. To find out appropriate algorithms for building such kinds of systems, it is necessary to evaluate various types of machine learning algorithms based on specific criteria. In this paper, we propose a novel evaluation formula which incorporates 6 indexes into our comprehensive measurement, including precision, recall, root mean square error, training time, sample complexity and practicability, in order to
find algorithms which have high detection rate, low training time, need less training samples and are easy
to use like constructing, understanding and analyzing models. Detailed evaluation process is designed to
get all necessary assessment indicators and 6 kinds of machine learning algorithms are evaluated.
Experimental results illustrate that Logistic Regression shows the best overall performance.
A NOVEL EVALUATION APPROACH TO FINDING LIGHTWEIGHT MACHINE LEARNING ALGORITHM...IJNSA Journal
This document proposes a novel evaluation approach to find lightweight machine learning algorithms for intrusion detection. It incorporates 6 evaluation indexes: precision, recall, root mean square error, training time, sample complexity, and practicability. The evaluation formula calculates a score for each algorithm based on F1 score and penalty values. The document defines penalty values for the practicability of 6 machine learning algorithms (decision tree, naive bayes, multilayer perceptron, radial basis function network, logistic regression, support vector machine). Experimental results on intrusion detection datasets will evaluate the algorithms based on the proposed approach.
An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...IRJET Journal
This document summarizes a research paper that proposes a system for detecting fraudulent credit card transactions using data mining techniques. The system uses the Apriori algorithm to perform frequent item set mining on a credit card transaction dataset. It then uses the Support Vector Machine (SVM) classification method to match new transactions to either a legal transaction pattern database or a fraudulent transaction pattern database that was formed based on users' previous transactions. The results showed this proposed method achieved better fraud detection with a lower false alarm rate than existing methods like Hidden Markov Models.
Detecting Suspicious Accounts in Money LaunderingIRJET Journal
This document summarizes a research paper that proposes a new method for detecting suspicious accounts involved in money laundering. The method uses hash-based association mining to generate frequent transaction datasets from a large database of transactions. A graph theoretical approach is then used to identify the traversal path of transactions between accounts, allowing agents and beneficiaries in the money laundering process to be identified. The method was tested on artificial transaction datasets of up to 17,000 transactions and was able to successfully find paths between frequent accounts. An enhancement was also proposed that compares transaction amounts to previous years to identify significant increases that may indicate suspicious activity.
Analysis on Fraud Detection Mechanisms Using Machine Learning TechniquesIRJET Journal
1) The document discusses using machine learning techniques like Random Forest Classifier and AdaBoost to detect fraud in blockchain transactions through an ensemble model.
2) It analyzes the individual accuracy of Random Forest and AdaBoost classifiers, finding accuracies over 99.99%, then ensembles their predictions using a stacking method.
3) The stacking ensemble model combines the predictions of the Random Forest and AdaBoost models into a new training set to potentially provide even more accurate fraud detection compared to the individual models.
This document summarizes a research paper that aims to predict taxes using machine learning. It discusses using feature extraction and random forest algorithms on tax data to classify taxpayers and detect potential fraud. The methodology extracts numerical features from text data, uses random forest classification, and decision trees to predict future tax amounts owed. It finds this machine learning approach outperforms traditional statistical methods for tax prediction and fraud detection, helping improve tax compliance and design.
Applying K-Means Clustering Algorithm to Discover Knowledge from Insurance Da...theijes
Data mining works to extract information known in advance from the enormous quantities of data which can lead to knowledge. It provides information that helps to make good decisions. The effectiveness of data mining in access to knowledge to achieve the goal of which is the discovery of the hidden facts contained in databases and through the use of multiple technologies. Clustering is organizing data into clusters or groups such that they have high intra-cluster similarity and low inter cluster similarity. This paper deals with K-means clustering algorithm which collect a number of data based on the characteristics and attributes of this data, and process the Clustering by reducing the distances between the data center. This algorithm is applied using open source tool called WEKA, with the Insurance dataset as its input
IRJET- Online Crime Reporting and Management System using Data MiningIRJET Journal
This document describes an online crime reporting and management system that uses data mining. The system allows users and police to file crime reports, complaints, and missing person reports online. It notifies users if they enter a high crime alert area and provides crime rates by area. The system registers complaints through a web app where users can upload images/videos. It uses KNN, AES encryption, and K-means algorithms. The goal is to help law enforcement track and solve crimes more efficiently.
A NOVEL EVALUATION APPROACH TO FINDING LIGHTWEIGHT MACHINE LEARNING ALGORITHM...IJNSA Journal
Building practical and efficient intrusion detection systems in computer network is important in industrial areas today and machine learning technique provides a set of effective algorithms to detect network
intrusion. To find out appropriate algorithms for building such kinds of systems, it is necessary to evaluate various types of machine learning algorithms based on specific criteria. In this paper, we propose a novel evaluation formula which incorporates 6 indexes into our comprehensive measurement, including precision, recall, root mean square error, training time, sample complexity and practicability, in order to
find algorithms which have high detection rate, low training time, need less training samples and are easy
to use like constructing, understanding and analyzing models. Detailed evaluation process is designed to
get all necessary assessment indicators and 6 kinds of machine learning algorithms are evaluated.
Experimental results illustrate that Logistic Regression shows the best overall performance.
A NOVEL EVALUATION APPROACH TO FINDING LIGHTWEIGHT MACHINE LEARNING ALGORITHM...IJNSA Journal
This document proposes a novel evaluation approach to find lightweight machine learning algorithms for intrusion detection. It incorporates 6 evaluation indexes: precision, recall, root mean square error, training time, sample complexity, and practicability. The evaluation formula calculates a score for each algorithm based on F1 score and penalty values. The document defines penalty values for the practicability of 6 machine learning algorithms (decision tree, naive bayes, multilayer perceptron, radial basis function network, logistic regression, support vector machine). Experimental results on intrusion detection datasets will evaluate the algorithms based on the proposed approach.
An Identification and Detection of Fraudulence in Credit Card Fraud Transacti...IRJET Journal
This document summarizes a research paper that proposes a system for detecting fraudulent credit card transactions using data mining techniques. The system uses the Apriori algorithm to perform frequent item set mining on a credit card transaction dataset. It then uses the Support Vector Machine (SVM) classification method to match new transactions to either a legal transaction pattern database or a fraudulent transaction pattern database that was formed based on users' previous transactions. The results showed this proposed method achieved better fraud detection with a lower false alarm rate than existing methods like Hidden Markov Models.
Predicting reaction based on customer's transaction using machine learning a...IJECEIAES
Banking advertisements are important because they help target specific customers on subscribing to their packages or other deals by giving their current customers more fixed-term deposit offers. This is done through promotional advertisements on the Internet or media pages, and this task is the responsibility of the shopping department. In order to build a relationship with them, offer them the best deals, and be appropriate for the client with the company's assurance to recover these deposits, many banks or telecommunications firms store the data of their customers. The Portuguese bank increases its sales by establishing a relationship with its customers. This study proposes creating a prediction model using machine learning algorithms, to see how the customer reacts to subscribe to those fixed-term deposits or offers made with the aid of their past record. This classification is binary, i.e., the prediction of whether or not a customer will embrace these offers. Four classifiers that include k-nearest neighbor (k-NN) algorithm, decision tree, naive Bayes, and support vector machines (SVM) were used, and the best result was obtained from the classifier decision tree with an accuracy of 91% and the other classifier SVM with an accuracy of 89%.
Fraudulent Activities Detection in E-commerce WebsitesIRJET Journal
This document discusses methods for detecting fraudulent activities on e-commerce websites using machine learning algorithms. It begins with an introduction to e-commerce and the growing problem of fraud in online transactions. Then, it reviews related work applying techniques like decision trees, random forests, neural networks and deep learning for fraud detection. The document describes the methodology used, including preprocessing data, training and testing machine learning models, and evaluating performance. Specifically, it outlines approaches like k-nearest neighbors, decision trees, random forests and extreme gradient boosting for classification. Finally, it provides details on the dataset and features used for detecting fraudulent transactions based on user information, purchase details, devices and IP addresses.
Intrusion Detection System using K-Means Clustering and SMOTEIRJET Journal
The document proposes a hybrid sampling technique combining K-Means clustering and SMOTE oversampling to address data imbalance in network intrusion detection systems. It first applies K-Means clustering to handle outliers in the data, and then uses SMOTE to generate additional samples for the minority (intrusion) class. This produces a balanced dataset for training classification models like Random Forest and CNN. The technique is evaluated on the NSL-KDD dataset and achieves accuracy above 94% with these models, outperforming an alternative approach using DBSCAN and SMOTE.
IRJET- Survey on Credit Card Security System for Bank Transaction using N...IRJET Journal
This document summarizes a research paper that proposes a credit card fraud detection system using machine learning algorithms like Naive Bayes and Random Forest. It begins with an abstract describing the use of these classification algorithms on credit card transaction data to model past transactions and identify fraudulent ones. Then, it provides background on the increasing problem of credit card fraud and motivates the need for this research. It describes the Naive Bayes and Random Forest algorithms and evaluates their performance on this fraud detection task. Finally, it reviews related work applying machine learning to credit card fraud detection and discusses the results.
Online stream mining approach for clustering network trafficeSAT Journals
Abstract A large number of research have been proposed on intrusion detection system, which leads to the implementation of agent based intelligent IDS (IIDS), Non – intelligent IDS (NIDS), signature based IDS etc. While building such IDS models, learning algorithms from flow of network traffic plays crucial role in accuracy of IDS systems. The proposed work focuses on implementing the novel method to cluster network traffic which eliminates the limitations in existing online clustering algorithms and prove the robustness and accuracy over large stream of network traffic arriving at extremely high rate. We compare the existing algorithm with novel methods to analyse the accuracy and complexity. Keywords— NIDS, Data Stream Mining, Online Clustering, RAH algorithm, Online Efficient Incremental Clustering algorithm
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
A NOVEL CHARGING AND ACCOUNTING SCHEME IN MOBILE AD-HOC NETWORKSIJNSA Journal
Because of the lack of infrastructure in mobile ad hoc networks (MANETs), their proper functioning must rely on co-operations among mobile nodes. However, mobile nodes tend to save their own resources and may be reluctant to forward packets for other nodes. One approach to encourage co-operations among
nodes is to reward nodes that forward data for others. Such an incentive-based scheme requires a charging and accounting framework to control and manage rewards and fines (collected from users committing infractions). In this paper, we propose a novel charging and accounting scheme for MANETs. We present a detailed description of the proposed scheme and demonstrate its effectiveness via formal proofs and simulation results [15]. We develop a theoretical game model that offers advice to network administrators about the allocation of resources for monitoring mobile nodes. The solution provides the optimal monitoring probability, which discourages nodes from cheating because the gain would be compensated by
the penalty.
A NOVEL CHARGING AND ACCOUNTING SCHEME IN MOBILE AD-HOC NETWORKSIJNSA Journal
Because of the lack of infrastructure in mobile ad hoc networks (MANETs), their proper functioning must
rely on co-operations among mobile nodes. However, mobile nodes tend to save their own resources and
may be reluctant to forward packets for other nodes. One approach to encourage co-operations among
nodes is to reward nodes that forward data for others. Such an incentive-based scheme requires a charging
and accounting framework to control and manage rewards and fines (collected from users committing
infractions). In this paper, we propose a novel charging and accounting scheme for MANETs. We present a
detailed description of the proposed scheme and demonstrate its effectiveness via formal proofs and
simulation results [15]. We develop a theoretical game model that offers advice to network administrators
about the allocation of resources for monitoring mobile nodes. The solution provides the optimal
monitoring probability, which discourages nodes from cheating because the gain would be compensated by
the penalty.
A NOVEL CHARGING AND ACCOUNTING SCHEME IN MOBILE AD-HOC NETWORKSIJNSA Journal
Because of the lack of infrastructure in mobile ad hoc networks (MANETs), their proper functioning must rely on co-operations among mobile nodes. However, mobile nodes tend to save their own resources and may be reluctant to forward packets for other nodes. One approach to encourage co-operations among nodes is to reward nodes that forward data for others. Such an incentive-based scheme requires a charging and accounting framework to control and manage rewards and fines (collected from users committing infractions). In this paper, we propose a novel charging and accounting scheme for MANETs. We present a detailed description of the proposed scheme and demonstrate its effectiveness via formal proofs and simulation results [15]. We develop a theoretical game model that offers advice to network administrators about the allocation of resources for monitoring mobile nodes. The solution provides the optimal monitoring probability, which discourages nodes from cheating because the gain would be compensated by the penalty.
Machine Learning-Based Approaches for Fraud Detection in Credit Card Transact...IRJET Journal
This document presents a comparative study of machine learning approaches for credit card fraud detection. It explores isolation forest, local outlier factor (LOF), and one-class support vector machine (SVM) algorithms for fraud detection and compares their performance. The document first collects a dataset of credit card transactions containing both legitimate and fraudulent examples. It then implements and evaluates the machine learning models, assessing their ability to accurately identify fraud while minimizing false positives. Statistical tests and visualization techniques like ROC curves are used to analyze the models' performance. The best-performing aspects of each model are identified to inform optimal fraud detection.
Association rule visualization techniquemustafasmart
This document describes a project submitted for a degree in computer science. It discusses studying techniques for visualizing association rules discovered from databases by developed algorithms. The project aims to identify the strengths and weaknesses of these visualization techniques to determine the most appropriate for solving a main drawback of association rules, which is the huge number of extracted rules that cannot be manually inspected. The document provides background on data mining, association rules, and functional dependencies. It then outlines chapters that will explain the knowledge discovery process, association rule mining, and visualization techniques used for association rule visualization.
A Comparative Study for Credit Card Fraud Detection System using Machine Lear...IRJET Journal
This document presents a comparative study of machine learning models for credit card fraud detection. It discusses various machine learning and deep learning techniques used for credit card fraud detection systems, including neural networks, decision trees, logistic regression, random forests, convolutional neural networks, and more. It reviews related literature on using meta learning and neural networks for fraud detection. The paper aims to compare the performance of these different models for credit card fraud detection using datasets from banks containing labeled fraudulent and non-fraudulent transactions.
Analysis on different Data mining Techniques and algorithms used in IOTIJERA Editor
In this paper, we discusses about five functionalities of data mining in IOT that affects the performance and that
are: Data anomaly detection, Data clustering, Data classification, feature selection, time series prediction. Some
important algorithm has also been reviewed here of each functionalities that show advantages and limitations as
well as some new algorithm that are in research direction. Here we had represent knowledge view of data
mining in IOT.
Unsupervised Learning for Credit Card Fraud DetectionIRJET Journal
This document discusses using unsupervised learning techniques for credit card fraud detection. It proposes using clustering algorithms like self-organizing maps (SOM) to group transactions and detect anomalies. After clustering the data, association rule mining would be applied to each cluster to identify relationships between transaction attributes that indicate potential fraud. Peer group analysis is also mentioned as an unsupervised technique that monitors behavior over time to identify transactions deviating from a cardholder's normal spending patterns. The document concludes that this approach enables automated creation and improvement of fraud detection rules as transaction data evolves over time.
A Soft Set-based Co-occurrence for Clustering Web User TransactionsTELKOMNIKA JOURNAL
This document proposes a soft set-based approach for clustering web user transactions to achieve lower computational complexity and higher clustering purity compared to previous rough set approaches. Unlike rough set approaches that use similarity, the proposed approach uses a co-occurrence approach based on soft set theory. The soft set representation of web user transactions allows modeling as a binary-valued information system. The approach is evaluated in comparison to two previous rough set-based approaches, demonstrating better performance with over 100% lower computational complexity and higher cluster purity.
IRJET - Reporting and Management System for Online CrimeIRJET Journal
This document discusses the development of an online crime reporting and management system. The system allows users to view crime rates in different areas, receive alerts if they enter a high crime area, and file complaints online by uploading images and videos. It also allows police to track the status of complaints. The system uses algorithms like KNN, AES encryption, and K-means clustering. It discusses the benefits of online crime reporting compared to traditional methods. Design details are provided like the use case diagram, actors, and encryption process steps. Data mining techniques are also mentioned for analyzing crime patterns.
Extract Business Process Performance using Data MiningIJERA Editor
This paper aimed to analyze the performance of the business process using process mining. The performance is very important especially in large systems .the process of repairing devices was used as case study and the Fuzzy Miner Algorithm used to analyze the process model performance
This document analyzes a dataset of over 280,000 credit card transactions to detect fraudulent transactions using machine learning techniques. It first discusses challenges in credit card fraud detection like imbalanced data and lack of standard evaluation metrics. It then evaluates techniques like support vector machines, random forests, and local outlier factors. Analysis of the dataset found the data is highly skewed with few fraud cases. While models could achieve high accuracy by predicting all transactions as valid, other metrics are needed. The document concludes by implementing a local outlier factor model to detect patterns in fraudulent transactions, though accuracy in detecting fraud was low.
Machine Learning, K-means Algorithm Implementation with RIRJET Journal
This document discusses the implementation of the K-means clustering algorithm using R programming. It begins with an introduction to machine learning and the different types of machine learning algorithms. It then focuses on the K-means algorithm, describing the steps of the algorithm and how it is used for cluster analysis in unsupervised learning. The document then demonstrates implementing K-means clustering in R by generating sample data, initializing random centroids, calculating distances between data points and centroids, assigning data points to clusters based on closest centroid, recalculating centroids, and plotting the results. It concludes that K-means clustering is useful for gaining insights into dataset structure and was successfully implemented in R.
Mining Stream Data using k-Means clustering AlgorithmManishankar Medi
This document discusses using k-means clustering to analyze urban road traffic stream data. Stream data arrives continuously over time and is challenging to process due to its high volume, velocity and volatility. The document proposes using a sliding window technique with k-means clustering to analyze recent urban traffic data and visualize clusters in real-time to provide insights into traffic patterns and congested roads. This analysis could help travelers and authorities respond to traffic issues more quickly.
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...IJECEIAES
Medical image analysis has witnessed significant advancements with deep learning techniques. In the domain of brain tumor segmentation, the ability to
precisely delineate tumor boundaries from magnetic resonance imaging (MRI)
scans holds profound implications for diagnosis. This study presents an ensemble convolutional neural network (CNN) with transfer learning, integrating
the state-of-the-art Deeplabv3+ architecture with the ResNet18 backbone. The
model is rigorously trained and evaluated, exhibiting remarkable performance
metrics, including an impressive global accuracy of 99.286%, a high-class accuracy of 82.191%, a mean intersection over union (IoU) of 79.900%, a weighted
IoU of 98.620%, and a Boundary F1 (BF) score of 83.303%. Notably, a detailed comparative analysis with existing methods showcases the superiority of
our proposed model. These findings underscore the model’s competence in precise brain tumor localization, underscoring its potential to revolutionize medical
image analysis and enhance healthcare outcomes. This research paves the way
for future exploration and optimization of advanced CNN models in medical
imaging, emphasizing addressing false positives and resource efficiency.
Embedded machine learning-based road conditions and driving behavior monitoringIJECEIAES
Car accident rates have increased in recent years, resulting in losses in human lives, properties, and other financial costs. An embedded machine learning-based system is developed to address this critical issue. The system can monitor road conditions, detect driving patterns, and identify aggressive driving behaviors. The system is based on neural networks trained on a comprehensive dataset of driving events, driving styles, and road conditions. The system effectively detects potential risks and helps mitigate the frequency and impact of accidents. The primary goal is to ensure the safety of drivers and vehicles. Collecting data involved gathering information on three key road events: normal street and normal drive, speed bumps, circular yellow speed bumps, and three aggressive driving actions: sudden start, sudden stop, and sudden entry. The gathered data is processed and analyzed using a machine learning system designed for limited power and memory devices. The developed system resulted in 91.9% accuracy, 93.6% precision, and 92% recall. The achieved inference time on an Arduino Nano 33 BLE Sense with a 32-bit CPU running at 64 MHz is 34 ms and requires 2.6 kB peak RAM and 139.9 kB program flash memory, making it suitable for resource-constrained embedded systems.
More Related Content
Similar to The role of Louvain-coloring clustering in the detection of fraud transactions
Predicting reaction based on customer's transaction using machine learning a...IJECEIAES
Banking advertisements are important because they help target specific customers on subscribing to their packages or other deals by giving their current customers more fixed-term deposit offers. This is done through promotional advertisements on the Internet or media pages, and this task is the responsibility of the shopping department. In order to build a relationship with them, offer them the best deals, and be appropriate for the client with the company's assurance to recover these deposits, many banks or telecommunications firms store the data of their customers. The Portuguese bank increases its sales by establishing a relationship with its customers. This study proposes creating a prediction model using machine learning algorithms, to see how the customer reacts to subscribe to those fixed-term deposits or offers made with the aid of their past record. This classification is binary, i.e., the prediction of whether or not a customer will embrace these offers. Four classifiers that include k-nearest neighbor (k-NN) algorithm, decision tree, naive Bayes, and support vector machines (SVM) were used, and the best result was obtained from the classifier decision tree with an accuracy of 91% and the other classifier SVM with an accuracy of 89%.
Fraudulent Activities Detection in E-commerce WebsitesIRJET Journal
This document discusses methods for detecting fraudulent activities on e-commerce websites using machine learning algorithms. It begins with an introduction to e-commerce and the growing problem of fraud in online transactions. Then, it reviews related work applying techniques like decision trees, random forests, neural networks and deep learning for fraud detection. The document describes the methodology used, including preprocessing data, training and testing machine learning models, and evaluating performance. Specifically, it outlines approaches like k-nearest neighbors, decision trees, random forests and extreme gradient boosting for classification. Finally, it provides details on the dataset and features used for detecting fraudulent transactions based on user information, purchase details, devices and IP addresses.
Intrusion Detection System using K-Means Clustering and SMOTEIRJET Journal
The document proposes a hybrid sampling technique combining K-Means clustering and SMOTE oversampling to address data imbalance in network intrusion detection systems. It first applies K-Means clustering to handle outliers in the data, and then uses SMOTE to generate additional samples for the minority (intrusion) class. This produces a balanced dataset for training classification models like Random Forest and CNN. The technique is evaluated on the NSL-KDD dataset and achieves accuracy above 94% with these models, outperforming an alternative approach using DBSCAN and SMOTE.
IRJET- Survey on Credit Card Security System for Bank Transaction using N...IRJET Journal
This document summarizes a research paper that proposes a credit card fraud detection system using machine learning algorithms like Naive Bayes and Random Forest. It begins with an abstract describing the use of these classification algorithms on credit card transaction data to model past transactions and identify fraudulent ones. Then, it provides background on the increasing problem of credit card fraud and motivates the need for this research. It describes the Naive Bayes and Random Forest algorithms and evaluates their performance on this fraud detection task. Finally, it reviews related work applying machine learning to credit card fraud detection and discusses the results.
Online stream mining approach for clustering network trafficeSAT Journals
Abstract A large number of research have been proposed on intrusion detection system, which leads to the implementation of agent based intelligent IDS (IIDS), Non – intelligent IDS (NIDS), signature based IDS etc. While building such IDS models, learning algorithms from flow of network traffic plays crucial role in accuracy of IDS systems. The proposed work focuses on implementing the novel method to cluster network traffic which eliminates the limitations in existing online clustering algorithms and prove the robustness and accuracy over large stream of network traffic arriving at extremely high rate. We compare the existing algorithm with novel methods to analyse the accuracy and complexity. Keywords— NIDS, Data Stream Mining, Online Clustering, RAH algorithm, Online Efficient Incremental Clustering algorithm
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
A NOVEL CHARGING AND ACCOUNTING SCHEME IN MOBILE AD-HOC NETWORKSIJNSA Journal
Because of the lack of infrastructure in mobile ad hoc networks (MANETs), their proper functioning must rely on co-operations among mobile nodes. However, mobile nodes tend to save their own resources and may be reluctant to forward packets for other nodes. One approach to encourage co-operations among
nodes is to reward nodes that forward data for others. Such an incentive-based scheme requires a charging and accounting framework to control and manage rewards and fines (collected from users committing infractions). In this paper, we propose a novel charging and accounting scheme for MANETs. We present a detailed description of the proposed scheme and demonstrate its effectiveness via formal proofs and simulation results [15]. We develop a theoretical game model that offers advice to network administrators about the allocation of resources for monitoring mobile nodes. The solution provides the optimal monitoring probability, which discourages nodes from cheating because the gain would be compensated by
the penalty.
A NOVEL CHARGING AND ACCOUNTING SCHEME IN MOBILE AD-HOC NETWORKSIJNSA Journal
Because of the lack of infrastructure in mobile ad hoc networks (MANETs), their proper functioning must
rely on co-operations among mobile nodes. However, mobile nodes tend to save their own resources and
may be reluctant to forward packets for other nodes. One approach to encourage co-operations among
nodes is to reward nodes that forward data for others. Such an incentive-based scheme requires a charging
and accounting framework to control and manage rewards and fines (collected from users committing
infractions). In this paper, we propose a novel charging and accounting scheme for MANETs. We present a
detailed description of the proposed scheme and demonstrate its effectiveness via formal proofs and
simulation results [15]. We develop a theoretical game model that offers advice to network administrators
about the allocation of resources for monitoring mobile nodes. The solution provides the optimal
monitoring probability, which discourages nodes from cheating because the gain would be compensated by
the penalty.
A NOVEL CHARGING AND ACCOUNTING SCHEME IN MOBILE AD-HOC NETWORKSIJNSA Journal
Because of the lack of infrastructure in mobile ad hoc networks (MANETs), their proper functioning must rely on co-operations among mobile nodes. However, mobile nodes tend to save their own resources and may be reluctant to forward packets for other nodes. One approach to encourage co-operations among nodes is to reward nodes that forward data for others. Such an incentive-based scheme requires a charging and accounting framework to control and manage rewards and fines (collected from users committing infractions). In this paper, we propose a novel charging and accounting scheme for MANETs. We present a detailed description of the proposed scheme and demonstrate its effectiveness via formal proofs and simulation results [15]. We develop a theoretical game model that offers advice to network administrators about the allocation of resources for monitoring mobile nodes. The solution provides the optimal monitoring probability, which discourages nodes from cheating because the gain would be compensated by the penalty.
Machine Learning-Based Approaches for Fraud Detection in Credit Card Transact...IRJET Journal
This document presents a comparative study of machine learning approaches for credit card fraud detection. It explores isolation forest, local outlier factor (LOF), and one-class support vector machine (SVM) algorithms for fraud detection and compares their performance. The document first collects a dataset of credit card transactions containing both legitimate and fraudulent examples. It then implements and evaluates the machine learning models, assessing their ability to accurately identify fraud while minimizing false positives. Statistical tests and visualization techniques like ROC curves are used to analyze the models' performance. The best-performing aspects of each model are identified to inform optimal fraud detection.
Association rule visualization techniquemustafasmart
This document describes a project submitted for a degree in computer science. It discusses studying techniques for visualizing association rules discovered from databases by developed algorithms. The project aims to identify the strengths and weaknesses of these visualization techniques to determine the most appropriate for solving a main drawback of association rules, which is the huge number of extracted rules that cannot be manually inspected. The document provides background on data mining, association rules, and functional dependencies. It then outlines chapters that will explain the knowledge discovery process, association rule mining, and visualization techniques used for association rule visualization.
A Comparative Study for Credit Card Fraud Detection System using Machine Lear...IRJET Journal
This document presents a comparative study of machine learning models for credit card fraud detection. It discusses various machine learning and deep learning techniques used for credit card fraud detection systems, including neural networks, decision trees, logistic regression, random forests, convolutional neural networks, and more. It reviews related literature on using meta learning and neural networks for fraud detection. The paper aims to compare the performance of these different models for credit card fraud detection using datasets from banks containing labeled fraudulent and non-fraudulent transactions.
Analysis on different Data mining Techniques and algorithms used in IOTIJERA Editor
In this paper, we discusses about five functionalities of data mining in IOT that affects the performance and that
are: Data anomaly detection, Data clustering, Data classification, feature selection, time series prediction. Some
important algorithm has also been reviewed here of each functionalities that show advantages and limitations as
well as some new algorithm that are in research direction. Here we had represent knowledge view of data
mining in IOT.
Unsupervised Learning for Credit Card Fraud DetectionIRJET Journal
This document discusses using unsupervised learning techniques for credit card fraud detection. It proposes using clustering algorithms like self-organizing maps (SOM) to group transactions and detect anomalies. After clustering the data, association rule mining would be applied to each cluster to identify relationships between transaction attributes that indicate potential fraud. Peer group analysis is also mentioned as an unsupervised technique that monitors behavior over time to identify transactions deviating from a cardholder's normal spending patterns. The document concludes that this approach enables automated creation and improvement of fraud detection rules as transaction data evolves over time.
A Soft Set-based Co-occurrence for Clustering Web User TransactionsTELKOMNIKA JOURNAL
This document proposes a soft set-based approach for clustering web user transactions to achieve lower computational complexity and higher clustering purity compared to previous rough set approaches. Unlike rough set approaches that use similarity, the proposed approach uses a co-occurrence approach based on soft set theory. The soft set representation of web user transactions allows modeling as a binary-valued information system. The approach is evaluated in comparison to two previous rough set-based approaches, demonstrating better performance with over 100% lower computational complexity and higher cluster purity.
IRJET - Reporting and Management System for Online CrimeIRJET Journal
This document discusses the development of an online crime reporting and management system. The system allows users to view crime rates in different areas, receive alerts if they enter a high crime area, and file complaints online by uploading images and videos. It also allows police to track the status of complaints. The system uses algorithms like KNN, AES encryption, and K-means clustering. It discusses the benefits of online crime reporting compared to traditional methods. Design details are provided like the use case diagram, actors, and encryption process steps. Data mining techniques are also mentioned for analyzing crime patterns.
Extract Business Process Performance using Data MiningIJERA Editor
This paper aimed to analyze the performance of the business process using process mining. The performance is very important especially in large systems .the process of repairing devices was used as case study and the Fuzzy Miner Algorithm used to analyze the process model performance
This document analyzes a dataset of over 280,000 credit card transactions to detect fraudulent transactions using machine learning techniques. It first discusses challenges in credit card fraud detection like imbalanced data and lack of standard evaluation metrics. It then evaluates techniques like support vector machines, random forests, and local outlier factors. Analysis of the dataset found the data is highly skewed with few fraud cases. While models could achieve high accuracy by predicting all transactions as valid, other metrics are needed. The document concludes by implementing a local outlier factor model to detect patterns in fraudulent transactions, though accuracy in detecting fraud was low.
Machine Learning, K-means Algorithm Implementation with RIRJET Journal
This document discusses the implementation of the K-means clustering algorithm using R programming. It begins with an introduction to machine learning and the different types of machine learning algorithms. It then focuses on the K-means algorithm, describing the steps of the algorithm and how it is used for cluster analysis in unsupervised learning. The document then demonstrates implementing K-means clustering in R by generating sample data, initializing random centroids, calculating distances between data points and centroids, assigning data points to clusters based on closest centroid, recalculating centroids, and plotting the results. It concludes that K-means clustering is useful for gaining insights into dataset structure and was successfully implemented in R.
Mining Stream Data using k-Means clustering AlgorithmManishankar Medi
This document discusses using k-means clustering to analyze urban road traffic stream data. Stream data arrives continuously over time and is challenging to process due to its high volume, velocity and volatility. The document proposes using a sliding window technique with k-means clustering to analyze recent urban traffic data and visualize clusters in real-time to provide insights into traffic patterns and congested roads. This analysis could help travelers and authorities respond to traffic issues more quickly.
Similar to The role of Louvain-coloring clustering in the detection of fraud transactions (20)
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...IJECEIAES
Medical image analysis has witnessed significant advancements with deep learning techniques. In the domain of brain tumor segmentation, the ability to
precisely delineate tumor boundaries from magnetic resonance imaging (MRI)
scans holds profound implications for diagnosis. This study presents an ensemble convolutional neural network (CNN) with transfer learning, integrating
the state-of-the-art Deeplabv3+ architecture with the ResNet18 backbone. The
model is rigorously trained and evaluated, exhibiting remarkable performance
metrics, including an impressive global accuracy of 99.286%, a high-class accuracy of 82.191%, a mean intersection over union (IoU) of 79.900%, a weighted
IoU of 98.620%, and a Boundary F1 (BF) score of 83.303%. Notably, a detailed comparative analysis with existing methods showcases the superiority of
our proposed model. These findings underscore the model’s competence in precise brain tumor localization, underscoring its potential to revolutionize medical
image analysis and enhance healthcare outcomes. This research paves the way
for future exploration and optimization of advanced CNN models in medical
imaging, emphasizing addressing false positives and resource efficiency.
Embedded machine learning-based road conditions and driving behavior monitoringIJECEIAES
Car accident rates have increased in recent years, resulting in losses in human lives, properties, and other financial costs. An embedded machine learning-based system is developed to address this critical issue. The system can monitor road conditions, detect driving patterns, and identify aggressive driving behaviors. The system is based on neural networks trained on a comprehensive dataset of driving events, driving styles, and road conditions. The system effectively detects potential risks and helps mitigate the frequency and impact of accidents. The primary goal is to ensure the safety of drivers and vehicles. Collecting data involved gathering information on three key road events: normal street and normal drive, speed bumps, circular yellow speed bumps, and three aggressive driving actions: sudden start, sudden stop, and sudden entry. The gathered data is processed and analyzed using a machine learning system designed for limited power and memory devices. The developed system resulted in 91.9% accuracy, 93.6% precision, and 92% recall. The achieved inference time on an Arduino Nano 33 BLE Sense with a 32-bit CPU running at 64 MHz is 34 ms and requires 2.6 kB peak RAM and 139.9 kB program flash memory, making it suitable for resource-constrained embedded systems.
Advanced control scheme of doubly fed induction generator for wind turbine us...IJECEIAES
This paper describes a speed control device for generating electrical energy on an electricity network based on the doubly fed induction generator (DFIG) used for wind power conversion systems. At first, a double-fed induction generator model was constructed. A control law is formulated to govern the flow of energy between the stator of a DFIG and the energy network using three types of controllers: proportional integral (PI), sliding mode controller (SMC) and second order sliding mode controller (SOSMC). Their different results in terms of power reference tracking, reaction to unexpected speed fluctuations, sensitivity to perturbations, and resilience against machine parameter alterations are compared. MATLAB/Simulink was used to conduct the simulations for the preceding study. Multiple simulations have shown very satisfying results, and the investigations demonstrate the efficacy and power-enhancing capabilities of the suggested control system.
Neural network optimizer of proportional-integral-differential controller par...IJECEIAES
Wide application of proportional-integral-differential (PID)-regulator in industry requires constant improvement of methods of its parameters adjustment. The paper deals with the issues of optimization of PID-regulator parameters with the use of neural network technology methods. A methodology for choosing the architecture (structure) of neural network optimizer is proposed, which consists in determining the number of layers, the number of neurons in each layer, as well as the form and type of activation function. Algorithms of neural network training based on the application of the method of minimizing the mismatch between the regulated value and the target value are developed. The method of back propagation of gradients is proposed to select the optimal training rate of neurons of the neural network. The neural network optimizer, which is a superstructure of the linear PID controller, allows increasing the regulation accuracy from 0.23 to 0.09, thus reducing the power consumption from 65% to 53%. The results of the conducted experiments allow us to conclude that the created neural superstructure may well become a prototype of an automatic voltage regulator (AVR)-type industrial controller for tuning the parameters of the PID controller.
An improved modulation technique suitable for a three level flying capacitor ...IJECEIAES
This research paper introduces an innovative modulation technique for controlling a 3-level flying capacitor multilevel inverter (FCMLI), aiming to streamline the modulation process in contrast to conventional methods. The proposed
simplified modulation technique paves the way for more straightforward and
efficient control of multilevel inverters, enabling their widespread adoption and
integration into modern power electronic systems. Through the amalgamation of
sinusoidal pulse width modulation (SPWM) with a high-frequency square wave
pulse, this controlling technique attains energy equilibrium across the coupling
capacitor. The modulation scheme incorporates a simplified switching pattern
and a decreased count of voltage references, thereby simplifying the control
algorithm.
A review on features and methods of potential fishing zoneIJECEIAES
This review focuses on the importance of identifying potential fishing zones in seawater for sustainable fishing practices. It explores features like sea surface temperature (SST) and sea surface height (SSH), along with classification methods such as classifiers. The features like SST, SSH, and different classifiers used to classify the data, have been figured out in this review study. This study underscores the importance of examining potential fishing zones using advanced analytical techniques. It thoroughly explores the methodologies employed by researchers, covering both past and current approaches. The examination centers on data characteristics and the application of classification algorithms for classification of potential fishing zones. Furthermore, the prediction of potential fishing zones relies significantly on the effectiveness of classification algorithms. Previous research has assessed the performance of models like support vector machines, naïve Bayes, and artificial neural networks (ANN). In the previous result, the results of support vector machine (SVM) were 97.6% more accurate than naive Bayes's 94.2% to classify test data for fisheries classification. By considering the recent works in this area, several recommendations for future works are presented to further improve the performance of the potential fishing zone models, which is important to the fisheries community.
Electrical signal interference minimization using appropriate core material f...IJECEIAES
As demand for smaller, quicker, and more powerful devices rises, Moore's law is strictly followed. The industry has worked hard to make little devices that boost productivity. The goal is to optimize device density. Scientists are reducing connection delays to improve circuit performance. This helped them understand three-dimensional integrated circuit (3D IC) concepts, which stack active devices and create vertical connections to diminish latency and lower interconnects. Electrical involvement is a big worry with 3D integrates circuits. Researchers have developed and tested through silicon via (TSV) and substrates to decrease electrical wave involvement. This study illustrates a novel noise coupling reduction method using several electrical involvement models. A 22% drop in electrical involvement from wave-carrying to victim TSVs introduces this new paradigm and improves system performance even at higher THz frequencies.
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...IJECEIAES
Climate change's impact on the planet forced the United Nations and governments to promote green energies and electric transportation. The deployments of photovoltaic (PV) and electric vehicle (EV) systems gained stronger momentum due to their numerous advantages over fossil fuel types. The advantages go beyond sustainability to reach financial support and stability. The work in this paper introduces the hybrid system between PV and EV to support industrial and commercial plants. This paper covers the theoretical framework of the proposed hybrid system including the required equation to complete the cost analysis when PV and EV are present. In addition, the proposed design diagram which sets the priorities and requirements of the system is presented. The proposed approach allows setup to advance their power stability, especially during power outages. The presented information supports researchers and plant owners to complete the necessary analysis while promoting the deployment of clean energy. The result of a case study that represents a dairy milk farmer supports the theoretical works and highlights its advanced benefits to existing plants. The short return on investment of the proposed approach supports the paper's novelty approach for the sustainable electrical system. In addition, the proposed system allows for an isolated power setup without the need for a transmission line which enhances the safety of the electrical network
Bibliometric analysis highlighting the role of women in addressing climate ch...IJECEIAES
Fossil fuel consumption increased quickly, contributing to climate change
that is evident in unusual flooding and draughts, and global warming. Over
the past ten years, women's involvement in society has grown dramatically,
and they succeeded in playing a noticeable role in reducing climate change.
A bibliometric analysis of data from the last ten years has been carried out to
examine the role of women in addressing the climate change. The analysis's
findings discussed the relevant to the sustainable development goals (SDGs),
particularly SDG 7 and SDG 13. The results considered contributions made
by women in the various sectors while taking geographic dispersion into
account. The bibliometric analysis delves into topics including women's
leadership in environmental groups, their involvement in policymaking, their
contributions to sustainable development projects, and the influence of
gender diversity on attempts to mitigate climate change. This study's results
highlight how women have influenced policies and actions related to climate
change, point out areas of research deficiency and recommendations on how
to increase role of the women in addressing the climate change and
achieving sustainability. To achieve more successful results, this initiative
aims to highlight the significance of gender equality and encourage
inclusivity in climate change decision-making processes.
Voltage and frequency control of microgrid in presence of micro-turbine inter...IJECEIAES
The active and reactive load changes have a significant impact on voltage
and frequency. In this paper, in order to stabilize the microgrid (MG) against
load variations in islanding mode, the active and reactive power of all
distributed generators (DGs), including energy storage (battery), diesel
generator, and micro-turbine, are controlled. The micro-turbine generator is
connected to MG through a three-phase to three-phase matrix converter, and
the droop control method is applied for controlling the voltage and
frequency of MG. In addition, a method is introduced for voltage and
frequency control of micro-turbines in the transition state from gridconnected mode to islanding mode. A novel switching strategy of the matrix
converter is used for converting the high-frequency output voltage of the
micro-turbine to the grid-side frequency of the utility system. Moreover,
using the switching strategy, the low-order harmonics in the output current
and voltage are not produced, and consequently, the size of the output filter
would be reduced. In fact, the suggested control strategy is load-independent
and has no frequency conversion restrictions. The proposed approach for
voltage and frequency regulation demonstrates exceptional performance and
favorable response across various load alteration scenarios. The suggested
strategy is examined in several scenarios in the MG test systems, and the
simulation results are addressed.
Enhancing battery system identification: nonlinear autoregressive modeling fo...IJECEIAES
Precisely characterizing Li-ion batteries is essential for optimizing their
performance, enhancing safety, and prolonging their lifespan across various
applications, such as electric vehicles and renewable energy systems. This
article introduces an innovative nonlinear methodology for system
identification of a Li-ion battery, employing a nonlinear autoregressive with
exogenous inputs (NARX) model. The proposed approach integrates the
benefits of nonlinear modeling with the adaptability of the NARX structure,
facilitating a more comprehensive representation of the intricate
electrochemical processes within the battery. Experimental data collected
from a Li-ion battery operating under diverse scenarios are employed to
validate the effectiveness of the proposed methodology. The identified
NARX model exhibits superior accuracy in predicting the battery's behavior
compared to traditional linear models. This study underscores the
importance of accounting for nonlinearities in battery modeling, providing
insights into the intricate relationships between state-of-charge, voltage, and
current under dynamic conditions.
Smart grid deployment: from a bibliometric analysis to a surveyIJECEIAES
Smart grids are one of the last decades' innovations in electrical energy.
They bring relevant advantages compared to the traditional grid and
significant interest from the research community. Assessing the field's
evolution is essential to propose guidelines for facing new and future smart
grid challenges. In addition, knowing the main technologies involved in the
deployment of smart grids (SGs) is important to highlight possible
shortcomings that can be mitigated by developing new tools. This paper
contributes to the research trends mentioned above by focusing on two
objectives. First, a bibliometric analysis is presented to give an overview of
the current research level about smart grid deployment. Second, a survey of
the main technological approaches used for smart grid implementation and
their contributions are highlighted. To that effect, we searched the Web of
Science (WoS), and the Scopus databases. We obtained 5,663 documents
from WoS and 7,215 from Scopus on smart grid implementation or
deployment. With the extraction limitation in the Scopus database, 5,872 of
the 7,215 documents were extracted using a multi-step process. These two
datasets have been analyzed using a bibliometric tool called bibliometrix.
The main outputs are presented with some recommendations for future
research.
Use of analytical hierarchy process for selecting and prioritizing islanding ...IJECEIAES
One of the problems that are associated to power systems is islanding
condition, which must be rapidly and properly detected to prevent any
negative consequences on the system's protection, stability, and security.
This paper offers a thorough overview of several islanding detection
strategies, which are divided into two categories: classic approaches,
including local and remote approaches, and modern techniques, including
techniques based on signal processing and computational intelligence.
Additionally, each approach is compared and assessed based on several
factors, including implementation costs, non-detected zones, declining
power quality, and response times using the analytical hierarchy process
(AHP). The multi-criteria decision-making analysis shows that the overall
weight of passive methods (24.7%), active methods (7.8%), hybrid methods
(5.6%), remote methods (14.5%), signal processing-based methods (26.6%),
and computational intelligent-based methods (20.8%) based on the
comparison of all criteria together. Thus, it can be seen from the total weight
that hybrid approaches are the least suitable to be chosen, while signal
processing-based methods are the most appropriate islanding detection
method to be selected and implemented in power system with respect to the
aforementioned factors. Using Expert Choice software, the proposed
hierarchy model is studied and examined.
Enhancing of single-stage grid-connected photovoltaic system using fuzzy logi...IJECEIAES
The power generated by photovoltaic (PV) systems is influenced by
environmental factors. This variability hampers the control and utilization of
solar cells' peak output. In this study, a single-stage grid-connected PV
system is designed to enhance power quality. Our approach employs fuzzy
logic in the direct power control (DPC) of a three-phase voltage source
inverter (VSI), enabling seamless integration of the PV connected to the
grid. Additionally, a fuzzy logic-based maximum power point tracking
(MPPT) controller is adopted, which outperforms traditional methods like
incremental conductance (INC) in enhancing solar cell efficiency and
minimizing the response time. Moreover, the inverter's real-time active and
reactive power is directly managed to achieve a unity power factor (UPF).
The system's performance is assessed through MATLAB/Simulink
implementation, showing marked improvement over conventional methods,
particularly in steady-state and varying weather conditions. For solar
irradiances of 500 and 1,000 W/m2
, the results show that the proposed
method reduces the total harmonic distortion (THD) of the injected current
to the grid by approximately 46% and 38% compared to conventional
methods, respectively. Furthermore, we compare the simulation results with
IEEE standards to evaluate the system's grid compatibility.
Enhancing photovoltaic system maximum power point tracking with fuzzy logic-b...IJECEIAES
Photovoltaic systems have emerged as a promising energy resource that
caters to the future needs of society, owing to their renewable, inexhaustible,
and cost-free nature. The power output of these systems relies on solar cell
radiation and temperature. In order to mitigate the dependence on
atmospheric conditions and enhance power tracking, a conventional
approach has been improved by integrating various methods. To optimize
the generation of electricity from solar systems, the maximum power point
tracking (MPPT) technique is employed. To overcome limitations such as
steady-state voltage oscillations and improve transient response, two
traditional MPPT methods, namely fuzzy logic controller (FLC) and perturb
and observe (P&O), have been modified. This research paper aims to
simulate and validate the step size of the proposed modified P&O and FLC
techniques within the MPPT algorithm using MATLAB/Simulink for
efficient power tracking in photovoltaic systems.
Adaptive synchronous sliding control for a robot manipulator based on neural ...IJECEIAES
Robot manipulators have become important equipment in production lines, medical fields, and transportation. Improving the quality of trajectory tracking for
robot hands is always an attractive topic in the research community. This is a
challenging problem because robot manipulators are complex nonlinear systems
and are often subject to fluctuations in loads and external disturbances. This
article proposes an adaptive synchronous sliding control scheme to improve trajectory tracking performance for a robot manipulator. The proposed controller
ensures that the positions of the joints track the desired trajectory, synchronize
the errors, and significantly reduces chattering. First, the synchronous tracking
errors and synchronous sliding surfaces are presented. Second, the synchronous
tracking error dynamics are determined. Third, a robust adaptive control law is
designed,the unknown components of the model are estimated online by the neural network, and the parameters of the switching elements are selected by fuzzy
logic. The built algorithm ensures that the tracking and approximation errors
are ultimately uniformly bounded (UUB). Finally, the effectiveness of the constructed algorithm is demonstrated through simulation and experimental results.
Simulation and experimental results show that the proposed controller is effective with small synchronous tracking errors, and the chattering phenomenon is
significantly reduced.
Remote field-programmable gate array laboratory for signal acquisition and de...IJECEIAES
A remote laboratory utilizing field-programmable gate array (FPGA) technologies enhances students’ learning experience anywhere and anytime in embedded system design. Existing remote laboratories prioritize hardware access and visual feedback for observing board behavior after programming, neglecting comprehensive debugging tools to resolve errors that require internal signal acquisition. This paper proposes a novel remote embeddedsystem design approach targeting FPGA technologies that are fully interactive via a web-based platform. Our solution provides FPGA board access and debugging capabilities beyond the visual feedback provided by existing remote laboratories. We implemented a lab module that allows users to seamlessly incorporate into their FPGA design. The module minimizes hardware resource utilization while enabling the acquisition of a large number of data samples from the signal during the experiments by adaptively compressing the signal prior to data transmission. The results demonstrate an average compression ratio of 2.90 across three benchmark signals, indicating efficient signal acquisition and effective debugging and analysis. This method allows users to acquire more data samples than conventional methods. The proposed lab allows students to remotely test and debug their designs, bridging the gap between theory and practice in embedded system design.
Detecting and resolving feature envy through automated machine learning and m...IJECEIAES
Efficiently identifying and resolving code smells enhances software project quality. This paper presents a novel solution, utilizing automated machine learning (AutoML) techniques, to detect code smells and apply move method refactoring. By evaluating code metrics before and after refactoring, we assessed its impact on coupling, complexity, and cohesion. Key contributions of this research include a unique dataset for code smell classification and the development of models using AutoGluon for optimal performance. Furthermore, the study identifies the top 20 influential features in classifying feature envy, a well-known code smell, stemming from excessive reliance on external classes. We also explored how move method refactoring addresses feature envy, revealing reduced coupling and complexity, and improved cohesion, ultimately enhancing code quality. In summary, this research offers an empirical, data-driven approach, integrating AutoML and move method refactoring to optimize software project quality. Insights gained shed light on the benefits of refactoring on code quality and the significance of specific features in detecting feature envy. Future research can expand to explore additional refactoring techniques and a broader range of code metrics, advancing software engineering practices and standards.
Smart monitoring technique for solar cell systems using internet of things ba...IJECEIAES
Rapidly and remotely monitoring and receiving the solar cell systems status parameters, solar irradiance, temperature, and humidity, are critical issues in enhancement their efficiency. Hence, in the present article an improved smart prototype of internet of things (IoT) technique based on embedded system through NodeMCU ESP8266 (ESP-12E) was carried out experimentally. Three different regions at Egypt; Luxor, Cairo, and El-Beheira cities were chosen to study their solar irradiance profile, temperature, and humidity by the proposed IoT system. The monitoring data of solar irradiance, temperature, and humidity were live visualized directly by Ubidots through hypertext transfer protocol (HTTP) protocol. The measured solar power radiation in Luxor, Cairo, and El-Beheira ranged between 216-1000, 245-958, and 187-692 W/m 2 respectively during the solar day. The accuracy and rapidity of obtaining monitoring results using the proposed IoT system made it a strong candidate for application in monitoring solar cell systems. On the other hand, the obtained solar power radiation results of the three considered regions strongly candidate Luxor and Cairo as suitable places to build up a solar cells system station rather than El-Beheira.
An efficient security framework for intrusion detection and prevention in int...IJECEIAES
Over the past few years, the internet of things (IoT) has advanced to connect billions of smart devices to improve quality of life. However, anomalies or malicious intrusions pose several security loopholes, leading to performance degradation and threat to data security in IoT operations. Thereby, IoT security systems must keep an eye on and restrict unwanted events from occurring in the IoT network. Recently, various technical solutions based on machine learning (ML) models have been derived towards identifying and restricting unwanted events in IoT. However, most ML-based approaches are prone to miss-classification due to inappropriate feature selection. Additionally, most ML approaches applied to intrusion detection and prevention consider supervised learning, which requires a large amount of labeled data to be trained. Consequently, such complex datasets are impossible to source in a large network like IoT. To address this problem, this proposed study introduces an efficient learning mechanism to strengthen the IoT security aspects. The proposed algorithm incorporates supervised and unsupervised approaches to improve the learning models for intrusion detection and mitigation. Compared with the related works, the experimental outcome shows that the model performs well in a benchmark dataset. It accomplishes an improved detection accuracy of approximately 99.21%.
Batteries -Introduction – Types of Batteries – discharging and charging of battery - characteristics of battery –battery rating- various tests on battery- – Primary battery: silver button cell- Secondary battery :Ni-Cd battery-modern battery: lithium ion battery-maintenance of batteries-choices of batteries for electric vehicle applications.
Fuel Cells: Introduction- importance and classification of fuel cells - description, principle, components, applications of fuel cells: H2-O2 fuel cell, alkaline fuel cell, molten carbonate fuel cell and direct methanol fuel cells.
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTjpsjournal1
The rivalry between prominent international actors for dominance over Central Asia's hydrocarbon
reserves and the ancient silk trade route, along with China's diplomatic endeavours in the area, has been
referred to as the "New Great Game." This research centres on the power struggle, considering
geopolitical, geostrategic, and geoeconomic variables. Topics including trade, political hegemony, oil
politics, and conventional and nontraditional security are all explored and explained by the researcher.
Using Mackinder's Heartland, Spykman Rimland, and Hegemonic Stability theories, examines China's role
in Central Asia. This study adheres to the empirical epistemological method and has taken care of
objectivity. This study analyze primary and secondary research documents critically to elaborate role of
china’s geo economic outreach in central Asian countries and its future prospect. China is thriving in trade,
pipeline politics, and winning states, according to this study, thanks to important instruments like the
Shanghai Cooperation Organisation and the Belt and Road Economic Initiative. According to this study,
China is seeing significant success in commerce, pipeline politics, and gaining influence on other
governments. This success may be attributed to the effective utilisation of key tools such as the Shanghai
Cooperation Organisation and the Belt and Road Economic Initiative.
Understanding Inductive Bias in Machine LearningSUTEJAS
This presentation explores the concept of inductive bias in machine learning. It explains how algorithms come with built-in assumptions and preferences that guide the learning process. You'll learn about the different types of inductive bias and how they can impact the performance and generalizability of machine learning models.
The presentation also covers the positive and negative aspects of inductive bias, along with strategies for mitigating potential drawbacks. We'll explore examples of how bias manifests in algorithms like neural networks and decision trees.
By understanding inductive bias, you can gain valuable insights into how machine learning models work and make informed decisions when building and deploying them.
Comparative analysis between traditional aquaponics and reconstructed aquapon...bijceesjournal
The aquaponic system of planting is a method that does not require soil usage. It is a method that only needs water, fish, lava rocks (a substitute for soil), and plants. Aquaponic systems are sustainable and environmentally friendly. Its use not only helps to plant in small spaces but also helps reduce artificial chemical use and minimizes excess water use, as aquaponics consumes 90% less water than soil-based gardening. The study applied a descriptive and experimental design to assess and compare conventional and reconstructed aquaponic methods for reproducing tomatoes. The researchers created an observation checklist to determine the significant factors of the study. The study aims to determine the significant difference between traditional aquaponics and reconstructed aquaponics systems propagating tomatoes in terms of height, weight, girth, and number of fruits. The reconstructed aquaponics system’s higher growth yield results in a much more nourished crop than the traditional aquaponics system. It is superior in its number of fruits, height, weight, and girth measurement. Moreover, the reconstructed aquaponics system is proven to eliminate all the hindrances present in the traditional aquaponics system, which are overcrowding of fish, algae growth, pest problems, contaminated water, and dead fish.
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMSIJNSA Journal
The smart irrigation system represents an innovative approach to optimize water usage in agricultural and landscaping practices. The integration of cutting-edge technologies, including sensors, actuators, and data analysis, empowers this system to provide accurate monitoring and control of irrigation processes by leveraging real-time environmental conditions. The main objective of a smart irrigation system is to optimize water efficiency, minimize expenses, and foster the adoption of sustainable water management methods. This paper conducts a systematic risk assessment by exploring the key components/assets and their functionalities in the smart irrigation system. The crucial role of sensors in gathering data on soil moisture, weather patterns, and plant well-being is emphasized in this system. These sensors enable intelligent decision-making in irrigation scheduling and water distribution, leading to enhanced water efficiency and sustainable water management practices. Actuators enable automated control of irrigation devices, ensuring precise and targeted water delivery to plants. Additionally, the paper addresses the potential threat and vulnerabilities associated with smart irrigation systems. It discusses limitations of the system, such as power constraints and computational capabilities, and calculates the potential security risks. The paper suggests possible risk treatment methods for effective secure system operation. In conclusion, the paper emphasizes the significant benefits of implementing smart irrigation systems, including improved water conservation, increased crop yield, and reduced environmental impact. Additionally, based on the security analysis conducted, the paper recommends the implementation of countermeasures and security approaches to address vulnerabilities and ensure the integrity and reliability of the system. By incorporating these measures, smart irrigation technology can revolutionize water management practices in agriculture, promoting sustainability, resource efficiency, and safeguarding against potential security threats.
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsVictor Morales
K8sGPT is a tool that analyzes and diagnoses Kubernetes clusters. This presentation was used to share the requirements and dependencies to deploy K8sGPT in a local environment.
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
The role of Louvain-coloring clustering in the detection of fraud transactions
1. International Journal of Electrical and Computer Engineering (IJECE)
Vol. 14, No. 1, February 2024, pp. 608~616
ISSN: 2088-8708, DOI: 10.11591/ijece.v14i1.pp608-616 608
Journal homepage: http://ijece.iaescore.com
The role of Louvain-coloring clustering in the detection of fraud
transactions
Heru Mardiansyah2
, Saib Suwilo1,2
, Erna Budhiarti Nababan1
, Syahril Efendi2
1
Department of Information Technology, Faculty of Computer Science and Information Technology, Universitas Sumatera Utara,
Medan, Indonesia
2
Department of Computer Science, Faculty of Computer Science and Information Technology, Universitas Sumatera Utara,
Medan, Indonesia
Article Info ABSTRACT
Article history:
Received May 12, 2023
Revised Jul 13, 2023
Accepted Jul 17, 2023
Clustering is a technique in data mining capable of grouping very large
amounts of data to gain new knowledge based on unsupervised learning.
Clustering is capable of grouping various types of data and fields. The
process that requires this technique is in the business sector, especially
banking. In the transaction business process in banking, fraud is often
encountered in transactions. This raises interest in clustering data fraud in
transactions. An algorithm is needed in the cluster, namely Louvain’s
algorithm. Louvain’s algorithm is capable of clustering in large numbers,
which represent them in a graph. So, the Louvain algorithm is optimized
with colored graphs to facilitate research continuity in labeling. In this study,
33,491 non-fraud data were grouped, and 241 fraud transaction data were
carried out. However, Louvain’s algorithm shows that clustering increases
the amount of data fraud of 90% by accurate.
Keywords:
Cluster
Clustering
Data mining
Fraud
Graph
Louvain algorithm
Transaction
This is an open access article under the CC BY-SA license.
Corresponding Author:
Saib Suwilo
Department of Information Technology, Faculty of Computer Science and Information Technology,
Universitas Sumatera Utara
Medan, Indonesia
Email: saib@usu.ac.id
1. INTRODUCTION
One of data mining techniques using unsupervised learning is clusters [1], [2]. Where unsupervised
learning is learning without information so that it becomes new knowledge based on large data sources
[3], [4]. Clustering itself aims to group data on the same characteristics and areas based on the required
characteristics of other regions [5], [6]. Various studies are often carried out by clustering, as was done [7] do
managerial clustering which has a lot of data and expertise in decision makers within the company so that it
can be used as a reference in changing leadership. aside from that [8] do a categorization of behavior on
bitcoin. Job operation requires a different amount of time, and it is impossible to process a series of
operations on a job simultaneously because an operation must be carried out sequentially. The problem is
choosing the best sequence of work activities and determining which machine can perform the task to
minimize machine idle and standby time [9].
Various fields can be utilized in this clustering technique as is currently developing, namely in the
business sector [10]. In the field of business and various problems as well as with big data can be solved in
the field of business [11]. Various types of businesses are also a concern for the need for data mining
techniques to gain new knowledge [12]. As is the case in banking which has a financial transaction process
where one type of crime that often occurs is fraud in the transaction process, both in the declared type of
fabrication crime [13], [14].
2. Int J Elec & Comp Eng ISSN: 2088-8708
The role of Louvain-coloring clustering in the detection of fraud transactions (Heru Mardiansyah)
609
Fraud in transactions in banking certainly causes a lot of harm to various parties [15]. As in the
incident that has been quoted from [16] perform fraud detection using a machine learning approach and
produce an related-party transactions (RPTs) knowledge graph to obtain performance and verify fraud in
transactions. Besides that, fraud also occurs from research [17] grouping fraudulent transactions in the data
mining process with the aim of increasing the profits of the companies and obtaining optimal results so that
managerial can improve the quality of the company.
With these problems, of course, to get contributions in the field of computer science, the right
algorithm is chosen to detect fraud [18]. Recorded transaction processes are of course large data and can be
used as new knowledge [19]. Thus, of course, an optimal algorithm is needed to detect fraudulent
transactions. As done [20] creating an anti-fraud system in banking to be able to monitor fraud in transactions
where the system is given knowledge with a clustering approach to data mining.
However, in business processes the Louvain algorithm is often used in clustering [18]. Louvain
algorithm is a community clustering algorithm based on graphs. So that data on transactions becomes a graph
that can facilitate the detection of fraud as a community [21]. Louvain algorithm has also been carried out
[22] in conducting clusters for local energy market travel routes in the US where the optimization model on
the Louvain algorithm is located on modularity based on a combination of graphic-intrinsic parameters so
that it has an accuracy of 0.05 to 0.15. So that in this study an optimization was found in the Louvain
algorithm with the image function so that decision makers can quickly get labels in the fraud transaction
community to be the Louvain-Coloring algorithm approach.
2. MATERIAL AND METHOD
2.1. Dataset
The dataset in this research uses community banking transaction data where the data consists of
33,491 non-fraud data and 241 fraudulent transaction data. Because the Louvain algorithm can perform
clustering to detect communities. However, one of the shortcomings of the Louvain algorithm on large
datasets is its relatively low processing time. However, in terms of nodes and data proximity, it will certainly
ensure whether there is no more data that is close to fraudulent.
2.2. Louvain-coloring algorithm in clustering
Louvain algorithm is an algorithm with unsupervised learning to be able to classify data fraud in
banking transactions [23]. Where the Louvain algorithm has modularity optimization and community
aggregation with maximum modularity based on (1) [24].
. 𝑀 =
1
2𝑚
∑ (𝐴𝑖𝑗 − 𝑃𝑖𝑗)𝛿(𝑐𝐽,𝑐𝐽
𝑖,𝑗 ) =
1
2𝑀
∑ (𝐴𝐼𝐽 −
𝐾𝑖𝐾𝑗
2𝑀
) 𝜗(𝐶𝐼,𝐶𝐽).
𝐼,𝐽 (1)
𝐴𝑖𝑗 is a weight value that can replace the input of neighboring matrix values with side weights connected
between nodes 𝑖 and 𝑗, 𝑘𝑖 = ∑𝑗 𝐴𝑖𝑗 which is the degree of node I, 𝐶𝐼 which is a community attribute,
𝛿-function 𝛿 (𝑢, 𝑣) is 1 if 𝑢=𝑣 and 0, then. 𝑚=1 ∑𝑖𝑗 𝐴𝑖𝑗 2 if the weights are graphs. Louvain will randomly
arrange all nodes in the network according to the modular optimization method [21]. Then, one by one, it
removes and inserts each node into another community 𝐶 until there is no appreciable increase in modularity
(input parameter) as shown in (2):
∆𝑀 = ⌈
∑ +
𝑖𝑛 2𝑘𝑖,𝑖𝑛
2𝑚.
− (
∑ +
𝑡𝑜𝑡 𝐾𝑖
2𝑚.
)
2
⌉ − [
∑ 𝑖𝑛
2𝑚
. − (
∑ 𝑡𝑜𝑡
2𝑚.
)
2
− (
𝑘𝑖
2𝑚.
)
2
] (2)
Louvain randomizes the existing nodes in the modular optimization method. For a definite increase, Louvain
deletes and inserts each node in the community that is not the same as C so that a very visible increase does
not occur. Like the (3) [25]:
∆𝑀 =
𝐾𝑖,𝑖𝑛
𝑚
−
.2 ∑ 𝑡𝑜𝑡𝑘𝑖.
.(2𝑚)2.
→ ∆𝑀𝑚 = 𝐾𝑖,𝑖𝑛 −
∑ 𝑡𝑜𝑡𝐾𝑖
2𝑚
(3)
While 𝐾𝑖,𝑖𝑛 and ∑ 𝑡𝑜𝑡 must be calculated for each test community, node-specific 𝐾𝑖/(2m) is analyzed. In this
way, the final expression is only recomputed when different nodes are considered during modulus
optimization [26]. After completing the first stage, all nodes belonging to the same community are combined
into one giant node. The links that connect the giant nodes are the sum of the previous links that have
connected the nodes of the same different community.
3. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 14, No. 1, February 2024: 608-616
610
2.3. General architecture
In this paper, of course, requires steps that are directed and focused on goals. So, to support the
research process a general architecture was formed. As for Figure 1, the Louvain process of this paper. Based
on Figure 1 on the Louvain process with the steps:
− Collect datasets
− Perform data preprocessing as needed
− Perform clustering with the Louvain algorithm with n scenarios.
− Perform optimization with Louvain-Coloring based on (4).
𝑄 =
1
2𝑚
∑ [𝐴𝑖𝑗 −
𝑘𝑖𝑘𝑗
2𝑚
] 𝛿(𝑐𝑖, 𝑐𝑗) (4)
Figure 1. Louvain process
3. RESULTS AND DISCUSSION
Clustering is carried out on communities that commit fraud in transactions, especially in banking.
The one-time dataset is based on public data consisting of 33,732 datasets and has been informed that 33,491
data are not fraudulent and 241 are fraudulent in transactions. But in the proof to detect how accurate the
knowledge that will be achieved based on the clustering technique is of course tested with the Louvain
algorithm. Where, after data exploration, it is confirmed whether transactions that are not fraudulent
approach or identify fraud.
3.1. Louvain algorithm to detect community
This test uses 6 scenarios to detect communities with the Louvain algorithm, using several
combinations of existing parameter values, which will be discussed in the test scenarios. The default
4. Int J Elec & Comp Eng ISSN: 2088-8708
The role of Louvain-coloring clustering in the detection of fraud transactions (Heru Mardiansyah)
611
parameter value is the new network's two-phase maxLevel (default=10), which is the level for the community
hierarchy. When running, the higher the level, the larger the community, which may not be exactly what is
desired. Next, maxIteration (default=10), which is the first phase, continues to repeat iteratively until the
increase in modularity is negligible or the number of iterations is reached as specified. Tolerance
(default=0.0001) the smaller the tolerance, the higher it is, the better, but more iterations may be required.
Concurrency (default=4) which is the number of threads that can run simultaneously.
3.1.1. Scenario 1
Scenario 1 uses parameters with the default values maxLevels, maxIteration, tolerance, and
concurrency. Comparison of test results from the Louvain algorithm and Louvain coloring. Louvain
coloring's modularity value increased by 0.981199% compared to Louvain. Louvain coloring has a better
processing time with a reduction time of 57.82%. The number of communities produced by the Louvain and
Louvain coloring algorithms only has a difference of 0.07% (normal). The test results show a modularity
value of 0.981199, communityID 181075, userCount 206, flaggedCount 7, flaggedRatio 0.033891. By
comparing all scenarios, scenario 5 is the best scenario. So, based on the test results for scenario 1 above, the
Louvain coloring algorithm is better than the Louvain algorithm.
3.1.2. Scenario 2
Scenario 2 uses parameters with a maxLevels value of 10, and the default values for maxIteration,
tolerance, and concurrency. Comparison of test results for the Louvain algorithm and Louvain coloring.
Where, Louvain’s modularity value is 0.981199% higher compared to Louvain coloring. And the time from
Louvain coloring is better than Louvain with a reduction time of 23.53%. And the number of communities
produced by the Louvain algorithm and Louvain coloring only has a difference of 0.03% (normal). So, based
on the test results for scenario 2, the Louvain algorithm is better than the Louvain coloring algorithm. Testing
is carried out by changing the maxLevel parameter with a test value of 1 to 10, and for other parameters,
namely maxIteration, tolerance, and concurrency, use the default values. Based on Table 1 scenario 2 results.
Table 1. Scenario 2 results
Max Level Modularity Community
count
Community
ID
User
count
Flagged
count
Ratio Accuracy
1 0.847114 32,907 195490 2 2 1.000000 88
2 0.957023 16,527 182733 7 4 0.571429 88
3 0.974713 13,626 195231 40 6 0.150000 88
4 0.978536 12,617 187751 127 7 0.055118 88
5 0.979545 12,211 197388 145 7 0.048276 88
6 0.980187 12,001 202202 147 7 0.047619 88
7 0.980621 11,884 25771 154 7 0.045455 88
8 0.980805 11,790 60065 181 7 0.038674 88
9 0.980904 11,727 180048 155 7 0.045161 88
10 0.980971 11,662 197265 159 7 0.044025 88
Table 1 shows the test results for the maxLevel combination with the results parameters modularity,
communityCount, communityID, userCount, FlaggedCount, Ratio having different values when the
maxLevel parameter with different values is run, and the accuracy results do not change when the test is
carried out with the maxLevel value different. Testing is carried out by changing the maxLevel parameter
with a test value of 1 to 10, and for other parameters, namely maxIteration, tolerance, and concurrency, use
the default values. When maxLevel with a value of 10 is tested, it gives the highest modularity results from
the others.
3.1.3. Scenario 3
Scenario 3 uses parameters with a maxIteration value of 700, and the default values for maxLevels,
tolerance, and concurrency. Comparison of test results for the Louvain algorithm and Louvain coloring.
Louvain coloring’s modularity value is 0.000128 higher compared to Louvain. The time from Louvain is
better than Louvain coloring with a time reduction of 80.23%. And the number of communities produced by
the Louvain coloring algorithm is better than Louvain. So, the test results for scenario 3 are that the Louvain
coloring algorithm is better than the Louvain algorithm. The test is carried out by combining the maxIteration
parameter values, and for the other parameters, namely maxLevel, tolerance and concurrency, they use the
default values and in Table 2 scenario 3 results. In Table 2 the values for the maxIteration parameter used are
10, 100, 200, 300, 400, 500, 600, 700, 800, 900, and 1,000. The test results are the maxIteratrions parameter
value is 700, maximum modularity value is 0.847158.
5. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 14, No. 1, February 2024: 608-616
612
Table 2. Scenario 3 results
Max Iterations Modularity Community
count
Community
ID
User count Flagged
count
Ratio Accuracy
10 0.846946 329,014 195490 2 2 1.000000 88
100 0.847163 32,903 184405 3 2 0.666667 88
200 0.847097 32,907 195490 2 2 1.000000 88
300 0.847039 32,908 195490 2 2 1.000000 88
400 0.84708 32,906 195490 2 2 1.000000 88
500 0.846724 32,921 195490 2 2 1.000000 88
600 0.847041 32,905 196688 2 2 1.000000 88
700 0.847158 32,904 195490 2 2 1.000000 88
800 0.847061 32,907 195490 2 2 1.000000 88
900 0.847039 32,907 195490 2 2 1.000000 88
1000 0.847002 32,907 195490 2 2 1.000000 88
3.1.4. Scenario 4
Scenario 4 uses parameters with a tolerance value of 0.000000001, and maxLevels, maxIteration,
and concurrency are the default. Comparison of test results from the Louvain algorithm and Louvain
coloring. Louvain’s modularity value is 0.0063% higher compared to Louvain coloring. And the processing
time of Louvain coloring is better than Louvain with a reduction time of 17.45%. And the number of
communities produced by the Louvain and Louvain coloring algorithms has a difference of 0.05% (normal).
So, the test results for scenario 4 are that the Louvain coloring algorithm is better than the Louvain algorithm.
Testing is carried out by combining the tolerance parameter values, and for other parameters, namely
maxLevel, maxIteration, and concurrency, using the default values and shown in Table 3 scenario 4 results.
In Table 3 the values for the tolerance parameters used are 0.01, 0.001, 0.0001, 0.00001, 0.000001,
0.0000001, 00000001, 0.000000001, 00000000001, with the highest test results on the parameter value
0.000000001 which has a modularity value 0.847597.
Table 3. Scenario 4 results
Tolerance Modularity Community
count
Community
ID
User count Flagged
count
Ratio Accuracy
0.01 0.844313 33,080 199429 2 2 1.000000 88
0.001 0.846621 324,925 195490 2 2 1.000000 88
0.0001 0.847105 32,908 195490 2 2 1.000000 88
0.00001 0.847053 32,908 195490 2 2 1.000000 88
0.000001 0.847156 32,906 195490 2 2 1.000000 88
0.0000001 0.847102 32,906 195490 2 2 1.000000 88
0.00000001 0.847073 32,906 195490 2 2 1.000000 88
0.000000001 0.847597 32,906 195490 2 2 1.000000 88
0.0000000001 0.847205 32,906 194648 2 2 1.000000 88
0.00000000001 0.847188 32,906 195490 2 2 1.000000 88
3.1.5. Scenario 5
Scenario 5 uses parameters with a concurrency value of 1, and maxLevels, maxIteration, tolerance,
and are the defaults. Comparison of test results from the Louvain and Louvain coloring algorithms. Louvain
coloring's modularity value is 0.0024% higher compared to Louvain. And the time from Louvain coloring is
better than Louvain with a reduction time of 55.96%. And the number of communities produced by the
Louvain and Louvain coloring algorithms has a difference of 0.12%. So based on this, the test results for
scenario 5 are that the Louvain coloring algorithm is better than the Louvain algorithm. Testing is carried out
by combining the concurrency parameter values, and for other parameters, namely maxLevel, maxIteration,
and tolerance using the default values in Table 4 scenario 5 results. In Table 4 the values for the concurrency
parameters used are 1, 2, 3 and 4, with the highest test results on the value of parameter 1 which has a
modularity value of 0.847234.
Table 4. Scenario 5 results
Concurrency Modularity Community
count
Community
ID
User count Flagged
count
Ratio Accuracy
1 0.847234 32,889 195490 2 2 1.000000 88
2 0.846653 32,914 195490 2 2 1.000000 88
3 0.847034 32,913 195490 2 2 1.000000 88
4 0.846975 32,910 194597 2 2 1.000000 88
6. Int J Elec & Comp Eng ISSN: 2088-8708
The role of Louvain-coloring clustering in the detection of fraud transactions (Heru Mardiansyah)
613
3.2. Louvain-coloring algorithm optimization to detect community
In this scenario, tests will be run with parameter values for maxLevel, maxIterations, tolerance and
concurrency. The test results, the value of Q_m is 0.981199, communityID 181075, userCount 206,
flaggedCount 7, flaggedRatio 0.033891. By comparing all of Louvain’s scenarios, this scenario can be said to
be the best scenario. The results are shown in Table 5.
From the several scenarios that have been carried out from Table 5 it is clear that the optimal results
are clearly visible. Based on the results that have been achieved, it can be seen from the dataset analyzed that
fraud has some closeness from the data and transaction community that is good and not fraudulent. So that a
graph is formed in Figure 2.
Table 5. Optimization results of the Louvain-Coloring algorithm for detecting fraud transactions
Skenario Louvain Louvain Coloring
Time (ms) Modularity Community
count
Time (ms) Modularity Community
count
1 354 0.980996 11,678 296 0.980959 11,686
2 348 0.981066 11,672 219 0.980867 11,662
3 233 0.981239 11,635 193 0.980925 11,685
4 265 0.980977 11,682 184 0.980907 11,682
5 471 0.981008 11,667 265 0.981183 11,656
Figure 2. Graph of fraud transaction community
7. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 14, No. 1, February 2024: 608-616
614
From Figure 2 on the graph, it can be seen that the results with the Louvain algorithm have been
optimized for Louvain-coloring, there are 19 data that are close to the Fraud community. The modularity
value of the Louvain coloring algorithm is better than the Louvain algorithm. From the 5 test scenarios
carried out, the Louvain coloring algorithm excels in 4 scenarios, namely 1, 3, 4 and 5. Meanwhile,
Louvain’s algorithm only excels in scenario 2. The modularity value of the Louvain Coloring algorithm is at
the lowest value of 0.980969 and the highest is 0.981207. Whereas the Louvain algorithm has the lowest
value of 0.981037 and the highest of 0.981157. The highest modularity value of the Louvain coloring
algorithm is obtained through scenario 1 which uses parameters with default values. While the highest
modularity value of the Louvain algorithm is obtained through scenario 4 which uses a tolerance value of
0.00000001 and other parameters with default values.
The Louvain coloring algorithm has the fastest or shortest processing time produced by scenario 4,
which is 123 ms, and the longest is produced by scenario 1, which is 394 ms. While the Louvain algorithm
has the fastest processing time produced by scenario 2, namely 136 ms, and the longest produced by
scenario 1, which is 934 ms. The longest time produced by the two algorithms in scenario 1 has a very large
or quite significant time difference.
The Louvain coloring algorithm has the smallest community generated by scenario 2 with a total
of 11,641 while the largest is generated by scenario 4 with a total of 11,668. Meanwhile the Louvain
algorithm has the smallest community generated by scenario 2 with a total of 11,649 and the largest
community is generated by scenario 4 with a total of 11,664 It is very interesting that the two algorithms
together produce the smallest and largest numbers in the same scenario. And for the number of
communities produced, even though it has a difference, it is in a very small range between 4 to 14
communities (<1% of the community). The researchers consider the number of communities generated can
be excluded from the assessment of the algorithm. So that exposure in the Louvain-coloring algorithm is
able to solve the clustering problem in transaction fraud by forming a new label so as to facilitate the
decision-making process.
4. CONCLUSION
The focus of this research is cluster research on big data, namely fraudulent and non-fraud
transactions in banking. Where the data that has been received is 33,491 non-fraud data and 241 fraudulent
transaction data. Then to detect this, clustering is carried out on data mining techniques so that they are able
to detect fraud transactions. This has been proven where the cluster process using the Louvain algorithm is
able to solve clustering problems in fraudulent transactions. From the data that has been received based on
the Louvain algorithm which approaches fraud data increases and has an accuracy of 90%. However, to make
labeling easier, the Louvain algorithm is optimized by presenting the data in a colored graph. This makes it
easier to detect. This is of course the basis for future predictions or forecasting in the cluster process in the
behavior of fraud perpetrators.
REFERENCES
[1] A. Triayudi, W. O. Widyarto, L. Kamelia, I. Iksal, and S. Sumiati, “CLG clustering for dropout prediction using log-data
clustering method,” IAES International Journal of Artificial Intelligence (IJ-AI), vol. 10, no. 3, pp. 764–770, Sep. 2021, doi:
10.11591/ijai.v10.i3.pp764-770.
[2] W. Ramdhan, O. S. Sitompul, E. B. Nababan, and Sawaluddin, “Clustering algorithm-based pandemic multicluster framework
analysis: k-means and k-medoids,” in 2022 IEEE International Conference of Computer Science and Information Technology
(ICOSNIKOM), Oct. 2022, pp. 1–6, doi: 10.1109/ICOSNIKOM56551.2022.10034927.
[3] A. R. Lubis, S. Prayudani, M. Lubis, and Al-Khowarizmi, “Analysis of the Markov chain approach to detect blood sugar level,”
Journal of Physics: Conference Series, vol. 1361, no. 1, Nov. 2019, doi: 10.1088/1742-6596/1361/1/012052.
[4] S. Lailiyah, E. Yulsilviana, and R. Andrea, “Clustering analysis of learning style on Anggana High School student,”
TELKOMNIKA (Telecommunication Computing Electronics and Control), vol. 17, no. 3, pp. 1409–1416, Jun. 2019, doi:
10.12928/telkomnika.v17i3.9101.
[5] A. A. Baker and Y. Ghadi, “Cancerous lung nodule detection in computed tomography images,” TELKOMNIKA
(Telecommunication Computing Electronics and Control), vol. 18, no. 5, pp. 2432–2438, Oct. 2020, doi:
10.12928/telkomnika.v18i5.15523.
[6] I. M. O. Widyantara, I. P. N. Hartawan, A. A. I. N. E. Karyawati, N. I. Er, and K. B. Artana, “Automatic identification system-
based trajectory clustering framework to identify vessel movement pattern,” IAES International Journal of Artificial Intelligence
(IJ-AI), vol. 12, no. 1, pp. 1–11, Mar. 2023, doi: 10.11591/ijai.v12.i1.pp1-11.
[7] S. E. Hashemi, F. Gholian-Jouybari, and M. Hajiaghaei-Keshteli, “A fuzzy C-means algorithm for optimizing data clustering,”
Expert Systems with Applications, vol. 227, Oct. 2023, doi: 10.1016/j.eswa.2023.120377.
[8] Ş. Telli and X. Zhao, “Clustering in Bitcoin balance,” Finance Research Letters, vol. 55, Jul. 2023, doi:
10.1016/j.frl.2023.103904.
[9] E. B. Nababan, O. S. Sitompul, and J. Sianturi, “Analysis of tabu list length on job shop scheduling problem,” Journal of Physics:
Conference Series, vol. 1235, no. 1, Jun. 2019, doi: 10.1088/1742-6596/1235/1/012047.
[10] A. C. Benabdellah, A. Benghabrit, and I. Bouhaddou, “A survey of clustering algorithms for an industrial context,” Procedia
8. Int J Elec & Comp Eng ISSN: 2088-8708
The role of Louvain-coloring clustering in the detection of fraud transactions (Heru Mardiansyah)
615
Computer Science, vol. 148, pp. 291–302, 2019, doi: 10.1016/j.procs.2019.01.022.
[11] J. M. D. Delgado et al., “Robotics and automated systems in construction: Understanding industry-specific challenges
for adoption,” Journal of Building Engineering, vol. 26, Art. no. 100868, pp. 1-11, Nov. 2019, doi:
10.1016/j.jobe.2019.100868.
[12] A. Al-Khowarizmi, R. Syah, and M. Elveny, “The model of business intelligence development by applying cooperative society
based financial technology,” in Proceedings of Sixth International Congress on Information and Communication Technology,
Springer Singapore, 2022, pp. 117–125.
[13] K. W. F. Ma and T. McKinnon, “COVID-19 and cyber fraud: emerging threats during the pandemic,” Journal of Financial
Crime, vol. 29, no. 2, pp. 433–446, Mar. 2022, doi: 10.1108/JFC-01-2021-0016.
[14] A. Srivastav and S. Prajapat, “Text similarity algorithms to determine Indian penal code sections for offence report,”
IAES International Journal of Artificial Intelligence (IJ-AI), vol. 11, no. 1, pp. 34–40, Mar. 2022, doi:
10.11591/ijai.v11.i1.pp34-40.
[15] M. Leo, S. Sharma, and K. Maddulety, “Machine learning in banking risk management: a literature review,” Risks, vol. 7, no. 1,
Mar. 2019, doi: 10.3390/risks7010029.
[16] X. Mao, H. Sun, X. Zhu, and J. Li, “Financial fraud detection using the related-party transaction knowledge graph,” Procedia
Computer Science, vol. 199, pp. 733–740, 2022, doi: 10.1016/j.procs.2022.01.091.
[17] M. Jans, J. M. van der Werf, N. Lybaert, and K. Vanhoof, “A business process mining application for internal transaction
fraud mitigation,” Expert Systems with Applications, vol. 38, no. 10, pp. 13351–13359, Sep. 2011, doi:
10.1016/j.eswa.2011.04.159.
[18] J. M. Arockiam and A. C. S. Pushpanathan, “MapReduce-iterative support vector machine classifier: novel fraud detection
systems in healthcare insurance industry,” International Journal of Electrical and Computer Engineering (IJECE), vol. 13, no. 1,
pp. 756–769, Feb. 2023, doi: 10.11591/ijece.v13i1.pp756-769.
[19] I. A. Ajah and H. F. Nweke, “Big data and business analytics: trends, platforms, success factors and applications,” Big Data and
Cognitive Computing, vol. 3, no. 2, Jun. 2019, doi: 10.3390/bdcc3020032.
[20] I. Vorobyev and A. Krivitskaya, “Reducing false positives in bank anti-fraud systems based on rule induction in distributed tree-
based models,” Computers and Security, vol. 120, Sep. 2022, doi: 10.1016/j.cose.2022.102786.
[21] S. Shirazi, H. Baziyad, N. Ahmadi, and A. Albadvi, “A new application of Louvain algorithm for identifying disease fields
using big data techniques,” Journal of Biostatistics and Epidemiology (Quarterly), vol. 5, no. 3, 2019, doi:
10.18502/jbe.v5i3.3613.
[22] W. Zhang, “Improving commuting zones using the Louvain community detection algorithm,” Economics Letters, vol. 219, Oct.
2022, doi: 10.1016/j.econlet.2022.110827.
[23] J. Zhang, J. Fei, X. Song, and J. Feng, “An improved Louvain algorithm for community detection,” Mathematical Problems in
Engineering, vol. 2021, pp. 1–14, Nov. 2021, doi: 10.1155/2021/1485592.
[24] N. S. Sattar and S. Arifuzzaman, “Scalable distributed Louvain algorithm for community detection in large graphs,” The Journal
of Supercomputing, vol. 78, no. 7, pp. 10275–10309, May 2022, doi: 10.1007/s11227-021-04224-2.
[25] C. Makris, D. Pettas, and G. Pispirigos, “Distributed community prediction for social graphs based on Louvain algorithm,”
in IFIP Advances in Information and Communication Technology, Springer International Publishing, 2019,
pp. 500–511.
[26] J. Qin, M. Li, and Y. Liang, “Minimum cost consensus model for CRP-driven preference optimization analysis in large-scale
group decision making using Louvain algorithm,” Information Fusion, vol. 80, pp. 121–136, Apr. 2022, doi:
10.1016/j.inffus.2021.11.001.
BIOGRAPHIES OF AUTHORS
Heru Mardiansyah born in Medan, Indonesia, in 1980. Student of the Faculty of
Computer Science and Information Technology, University of North Sumatra and Masters
from the University of North Sumatra in 2019 and currently pursuing a Ph.D. program at the
University of North Sumatra. Research areas of data science, big data, machine learning,
neural networks, artificial intelligence, and business intelligence. He can be contacted at
email: heru@students.usu.ac.id.
Saib Suwilo Professor from Universitas Sumatera Utara, Medan Indonesia. Saib
Suwilo was born in Simalungun, North Sumatera. Drs. in mathematics at USU Medan, 1992,
M.Sc. in mathematics at University of Wyoming, 1994, and Ph.D. in mathematics at
University of Wyoming, 2001. He can be contacted at email: saib@usu.ac.id.
9. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 14, No. 1, February 2024: 608-616
616
Erna Budhiarti Nababan associate professor from Universitas Sumatera Utara,
Medan Indonesia. Erna Budhiarti Nababan was born in Bandung, West Java. S.Pd. at Institute
Keguruan dan Ilmu Pendidikan (IKIP Bandung), 1981, MIT at University Kebangsaan
Malaysia (UKM Malaysia), 2005; Dr. at Universiti Kebangsaan Malaysia (UKM Malaysia),
2010. He can be contacted at email: ernabrn@usu.ac.id.
Syahril Efendi professor from Universitas Sumatera Utara, Medan Indonesia. His
main research interest is data science, big data, machine learning. He can be contacted at
email: syaril1@usu.ac.id and syahnyata1@gmail.com.