COMMERCE FAKE PRODUCT REVIEWS MONITORING AND DETECTION

COMMERCE FAKE PRODUCT REVIEWS MONITORING AND DETECTION
SYSTEM
Abstract
Online consumer reviews play an important role in helping consumers judge the quality
and authenticity of products on e-commerce platforms. However, the constant presence of fake
reviews on these platforms has significantly impacted the operation and development of e-
commerce platforms. In this study, we develop a novel supervised probabilistic method to
detect fake reviews by utilizing the difference in the distribution of non-fraudulent reviews and
that of fake reviews. Specifically, we first derive the univariate distributions of several unique
features (linguistic, behavioral, and interrelationship features). We then integrate these
distributions into two mixed distributions according to their labels to represent the overall
difference between non-fraudulent reviews and fake reviews. Next, we randomly generate
synthetic review data points with different labels from the above mixed distributions. Finally,
we train a Multilayer Perceptron model by using these synthetic review data to obtain a
classifier. We conducted several experiments to test the model using several original real-world
review datasets. Numerical results indicated that the proposed supervised method
outperformed some well-known sampling models and fake review detection methods, in terms
of classification accuracy. Moreover, we extend the proposed method to handle the scenarios
with small samples of raw review data. This study contributes to the literature by exploiting
the difference in the distribution of non-fraudulent reviews and that of fraudulent reviews,
which can improve the accuracy of fake review detection for online platforms.
Existing System
Detecting fake reviews on e-commerce platforms is critical for maintaining credibility
and trust among users. A supervised general mixed probability approach offers a robust method
to sift through reviews and identify potential fraudulent ones. By leveraging a combination of
machine learning algorithms and probabilistic models, this approach aims to analyze various
features within reviews to distinguish between genuine and fake content.The system employs
a supervised learning framework, utilizing a labeled dataset to train the model. Features such
as sentiment analysis, linguistic patterns, reviewer behavior, review timing, and product
information are considered to create a comprehensive feature set. These features are then
processed through a mixed probability model, which combines the strengths of different

probabilistic techniques, such as Bayesian methods or Hidden Markov Models, to assess the
likelihood of a review being authentic or deceptive.By employing a mixed probability
approach, this system can effectively handle diverse types of fake reviews, adapting to evolving
strategies used by malicious actors. Additionally, continuous model retraining and adaptation
ensure its ability to stay updated with new trends in fraudulent review practices.The goal of
this approach is not only to accurately detect fake reviews but also to provide e-commerce
platforms with a scalable and adaptable solution to maintain the integrity of their review
systems, fostering a trustworthy environment for both consumers and businesses.
Drawback in Existing System
 Data Dependence: This approach heavily relies on labeled datasets for training.
Obtaining and maintaining a large and diverse labeled dataset can be challenging and
costly. Moreover, the model's effectiveness might decrease if the dataset doesn’t
adequately represent evolving fraudulent review tactics.
 Feature Engineering Complexity: Extracting relevant features from reviews requires
sophisticated natural language processing (NLP) techniques. Designing and
engineering these features can be complex and computationally intensive. Additionally,
the model's performance heavily relies on the quality and relevance of these features.
 Adaptability to New Techniques: Fraudulent review strategies evolve over time, and
new methods constantly emerge. The model might struggle to adapt quickly to these
changes, requiring frequent updates and retraining to maintain its effectiveness.
 Resource Intensive: Implementing and maintaining a mixed probability approach can
be computationally demanding. This might pose challenges for smaller e-commerce
platforms with limited resources.
Proposed System
 Data Collection: Description of the dataset acquisition process, emphasizing the need
for a diverse and labeled dataset.
 Preprocessing and Feature Engineering: Details on data preprocessing techniques
and the selection of various features (linguistic, behavioral, temporal) for model
training.

 Supervised Learning Framework: Explanation of the mixed probability approach
involving Bayesian classifiers, Hidden Markov Models, or ensemble methods.
 Model Training and Evaluation: Methodology for model training, validation, and
performance evaluation using appropriate metrics.
Algorithm
 Sentiment Analysis: Using algorithms like VADER (Valence Aware Dictionary and
sEntiment Reasoner) or supervised machine learning models to determine sentiment
polarity.
 NLP Techniques: Leveraging techniques like word embeddings (Word2Vec, GloVe)
or language models (BERT, GPT) for semantic understanding.
 Linguistic Features: Analyzing word frequency, syntactic patterns, or grammar
structures.
Advantages
 Incorporates Various Features: Leverages linguistic, temporal, and behavioral
attributes within reviews, offering a comprehensive assessment for identifying
fraudulent patterns.
 Comprehensive Feature Set: Utilizes diverse features such as sentiment analysis,
linguistic patterns, reviewer behavior, and temporal information, improving the
accuracy of detecting fake reviews.
 Mixed Probability Models: Combines different probabilistic techniques, allowing the
system to adapt to emerging fraudulent review strategies over time.
 Robust Classification: Considers multiple dimensions, minimizing misclassification
of genuine reviews as fake, thus reducing false alarms.
Software Specification
 Processor : I3 core processor
 Ram : 4 GB
 Hard disk : 500 GB

Software Specification
 Operating System : Windows 10 /11
 Frond End : JAVA Swing
 Back End : Mysql Server
 IDE Tools : Eclipse
 Browser : Microsoft Edge

COMMERCE FAKE PRODUCT REVIEWS MONITORING AND DETECTION

Recommended

Recommended

More Related Content

Similar to COMMERCE FAKE PRODUCT REVIEWS MONITORING AND DETECTION

Similar to COMMERCE FAKE PRODUCT REVIEWS MONITORING AND DETECTION (20)

More from Shakas Technologies

More from Shakas Technologies (20)

Recently uploaded

Recently uploaded (20)

COMMERCE FAKE PRODUCT REVIEWS MONITORING AND DETECTION