This document summarizes an article that proposes a new method called Opinion Pattern Mining Segmentation (OPMS) based on Probabilistic Principal Component Analysis (PPCA). The method segments user profiles and behavior patterns from product reviews more efficiently compared to traditional methods like random forests. It reduces dimensionality using a covariance matrix in the PPCA process, improving segmentation efficiency by up to 9% and decreasing false positive rates. The method was tested on product review data and showed improvements in segmentation efficiency, user product preference accuracy, and reduced opinion pattern mining time compared to other methods.
Understanding the Applicability of Linear & Non-Linear Models Using a Case-Ba...ijaia
This paper uses a case based study – “product sales estimation” on real-time data to help us understand
the applicability of linear and non-linear models in machine learning and data mining. A systematic
approach has been used here to address the given problem statement of sales estimation for a particular set
of products in multiple categories by applying both linear and non-linear machine learning techniques on
a data set of selected features from the original data set. Feature selection is a process that reduces the
dimensionality of the data set by excluding those features which contribute minimal to the prediction of the
dependent variable. The next step in this process is training the model that is done using multiple
techniques from linear & non-linear domains, one of the best ones in their respective areas. Data Remodeling
has then been done to extract new features from the data set by changing the structure of the
dataset & the performance of the models is checked again. Data Remodeling often plays a very crucial and
important role in boosting classifier accuracies by changing the properties of the given dataset. We then try
to explore and analyze the various reasons due to which one model performs better than the other & hence
try and develop an understanding about the applicability of linear & non-linear machine learning models.
The target mentioned above being our primary goal, we also aim to find the classifier with the best possible
accuracy for product sales estimation in the given scenario.
IRJET- Physical Design of Approximate Multiplier for Area and Power EfficiencyIRJET Journal
This document summarizes research on using statistical measures and machine learning techniques to perform sentiment analysis on product reviews. The researchers collected product review data from online sources and analyzed the sentiment and opinions expressed in the text using support vector machine classifiers. They classified reviews as positive or negative and analyzed key product features that were discussed. The results demonstrated that statistical sentiment analysis can help companies better understand customer feedback and identify popular product versions or attributes. Several related works applying techniques like naive Bayes, lexicon-based methods and aspect-based sentiment analysis on reviews from domains like movies, hotels and restaurants are also summarized.
IRJET - Support Vector Machine versus Naive Bayes Classifier:A Juxtaposition ...IRJET Journal
This document compares the Naive Bayes and Support Vector Machine machine learning algorithms for sentiment analysis. It discusses how each algorithm works, including vectorization, parameter tuning, and terminology related to evaluating model performance such as bias, variance, cross-validation, and ROC curves. An experiment is described that applies both algorithms to movie, product, and service reviews from public datasets to determine which performs better for sentiment classification based on various evaluation metrics like accuracy, precision, recall and F1 score. The results are analyzed to understand which algorithm may be better suited for different use cases and how future work could improve model performance.
IRJET- Product Aspect Ranking and its ApplicationIRJET Journal
The document presents a framework for product aspect ranking using consumer reviews from online sources. It aims to identify important aspects of products by extracting aspects from reviews, classifying the sentiment on each aspect, and ranking aspects based on frequency and influence on overall consumer sentiment. The framework includes data preprocessing of reviews, aspect identification by extracting frequent nouns, sentiment classification of reviews as positive, negative or neutral, and a probabilistic ranking algorithm to determine important aspects. It is proposed that identifying and ranking important product aspects can help consumers make purchase decisions and help companies improve products. The framework is implemented and evaluated on consumer reviews from various sources and products.
IRJET- An Efficient Ensemble Machine Learning System for Restaurant Recom...IRJET Journal
This document discusses building an efficient machine learning system for restaurant recommendations. It analyzes different ensemble machine learning models including stochastic gradient descent and random forest on Yelp restaurant review data from Phoenix and Scottsdale, USA. The random forest model achieved the best accuracy with a mean square error of 2.633 for Phoenix and 2.518 for Scottsdale. The document also outlines various recommendation techniques like matrix factorization, latent factor models, and converting it to a regression problem to predict restaurant ratings.
A Literature Survey: Fuzzy Logic and Qualitative Performance Evaluation of Su...theijes
This document provides a literature review on using fuzzy logic to evaluate the qualitative performance of supply chain management. It discusses how qualitative performance measures are difficult to incorporate into traditional quantitative models due to their subjective nature. Fuzzy logic can help address this by allowing degrees of membership and linguistic variables. The document reviews literature on qualitative supply chain performance measurement and the role of fuzzy logic. It discusses how fuzzy logic can help with the fuzzification and evaluation of qualitative supply chain performance measures and variables through defining linguistic terms and membership functions.
An effective way to optimize key performance factors of supply chainIAEME Publication
This document summarizes an article from the International Journal of Management that discusses optimizing key performance factors in supply chain management. The article begins with an abstract that outlines the goal of using analytical techniques to optimize costs in the outward supply chain. It then reviews relevant literature on supply chain performance measurement and modeling supply chain systems. The methodology section outlines the steps taken, which include identifying key parameters that influence performance, formulating the problem as minimizing total supply chain costs given constraints, validating the model, and implementing the solution. The conclusion emphasizes the importance of supply chain performance measurement for competitiveness.
This document discusses machine learning algorithms and their applications. It begins with an abstract discussing supervised, unsupervised, and reinforcement learning techniques. It then discusses machine learning in more detail, explaining that machine learning algorithms represent data instances with a set of features and classify instances based on their labels. The main focus is on supervised and unsupervised learning techniques and their performance parameters. It provides an overview of support vector machines, neural networks, and other machine learning algorithms. In summary, the document provides a survey of different machine learning techniques, how they work, and their applications.
Understanding the Applicability of Linear & Non-Linear Models Using a Case-Ba...ijaia
This paper uses a case based study – “product sales estimation” on real-time data to help us understand
the applicability of linear and non-linear models in machine learning and data mining. A systematic
approach has been used here to address the given problem statement of sales estimation for a particular set
of products in multiple categories by applying both linear and non-linear machine learning techniques on
a data set of selected features from the original data set. Feature selection is a process that reduces the
dimensionality of the data set by excluding those features which contribute minimal to the prediction of the
dependent variable. The next step in this process is training the model that is done using multiple
techniques from linear & non-linear domains, one of the best ones in their respective areas. Data Remodeling
has then been done to extract new features from the data set by changing the structure of the
dataset & the performance of the models is checked again. Data Remodeling often plays a very crucial and
important role in boosting classifier accuracies by changing the properties of the given dataset. We then try
to explore and analyze the various reasons due to which one model performs better than the other & hence
try and develop an understanding about the applicability of linear & non-linear machine learning models.
The target mentioned above being our primary goal, we also aim to find the classifier with the best possible
accuracy for product sales estimation in the given scenario.
IRJET- Physical Design of Approximate Multiplier for Area and Power EfficiencyIRJET Journal
This document summarizes research on using statistical measures and machine learning techniques to perform sentiment analysis on product reviews. The researchers collected product review data from online sources and analyzed the sentiment and opinions expressed in the text using support vector machine classifiers. They classified reviews as positive or negative and analyzed key product features that were discussed. The results demonstrated that statistical sentiment analysis can help companies better understand customer feedback and identify popular product versions or attributes. Several related works applying techniques like naive Bayes, lexicon-based methods and aspect-based sentiment analysis on reviews from domains like movies, hotels and restaurants are also summarized.
IRJET - Support Vector Machine versus Naive Bayes Classifier:A Juxtaposition ...IRJET Journal
This document compares the Naive Bayes and Support Vector Machine machine learning algorithms for sentiment analysis. It discusses how each algorithm works, including vectorization, parameter tuning, and terminology related to evaluating model performance such as bias, variance, cross-validation, and ROC curves. An experiment is described that applies both algorithms to movie, product, and service reviews from public datasets to determine which performs better for sentiment classification based on various evaluation metrics like accuracy, precision, recall and F1 score. The results are analyzed to understand which algorithm may be better suited for different use cases and how future work could improve model performance.
IRJET- Product Aspect Ranking and its ApplicationIRJET Journal
The document presents a framework for product aspect ranking using consumer reviews from online sources. It aims to identify important aspects of products by extracting aspects from reviews, classifying the sentiment on each aspect, and ranking aspects based on frequency and influence on overall consumer sentiment. The framework includes data preprocessing of reviews, aspect identification by extracting frequent nouns, sentiment classification of reviews as positive, negative or neutral, and a probabilistic ranking algorithm to determine important aspects. It is proposed that identifying and ranking important product aspects can help consumers make purchase decisions and help companies improve products. The framework is implemented and evaluated on consumer reviews from various sources and products.
IRJET- An Efficient Ensemble Machine Learning System for Restaurant Recom...IRJET Journal
This document discusses building an efficient machine learning system for restaurant recommendations. It analyzes different ensemble machine learning models including stochastic gradient descent and random forest on Yelp restaurant review data from Phoenix and Scottsdale, USA. The random forest model achieved the best accuracy with a mean square error of 2.633 for Phoenix and 2.518 for Scottsdale. The document also outlines various recommendation techniques like matrix factorization, latent factor models, and converting it to a regression problem to predict restaurant ratings.
A Literature Survey: Fuzzy Logic and Qualitative Performance Evaluation of Su...theijes
This document provides a literature review on using fuzzy logic to evaluate the qualitative performance of supply chain management. It discusses how qualitative performance measures are difficult to incorporate into traditional quantitative models due to their subjective nature. Fuzzy logic can help address this by allowing degrees of membership and linguistic variables. The document reviews literature on qualitative supply chain performance measurement and the role of fuzzy logic. It discusses how fuzzy logic can help with the fuzzification and evaluation of qualitative supply chain performance measures and variables through defining linguistic terms and membership functions.
An effective way to optimize key performance factors of supply chainIAEME Publication
This document summarizes an article from the International Journal of Management that discusses optimizing key performance factors in supply chain management. The article begins with an abstract that outlines the goal of using analytical techniques to optimize costs in the outward supply chain. It then reviews relevant literature on supply chain performance measurement and modeling supply chain systems. The methodology section outlines the steps taken, which include identifying key parameters that influence performance, formulating the problem as minimizing total supply chain costs given constraints, validating the model, and implementing the solution. The conclusion emphasizes the importance of supply chain performance measurement for competitiveness.
This document discusses machine learning algorithms and their applications. It begins with an abstract discussing supervised, unsupervised, and reinforcement learning techniques. It then discusses machine learning in more detail, explaining that machine learning algorithms represent data instances with a set of features and classify instances based on their labels. The main focus is on supervised and unsupervised learning techniques and their performance parameters. It provides an overview of support vector machines, neural networks, and other machine learning algorithms. In summary, the document provides a survey of different machine learning techniques, how they work, and their applications.
This document describes a customer decision support system prototype that aims to provide insights about consumers to small businesses. The prototype collects and analyzes customer transaction and product data using algorithms like customer confidence prediction and customer priority classification. It stores data in databases and uses a modular architecture with input/output modules, data cleaning, business logic, and a user interface. The goal is to help small businesses by suggesting targeted products and discounts to increase sales based on customer analytics. Testing showed the prototype could efficiently share anonymous customer data between stores and potentially lower sales times.
CHURN ANALYSIS AND PLAN RECOMMENDATION FOR TELECOM OPERATORSJournal For Research
This document summarizes research on customer churn prediction and retention strategies in the telecommunications industry. It provides an abstract for a research paper on using a hybrid machine learning classifier and rule engine to predict customer churn and recommend retention plans. The document then reviews related work applying techniques like fuzzy logic, ordinal regression, artificial neural networks, and ensemble methods to predict churn. It outlines the problem of predicting churn and proposing recommendation rules. The proposed solution is a hybrid machine learning model to predict churn with high accuracy and explain reasons for churn to help recommend plans to retain customers.
Clustering Prediction Techniques in Defining and Predicting Customers Defecti...IJECEIAES
With the growth of the e-commerce sector, customers have more choices, a fact which encourages them to divide their purchases amongst several ecommerce sites and compare their competitors‟ products, yet this increases high risks of churning. A review of the literature on customer churning models reveals that no prior research had considered both partial and total defection in non-contractual online environments. Instead, they focused either on a total or partial defect. This study proposes a customer churn prediction model in an e-commerce context, wherein a clustering phase is based on the integration of the k-means method and the Length-RecencyFrequency-Monetary (LRFM) model. This phase is employed to define churn followed by a multi-class prediction phase based on three classification techniques: Simple decision tree, Artificial neural networks and Decision tree ensemble, in which the dependent variable classifies a particular customer into a customer continuing loyal buying patterns (Non-churned), a partial defector (Partially-churned), and a total defector (Totally-churned). Macroaveraging measures including average accuracy, macro-average of Precision, Recall, and F-1 are used to evaluate classifiers‟ performance on 10-fold cross validation. Using real data from an online store, the results show the efficiency of decision tree ensemble model over the other models in identifying both future partial and total defection.
Qais Yahya Hatim has a PhD in industrial engineering and operations research from Penn State University. He currently works as a statistician and operations research analyst at the FDA, where he applies statistical methods like data mining, multivariate analysis, and Bayesian inference to evaluate pharmaceutical quality data. His research experience includes engineering statistics, supply chain modeling, and manufacturing optimization. He has worked on projects involving production simulation, finite element analysis, and statistical modeling at NIST and IAEC.
IRJET- Analysis of Brand Value Prediction based on Social Media DataIRJET Journal
This document presents a study that analyzes brand value prediction based on social media data using different sentiment analysis techniques. The study compares lexicon-based sentiment analysis tools SentiWordNet and TextBlob, and also evaluates supervised machine learning classifiers Naive Bayes and CNN. The CNN model achieved the highest accuracy of 94.4% when applied to a dataset of Amazon product reviews, outperforming the Naive Bayes model which achieved 82% accuracy. The study concludes that hybrid methods combining lexicon-based and machine learning approaches can effectively analyze sentiment from large social media datasets.
This document discusses opinion mining and sentiment analysis for business intelligence purposes. It provides an overview of related work on extracting opinions from text to classify sentiments. The paper surveys techniques like lexicon-based approaches and machine learning algorithms for sentiment classification. It also discusses how opinion mining can help business analysts extract relevant information from large amounts of unstructured data on the web to make informed decisions. Future work may involve applying techniques like neural networks and improving information retrieval from XML data sources.
Associative Regressive Decision Rule Mining for Predicting Customer Satisfact...csandit
Opinion mining also known as sentiment analysis, involves customer satisfactory patterns,
sentiments and attitudes toward entities, products, services and their attributes. With the rapid
development in the field of Internet, potential customer’s provides a satisfactory level of
product/service reviews. The high volume of customer reviews were developed for
product/review through taxonomy-aware processing but, it was difficult to identify the best
reviews. In this paper, an Associative Regression Decision Rule Mining (ARDRM) technique is
developed to predict the pattern for service provider and to improve customer satisfaction based
on the review comments. Associative Regression based Decision Rule Mining performs twosteps
for improving the customer satisfactory level. Initially, the Machine Learning Bayes
Sentiment Classifier (MLBSC) is used to classify the class labels for each service reviews. After
that, Regressive factor of the opinion words and Class labels were checked for Association
between the words by using various probabilistic rules. Based on the probabilistic rules, the
opinion and sentiments effect on customer reviews, are analyzed to arrive at specific set of
service preferred by the customers with their review comments. The Associative Regressive
Decision Rule helps the service provider to take decision on improving the customer satisfactory
level. The experimental results reveal that the Associative Regression Decision Rule Mining
(ARDRM) technique improved the performance in terms of true positive rate, Associative
Regression factor, Regressive Decision Rule Generation time and Review Detection Accuracy of
similar pattern.
Providing Highly Accurate Service Recommendation over Big Data using Adaptive...IRJET Journal
This document proposes an adaptive recommendation system to provide accurate service recommendations over big data. It combines content-based, item-based, and knowledge-based recommendation techniques using an adaptive collaborative filtering approach. The system aims to improve scalability, accuracy, and address cold-start problems. It uses clustering to group similar services together to reduce data size and improve recommendation accuracy. The system architecture includes administrative and visitor modules to manage products and provide recommendations respectively. Service recommendations are generated by matching users to similar neighborhoods based on item preferences.
This document summarizes strategies for improving the effectiveness of marketing and sales discussed in various research papers. It first introduces common data mining algorithms like Apriori, FP-growth, and Eclat that are used to find frequently purchased item sets by analyzing customer transaction data. This information can then be used to better target products to customers. The document then summarizes 11 research papers that examine techniques like sentiment analysis of reviews to predict sales, using price indexes to understand market impacts, and designing marketing management systems. The goal of these strategies is to enhance customer relationships and make more informed decisions about product placement, pricing, and promotions.
IRJET- Survey of Classification of Business Reviews using Sentiment AnalysisIRJET Journal
1. The document discusses using machine learning algorithms like Naive Bayes and Linear SVC to classify reviews of businesses as positive or negative based on sentiment analysis of the text.
2. It explores feature selection methods like information gain to identify important features that help determine sentiment. It also discusses using tools like SentiWordNet to assign sentiment scores to words.
3. The proposed system applies a lexical approach using SentiWordNet to quantify word sentiment scores, then uses feature selection and machine learning classifiers like Naive Bayes and Linear SVC to determine the overall sentiment polarity of reviews with over 90% accuracy.
IRJET - Characterizing Products’ Outcome by Sentiment Analysis and Predicting...IRJET Journal
This document discusses characterizing products' outcomes using early reviews and sentiment analysis. The researchers use support vector machines (SVM) to analyze the sentiment of early reviews for products from e-commerce websites to predict whether products will be successful or fail. They define early reviews as those posted soon after a product launch. The SVM model is trained on labeled early review data to classify reviews as positive, negative or neutral sentiment. They also use a statistical method called PER to identify early reviewers based on users who frequently post early reviews. The goal is to help companies understand which types of products may be successful by analyzing early reviewer sentiment.
Sales analysis using product rating in data mining techniqueseSAT Journals
Abstract
In this paper a new product rating approach for mathematically and graphically analyzing sales of same type of products from different manufactures and with most frequent combination of items is proposed. In product sales market there is no specific rating for product of same type and combination of product purchasing pattern. By this we retrieve the best combination of products with mathematically rating. By this rating and pattern we can make graphical representation of rating and combination of product of same type to compare them with other . Data mining provide more abstract knowledge to analyze business functionalities with retail product data. The purpose of product is to fulfill need of customer , based upon it there are different company makes product of same type , by analyzing it mathematically best one can be calculated thing such as customer satisfaction , product efficiency , popularity among them.
Keywords: Data Mining, Sales Report, Product rating, Threshold value.
IRJET- Book Recommendation System using Item Based Collaborative FilteringIRJET Journal
This document describes an item-based collaborative filtering approach for a book recommendation system. It discusses different recommendation system techniques including collaborative filtering, content-based filtering, and hybrid filtering. It then focuses on item-based collaborative filtering, explaining how it calculates item similarities using adjusted cosine similarity and makes predictions using weighted sums. The document tests the approach on the Goodbooks10k dataset and evaluates it using mean absolute error, finding lower error rates with more neighbor items. In conclusion, item-based collaborative filtering is an effective approach for book recommendations.
Comparative Analysis of Machine Learning Algorithms for their Effectiveness i...IRJET Journal
1. The document presents a comparative analysis of machine learning algorithms for predicting customer churn in the telecom industry.
2. Logistic regression, random forest, and balanced random forest classifiers were evaluated on a dataset of 25,000 customers described by 111 variables.
3. The balanced logistic regression model that used SMOTE to address class imbalance achieved the best performance with an area under the ROC curve of 0.861, accurately predicting churn with an accuracy of 77% and recall of 76% on the test set.
Identification of important features and data mining classification technique...IJECEIAES
Employees absenteeism at the work costs organizations billions a year. Prediction of employees’ absenteeism and the reasons behind their absence help organizations in reducing expenses and increasing productivity. Data mining turns the vast volume of human resources data into information that can help in decision-making and prediction. Although the selection of features is a critical step in data mining to enhance the efficiency of the final prediction, it is not yet known which method of feature selection is better. Therefore, this paper aims to compare the performance of three well-known feature selection methods in absenteeism prediction, which are relief-based feature selection, correlation-based feature selection and information-gain feature selection. In addition, this paper aims to find the best combination of feature selection method and data mining technique in enhancing the absenteeism prediction accuracy. Seven classification techniques were used as the prediction model. Additionally, cross-validation approach was utilized to assess the applied prediction models to have more realistic and reliable results. The used dataset was built at a courier company in Brazil with records of absenteeism at work. Regarding experimental results, correlationbased feature selection surpasses the other methods through the performance measurements. Furthermore, bagging classifier was the best-performing data mining technique when features were selected using correlation-based feature selection with an accuracy rate of (92%).
Empirical Model of Supervised Learning Approach for Opinion MiningIRJET Journal
This summarizes an empirical model for opinion mining using supervised learning with an integrated alignment model and naive Bayesian classification model. The proposed model aims to automatically identify user reviews of products as positive or negative and provide an aggregated product rating based on review sentiment analysis and rankings. An alignment model is used to match keywords between source and target reviews to determine sentiment polarity. If a match is not found, the review is sent to a naive Bayesian classification model for sentiment analysis and rating. A rank aggregation model then considers data parameters like user ID, time, and rank to generate a ranked list of products based on ratings and sentiment analysis while excluding short-duration sessions or redundant comments. The proposed hybrid model aims to provide more accurate results for product sentiment analysis
IRJET- Slant Analysis of Customer Reviews in View of Concealed Markov DisplayIRJET Journal
This document summarizes a research paper that proposes a method for sentiment analysis of customer reviews using a Hidden Markov Model. It first discusses how online retailers receive large numbers of customer reviews for products and how it is difficult to analyze the overall sentiment from all reviews. The proposed method involves using a Hidden Markov Model to analyze each review sentence and determine if it expresses a positive or negative sentiment. The model is trained on a dataset of customer reviews that have been part-of-speech labeled. Experimental results found that the trained Hidden Markov Model achieved high precision and accuracy in classifying the sentiment of reviews.
Customer Clustering Based on Customer Purchasing Sequence DataIJERA Editor
Customer clustering has become a priority for enterprises because of the importance of customer relationship management. Customer clustering can improve understanding of the composition and characteristics of customers, thereby enabling the creation of appropriate marketing strategies for each customer group. Previously, different customer clustering approaches have been proposed according to data type, namely customer profile data, customer value data, customer transaction data, and customer purchasing sequence data. This paper considers the customer clustering problem in the context of customer purchasing sequence data. However, two major aspects distinguish this paper from past research: (1) in our model, a customer sequence contains itemsets, which is a more realistic configuration than previous models, which assume a customer sequence would merely consist of items; and (2) in our model, a customer may belong to multiple clusters or no cluster, whereas in existing models a customer is limited to only one cluster. The second difference implies that each cluster discovered using our model represents a crucial type of customer behavior and that a customer can exhibit several types of behavior simultaneously. Finally, extensive experiments are conducted through a retail data set, and the results show that the clusters obtained by our model can provide more accurate descriptions of customer purchasing behaviors.
The disruptometer: an artificial intelligence algorithm for market insightsjournalBEEI
Social media data mining is rapidly developing to be a mainstream tool for marketing insights in today’s world, due to the abundance of data and often freely accessed information. In this paper, we propose a framework for market research purposes called the Disruptometer. The algorithm uses keywords to provide different types of market insights from data crawling. The preliminary algorithm data-mines information from Twitter and outputs 2 parameters-Product-to-Market Fit and Disruption Quotient, which is obtained from a brand’s customer value proposition, problem space, and incumbent space. The algorithm has been tested with a venture capitalist portfolio company and market research firm to show high correlated results. Out of 4 brand use cases, 3 obtained identical results with the
analysts ‘studies.
Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...Sagar Deogirkar
Comparing the State-of-the-Art Deep Learning with Machine Learning algorithms performance on TF-IDF vector creation for Sentiment Analysis using Airline Tweeter Data Set.
Face Recognition using Discrete Wavelet Transform and Principle Component Analysis features of MATLAB.
for processing video go to: https://www.youtube.com/watch?v=X67b0NULO98
- Four potential Yoplait Light TV ads were tested among 5478 respondents to determine if a $25 million ad campaign would be worthwhile.
- Key findings showed that respondents with positive evaluations of the ads were more likely to purchase the products, the storyline conclusion drove the evaluation of the ads, and positive ad appeals positively correlated with brand recall while negative appeals negatively correlated.
- It was concluded that the ad campaign would be worth $25 million, but further research should measure actual purchase behavior and improve brand recall.
This document describes a customer decision support system prototype that aims to provide insights about consumers to small businesses. The prototype collects and analyzes customer transaction and product data using algorithms like customer confidence prediction and customer priority classification. It stores data in databases and uses a modular architecture with input/output modules, data cleaning, business logic, and a user interface. The goal is to help small businesses by suggesting targeted products and discounts to increase sales based on customer analytics. Testing showed the prototype could efficiently share anonymous customer data between stores and potentially lower sales times.
CHURN ANALYSIS AND PLAN RECOMMENDATION FOR TELECOM OPERATORSJournal For Research
This document summarizes research on customer churn prediction and retention strategies in the telecommunications industry. It provides an abstract for a research paper on using a hybrid machine learning classifier and rule engine to predict customer churn and recommend retention plans. The document then reviews related work applying techniques like fuzzy logic, ordinal regression, artificial neural networks, and ensemble methods to predict churn. It outlines the problem of predicting churn and proposing recommendation rules. The proposed solution is a hybrid machine learning model to predict churn with high accuracy and explain reasons for churn to help recommend plans to retain customers.
Clustering Prediction Techniques in Defining and Predicting Customers Defecti...IJECEIAES
With the growth of the e-commerce sector, customers have more choices, a fact which encourages them to divide their purchases amongst several ecommerce sites and compare their competitors‟ products, yet this increases high risks of churning. A review of the literature on customer churning models reveals that no prior research had considered both partial and total defection in non-contractual online environments. Instead, they focused either on a total or partial defect. This study proposes a customer churn prediction model in an e-commerce context, wherein a clustering phase is based on the integration of the k-means method and the Length-RecencyFrequency-Monetary (LRFM) model. This phase is employed to define churn followed by a multi-class prediction phase based on three classification techniques: Simple decision tree, Artificial neural networks and Decision tree ensemble, in which the dependent variable classifies a particular customer into a customer continuing loyal buying patterns (Non-churned), a partial defector (Partially-churned), and a total defector (Totally-churned). Macroaveraging measures including average accuracy, macro-average of Precision, Recall, and F-1 are used to evaluate classifiers‟ performance on 10-fold cross validation. Using real data from an online store, the results show the efficiency of decision tree ensemble model over the other models in identifying both future partial and total defection.
Qais Yahya Hatim has a PhD in industrial engineering and operations research from Penn State University. He currently works as a statistician and operations research analyst at the FDA, where he applies statistical methods like data mining, multivariate analysis, and Bayesian inference to evaluate pharmaceutical quality data. His research experience includes engineering statistics, supply chain modeling, and manufacturing optimization. He has worked on projects involving production simulation, finite element analysis, and statistical modeling at NIST and IAEC.
IRJET- Analysis of Brand Value Prediction based on Social Media DataIRJET Journal
This document presents a study that analyzes brand value prediction based on social media data using different sentiment analysis techniques. The study compares lexicon-based sentiment analysis tools SentiWordNet and TextBlob, and also evaluates supervised machine learning classifiers Naive Bayes and CNN. The CNN model achieved the highest accuracy of 94.4% when applied to a dataset of Amazon product reviews, outperforming the Naive Bayes model which achieved 82% accuracy. The study concludes that hybrid methods combining lexicon-based and machine learning approaches can effectively analyze sentiment from large social media datasets.
This document discusses opinion mining and sentiment analysis for business intelligence purposes. It provides an overview of related work on extracting opinions from text to classify sentiments. The paper surveys techniques like lexicon-based approaches and machine learning algorithms for sentiment classification. It also discusses how opinion mining can help business analysts extract relevant information from large amounts of unstructured data on the web to make informed decisions. Future work may involve applying techniques like neural networks and improving information retrieval from XML data sources.
Associative Regressive Decision Rule Mining for Predicting Customer Satisfact...csandit
Opinion mining also known as sentiment analysis, involves customer satisfactory patterns,
sentiments and attitudes toward entities, products, services and their attributes. With the rapid
development in the field of Internet, potential customer’s provides a satisfactory level of
product/service reviews. The high volume of customer reviews were developed for
product/review through taxonomy-aware processing but, it was difficult to identify the best
reviews. In this paper, an Associative Regression Decision Rule Mining (ARDRM) technique is
developed to predict the pattern for service provider and to improve customer satisfaction based
on the review comments. Associative Regression based Decision Rule Mining performs twosteps
for improving the customer satisfactory level. Initially, the Machine Learning Bayes
Sentiment Classifier (MLBSC) is used to classify the class labels for each service reviews. After
that, Regressive factor of the opinion words and Class labels were checked for Association
between the words by using various probabilistic rules. Based on the probabilistic rules, the
opinion and sentiments effect on customer reviews, are analyzed to arrive at specific set of
service preferred by the customers with their review comments. The Associative Regressive
Decision Rule helps the service provider to take decision on improving the customer satisfactory
level. The experimental results reveal that the Associative Regression Decision Rule Mining
(ARDRM) technique improved the performance in terms of true positive rate, Associative
Regression factor, Regressive Decision Rule Generation time and Review Detection Accuracy of
similar pattern.
Providing Highly Accurate Service Recommendation over Big Data using Adaptive...IRJET Journal
This document proposes an adaptive recommendation system to provide accurate service recommendations over big data. It combines content-based, item-based, and knowledge-based recommendation techniques using an adaptive collaborative filtering approach. The system aims to improve scalability, accuracy, and address cold-start problems. It uses clustering to group similar services together to reduce data size and improve recommendation accuracy. The system architecture includes administrative and visitor modules to manage products and provide recommendations respectively. Service recommendations are generated by matching users to similar neighborhoods based on item preferences.
This document summarizes strategies for improving the effectiveness of marketing and sales discussed in various research papers. It first introduces common data mining algorithms like Apriori, FP-growth, and Eclat that are used to find frequently purchased item sets by analyzing customer transaction data. This information can then be used to better target products to customers. The document then summarizes 11 research papers that examine techniques like sentiment analysis of reviews to predict sales, using price indexes to understand market impacts, and designing marketing management systems. The goal of these strategies is to enhance customer relationships and make more informed decisions about product placement, pricing, and promotions.
IRJET- Survey of Classification of Business Reviews using Sentiment AnalysisIRJET Journal
1. The document discusses using machine learning algorithms like Naive Bayes and Linear SVC to classify reviews of businesses as positive or negative based on sentiment analysis of the text.
2. It explores feature selection methods like information gain to identify important features that help determine sentiment. It also discusses using tools like SentiWordNet to assign sentiment scores to words.
3. The proposed system applies a lexical approach using SentiWordNet to quantify word sentiment scores, then uses feature selection and machine learning classifiers like Naive Bayes and Linear SVC to determine the overall sentiment polarity of reviews with over 90% accuracy.
IRJET - Characterizing Products’ Outcome by Sentiment Analysis and Predicting...IRJET Journal
This document discusses characterizing products' outcomes using early reviews and sentiment analysis. The researchers use support vector machines (SVM) to analyze the sentiment of early reviews for products from e-commerce websites to predict whether products will be successful or fail. They define early reviews as those posted soon after a product launch. The SVM model is trained on labeled early review data to classify reviews as positive, negative or neutral sentiment. They also use a statistical method called PER to identify early reviewers based on users who frequently post early reviews. The goal is to help companies understand which types of products may be successful by analyzing early reviewer sentiment.
Sales analysis using product rating in data mining techniqueseSAT Journals
Abstract
In this paper a new product rating approach for mathematically and graphically analyzing sales of same type of products from different manufactures and with most frequent combination of items is proposed. In product sales market there is no specific rating for product of same type and combination of product purchasing pattern. By this we retrieve the best combination of products with mathematically rating. By this rating and pattern we can make graphical representation of rating and combination of product of same type to compare them with other . Data mining provide more abstract knowledge to analyze business functionalities with retail product data. The purpose of product is to fulfill need of customer , based upon it there are different company makes product of same type , by analyzing it mathematically best one can be calculated thing such as customer satisfaction , product efficiency , popularity among them.
Keywords: Data Mining, Sales Report, Product rating, Threshold value.
IRJET- Book Recommendation System using Item Based Collaborative FilteringIRJET Journal
This document describes an item-based collaborative filtering approach for a book recommendation system. It discusses different recommendation system techniques including collaborative filtering, content-based filtering, and hybrid filtering. It then focuses on item-based collaborative filtering, explaining how it calculates item similarities using adjusted cosine similarity and makes predictions using weighted sums. The document tests the approach on the Goodbooks10k dataset and evaluates it using mean absolute error, finding lower error rates with more neighbor items. In conclusion, item-based collaborative filtering is an effective approach for book recommendations.
Comparative Analysis of Machine Learning Algorithms for their Effectiveness i...IRJET Journal
1. The document presents a comparative analysis of machine learning algorithms for predicting customer churn in the telecom industry.
2. Logistic regression, random forest, and balanced random forest classifiers were evaluated on a dataset of 25,000 customers described by 111 variables.
3. The balanced logistic regression model that used SMOTE to address class imbalance achieved the best performance with an area under the ROC curve of 0.861, accurately predicting churn with an accuracy of 77% and recall of 76% on the test set.
Identification of important features and data mining classification technique...IJECEIAES
Employees absenteeism at the work costs organizations billions a year. Prediction of employees’ absenteeism and the reasons behind their absence help organizations in reducing expenses and increasing productivity. Data mining turns the vast volume of human resources data into information that can help in decision-making and prediction. Although the selection of features is a critical step in data mining to enhance the efficiency of the final prediction, it is not yet known which method of feature selection is better. Therefore, this paper aims to compare the performance of three well-known feature selection methods in absenteeism prediction, which are relief-based feature selection, correlation-based feature selection and information-gain feature selection. In addition, this paper aims to find the best combination of feature selection method and data mining technique in enhancing the absenteeism prediction accuracy. Seven classification techniques were used as the prediction model. Additionally, cross-validation approach was utilized to assess the applied prediction models to have more realistic and reliable results. The used dataset was built at a courier company in Brazil with records of absenteeism at work. Regarding experimental results, correlationbased feature selection surpasses the other methods through the performance measurements. Furthermore, bagging classifier was the best-performing data mining technique when features were selected using correlation-based feature selection with an accuracy rate of (92%).
Empirical Model of Supervised Learning Approach for Opinion MiningIRJET Journal
This summarizes an empirical model for opinion mining using supervised learning with an integrated alignment model and naive Bayesian classification model. The proposed model aims to automatically identify user reviews of products as positive or negative and provide an aggregated product rating based on review sentiment analysis and rankings. An alignment model is used to match keywords between source and target reviews to determine sentiment polarity. If a match is not found, the review is sent to a naive Bayesian classification model for sentiment analysis and rating. A rank aggregation model then considers data parameters like user ID, time, and rank to generate a ranked list of products based on ratings and sentiment analysis while excluding short-duration sessions or redundant comments. The proposed hybrid model aims to provide more accurate results for product sentiment analysis
IRJET- Slant Analysis of Customer Reviews in View of Concealed Markov DisplayIRJET Journal
This document summarizes a research paper that proposes a method for sentiment analysis of customer reviews using a Hidden Markov Model. It first discusses how online retailers receive large numbers of customer reviews for products and how it is difficult to analyze the overall sentiment from all reviews. The proposed method involves using a Hidden Markov Model to analyze each review sentence and determine if it expresses a positive or negative sentiment. The model is trained on a dataset of customer reviews that have been part-of-speech labeled. Experimental results found that the trained Hidden Markov Model achieved high precision and accuracy in classifying the sentiment of reviews.
Customer Clustering Based on Customer Purchasing Sequence DataIJERA Editor
Customer clustering has become a priority for enterprises because of the importance of customer relationship management. Customer clustering can improve understanding of the composition and characteristics of customers, thereby enabling the creation of appropriate marketing strategies for each customer group. Previously, different customer clustering approaches have been proposed according to data type, namely customer profile data, customer value data, customer transaction data, and customer purchasing sequence data. This paper considers the customer clustering problem in the context of customer purchasing sequence data. However, two major aspects distinguish this paper from past research: (1) in our model, a customer sequence contains itemsets, which is a more realistic configuration than previous models, which assume a customer sequence would merely consist of items; and (2) in our model, a customer may belong to multiple clusters or no cluster, whereas in existing models a customer is limited to only one cluster. The second difference implies that each cluster discovered using our model represents a crucial type of customer behavior and that a customer can exhibit several types of behavior simultaneously. Finally, extensive experiments are conducted through a retail data set, and the results show that the clusters obtained by our model can provide more accurate descriptions of customer purchasing behaviors.
The disruptometer: an artificial intelligence algorithm for market insightsjournalBEEI
Social media data mining is rapidly developing to be a mainstream tool for marketing insights in today’s world, due to the abundance of data and often freely accessed information. In this paper, we propose a framework for market research purposes called the Disruptometer. The algorithm uses keywords to provide different types of market insights from data crawling. The preliminary algorithm data-mines information from Twitter and outputs 2 parameters-Product-to-Market Fit and Disruption Quotient, which is obtained from a brand’s customer value proposition, problem space, and incumbent space. The algorithm has been tested with a venture capitalist portfolio company and market research firm to show high correlated results. Out of 4 brand use cases, 3 obtained identical results with the
analysts ‘studies.
Comparative Study of Machine Learning Algorithms for Sentiment Analysis with ...Sagar Deogirkar
Comparing the State-of-the-Art Deep Learning with Machine Learning algorithms performance on TF-IDF vector creation for Sentiment Analysis using Airline Tweeter Data Set.
Face Recognition using Discrete Wavelet Transform and Principle Component Analysis features of MATLAB.
for processing video go to: https://www.youtube.com/watch?v=X67b0NULO98
- Four potential Yoplait Light TV ads were tested among 5478 respondents to determine if a $25 million ad campaign would be worthwhile.
- Key findings showed that respondents with positive evaluations of the ads were more likely to purchase the products, the storyline conclusion drove the evaluation of the ads, and positive ad appeals positively correlated with brand recall while negative appeals negatively correlated.
- It was concluded that the ad campaign would be worth $25 million, but further research should measure actual purchase behavior and improve brand recall.
This document provides an introduction to multivariate statistics. It begins with background on the Indian Statistical Institute where the author is located. It then discusses some common myths about multivariate statistics, defining it as the analysis of relationships between sets of variables. The document lists several multivariate statistical tools and provides examples of research questions they could address related to women and child development. It also summarizes some published studies utilizing multivariate techniques like principal component analysis, correspondence analysis, cluster analysis, and MANOVA.
The document discusses component-level design which occurs after architectural design. It aims to create a design model from analysis and architectural models. Component-level design can be represented using graphical, tabular, or text-based notations. The key aspects covered include:
- Defining a software component as a modular building block with interfaces and collaboration
- Designing class-based components following principles like open-closed and dependency inversion
- Guidelines for high cohesion and low coupling in components
- Designing conventional components using notations like sequence, if-then-else, and tabular representations
Face Recognition using PCA-Principal Component Analysis using MATLABSindhi Madhuri
PCA is used for face recognition. It involves calculating eigenvectors from a training set of face images to define a feature space called "eigenfaces". A new face is recognized by projecting it onto this space and comparing to existing faces. PCA works by identifying directions of maximum variance in the training data, capturing the most important information about faces with fewer vectors. Potential applications include identification, security, and human-computer interaction. However, it is sensitive to changes in lighting and expression.
The document discusses component-based software engineering and defines a software component. A component is a modular building block defined by interfaces that can be independently deployed. Components are standardized, independent, composable, deployable, and documented. They communicate through interfaces and are designed to achieve reusability. The document outlines characteristics of components and discusses different views of components, including object-oriented, conventional, and process-related views. It also covers topics like component-level design principles, packaging, cohesion, and coupling.
Steam turbines work by converting the energy of expanding steam into rotational motion. They have several key components and come in two main types: impulse and reaction. Impulse turbines use nozzles to direct high velocity steam onto turbine blades for impulse, while reaction turbines utilize both fixed and moving blades to expand steam. Common problems in steam turbines include stress corrosion cracking, corrosion fatigue, thermal fatigue, and pitting due to chemical attack from corrosive elements in the steam. Proper lubrication and preventing blade deterioration are important for optimizing steam turbine performance and lifespan.
Gesture Recognition using Principle Component Analysis & Viola-Jones AlgorithmIJMER
Gesture recognition pertains to recognizing meaningful expressions of motion by a human,
involving the hands, arms, face, head, and/or body. It is of utmost importance in designing an intelligent
and efficient human–computer interface. The applications of gesture recognition are manifold, ranging
from sign language through medical rehabilitation to virtual reality. In this paper, we provide a survey on
gesture recognition with particular emphasis on hand gestures and facial expressions. Applications
involving wavelet transform and principal component analysis for face and hand gesture recognition on
digital images
IRJET - Customer Churn Analysis in Telecom IndustryIRJET Journal
This document discusses using machine learning techniques like logistic regression to analyze customer data and predict customer churn in the telecom industry. It proposes a system to build a churn prediction model using logistic regression on historical customer data to identify high-risk customers. The system would have options to view results, perform training and testing on new data, and analyze performance. It would also include a recommender system to recommend suitable plans for identified churn customers based on their usage patterns. The results show the model can predict churn with 80% accuracy and identify similar customers who may also churn.
IRJET-User Profile based Behavior Identificaton using Data Mining TechniqueIRJET Journal
This document presents a model for analyzing customer behavior on online shopping sites using data mining techniques. Clickstream data is collected from customers and analyzed to predict shopping behaviors and provide recommendations. The Naive Bayes algorithm is used to classify customers into categories based on likely purchased and viewed product categories. Recommendations are then provided to customers in their predicted interested categories. The model aims to increase sales by understanding customer interests and loyalty to specific product types.
IRJET-Survey on Identification of Top-K Competitors using Data MiningIRJET Journal
This document presents a method for identifying the top-K competitors of a given product using customer reviews and sentiment analysis. It defines competitiveness between two products based on how much their target markets overlap. The method uses natural language processing to analyze sentiment in customer reviews as positive, negative or neutral. It then calculates competitiveness scores between all products to identify the top competitors for a given product. The proposed approach is tested on multiple datasets from different domains like cameras, mobiles, TVs and laptops. It provides an efficient and scalable method for identifying competitors from large datasets.
IRJET- Opinion Mining and Sentiment Analysis for Online ReviewIRJET Journal
This document summarizes a research paper that proposes a system for conducting sentiment analysis on online product reviews. The system uses a dual sentiment analysis approach that trains a classifier on both original reviews and sentiment-reversed reviews to address issues with polarity shifts. It generates random keys for users to access the review system and uses clustering algorithms to differentiate positive and negative words in reviews and provide an overall product rating. The goal is to help users make more informed purchasing decisions based on genuine reviews by preventing fake reviews from improperly influencing ratings.
This document summarizes research on how the Analytic Hierarchy Process (AHP) methodology and human behavior influence the e-scouting process. Experiments were conducted comparing how student "buyers" selected products versus a virtual decision maker using AHP. The results showed that an AHP-based decision support system helped buyers interpret strategic guidelines. When product features were clearly defined, human and AHP evaluations aligned. But differences emerged when features were unknown or unlimited. The research concluded AHP always improved e-scouting strategies, but only improved methods when human evaluations were limited to defined features.
IRJET- Opinion Summarization using Soft Computing and Information RetrievalIRJET Journal
This document presents an approach for opinion summarization of online user reviews through various stages including data acquisition, preprocessing, feature extraction, classification, and representation to generate a comparative feature-based statistical summary. The proposed system first collects user reviews from online sources, cleans the data through preprocessing techniques, extracts product features, classifies reviews using a sentiment dictionary, and finally represents the results in charts and graphs to guide users in making purchase decisions. The system aims to address the challenges of analyzing large amounts of unstructured user review data written in natural language to automatically generate concise summaries of customer opinions and assessments of different product features.
Measurement model of software quality in user’s perceptioneSAT Journals
Abstract An increasing emphasis on consumer demand and expanded development budgets of software development firms fuel the need to upgrade software quality. Software quality is largely measured by quality standards and guidelines. This paper presents a method for modeling users’ perception of software quality. The method aims to improve the quality of data derived from user opinion surveys and facilitate the analysis of such data. The proposed model offers a way to measure users’ opinion in early stages of product release and a way of predicting the opinion subsequently formed after their opinion revisions using the initial measurements. Therefore, this work develops a conceptual software quality measurement model for evaluating software quality to decrease the perceptive and expectative (or quality) measuring gap between a software development firm and the end user’s requirements. Index Terms: Software quality, Software development, Quality Measurement, Quality Evaluations, & Quality Attributes
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
IRJET- Customer Buying Prediction using Machine-Learning Techniques: A SurveyIRJET Journal
1) The document discusses using machine learning techniques to predict customer purchasing and churn based on their personal and behavioral data.
2) It reviews several machine learning algorithms that have been used for prediction, including random forest, logistic regression, naive bayes, and support vector machines.
3) Deep learning techniques are also discussed, including the use of convolutional neural networks to reveal hidden patterns in customer data and predict purchases and churn.
CUSTOMER SEGMENTATION IN SHOPPING MALL USING CLUSTERING IN MACHINE LEARNINGIRJET Journal
This document discusses using clustering algorithms in machine learning to segment customers in a shopping mall. It aims to identify groups of customers with similar characteristics like gender, age, spending habits to more effectively market to each group. Specifically, it uses k-means clustering to segment customers and visualize differences in gender and age. It then examines their annual income and proposes that segmentation focuses on improving customer spending scores. The proposed system uses machine learning approaches like k-means clustering which is more accurate and efficient than traditional manual methods for analyzing customer data and finding insights to identify customer segments.
IRJET- Analysis of Rating Difference and User InterestIRJET Journal
This document summarizes a research paper that proposes a collaborative filtering recommendation algorithm that incorporates rating differences and user interests. It first adds a rating difference factor to the traditional collaborative filtering algorithm. It then calculates user interests based on item attributes and the similarity between user interests. Recommendations are made by weighting user rating differences and interest similarities. The proposed algorithm is shown to reduce error rates and improve accuracy compared to traditional collaborative filtering.
IRJET - Sentiment Analysis and Rumour Detection in Online Product ReviewsIRJET Journal
This document summarizes research on sentiment analysis and rumor detection in online product reviews. It discusses several techniques for sentiment classification and rumor detection, including using convolutional neural networks, recurrent neural networks, attention mechanisms, and sentiment lexicons. The document also examines applying these techniques to datasets from e-commerce sites to classify reviews as positive, negative, or neutral and identify deceptive reviews. Additionally, it proposes models that incorporate sentiment analysis to provide more personalized product recommendations and discusses applying these models and sentiment features to improve recommendation system performance.
Automated Feature Selection and Churn Prediction using Deep Learning ModelsIRJET Journal
This document discusses using deep learning models for churn prediction in the telecommunications industry. It begins with an introduction to churn prediction and feature selection challenges. It then provides an overview of deep learning techniques, including artificial neural networks, convolutional neural networks, and their applications. The document proposes three deep learning architectures for churn prediction and experiments with them on two telecom datasets. The results show deep learning models can achieve performance comparable to traditional models without manual feature engineering.
This document discusses a smart tourism recommender system that uses machine learning algorithms to provide personalized tourism recommendations to users. It first introduces the problem of information overload that tourists face when searching for travel information online. It then discusses how the proposed system aims to overcome this issue by using a naive Bayes classifier to analyze a user's input and preferences and provide the most relevant tourism location recommendations. The system architecture and challenges of building recommender systems at scale are also outlined. Finally, it summarizes the implemented website and discusses using machine learning algorithms like naive Bayes to classify users' requirements and provide better assistance in finding desired travel results.
Recommender System- Analyzing products by mining Data StreamsIRJET Journal
This document discusses several papers related to recommender systems and analyzing products and reviews. It discusses using data mining techniques like SVM, Naive Bayes and clustering algorithms to build recommendation systems for small businesses based on product sales and reviews. It also discusses detecting fake reviews using language analysis and summarizes papers on using Power BI for data visualization and analyzing research data. Key aspects covered include using data streams to provide recommendations in real-time, detecting fake reviews, using data visualization tools like Power BI for analysis, and combining clustering and association rule mining for recommendations.
IRJET- Study of Data Analysis on Machines of Pharmaceutical Manufacturing Ind...IRJET Journal
This document discusses analyzing data from machines in a pharmaceutical manufacturing industry. Specifically, it focuses on analyzing data from a checkweigher machine, which checks weights of packaged products, and a metal detector machine, which detects metal impurities in tablets. The goal is to analyze reports from these machines to determine faults, increase efficiency, and carry out maintenance and traceback activities. The methodology uses Python libraries like NumPy, Pandas, and Matplotlib to analyze the machine data in a Jupyter Notebook, including visualizing statistics and trends over time. This will provide insights to improve operations in the pharmaceutical manufacturing process.
IRJET- Scalable Content Aware Collaborative Filtering for Location Recommenda...IRJET Journal
This document proposes a scalable content-aware collaborative filtering (ICCF) system for location recommendation that avoids negative sampling. ICCF considers user profiles, textual content, and relationships between concepts to learn user preferences. It evaluates features like dimensionality and estimate accuracy, and relates ICCF to graph Laplacian regularization. The system was evaluated on a large-scale LBSN where ICCF improved accuracy over other collaborative filtering methods by addressing data sparsity issues through an injection approach. Naive Bayes and collaborative filtering algorithms are used and the system aims to provide personalized and diverse location recommendations while preserving user privacy.
The document describes a study on decentralized production distribution planning in a supply chain using intelligent agents. It proposes using intelligent agents to coordinate between decentralized production and distribution facilities to minimize costs. The agents work iteratively, with the Distribution Agent generating a plan given infinite production capacity, then the Production Agent generating a feasible production plan, and they iterate until their plans are aligned. A mathematical model is formulated and an example numerical illustration is provided to demonstrate the approach. The results show the total cost decreases with each iteration as the agents coordinate to find an optimal joint production-distribution plan.
IRJET- Recommendation System for Electronic Products using BigDataIRJET Journal
This document proposes a recommendation system for electronic products using big data and agglomerative hierarchical clustering. It discusses how recommendation systems work to predict user preferences and overcome information overload. The proposed system would analyze product data sets, cluster similar products, and recommend products to users based on their preferences and other product ratings and reviews. It aims to reduce the time users spend deciding which product to purchase by providing personalized recommendations. The system design involves users registering and querying the system, which then performs map reduce on the data sets clustered by an administrator to filter and return recommended product lists to the user.
IRJET- User Preferences and Similarity EstimationIRJET Journal
This document summarizes a research paper that proposes a new method for estimating the similarity between items based on user preferences. The proposed method considers user preferences when calculating similarity, unlike existing methods that only look at item attributes. It uses an "inverse top-k query" that returns the users for whom an item is in their top-k set, rather than the top k items for a user. The Jaccard coefficient is used to calculate similarity between items based on the results of inverse top-k queries. The proposed method aims to provide more personalized recommendations by more closely matching items to individual user preferences.
Similar to Opinion pattern mining based on probabilistic principle component analysis report (20)
Mechanical properties of hybrid fiber reinforced concrete for pavementseSAT Journals
Abstract
The effect of addition of mono fibers and hybrid fibers on the mechanical properties of concrete mixture is studied in the present
investigation. Steel fibers of 1% and polypropylene fibers 0.036% were added individually to the concrete mixture as mono fibers and
then they were added together to form a hybrid fiber reinforced concrete. Mechanical properties such as compressive, split tensile and
flexural strength were determined. The results show that hybrid fibers improve the compressive strength marginally as compared to
mono fibers. Whereas, hybridization improves split tensile strength and flexural strength noticeably.
Keywords:-Hybridization, mono fibers, steel fiber, polypropylene fiber, Improvement in mechanical properties.
Material management in construction – a case studyeSAT Journals
Abstract
The objective of the present study is to understand about all the problems occurring in the company because of improper application
of material management. In construction project operation, often there is a project cost variance in terms of the material, equipments,
manpower, subcontractor, overhead cost, and general condition. Material is the main component in construction projects. Therefore,
if the material management is not properly managed it will create a project cost variance. Project cost can be controlled by taking
corrective actions towards the cost variance. Therefore a methodology is used to diagnose and evaluate the procurement process
involved in material management and launch a continuous improvement was developed and applied. A thorough study was carried
out along with study of cases, surveys and interviews to professionals involved in this area. As a result, a methodology for diagnosis
and improvement was proposed and tested in selected projects. The results obtained show that the main problem of procurement is
related to schedule delays and lack of specified quality for the project. To prevent this situation it is often necessary to dedicate
important resources like money, personnel, time, etc. To monitor and control the process. A great potential for improvement was
detected if state of the art technologies such as, electronic mail, electronic data interchange (EDI), and analysis were applied to the
procurement process. These helped to eliminate the root causes for many types of problems that were detected.
Managing drought short term strategies in semi arid regions a case studyeSAT Journals
Abstract
Drought management needs multidisciplinary action. Interdisciplinary efforts among the experts in various fields of the droughts
prone areas are helpful to achieve tangible and permanent solution for this recurring problem. The Gulbarga district having the total
area around 16, 240 sq.km, and accounts 8.45 per cent of the Karnataka state area. The district has been situated with latitude 17º 19'
60" North and longitude of 76 º 49' 60" east. The district is situated entirely on the Deccan plateau positioned at a height of 300 to
750 m above MSL. Sub-tropical, semi-arid type is one among the drought prone districts of Karnataka State. The drought
management is very important for a district like Gulbarga. In this paper various short term strategies are discussed to mitigate the
drought condition in the district.
Keywords: Drought, South-West monsoon, Semi-Arid, Rainfall, Strategies etc.
Life cycle cost analysis of overlay for an urban road in bangaloreeSAT Journals
Abstract
Pavements are subjected to severe condition of stresses and weathering effects from the day they are constructed and opened to traffic
mainly due to its fatigue behavior and environmental effects. Therefore, pavement rehabilitation is one of the most important
components of entire road systems. This paper highlights the design of concrete pavement with added mono fibers like polypropylene,
steel and hybrid fibres for a widened portion of existing concrete pavement and various overlay alternatives for an existing
bituminous pavement in an urban road in Bangalore. Along with this, Life cycle cost analyses at these sections are done by Net
Present Value (NPV) method to identify the most feasible option. The results show that though the initial cost of construction of
concrete overlay is high, over a period of time it prove to be better than the bituminous overlay considering the whole life cycle cost.
The economic analysis also indicates that, out of the three fibre options, hybrid reinforced concrete would be economical without
compromising the performance of the pavement.
Keywords: - Fatigue, Life cycle cost analysis, Net Present Value method, Overlay, Rehabilitation
Laboratory studies of dense bituminous mixes ii with reclaimed asphalt materialseSAT Journals
Abstract
The issue of growing demand on our nation’s roadways over that past couple of decades, decreasing budgetary funds, and the need to
provide a safe, efficient, and cost effective roadway system has led to a dramatic increase in the need to rehabilitate our existing
pavements and the issue of building sustainable road infrastructure in India. With these emergency of the mentioned needs and this
are today’s burning issue and has become the purpose of the study.
In the present study, the samples of existing bituminous layer materials were collected from NH-48(Devahalli to Hassan) site.The
mixtures were designed by Marshall Method as per Asphalt institute (MS-II) at 20% and 30% Reclaimed Asphalt Pavement (RAP).
RAP material was blended with virgin aggregate such that all specimens tested for the, Dense Bituminous Macadam-II (DBM-II)
gradation as per Ministry of Roads, Transport, and Highways (MoRT&H) and cost analysis were carried out to know the economics.
Laboratory results and analysis showed the use of recycled materials showed significant variability in Marshall Stability, and the
variability increased with the increase in RAP content. The saving can be realized from utilization of recycled materials as per the
methodology, the reduction in the total cost is 19%, 30%, comparing with the virgin mixes.
Keywords: Reclaimed Asphalt Pavement, Marshall Stability, MS-II, Dense Bituminous Macadam-II
Laboratory investigation of expansive soil stabilized with natural inorganic ...eSAT Journals
This document summarizes a study on stabilizing expansive black cotton soil with the natural inorganic stabilizer RBI-81. Laboratory tests were conducted to evaluate the effect of RBI-81 on the soil's engineering properties. The tests showed that with 2% RBI-81 and 28 days of curing, the unconfined compressive strength increased by around 250% and the CBR value improved by approximately 400% compared to the untreated soil. Overall, the study found that RBI-81 effectively improved the strength properties of the black cotton soil and its suitability as a soil stabilizer was supported.
Influence of reinforcement on the behavior of hollow concrete block masonry p...eSAT Journals
Abstract
Reinforced masonry was developed to exploit the strength potential of masonry and to solve its lack of tensile strength. Experimental
and analytical studies have been carried out to investigate the effect of reinforcement on the behavior of hollow concrete block
masonry prisms under compression and to predict ultimate failure compressive strength. In the numerical program, three dimensional
non-linear finite elements (FE) model based on the micro-modeling approach is developed for both unreinforced and reinforced
masonry prisms using ANSYS (14.5). The proposed FE model uses multi-linear stress-strain relationships to model the non-linear
behavior of hollow concrete block, mortar, and grout. Willam-Warnke’s five parameter failure theory has been adopted to model the
failure of masonry materials. The comparison of the numerical and experimental results indicates that the FE models can successfully
capture the highly nonlinear behavior of the physical specimens and accurately predict their strength and failure mechanisms.
Keywords: Structural masonry, Hollow concrete block prism, grout, Compression failure, Finite element method,
Numerical modeling.
Influence of compaction energy on soil stabilized with chemical stabilizereSAT Journals
This document summarizes a study on the influence of compaction energy on soil stabilized with a chemical stabilizer. Laboratory tests were conducted on locally available loamy soil treated with a patented polymer liquid stabilizer and compacted at four different energy levels. The study found that increasing the compaction effort increased the density of both untreated and treated soil, but the rate of increase was lower for stabilized soil. Treating the soil with the stabilizer improved its unconfined compressive strength and resilient modulus, and reduced accumulated plastic strain, with these properties further improved by higher compaction efforts. The stabilized soil exhibited strength and performance benefits compared to the untreated soil.
Geographical information system (gis) for water resources managementeSAT Journals
This document describes a hydrological framework developed in the form of a Hydrologic Information System (HIS) to meet the information needs of various government departments related to water management in a state. The HIS consists of a hydrological database coupled with tools for collecting and analyzing spatial and non-spatial water resources data. It also incorporates a hydrological model to indirectly assess water balance components over space and time. A web-based GIS portal was created to allow users to access and visualize the hydrological data, as well as outputs from the SWAT hydrological model. The framework is intended to facilitate integrated water resources planning and management across different administrative levels.
Forest type mapping of bidar forest division, karnataka using geoinformatics ...eSAT Journals
Abstract
The study demonstrate the potentiality of satellite remote sensing technique for the generation of baseline information on forest types
including tree plantation details in Bidar forest division, Karnataka covering an area of 5814.60Sq.Kms. The Total Area of Bidar
forest division is 5814Sq.Kms analysis of the satellite data in the study area reveals that about 84% of the total area is Covered by
crop land, 1.778% of the area is covered by dry deciduous forest, 1.38 % of mixed plantation, which is very threatening to the
environmental stability of the forest, future plantation site has been mapped. With the use of latest Geo-informatics technology proper
and exact condition of the trees can be observed and necessary precautions can be taken for future plantation works in an appropriate
manner
Keywords:-RS, GIS, GPS, Forest Type, Tree Plantation
Factors influencing compressive strength of geopolymer concreteeSAT Journals
Abstract
To study effects of several factors on the properties of fly ash based geopolymer concrete on the compressive strength and also the
cost comparison with the normal concrete. The test variables were molarities of sodium hydroxide(NaOH) 8M,14M and 16M, ratio of
NaOH to sodium silicate (Na2SiO3) 1, 1.5, 2 and 2.5, alkaline liquid to fly ash ratio 0.35 and 0.40 and replacement of water in
Na2SiO3 solution by 10%, 20% and 30% were used in the present study. The test results indicated that the highest compressive
strength 54 MPa was observed for 16M of NaOH, ratio of NaOH to Na2SiO3 2.5 and alkaline liquid to fly ash ratio of 0.35. Lowest
compressive strength of 27 MPa was observed for 8M of NaOH, ratio of NaOH to Na2SiO3 is 1 and alkaline liquid to fly ash ratio of
0.40. Alkaline liquid to fly ash ratio of 0.35, water replacement of 10% and 30% for 8 and 16 molarity of NaOH and has resulted in
compressive strength of 36 MPa and 20 MPa respectively. Superplasticiser dosage of 2 % by weight of fly ash has given higher
strength in all cases.
Keywords: compressive strength, alkaline liquid, fly ash
Experimental investigation on circular hollow steel columns in filled with li...eSAT Journals
Abstract
Composite Circular hollow Steel tubes with and without GFRP infill for three different grades of Light weight concrete are tested for
ultimate load capacity and axial shortening , under Cyclic loading. Steel tubes are compared for different lengths, cross sections and
thickness. Specimens were tested separately after adopting Taguchi’s L9 (Latin Squares) Orthogonal array in order to save the initial
experimental cost on number of specimens and experimental duration. Analysis was carried out using ANN (Artificial Neural
Network) technique with the assistance of Mini Tab- a statistical soft tool. Comparison for predicted, experimental & ANN output is
obtained from linear regression plots. From this research study, it can be concluded that *Cross sectional area of steel tube has most
significant effect on ultimate load carrying capacity, *as length of steel tube increased- load carrying capacity decreased & *ANN
modeling predicted acceptable results. Thus ANN tool can be utilized for predicting ultimate load carrying capacity for composite
columns.
Keywords: Light weight concrete, GFRP, Artificial Neural Network, Linear Regression, Back propagation, orthogonal
Array, Latin Squares
Experimental behavior of circular hsscfrc filled steel tubular columns under ...eSAT Journals
This document summarizes an experimental study that tested circular concrete-filled steel tube columns with varying parameters. 45 specimens were tested with different fiber percentages (0-2%), tube diameter-to-wall-thickness ratios (D/t from 15-25), and length-to-diameter (L/d) ratios (from 2.97-7.04). The results found that columns filled with fiber-reinforced concrete exhibited higher stiffness, equal ductility, and enhanced energy absorption compared to those filled with plain concrete. The load carrying capacity increased with fiber content up to 1.5% but not at 2.0%. The analytical predictions of failure load closely matched the experimental values.
Evaluation of punching shear in flat slabseSAT Journals
Abstract
Flat-slab construction has been widely used in construction today because of many advantages that it offers. The basic philosophy in
the design of flat slab is to consider only gravity forces; this method ignores the effect of punching shear due to unbalanced moments
at the slab column junction which is critical. An attempt has been made to generate generalized design sheets which accounts both
punching shear due to gravity loads and unbalanced moments for cases (a) interior column; (b) edge column (bending perpendicular
to shorter edge); (c) edge column (bending parallel to shorter edge); (d) corner column. These design sheets are prepared as per
codal provisions of IS 456-2000. These design sheets will be helpful in calculating the shear reinforcement to be provided at the
critical section which is ignored in many design offices. Apart from its usefulness in evaluating punching shear and the necessary
shear reinforcement, the design sheets developed will enable the designer to fix the depth of flat slab during the initial phase of the
design.
Keywords: Flat slabs, punching shear, unbalanced moment.
Evaluation of performance of intake tower dam for recent earthquake in indiaeSAT Journals
Abstract
Intake towers are typically tall, hollow, reinforced concrete structures and form entrance to reservoir outlet works. A parametric
study on dynamic behavior of circular cylindrical towers can be carried out to study the effect of depth of submergence, wall thickness
and slenderness ratio, and also effect on tower considering dynamic analysis for time history function of different soil condition and
by Goyal and Chopra accounting interaction effects of added hydrodynamic mass of surrounding and inside water in intake tower of
dam
Key words: Hydrodynamic mass, Depth of submergence, Reservoir, Time history analysis,
Evaluation of operational efficiency of urban road network using travel time ...eSAT Journals
This document evaluates the operational efficiency of an urban road network in Tiruchirappalli, India using travel time reliability measures. Traffic volume and travel times were collected using video data from 8-10 AM on various roads. Average travel times, 95th percentile travel times, and buffer time indexes were calculated to assess reliability. Non-motorized vehicles were found to most impact reliability on one road. A relationship between buffer time index and traffic volume was developed. Finally, a travel time model was created and validated based on length, speed, and volume.
Estimation of surface runoff in nallur amanikere watershed using scs cn methodeSAT Journals
Abstract
The development of watershed aims at productive utilization of all the available natural resources in the entire area extending from
ridge line to stream outlet. The per capita availability of land for cultivation has been decreasing over the years. Therefore, water and
the related land resources must be developed, utilized and managed in an integrated and comprehensive manner. Remote sensing and
GIS techniques are being increasingly used for planning, management and development of natural resources. The study area, Nallur
Amanikere watershed geographically lies between 110 38’ and 110 52’ N latitude and 760 30’ and 760 50’ E longitude with an area of
415.68 Sq. km. The thematic layers such as land use/land cover and soil maps were derived from remotely sensed data and overlayed
through ArcGIS software to assign the curve number on polygon wise. The daily rainfall data of six rain gauge stations in and around
the watershed (2001-2011) was used to estimate the daily runoff from the watershed using Soil Conservation Service - Curve Number
(SCS-CN) method. The runoff estimated from the SCS-CN model was then used to know the variation of runoff potential with different
land use/land cover and with different soil conditions.
Keywords: Watershed, Nallur watershed, Surface runoff, Rainfall-Runoff, SCS-CN, Remote Sensing, GIS.
Estimation of morphometric parameters and runoff using rs & gis techniqueseSAT Journals
This document summarizes a study that used remote sensing and GIS techniques to estimate morphometric parameters and runoff for the Yagachi catchment area in India over a 10-year period. Morphometric analysis was conducted to understand the hydrological response at the micro-watershed level. Daily runoff was estimated using the SCS curve number model. The results showed a positive correlation between rainfall and runoff. Land use/land cover changes between 2001-2010 were found to impact estimated runoff amounts. Remote sensing approaches provided an effective means to model runoff for this large, ungauged area.
Effect of variation of plastic hinge length on the results of non linear anal...eSAT Journals
Abstract The nonlinear Static procedure also well known as pushover analysis is method where in monotonically increasing loads are applied to the structure till the structure is unable to resist any further load. It is a popular tool for seismic performance evaluation of existing and new structures. In literature lot of research has been carried out on conventional pushover analysis and after knowing deficiency efforts have been made to improve it. But actual test results to verify the analytically obtained pushover results are rarely available. It has been found that some amount of variation is always expected to exist in seismic demand prediction of pushover analysis. Initial study is carried out by considering user defined hinge properties and default hinge length. Attempt is being made to assess the variation of pushover analysis results by considering user defined hinge properties and various hinge length formulations available in literature and results compared with experimentally obtained results based on test carried out on a G+2 storied RCC framed structure. For the present study two geometric models viz bare frame and rigid frame model is considered and it is found that the results of pushover analysis are very sensitive to geometric model and hinge length adopted. Keywords: Pushover analysis, Base shear, Displacement, hinge length, moment curvature analysis
Effect of use of recycled materials on indirect tensile strength of asphalt c...eSAT Journals
Abstract
Depletion of natural resources and aggregate quarries for the road construction is a serious problem to procure materials. Hence
recycling or reuse of material is beneficial. On emphasizing development in sustainable construction in the present era, recycling of
asphalt pavements is one of the effective and proven rehabilitation processes. For the laboratory investigations reclaimed asphalt
pavement (RAP) from NH-4 and crumb rubber modified binder (CRMB-55) was used. Foundry waste was used as a replacement to
conventional filler. Laboratory tests were conducted on asphalt concrete mixes with 30, 40, 50, and 60 percent replacement with RAP.
These test results were compared with conventional mixes and asphalt concrete mixes with complete binder extracted RAP
aggregates. Mix design was carried out by Marshall Method. The Marshall Tests indicated highest stability values for asphalt
concrete (AC) mixes with 60% RAP. The optimum binder content (OBC) decreased with increased in RAP in AC mixes. The Indirect
Tensile Strength (ITS) for AC mixes with RAP also was found to be higher when compared to conventional AC mixes at 300C.
Keywords: Reclaimed asphalt pavement, Foundry waste, Recycling, Marshall Stability, Indirect tensile strength.
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...IJECEIAES
Climate change's impact on the planet forced the United Nations and governments to promote green energies and electric transportation. The deployments of photovoltaic (PV) and electric vehicle (EV) systems gained stronger momentum due to their numerous advantages over fossil fuel types. The advantages go beyond sustainability to reach financial support and stability. The work in this paper introduces the hybrid system between PV and EV to support industrial and commercial plants. This paper covers the theoretical framework of the proposed hybrid system including the required equation to complete the cost analysis when PV and EV are present. In addition, the proposed design diagram which sets the priorities and requirements of the system is presented. The proposed approach allows setup to advance their power stability, especially during power outages. The presented information supports researchers and plant owners to complete the necessary analysis while promoting the deployment of clean energy. The result of a case study that represents a dairy milk farmer supports the theoretical works and highlights its advanced benefits to existing plants. The short return on investment of the proposed approach supports the paper's novelty approach for the sustainable electrical system. In addition, the proposed system allows for an isolated power setup without the need for a transmission line which enhances the safety of the electrical network
Introduction- e - waste – definition - sources of e-waste– hazardous substances in e-waste - effects of e-waste on environment and human health- need for e-waste management– e-waste handling rules - waste minimization techniques for managing e-waste – recycling of e-waste - disposal treatment methods of e- waste – mechanism of extraction of precious metal from leaching solution-global Scenario of E-waste – E-waste in India- case studies.
A SYSTEMATIC RISK ASSESSMENT APPROACH FOR SECURING THE SMART IRRIGATION SYSTEMSIJNSA Journal
The smart irrigation system represents an innovative approach to optimize water usage in agricultural and landscaping practices. The integration of cutting-edge technologies, including sensors, actuators, and data analysis, empowers this system to provide accurate monitoring and control of irrigation processes by leveraging real-time environmental conditions. The main objective of a smart irrigation system is to optimize water efficiency, minimize expenses, and foster the adoption of sustainable water management methods. This paper conducts a systematic risk assessment by exploring the key components/assets and their functionalities in the smart irrigation system. The crucial role of sensors in gathering data on soil moisture, weather patterns, and plant well-being is emphasized in this system. These sensors enable intelligent decision-making in irrigation scheduling and water distribution, leading to enhanced water efficiency and sustainable water management practices. Actuators enable automated control of irrigation devices, ensuring precise and targeted water delivery to plants. Additionally, the paper addresses the potential threat and vulnerabilities associated with smart irrigation systems. It discusses limitations of the system, such as power constraints and computational capabilities, and calculates the potential security risks. The paper suggests possible risk treatment methods for effective secure system operation. In conclusion, the paper emphasizes the significant benefits of implementing smart irrigation systems, including improved water conservation, increased crop yield, and reduced environmental impact. Additionally, based on the security analysis conducted, the paper recommends the implementation of countermeasures and security approaches to address vulnerabilities and ensure the integrity and reliability of the system. By incorporating these measures, smart irrigation technology can revolutionize water management practices in agriculture, promoting sustainability, resource efficiency, and safeguarding against potential security threats.
Understanding Inductive Bias in Machine LearningSUTEJAS
This presentation explores the concept of inductive bias in machine learning. It explains how algorithms come with built-in assumptions and preferences that guide the learning process. You'll learn about the different types of inductive bias and how they can impact the performance and generalizability of machine learning models.
The presentation also covers the positive and negative aspects of inductive bias, along with strategies for mitigating potential drawbacks. We'll explore examples of how bias manifests in algorithms like neural networks and decision trees.
By understanding inductive bias, you can gain valuable insights into how machine learning models work and make informed decisions when building and deploying them.
International Conference on NLP, Artificial Intelligence, Machine Learning an...gerogepatton
International Conference on NLP, Artificial Intelligence, Machine Learning and Applications (NLAIM 2024) offers a premier global platform for exchanging insights and findings in the theory, methodology, and applications of NLP, Artificial Intelligence, Machine Learning, and their applications. The conference seeks substantial contributions across all key domains of NLP, Artificial Intelligence, Machine Learning, and their practical applications, aiming to foster both theoretical advancements and real-world implementations. With a focus on facilitating collaboration between researchers and practitioners from academia and industry, the conference serves as a nexus for sharing the latest developments in the field.
Advanced control scheme of doubly fed induction generator for wind turbine us...IJECEIAES
This paper describes a speed control device for generating electrical energy on an electricity network based on the doubly fed induction generator (DFIG) used for wind power conversion systems. At first, a double-fed induction generator model was constructed. A control law is formulated to govern the flow of energy between the stator of a DFIG and the energy network using three types of controllers: proportional integral (PI), sliding mode controller (SMC) and second order sliding mode controller (SOSMC). Their different results in terms of power reference tracking, reaction to unexpected speed fluctuations, sensitivity to perturbations, and resilience against machine parameter alterations are compared. MATLAB/Simulink was used to conduct the simulations for the preceding study. Multiple simulations have shown very satisfying results, and the investigations demonstrate the efficacy and power-enhancing capabilities of the suggested control system.
Literature Review Basics and Understanding Reference Management.pptxDr Ramhari Poudyal
Three-day training on academic research focuses on analytical tools at United Technical College, supported by the University Grant Commission, Nepal. 24-26 May 2024
Opinion pattern mining based on probabilistic principle component analysis report
1. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
_______________________________________________________________________________________
Volume: 03 Issue: 11 | Nov-2014, Available @ http://www.ijret.org 564
OPINION PATTERN MINING BASED ON PROBABILISTIC
PRINCIPLE COMPONENT ANALYSIS REPORT
P.Saravana Kumar1
, A.Vijaya Kathiravan2
1
Assistant Professor, Department of Computer Applications, K.S.Rangasamy College of Technology,Tiruchengode,
Anna University, Chennai.
2
Assistant Professor, Department of Computer Science, Govt.Arts College, Periyar University, Salem
Abstract
Now days, Customer feedback and satisfaction is playing a significant role in commercial product to market. Customer can be
reviewed by other customer feedback and collect all the relevant information related to a particular product. Based on that the
decision can be taken to purchase the product. In the traditional method, Random forest predicted the impact of the review but not
worked with segmentation on the basis of multiple reviewer comments. At the same time, the variable cluster algorithm has been
addressed in the market segmentation for retailing the customer’s lifestyle. It has been provided with the segmentation method,
but not guide to full proof strategies for different product decision. Instead of that to guide different customers with a variety of
product feedback using pattern mining approaches. The product review pattern mining segmentation based on probabilistic
principle component analysis is proposed. The opinion mining, segments has categorized into several segments with pattern
analysis based on multiple review comments. This mechanism has reduced the dimensionality of the segmentation process using
the covariance matrix approach. The experiment uses the opinion rank review dataset information for further process. It increases
the segmentation efficient upto9% when compare with traditional and conventional methods. The experimentation has been done
with the important factor of opinion decision threshold, false positive rate, segmentation efficiency and customer product ratio
level along with customer behavioral feedback.
Keywords: Covariance Matrix ,Opinion Pattern Mining Segmentation, Probabilistic Principle Component Analysis, ,
Product Review
-------------------------------------------------------------------***------------------------------------------------------------------
1. INTRODUCTION
Opinion mining is a process for tracking the mood of the
public about a certain product. Segmentation has turned out
to be the primary conceptual model both in marketing theory
and in practice. With the increasing use of online reviews,
customers post the reviews of the products and dedicated
review sites. These reviews provide excellent sources for
obtaining the opinions of the valuable consumers about the
products, which are very useful to both potential customers
and product manufacturers. Techniques are now being
developed to exploit these sources to help companies and
individuals to gain such information effectively and easily.
Techniques are now being developed to exploit these
sources to help companies and individuals to gain such
information effectively and easily. Taiwan‟s economy as
described in [10] accompanied a model in the country‟s
developing market.
Discovery of customer relationship between huge databases
has been recognized to be useful in discerning marketing,
decision analysis, and business management. An important
application area of opinion mining relationship is the market
basket analysis, which demonstrates the buying behaviors of
customers. The main idea behind the framework of pattern
mining is to apply an efficient segmentation method that
distinguishes the customer likeness and unlikeness of the
product. By doing so, pattern mining helps to repeatedly
determine the relative amount by obtaining the assessment
results.
Spatiotemporal data representation in [5] followed the
association rule for discovering the knowledge but the
detailed customer requirements were not analyzed. The
dimensionality class measure was though precise to each of
the spatiotemporal data mining tasks but they were not
identified with effective ratio. Spatio-Temporal Association
Rules as presented in [7] developed an efficient algorithm to
reduce the linear run time. The interesting and verifiable
patterns were carried out using the world animal tracking
data set but not suitable for business processing mode.
Multiple-Instance Learning via Disambiguation (MILD) as
described in [16] recognized the correct optimistic instances
for business processing model.
Probabilistic Spatio temporal model for target event as
demonstrated in [14] identified the midpoint of the product
incident position but still advanced algorithms were not
developed for query optimization. Twitter user is observed
as a sensor and every tweet act as sensory information but
the segmentation operation were not carried out. Multiple
reviewer-level features as described in [1], helped to
measures the reviewers comment with extent of mining
subjectivity. Reviews have a fusion of objective with highly
skewed sentence that were associated with the product sales.
The product review tends to include the objective
2. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
_______________________________________________________________________________________
Volume: 03 Issue: 11 | Nov-2014, Available @ http://www.ijret.org 565
information. Random Forest predicted the impact of
reviews but not worked with the segmentation for different
user opinions.
In this work, focus is made on developing an effective
Opinion Pattern Mining Segmentation (OPMS) process. The
Opinion Pattern Mining performs the segmentation process
based on the Probabilistic Principle Component Analysis.
OPMS based on the PPCA report reduces the dimensionality
to make the segmentation process very easier. The
segmentation of user profile helps to easily segment the
behavioral pattern with determination of the maximum
likelihood. Probability Principle Component Analysis
updates the product reviews with the increased threshold
rate and reduced false positive rate.
2. OPINION PATTERN MINING
SEGMENTATION BASED ON THE
PROBABILISTIC PRINCIPLE COMPONENT
ANALYSIS
Opinion pattern mining with PPCA report aims to establish
the review of different user behavioral with orientation
results and produce the result with lesser false positive and
increased threshold rate. The architecture diagram of
opinion pattern mining segmentation based on PPCA is
described in Figure 1.
Fig 1 Architecture Diagram of PPCA on Opinion Pattern
Mining
As illustrated in Fig 1, opinion pattern mining based on the
probabilistic principle component analysis includes different
user behavior. Different user behavior with the maximum
likelihood is used for the estimation of orientation result.
The user behavior is segmented using the Probabilistic
Principle Component Analysis report. With the application
of PPCA, the threshold rate is increased and decreases the
false positive rate and also the dimensionality is reduced
using the covariance matrix. The covariance matrix
improves the segmentation efficiency and dimensionality
reduction in PPCA further reduces the opinion pattern
mining time.
2.1 Probabilistic Principle Component Analysis
Report
Based on the observation of the user behavior, PPCA report
is obtained that exhibit lesser dimensionality while
performing user behavior pattern segmentation. Each user
behavior 𝑈1, 𝑈2, 𝑈3 … . 𝑈𝑛 represents different
dimensionalities that have to be segmented for obtaining
different patterns according to the user behavior. The center
of dimensionality reduction in segmentation is formularized
as,
𝑈𝑛 =
1
𝑛
𝑈𝑖 − 𝑈𝑖+1
𝑛
𝑖=1 (1)
𝑈𝑛 denotes „n‟ user behavioral patterns whereas 𝑈𝑖, 𝑈𝑖+1 are
the obtained behavior patterns of each user. Each user
behavior is taken into account for providing efficient
opinion pattern mining without any dimensionality
reduction. Each user‟s carried out the step in (1) for
efficiently segmenting the user behavioral by avoiding
dimensionality reduction.
Fig 2: Probabilistic Procedure in PPCA
Fig 2 describes the step by step probabilistic usage in the
PPCA. The probabilistic step in the PPCA report which is
used to analyze and predict the modifications observed in
the behavior of the user. The benefit of using the probability
in PCA report is that it easily updates the likelihood value
according to the changes observed in user behavior based on
product reviewing.
PPCA covariance is the measure of variability co-exists that
exists in user behavior pattern for opinion pattern mining.
The ordered data move in the same pattern helps to reduce
the dimensions. Each element represents the user (in a
vector) identifies the behavioral pattern which is in the form
of scalar arbitrary point. The scalar arbitrary point is in the
form of finite number of observed empirical values specified
by a theoretical joint probabilistic distribution for
segmentation.
3. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
_______________________________________________________________________________________
Volume: 03 Issue: 11 | Nov-2014, Available @ http://www.ijret.org 566
Fig 3 Dimensionality Reduction Procedure
Fig 3 reduces the dimensionality from the standard user
behavioral pattern to the reduced dimensional pattern.
Dimensionality represents the orthogonal projection and is
attained using the covariance parameterization. PPCA report
operates as an effective mechanism in opinion pattern
mining segmentation with reduced dimensionality rate.
2.2 Opinion Pattern Mining Segmentation
Opining pattern mining segmentation with PPCA report
performs the process of mining and segments the pattern
efficiently without any redundancy. PPCA estimate the
distribution of the pattern and segments effectively based on
similar user behavior patterns. The algorithmic description
of OPMS using the PPCA report is described as,
//Opinion Pattern mining with PPCA report
Begin
Input: User Input pattern „U1, U2, U3…Un‟
Output: Opinion Pattern mining with lesser false positive
ratio
For Each User
Step 1: Analyze each user behavior from „U1, U2, U3…Un‟
Step 2: User behavior Segmented into „S1‟,‟S2‟,‟S3, …‟Sn‟
Step 3: Opinion pattern used to attain user product reviews
follows
Step 3.1: Based on the maximum Likelihood
1
𝑛
𝜆1, 𝜆2, … … 𝜆 𝑛
𝑛+1
1 using Eigen Values
Step 3.2: Reduced the dimensionality using Covariance
matrix
Step 3.3: Covariance matrix = 𝑐𝑜𝑣 (𝑥𝑖𝑖,𝑗 , 𝑥𝑗 ) = 𝐸[(𝑥𝑖 −
𝜆𝑖)(𝑥𝑗 − 𝜆𝑗 ) computed
Step 4: Probabilistic update the opinion of different user on
different products
End For
Sep 5: Goto step 1
Step 6: Run P C A Analysis until user „Un‟ End
In OPMS theory, the complex type of user queries is also
segmented readily with the PPCA report. The segmented
form of opinion pattern mining is illustrated the Fig4.
Fig 4 Segmented opinion pattern mining
The segmented part illustrates that similar opinions are
grouped together using PPCA report. S1, S2, and S3 are the
three set of segmented principles associated with the each
user behavior pattern. Efficient segmenting of user profiles
obtains the users behavioral patterns (i.e.,) opinion pattern
mining with increased threshold rate and decrease the false
positives.
𝑇𝑟𝑒𝑠𝑜𝑙𝑑 𝐹𝑎𝑐𝑡𝑜𝑟 𝐴𝑛𝑎𝑙𝑦𝑠𝑖𝑠 𝑖𝑛 𝑂𝑃𝑀𝑆 (𝑇) = 𝜎2
∗ 𝑖 (2)
𝐹𝑎𝑙𝑠𝑒 𝑃𝑜𝑠𝑡𝑖𝑣𝑒 𝐹𝑎𝑐𝑜𝑟 𝐴𝑛𝑎𝑙𝑦𝑠𝑖𝑠 𝐹𝐴 = 1 − 1 − 𝑒 −𝑠 𝑁
(3)
The factor analysis is carried out using the segmented
principles in the opinion pattern mining. The factor analysis
with increased threshold rate improves the level of the user‟s
product trend ratio and false positive rate reduced to
improve the accuracy level on the user behavioral pattern
mining.
3. EXPERIMENTAL EVALUATION
Opinion Pattern Mining Segmentation based on the
Probabilistic Principle Component Analysis (PPCA) uses
JAVA platform with Weka tool for the experimental work.
PPCA report uses the OpinRank Review Dataset extracted
from the UCI repository for the experimental work. The
OpinRank dataset contains user reviews related to car and
hotels. The information is collected from the Tripadvisor
and Edmunds. The Tripadisor shows the 268000 reviews
and Edmunds reviewed the 51,240 reviews.
OpinRank Review Dataset contains the full review of the car
model from 2007. The review holds the 140-250 cars for
each year. The review data extracted the fields, including the
dates, author names, favorites and the full text review. The
total review is expected to be 51,240. The review of the
hotel for 10 different cities such as Dubai, Beijing, London,
New York City, New Delhi, San Francisco, Shanghai,
Montreal, Las Vegas, and Chicago are collected. OpinRank
Review Dataset has about 80 to 700 hotels in each city. The
total number of reviews on the hotel is expected to be
268,000.
4. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
_______________________________________________________________________________________
Volume: 03 Issue: 11 | Nov-2014, Available @ http://www.ijret.org 567
The false positive ratio is the probability of incorrectly
rejecting the null suggestion for particular dataset
information. The false positive rate of the OPMS is defined
as,
𝐹𝑎𝑙𝑠𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑅𝑎𝑡𝑒 =
𝑉
𝐸0
∗ 100
Where V denotes the false positive rate of the product and
𝐸0 denotes the true result obtained from the customers.
Segmentation is defined as the important concept in
marketing to serve different types of customer.
Segmentation efficiency is measured in terms of the success
percentage (success %).
𝑃𝑎𝑡𝑡𝑒𝑟𝑛 𝑀𝑖𝑛𝑖𝑛𝑔 𝑇𝑖𝑚𝑒 = 𝑃1 − 𝑃2
Where 𝑃1 represents the Start time of pattern construction
and 𝑃2 denotes the End Time of Pattern construction.
Opinion pattern mining time is measured in terms of
seconds (Sec). Dimensionality reduction is measured as,
𝐷𝑅 =
(𝑁𝑜. 𝑜𝑓 𝑑𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛𝑠 − 𝑅𝑒𝑑𝑢𝑐𝑒𝑑 𝐷𝑖𝑚𝑒𝑛𝑠𝑖𝑜𝑛𝑠)
𝑃𝑟𝑜𝑐𝑒𝑠𝑠𝑖𝑛𝑔 𝑇𝑖𝑚𝑒
∗ 100
User product trend ratio level denotes the amount of
accurate product review attained by using the opinion
pattern mining, segmentation with the probability principle
component analysis report.
4. RESULT ANALYSIS
Opinion Pattern Mining Segmentation based on the
Probabilistic Principle Component Analysis (PPCA) is
compared against the Random Forest based classifier (RF)
method and Variable Clustering (VC) algorithm. OPMS is
evaluated using the OpinRank Review Dataset from the UCI
repository.
Fig 5 Segmentation Efficiency Measure
Fig 5 PPCA segments the user behavioral sequences
automatically into diverse behaviors. The covariance matrix
form improves the segmentation efficiency by 11 – 17 %
when compared with the RF method [1] and improved by 3
– 8 % when compared with the VCAlgorithm[2].
Fig. 6 User Product Trend Ratio Level Measure
Fig 6 The product reviewing with marginal distribution in
PPCA report updates the opinion patterns. The marginal
distribution is applied to update on the entire user behavioral
pattern and attain 3 – 6 % improved ratio level when
compared with the RF method [1].
Fig 7 Opinion Pattern Mining Time Measure
Fig 7 In order to compute the mining process the value of
„𝐶𝑀𝑟‟ is obtained. The reduction of dimensionality in
OPMS method reduces the opinion pattern mining time by 7
– 17 % when compared with the RF method [1]. The less
correlated value reduces the pattern mining time by 10 – 21
% in opinion pattern mining segmentation when compared
with the VC Algorithm [2].
5. CONCLUSION
The proposed opinion pattern mining, segmentation
approach based on the Probabilistic Principle Component
Analysis is a precious method in which segments, the useful
customer review information from large amounts of
repository data in an efficient manner. Opinion pattern
mining based segmentations show significance in data
mining technologies to reduce the false positive rate in
reasonably because the data organized with the maximum
likelihood mapping of the users‟ behavior. The experiment
result has produced that the opinion pattern mining,in which
the segmentation outperforms all the existing segmentations
work with 16.69% and improved decision threshold rate
0
20
40
60
80
100
50 150 250 350
SegmentationEfficiency
(Success%)
Pattern Size (KB)
RF method
VC
Algorithm
OPMS
method
0
10
20
30
40
50
60
70
80
90
20 60 100140
UserProductTrendRatio
(Gain%)
User Vector Points
RF method
VC
Algorithm
OPMS
method
0
200
400
600
800
1000
5 15 25 35
OpinionPatternMining
Time(sec)
No.of patterns
RF method
VC
Algorithm
OPMS
method
5. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
_______________________________________________________________________________________
Volume: 03 Issue: 11 | Nov-2014, Available @ http://www.ijret.org 568
and system efficiency rate is also increased in the
conventional method. The probability PCA updated the
product reviews ratio level by 7.50% based on the users
behavioral reviews. An Opinrank review dataset from the
UCI repository is used to review the experimental result of
OPMS with traditional method of the parametric factors
such as opinion pattern mining time, false positive rate, and
dimensionality reduction rate. The proposed mechanism has
deployed it in the real time application fields such as online
commercial products based on the customer review
comments recommended. The proposed mechanism
guaranteed to produce efficient results.
REFERENCES
[1]. Anindya Ghose., Panagiotis G. Ipeirotis., “Estimating
the Helpfulness and Economic Impact of Product Reviews:
Mining Text and Reviewer Characteristics,” IEEE
TRANSACTIONS ON KNOWLEDGE AND DATA
ENGINEERING, 2010
[2]. V.L. Migueis., A.S. Camanho., Joao Falcao e Cunha.,
“Customer data mining for lifestyle segmentation,” Expert
Systems with Applications., Elsevier Journal., 2012
[3]. Archana Tomar., Vineet Richhariya., Mahendra Ku.
Mishra., “ A Improved Privacy Preserving Algorithm using
the Association rule mining in centralized database.,”
International Journal of Advanced Technology &
Engineering Research (IJATER) ISSN NO: 2250-3536
VOLUME 2, ISSUE 2, 2012
[4]. Marco Muselli., and Enrico Ferrari.,“Coupling Logical
Analysis of Data and Shadow Clustering for Partially
Defined Positive Boolean Function Reconstruction,” IEEE
TRANSACTIONS ON KNOWLEDGE AND DATA
ENGINEERING, VOL. 23, NO. 1, JANUARY 2011
[5]. K.Venkateswara Rao., A.Govardhan., and
K.V.Chalapati Rao.,“ Spatiotemporal Data Mining:issues,
tasks and Applications.,” International Journal of Computer
Science & Engineering Survey (IJCSES) Vol.3, No.1, 2012
[6]. Swarupa Panmetsa., L.V.S.S., Ch. Raja Ramesh,
“Anonymization of the Sequential Patterns in Location
Based Service Environments,” International Journal of
Computer Technology & Research, IJCTR, ISSN 2319-
8184,Vol 1, Issue 1, October 2012
[7]. Florian Verhein., “Mining Complex Spatio-Temporal
Sequence Patterns,” Journal of Science, 2009
[8]. Ning Zhong., Yuefeng Li., Sheng-Tang Wu., “Effective
Pattern Discovery for Text Mining,” IEEE Transactions on
Knowledge and Data Engineering, Volume:24, Issue: 1, Jan
2012
[9]. Eric Hsueh-Chan Lu, Vincent S. Tseng., and Philip S.
Yu., “Mining Cluster-Based Temporal Mobile Sequential
Patterns in Location-Based Service Environments,”IEEE
TRANSACTIONS ON KNOWLEDGE AND DATA
ENGINEERING, VOL. 23, NO. 6, JUNE 2011
[10]. Chih-Hao Wena.,, Shu-Hsien Liao., Wei-Ling Chang.,
Ping-Yu Hsu., “Mining shopping behavior in the Taiwan
luxury products market,” Expert Systems with
Applications., Elsevier Journal., 2012
[11]. Emilio Miguelanez., Pedro Patron., Keith E. Brown.,
Yvan R. Petillot., and David M. Lane., “Semantic
Knowledge-Based Framework to Improve the Situation
Awareness of Autonomous Underwater Vehicles,” IEEE
TRANSACTIONS ON KNOWLEDGE AND DATA
ENGINEERING, VOL. 23, NO. 5, MAY 2011
[12]. Panagiotis Papadimitriou., Panayiotis Tsaparas, Ariel
Fuxman., and Lise Getoor., “TACI: Taxonomy-Aware
Catalog Integration,” IEEE TRANSACTIONS ON
KNOWLEDGE AND DATA ENGINEERING, VOL. 25,
NO. 7, JULY 2013
[13]. Tao Jiang., and Ah-Hwee Tan., “Learning Image-Text
Associations,” IEEE TRANSACTION ON KNOWLEDGE
AND DATA ENGINEERING., Volume:21 , Issue: 2, Feb
2009
[14]. Takeshi Sakaki, Makoto Okazaki, and Yutaka
Matsuo.,“Tweet Analysis for Real-Time Event Detection
and Earthquake Reporting System Development,” IEEE
TRANSACTIONS ON KNOWLEDGE AND DATA
ENGINEERING, VOL. 25, NO. 4, APRIL 2013
[15]. Jae-Gil Lee., Jiawei Han., Xiaolei Li, and Hong
Cheng., “Mining Discriminative Patterns for Classifying
Trajectories on Road Networks,” IEEE TRANSACTIONS
ON KNOWLEDGE AND DATA ENGINEERING, VOL.
23, NO. 5, MAY 2011
[16]. Wu-Jun Li., and Dit-Yan Yeung., “MILD: Multiple-
Instance Learning via Disambiguation,” IEEE
TRANSACTIONS ON KNOWLEDGE AND DATA
ENGINEERING, VOL. 22, NO. 1, JANUARY 2010
[17]. Oana Frunza., Diana Inkpen., and Thomas Tran., “A
Machine Learning Approach for Identifying Disease-
Treatment Relations in Short Texts,” IEEE
TRANSACTIONS ON KNOWLEDGE AND DATA
ENGINEERING, VOL. 23, NO. 6, JUNE 2011
[18]. Wenjing Zhang., and Xin Feng., “Event
Characterization and Prediction Based on Temporal Patterns
in Dynamic Data System,” IEEE TRANSACTIONS ON
KNOWLEDGE AND DATA ENGINEERING., 2013
[19]. Jung-Yi Jiang., Ren-Jia Liou., and Shie-Jue Lee., “A
Fuzzy Self-Constructing Feature Clustering Algorithm for
Text Classification,” IEEE TRANSACTIONS ON
KNOWLEDGE AND DATA ENGINEERING, VOL. 23,
NO. 3, MARCH 2011
[20]. Liang Wang., Christopher Leckie., Kotagiri
Ramamohanarao., and James Bezdek., “Automatically
Determining the Number of Clusters in Unlabeled Data
Sets,” IEEE TRANSACTIONS ON KNOWLEDGE AND
DATA ENGINEERING, VOL. 21, NO. 3, MARCH 2009
BIOGRAPHIES
P.SARAVANA KUMAR, is currently
working as an Assistant Professor in
Master of Computer Applications in
K.S.Rangasamy College of
Technology,Tiruchengode,Namakkal. He
is handling PG classes. He is also
pursuing Ph.D., in Computer Science under Periyar
University, Salem-11. He presented a paper in International
conferences and also he published in International Journal.
His area of interest is DataMining.
6. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
_______________________________________________________________________________________
Volume: 03 Issue: 11 | Nov-2014, Available @ http://www.ijret.org 569
Dr.A.VIJAYA KATHIRAVAN is
working as an Assistant Professor in
Computer Applications in PG and
Research Department of Computer
Science, Govt. Arts College
(Autonomous), Salem-07, TamilNadu,
INDIA. She received her M.Phil. in
Computer Science from Bharathiar University, Coimbatore
and she awarded her doctoral degree in Computer
Applications from University of Madras, Chennai. She has
published 6 Books, 3 papers in National Journal, 30 papers
in International Journal, 35 Papers in National Conference
Proceedings, 38 Papers in International Conference
Proceedings and a total of 112 publications. Her research
interests include data structures and algorithms,
data/text/web mining, search engines, web communities,
social network mining, machine learning, Natural Language
Processing, Organizational leadership and human resource
management.