This document summarizes recent approaches in opinion mining and sentiment analysis (OMSA) in online social networks. It discusses 15 different frameworks and methods that have been proposed, including the Twitter Opinion Mining framework, an election prediction model using user influence factors, and a fuzzy deep belief network approach. The document analyzes the key steps and characteristics of each approach, such as data collection, preprocessing, classification techniques, and performance results. Overall, the paper reviews the state-of-the-art in OMSA research and highlights areas for future improvements.
The feature selection or extraction is the most important task in Opinion mining and Sentimental Analysis
(OSMA) for calculating the polarity score. These scores are used to determine the positive, negative, and
neutral polarity about the product, user reviews, user comments, and etc., in social media for the purpose
of decision making and Business Intelligence to individuals or organizations. In this paper, we have
performed an experimental study for different feature extraction or selection techniques available for
opinion mining task. This experimental study is carried out in four stages. First, the data collection process
has been done from readily available sources. Second, the pre-processing techniques are applied
automatically using the tools to extract the terms, POS (Parts-of-Speech). Third, different feature selection
or extraction techniques are applied over the content. Finally, the empirical study is carried out for
analyzing the sentiment polarity with different features.
Scalable recommendation with social contextual informationeSAT Journals
Abstract Recommender systems are used to achieve effective and useful results in a social networks. The social recommendation will provide a social network structure but it is challenging to fuse social contextual factors which are derived from user’s motivation of social behaviors into social recommendation. Here, we introduce two contextual factors in recommender systems which are used to adopt a useful results namely a) individual preference and b) interpersonal influence. Individual preference analyze the social interests of an item content with user’s interest and adopt only users recommended results. Interpersonal influence is analyzing user-user interaction and their specific social relations. Beyond this, we propose a novel probabilistic matrix factorization method to fuse them in a latent space. The scalable algorithm provides a useful results by analyzing the ranking probability of each user social contextual information and also incrementally process the contextual data in large datasets. Keywords: social recommendation, individual preference, interpersonal influence, matrix factorization
Integrated bio-search approaches with multi-objective algorithms for optimiza...TELKOMNIKA JOURNAL
Optimal selection of features is very difficult and crucial to achieve, particularly for the task of classification. It is due to the traditional method of selecting features that function independently and generated the collection of irrelevant features, which therefore affects the quality of the accuracy of the classification. The goal of this paper is to leverage the potential of bio-inspired search algorithms, together with wrapper, in optimizing multi-objective algorithms, namely ENORA and NSGA-II to generate an optimal set of features. The main steps are to idealize the combination of ENORA and NSGA-II with suitable bio-search algorithms where multiple subset generation has been implemented. The next step is to validate the optimum feature set by conducting a subset evaluation. Eight (8) comparison datasets of various sizes have been deliberately selected to be checked. Results shown that the ideal combination of multi-objective algorithms, namely ENORA and NSGA-II, with the selected bio-inspired search algorithm is promising to achieve a better optimal solution (i.e. a best features with higher classification accuracy) for the selected datasets. This discovery implies that the ability of bio-inspired wrapper/filtered system algorithms will boost the efficiency of ENORA and NSGA-II for the task of selecting and classifying features.
A Novel Hybrid Voter Using Genetic Algorithm and Performance HistoryWaqas Tariq
Triple Modular Redundancy (TMR) is generally used to increase the reliability of real time systems where three similar modules are used in parallel and the final output is arrived at using voting methods. Numerous majority voting techniques have been proposed in literature however their performances are compromised for some typical set of module output value. Here we propose a new voting scheme for analog systems retaining the advantages of previous reported schemes and reduce the disadvantages associated with them. The scheme utilizes a genetic algorithm and previous performances history of the modules to calculate the final output. The scheme has been simulated using MATLAB and the performance of the voter has been compared with that of fuzzy voter proposed by Shabgahi et al [4]. The performance of the voter proposed here is better than the existing voters.
Survey on Supervised Method for Face Image Retrieval Based on Euclidean Dist...Editor IJCATR
Content-based image retrieval is a technique which uses visual contents to search images from large scale image databases
according to users' interests. Given a query face image, content-based face image retrieval tries to find similar face images from a large
image database. Initially face of the image is detected from the query image. After the removal of noise present in the image, it is
separated as patches. For each patch, the Local binary pattern (LBP) is extracted which improves the detection performance. LBP is a
type of feature used for classification in computer vision. The LBP operator assigns a label to every pixel of a gray level image. The
label mapping to a pixel is affected by the relationship between this pixel and its eight neighbors. Support Vector Machine (SVM) is
used then which will produce a model (based on the training data) that predicts the target values of the test data given only the test data
attributes. When the feature values are provided to the SVM classifier, it will train about the feature. Finally it will classify about the
result. SVM maps input vectors to a higher dimensional vector space where an optimal hyper plane is constructed. Among the
available hyper planes, there is one hyper plane alone that maximizes the distance between itself and the nearest data vectors of each
category. The Euclidean distance between the query image and database image is calculated and the index of the Euclidean distance is
sorted.The indexing scheme used for this purpose provides an efficient way to search the image. Then the corresponding image from
the database is retrieved based upon the index. This SVM classifier mainly improves the detection performance and the rate of
accuracy.
The purpose research is to develop the decision model of Multi-Criteria Group Decision Making (MCGDM) into Interval Value Fuzzy Multi-Criteria Group Decision Making (IV-FMCGDM), while the specific purpose is to construct decision-making model of Adaptive Interval Value Fuzzy Analytic Hierarchy Process (AIV- FAHP) uses Triangular Fuzzy Number (TFN) and group decision aggregation functions using Interval Value Geometric Means Aggregation (IV-GMA). The novelty research is to study the concept of group decision making by improving the middle point on the Interval Value Triangular Fuzzy Number (IV TFN). It provides more accurate modeling, and better rating performance, and more effective linguistic representation. This research produced a new decision-making model and algorithm based on AIV-FAHP used to measure the quality of e-learning.
The feature selection or extraction is the most important task in Opinion mining and Sentimental Analysis
(OSMA) for calculating the polarity score. These scores are used to determine the positive, negative, and
neutral polarity about the product, user reviews, user comments, and etc., in social media for the purpose
of decision making and Business Intelligence to individuals or organizations. In this paper, we have
performed an experimental study for different feature extraction or selection techniques available for
opinion mining task. This experimental study is carried out in four stages. First, the data collection process
has been done from readily available sources. Second, the pre-processing techniques are applied
automatically using the tools to extract the terms, POS (Parts-of-Speech). Third, different feature selection
or extraction techniques are applied over the content. Finally, the empirical study is carried out for
analyzing the sentiment polarity with different features.
Scalable recommendation with social contextual informationeSAT Journals
Abstract Recommender systems are used to achieve effective and useful results in a social networks. The social recommendation will provide a social network structure but it is challenging to fuse social contextual factors which are derived from user’s motivation of social behaviors into social recommendation. Here, we introduce two contextual factors in recommender systems which are used to adopt a useful results namely a) individual preference and b) interpersonal influence. Individual preference analyze the social interests of an item content with user’s interest and adopt only users recommended results. Interpersonal influence is analyzing user-user interaction and their specific social relations. Beyond this, we propose a novel probabilistic matrix factorization method to fuse them in a latent space. The scalable algorithm provides a useful results by analyzing the ranking probability of each user social contextual information and also incrementally process the contextual data in large datasets. Keywords: social recommendation, individual preference, interpersonal influence, matrix factorization
Integrated bio-search approaches with multi-objective algorithms for optimiza...TELKOMNIKA JOURNAL
Optimal selection of features is very difficult and crucial to achieve, particularly for the task of classification. It is due to the traditional method of selecting features that function independently and generated the collection of irrelevant features, which therefore affects the quality of the accuracy of the classification. The goal of this paper is to leverage the potential of bio-inspired search algorithms, together with wrapper, in optimizing multi-objective algorithms, namely ENORA and NSGA-II to generate an optimal set of features. The main steps are to idealize the combination of ENORA and NSGA-II with suitable bio-search algorithms where multiple subset generation has been implemented. The next step is to validate the optimum feature set by conducting a subset evaluation. Eight (8) comparison datasets of various sizes have been deliberately selected to be checked. Results shown that the ideal combination of multi-objective algorithms, namely ENORA and NSGA-II, with the selected bio-inspired search algorithm is promising to achieve a better optimal solution (i.e. a best features with higher classification accuracy) for the selected datasets. This discovery implies that the ability of bio-inspired wrapper/filtered system algorithms will boost the efficiency of ENORA and NSGA-II for the task of selecting and classifying features.
A Novel Hybrid Voter Using Genetic Algorithm and Performance HistoryWaqas Tariq
Triple Modular Redundancy (TMR) is generally used to increase the reliability of real time systems where three similar modules are used in parallel and the final output is arrived at using voting methods. Numerous majority voting techniques have been proposed in literature however their performances are compromised for some typical set of module output value. Here we propose a new voting scheme for analog systems retaining the advantages of previous reported schemes and reduce the disadvantages associated with them. The scheme utilizes a genetic algorithm and previous performances history of the modules to calculate the final output. The scheme has been simulated using MATLAB and the performance of the voter has been compared with that of fuzzy voter proposed by Shabgahi et al [4]. The performance of the voter proposed here is better than the existing voters.
Survey on Supervised Method for Face Image Retrieval Based on Euclidean Dist...Editor IJCATR
Content-based image retrieval is a technique which uses visual contents to search images from large scale image databases
according to users' interests. Given a query face image, content-based face image retrieval tries to find similar face images from a large
image database. Initially face of the image is detected from the query image. After the removal of noise present in the image, it is
separated as patches. For each patch, the Local binary pattern (LBP) is extracted which improves the detection performance. LBP is a
type of feature used for classification in computer vision. The LBP operator assigns a label to every pixel of a gray level image. The
label mapping to a pixel is affected by the relationship between this pixel and its eight neighbors. Support Vector Machine (SVM) is
used then which will produce a model (based on the training data) that predicts the target values of the test data given only the test data
attributes. When the feature values are provided to the SVM classifier, it will train about the feature. Finally it will classify about the
result. SVM maps input vectors to a higher dimensional vector space where an optimal hyper plane is constructed. Among the
available hyper planes, there is one hyper plane alone that maximizes the distance between itself and the nearest data vectors of each
category. The Euclidean distance between the query image and database image is calculated and the index of the Euclidean distance is
sorted.The indexing scheme used for this purpose provides an efficient way to search the image. Then the corresponding image from
the database is retrieved based upon the index. This SVM classifier mainly improves the detection performance and the rate of
accuracy.
The purpose research is to develop the decision model of Multi-Criteria Group Decision Making (MCGDM) into Interval Value Fuzzy Multi-Criteria Group Decision Making (IV-FMCGDM), while the specific purpose is to construct decision-making model of Adaptive Interval Value Fuzzy Analytic Hierarchy Process (AIV- FAHP) uses Triangular Fuzzy Number (TFN) and group decision aggregation functions using Interval Value Geometric Means Aggregation (IV-GMA). The novelty research is to study the concept of group decision making by improving the middle point on the Interval Value Triangular Fuzzy Number (IV TFN). It provides more accurate modeling, and better rating performance, and more effective linguistic representation. This research produced a new decision-making model and algorithm based on AIV-FAHP used to measure the quality of e-learning.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
A Simple Segmentation Approach for Unconstrained Cursive Handwritten Words in...CSCJournals
This paper presents a new, simple and fast approach for character segmentation of unconstrained handwritten words. The developed segmentation algorithm over-segments in some cases due to the inherent nature of the cursive words. However the over segmentation is minimum. To increase the efficiency of the algorithm an Artificial Neural Network is trained with significant amount of valid segmentation points for cursive words manually. Trained neural network extracts incorrect segmented points efficiently with high speed. For fair comparison benchmark database IAM is used. The experimental results are encouraging.
CLUSTER PRIORITY BASED SENTENCE RANKING FOR EFFICIENT EXTRACTIVE TEXT SUMMARIESecij
This paper presents a cluster priority ranking based approach for extractive automatic text summarization that aggregates different cluster ranks for final sentence scoring. This approach does not require any learning, feature weighting and semantic processing. Surface level features combinations are used for
individual cluster scoring. Proposed approach produces quality summaries without using title feature. Experimental results on DUC 2002 dataset proves robustness of proposed approach as compared to other surface level approaches using ROUGE evaluation matrices.
Learning to Rank Image Tags With Limited Training Examples1crore projects
IEEE PROJECTS 2015
1 crore projects is a leading Guide for ieee Projects and real time projects Works Provider.
It has been provided Lot of Guidance for Thousands of Students & made them more beneficial in all Technology Training.
Dot Net
DOTNET Project Domain list 2015
1. IEEE based on datamining and knowledge engineering
2. IEEE based on mobile computing
3. IEEE based on networking
4. IEEE based on Image processing
5. IEEE based on Multimedia
6. IEEE based on Network security
7. IEEE based on parallel and distributed systems
Java Project Domain list 2015
1. IEEE based on datamining and knowledge engineering
2. IEEE based on mobile computing
3. IEEE based on networking
4. IEEE based on Image processing
5. IEEE based on Multimedia
6. IEEE based on Network security
7. IEEE based on parallel and distributed systems
ECE IEEE Projects 2015
1. Matlab project
2. Ns2 project
3. Embedded project
4. Robotics project
Eligibility
Final Year students of
1. BSc (C.S)
2. BCA/B.E(C.S)
3. B.Tech IT
4. BE (C.S)
5. MSc (C.S)
6. MSc (IT)
7. MCA
8. MS (IT)
9. ME(ALL)
10. BE(ECE)(EEE)(E&I)
TECHNOLOGY USED AND FOR TRAINING IN
1. DOT NET
2. C sharp
3. ASP
4. VB
5. SQL SERVER
6. JAVA
7. J2EE
8. STRINGS
9. ORACLE
10. VB dotNET
11. EMBEDDED
12. MAT LAB
13. LAB VIEW
14. Multi Sim
CONTACT US
1 CRORE PROJECTS
Door No: 214/215,2nd Floor,
No. 172, Raahat Plaza, (Shopping Mall) ,Arcot Road, Vadapalani, Chennai,
Tamin Nadu, INDIA - 600 026
Email id: 1croreprojects@gmail.com
website:1croreprojects.com
Phone : +91 97518 00789 / +91 72999 51536
DEVNAGARI NUMERALS CLASSIFICATION AND RECOGNITION USING AN INTEGRATED APPROACHijfcstjournal
Character recognition has always been a challenging field for the researchers. There has been an astounding progress in the development of the systems for character recognition. OCR performs the recognition of the text in the scanned document image and converts it into editable form. The OCR process can have several stages like preprocessing, segmentation, recognition and post processing. The recognition generally, consists of feature extraction and classification. The choice of features and classification scheme affects the performance of OCR largely. In this paper, a classification scheme is proposed for the Devnagari numerals, which forms the basis for recognition. This approach integrates the structural features and water reservoir analogy based feature to classify the Devnagari numeral. In order to classify a single numeral, at most four checks are required. This increases the efficiency of the proposed scheme.
(Gaurav sawant & dhaval sawlani)bia 678 final project reportGaurav Sawant
PROJECT REPORT
• Performed memory-based collaborative filtering techniques like Cosine similarities, Pearson’s r & model-based Matrix Factorization techniques like Alternating Least Squares (ALS) method
• Studied the scalability of these methods on local machines & on Hadoop clusters
03 fauzi indonesian 9456 11nov17 edit septianIAESIJEECS
Since the rise of WWW, information available online is growing rapidly. One of the example is Indonesian online news. Therefore, automatic text classification became very important task for information filtering. One of the major issue in text classification is its high dimensionality of feature space. Most of the features are irrelevant, noisy, and redundant, which may decline the accuracy of the system. Hence, feature selection is needed. Maximal Marginal Relevance for Feature Selection (MMR-FS) has been proven to be a good feature selection for text with many redundant features, but it has high computational complexity. In this paper, we propose a two-phased feature selection method. In the first phase, to lower the complexity of MMR-FS we utilize Information Gain first to reduce features. This reduced feature will be selected using MMR-FS in the second phase. The experiment result showed that our new method can reach the best accuracy by 86%. This new method could lower the complexity of MMR-FS but still retain its accuracy.
A Heuristic Approach for Network Data Clusteringidescitation
In this growing world of technology there are lots of security threats received by
each and every area of computer networks. Most of the time the network security threats
produce high false positive and negative ratios, this creates an obstacle for any security
system to work improperly. The overwhelming threats make it challenging to understand
and manage the network data.
To address this problem we present a novel approach which eventually understand the
network data by clustering them without background knowledge of any threats according to
various parameters like source IP, Destination IP etc. And this approach saves
administrator’s time and energy in processing of large amount threats.
Entity-oriented sentiment analysis of tweets: results and problemsYuliya Rubtsova
This is summarization of the results of the reputation-oriented
Twitter task, which was held as part of SentiRuEval evaluation of Russian sentiment-analysis systems. The tweets in two domains: telecom companies and banks - were included in the evaluation. The task was to determine if an author of a tweet has a positive or negative attitude to a company mentioned in the message. The main issue of this paper is to analyze the current state and problems of approaches applied by the participants.
Feature selection is one of the most fundamental steps in machine learning. It is closely related to
dimensionality reduction. A commonly used approach in feature selection is ranking the individual
features according to some criteria and then search for an optimal feature subset based on an evaluation
criterion to test the optimality. The objective of this work is to predict more accurately the presence of
Learning Disability (LD) in school-aged children with reduced number of symptoms. For this purpose, a
novel hybrid feature selection approach is proposed by integrating a popular Rough Set based feature
ranking process with a modified backward feature elimination algorithm. The process of feature ranking
follows a method of calculating the significance or priority of each symptoms of LD as per their
contribution in representing the knowledge contained in the dataset. Each symptoms significance or
priority values reflect its relative importance to predict LD among the various cases. Then by eliminating
least significant features one by one and evaluating the feature subset at each stage of the process, an
optimal feature subset is generated. For comparative analysis and to establish the importance of rough set
theory in feature selection, the backward feature elimination algorithm is combined with two state-of-theart
filter based feature ranking techniques viz. information gain and gain ratio. The experimental results
show the proposed feature selection approach outperforms the other two in terms of the data reduction.
Also, the proposed method eliminates all the redundant attributes efficiently from the LD dataset without
sacrificing the classification performance.
AN EXPERIMENTAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MININGijscai
The feature selection or extraction is the most important task in Opinion mining and Sentimental Analysis
(OSMA) for calculating the polarity score. These scores are used to determine the positive, negative, and
neutral polarity about the product, user reviews, user comments, and etc., in social media for the purpose
of decision making and Business Intelligence to individuals or organizations. In this paper, we have
performed an experimental study for different feature extraction or selection techniques available for
opinion mining task. This experimental study is carried out in four stages. First, the data collection process
has been done from readily available sources. Second, the pre-processing techniques are applied
automatically using the tools to extract the terms, POS (Parts-of-Speech). Third, different feature selection
or extraction techniques are applied over the content. Finally, the empirical study is carried out for
analyzing the sentiment polarity with different features.
An Experimental Study of Feature Extraction Techniques in Opinion MiningIJSCAI Journal
The feature selection or extraction is the most important task in Opinion mining and Sentimental Analysis
(OSMA) for calculating the polarity score. These scores are used to determine the positive, negative, and
neutral polarity about the product, user reviews, user comments, and etc., in social media for the purpose
of decision making and Business Intelligence to individuals or organizations. In this paper, we have
performed an experimental study for different feature extraction or selection techniques available for
opinion mining task. This experimental study is carried out in four stages. First, the data collection process
has been done from readily available sources. Second, the pre-processing techniques are applied
automatically using the tools to extract the terms, POS (Parts-of-Speech). Third, different feature selection
or extraction techniques are applied over the content. Finally, the empirical study is carried out for
analyzing the sentiment polarity with different features
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
A Simple Segmentation Approach for Unconstrained Cursive Handwritten Words in...CSCJournals
This paper presents a new, simple and fast approach for character segmentation of unconstrained handwritten words. The developed segmentation algorithm over-segments in some cases due to the inherent nature of the cursive words. However the over segmentation is minimum. To increase the efficiency of the algorithm an Artificial Neural Network is trained with significant amount of valid segmentation points for cursive words manually. Trained neural network extracts incorrect segmented points efficiently with high speed. For fair comparison benchmark database IAM is used. The experimental results are encouraging.
CLUSTER PRIORITY BASED SENTENCE RANKING FOR EFFICIENT EXTRACTIVE TEXT SUMMARIESecij
This paper presents a cluster priority ranking based approach for extractive automatic text summarization that aggregates different cluster ranks for final sentence scoring. This approach does not require any learning, feature weighting and semantic processing. Surface level features combinations are used for
individual cluster scoring. Proposed approach produces quality summaries without using title feature. Experimental results on DUC 2002 dataset proves robustness of proposed approach as compared to other surface level approaches using ROUGE evaluation matrices.
Learning to Rank Image Tags With Limited Training Examples1crore projects
IEEE PROJECTS 2015
1 crore projects is a leading Guide for ieee Projects and real time projects Works Provider.
It has been provided Lot of Guidance for Thousands of Students & made them more beneficial in all Technology Training.
Dot Net
DOTNET Project Domain list 2015
1. IEEE based on datamining and knowledge engineering
2. IEEE based on mobile computing
3. IEEE based on networking
4. IEEE based on Image processing
5. IEEE based on Multimedia
6. IEEE based on Network security
7. IEEE based on parallel and distributed systems
Java Project Domain list 2015
1. IEEE based on datamining and knowledge engineering
2. IEEE based on mobile computing
3. IEEE based on networking
4. IEEE based on Image processing
5. IEEE based on Multimedia
6. IEEE based on Network security
7. IEEE based on parallel and distributed systems
ECE IEEE Projects 2015
1. Matlab project
2. Ns2 project
3. Embedded project
4. Robotics project
Eligibility
Final Year students of
1. BSc (C.S)
2. BCA/B.E(C.S)
3. B.Tech IT
4. BE (C.S)
5. MSc (C.S)
6. MSc (IT)
7. MCA
8. MS (IT)
9. ME(ALL)
10. BE(ECE)(EEE)(E&I)
TECHNOLOGY USED AND FOR TRAINING IN
1. DOT NET
2. C sharp
3. ASP
4. VB
5. SQL SERVER
6. JAVA
7. J2EE
8. STRINGS
9. ORACLE
10. VB dotNET
11. EMBEDDED
12. MAT LAB
13. LAB VIEW
14. Multi Sim
CONTACT US
1 CRORE PROJECTS
Door No: 214/215,2nd Floor,
No. 172, Raahat Plaza, (Shopping Mall) ,Arcot Road, Vadapalani, Chennai,
Tamin Nadu, INDIA - 600 026
Email id: 1croreprojects@gmail.com
website:1croreprojects.com
Phone : +91 97518 00789 / +91 72999 51536
DEVNAGARI NUMERALS CLASSIFICATION AND RECOGNITION USING AN INTEGRATED APPROACHijfcstjournal
Character recognition has always been a challenging field for the researchers. There has been an astounding progress in the development of the systems for character recognition. OCR performs the recognition of the text in the scanned document image and converts it into editable form. The OCR process can have several stages like preprocessing, segmentation, recognition and post processing. The recognition generally, consists of feature extraction and classification. The choice of features and classification scheme affects the performance of OCR largely. In this paper, a classification scheme is proposed for the Devnagari numerals, which forms the basis for recognition. This approach integrates the structural features and water reservoir analogy based feature to classify the Devnagari numeral. In order to classify a single numeral, at most four checks are required. This increases the efficiency of the proposed scheme.
(Gaurav sawant & dhaval sawlani)bia 678 final project reportGaurav Sawant
PROJECT REPORT
• Performed memory-based collaborative filtering techniques like Cosine similarities, Pearson’s r & model-based Matrix Factorization techniques like Alternating Least Squares (ALS) method
• Studied the scalability of these methods on local machines & on Hadoop clusters
03 fauzi indonesian 9456 11nov17 edit septianIAESIJEECS
Since the rise of WWW, information available online is growing rapidly. One of the example is Indonesian online news. Therefore, automatic text classification became very important task for information filtering. One of the major issue in text classification is its high dimensionality of feature space. Most of the features are irrelevant, noisy, and redundant, which may decline the accuracy of the system. Hence, feature selection is needed. Maximal Marginal Relevance for Feature Selection (MMR-FS) has been proven to be a good feature selection for text with many redundant features, but it has high computational complexity. In this paper, we propose a two-phased feature selection method. In the first phase, to lower the complexity of MMR-FS we utilize Information Gain first to reduce features. This reduced feature will be selected using MMR-FS in the second phase. The experiment result showed that our new method can reach the best accuracy by 86%. This new method could lower the complexity of MMR-FS but still retain its accuracy.
A Heuristic Approach for Network Data Clusteringidescitation
In this growing world of technology there are lots of security threats received by
each and every area of computer networks. Most of the time the network security threats
produce high false positive and negative ratios, this creates an obstacle for any security
system to work improperly. The overwhelming threats make it challenging to understand
and manage the network data.
To address this problem we present a novel approach which eventually understand the
network data by clustering them without background knowledge of any threats according to
various parameters like source IP, Destination IP etc. And this approach saves
administrator’s time and energy in processing of large amount threats.
Entity-oriented sentiment analysis of tweets: results and problemsYuliya Rubtsova
This is summarization of the results of the reputation-oriented
Twitter task, which was held as part of SentiRuEval evaluation of Russian sentiment-analysis systems. The tweets in two domains: telecom companies and banks - were included in the evaluation. The task was to determine if an author of a tweet has a positive or negative attitude to a company mentioned in the message. The main issue of this paper is to analyze the current state and problems of approaches applied by the participants.
Feature selection is one of the most fundamental steps in machine learning. It is closely related to
dimensionality reduction. A commonly used approach in feature selection is ranking the individual
features according to some criteria and then search for an optimal feature subset based on an evaluation
criterion to test the optimality. The objective of this work is to predict more accurately the presence of
Learning Disability (LD) in school-aged children with reduced number of symptoms. For this purpose, a
novel hybrid feature selection approach is proposed by integrating a popular Rough Set based feature
ranking process with a modified backward feature elimination algorithm. The process of feature ranking
follows a method of calculating the significance or priority of each symptoms of LD as per their
contribution in representing the knowledge contained in the dataset. Each symptoms significance or
priority values reflect its relative importance to predict LD among the various cases. Then by eliminating
least significant features one by one and evaluating the feature subset at each stage of the process, an
optimal feature subset is generated. For comparative analysis and to establish the importance of rough set
theory in feature selection, the backward feature elimination algorithm is combined with two state-of-theart
filter based feature ranking techniques viz. information gain and gain ratio. The experimental results
show the proposed feature selection approach outperforms the other two in terms of the data reduction.
Also, the proposed method eliminates all the redundant attributes efficiently from the LD dataset without
sacrificing the classification performance.
AN EXPERIMENTAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MININGijscai
The feature selection or extraction is the most important task in Opinion mining and Sentimental Analysis
(OSMA) for calculating the polarity score. These scores are used to determine the positive, negative, and
neutral polarity about the product, user reviews, user comments, and etc., in social media for the purpose
of decision making and Business Intelligence to individuals or organizations. In this paper, we have
performed an experimental study for different feature extraction or selection techniques available for
opinion mining task. This experimental study is carried out in four stages. First, the data collection process
has been done from readily available sources. Second, the pre-processing techniques are applied
automatically using the tools to extract the terms, POS (Parts-of-Speech). Third, different feature selection
or extraction techniques are applied over the content. Finally, the empirical study is carried out for
analyzing the sentiment polarity with different features.
An Experimental Study of Feature Extraction Techniques in Opinion MiningIJSCAI Journal
The feature selection or extraction is the most important task in Opinion mining and Sentimental Analysis
(OSMA) for calculating the polarity score. These scores are used to determine the positive, negative, and
neutral polarity about the product, user reviews, user comments, and etc., in social media for the purpose
of decision making and Business Intelligence to individuals or organizations. In this paper, we have
performed an experimental study for different feature extraction or selection techniques available for
opinion mining task. This experimental study is carried out in four stages. First, the data collection process
has been done from readily available sources. Second, the pre-processing techniques are applied
automatically using the tools to extract the terms, POS (Parts-of-Speech). Third, different feature selection
or extraction techniques are applied over the content. Finally, the empirical study is carried out for
analyzing the sentiment polarity with different features
Neural Network Based Context Sensitive Sentiment AnalysisEditor IJCATR
Social media communication is evolving more in these days. Social networking site is being rapidly increased in recent years, which provides platform to connect people all over the world and share their interests. The conversation and the posts available in social media are unstructured in nature. So sentiment analysis will be a challenging work in this platform. These analyses are mostly performed in machine learning techniques which are less accurate than neural network methodologies. This paper is based on sentiment classification using Competitive layer neural networks and classifies the polarity of a given text whether the expressed opinion in the text is positive or negative or neutral. It determines the overall topic of the given text. Context independent sentences and implicit meaning in the text are also considered in polarity classification.
APPROXIMATE ANALYTICAL SOLUTION OF NON-LINEAR BOUSSINESQ EQUATION FOR THE UNS...mathsjournal
For one dimensional homogeneous, isotropic aquifer, without accretion the governing Boussinesq
equation under Dupuit assumptions is a nonlinear partial differential equation. In the present paper
approximate analytical solution of nonlinear Boussinesq equation is obtained using Homotopy
perturbation transform method(HPTM). The solution is compared with the exact solution. The
comparison shows that the HPTM is efficient, accurate and reliable. The analysis of two important aquifer
parameters namely viz. specific yield and hydraulic conductivity is studied to see the effects on the height
of water table. The results resemble well with the physical phenomena.
FEATURE SELECTION AND CLASSIFICATION APPROACH FOR SENTIMENT ANALYSISmlaij
Sentiment analysis and Opinion mining has emerged as a popular and efficient technique for information retrieval and web data analysis. The exponential growth of the user generated content has opened new horizons for research in the field of sentiment analysis. This paper proposes a model for sentiment analysis of movie reviews using a combination of natural language processing and machine learning approaches. Firstly, different data pre-processing schemes are applied on the dataset. Secondly, the behaviour of twoclassifiers, Naive Bayes and SVM, is investigated in combination with different feature selection schemes to
obtain the results for sentiment analysis. Thirdly, the proposed model for sentiment analysis is extended to
obtain the results for higher order n-grams.
The sarcasm detection with the method of logistic regressionEditorIJAERD
The prediction analysis is approach which may predict future possibilities. This research work is based on the
sarcasm detection from the text data. In the previous time SVM classification is applied for the sarcasm detection. The SVM
classifier classifies data based on the hyper plane which give low accuracy. To improve accuracy for sarcasm detection
logistic regression is applied during this work. The existing and proposed techniques are implemented in python and results
are analysed in terms of accuracy, execution time. The proposed approach has high accuracy and low execution time as
compared to SVM classifier for sarcasm detection.
A survey of modified support vector machine using particle of swarm optimizat...Editor Jacotech
The main objective of this survey paper is to provide a detailed description of Wireless Sensor Networks with Medium Access Control layer and Routing layer. In the medium access control layer, Event Driven Time Division Multiple Access protocol is studied and in Network layer, two routing protocols Bellman-Ford and Dynamic Source Routing are studied.
AN EFFICIENT FEATURE SELECTION IN CLASSIFICATION OF AUDIO FILEScscpconf
In this paper we have focused on an efficient feature selection method in classification of audio files.
The main objective is feature selection and extraction. We have selected a set of features for further
analysis, which represents the elements in feature vector. By extraction method we can compute a
numerical representation that can be used to characterize the audio using the existing toolbox. In this
study Gain Ratio (GR) is used as a feature selection measure. GR is used to select splitting attribute
which will separate the tuples into different classes. The pulse clarity is considered as a subjective
measure and it is used to calculate the gain of features of audio files. The splitting criterion is
employed in the application to identify the class or the music genre of a specific audio file from
testing database. Experimental results indicate that by using GR the application can produce a
satisfactory result for music genre classification. After dimensionality reduction best three features
have been selected out of various features of audio file and in this technique we will get more than
90% successful classification result.
In this paper we have focused on an efficient feature selection method in classification of audio files.
The main objective is feature selection and extraction. We have selected a set of features for further
analysis, which represents the elements in feature vector. By extraction method we can compute a
numerical representation that can be used to characterize the audio using the existing toolbox. In this
study Gain Ratio (GR) is used as a feature selection measure. GR is used to select splitting attribute
which will separate the tuples into different classes. The pulse clarity is considered as a subjective
measure and it is used to calculate the gain of features of audio files. The splitting criterion is
employed in the application to identify the class or the music genre of a specific audio file from
testing database. Experimental results indicate that by using GR the application can produce a
satisfactory result for music genre classification. After dimensionality reduction best three features
have been selected out of various features of audio file and in this technique we will get more than
90% successful classification result.
A Survey on Sentiment Analysis and Opinion MiningIJSRD
In Today’s world, the social media has given web users a place for expressing and sharing their thoughts and opinions on different topics or events. For this purpose, the opinion mining has gained the importance. Sentiment classification and Opinion Mining is the study of people’s opinion, emotions, attitude towards the product, services, etc. Sentiment Analysis and Opinion Mining are the two interchangeable terms. There are various approaches and techniques exist for Sentiment Analysis like Naïve Bayes, Decision Trees, Support Vector Machines, Random Forests, Maximum Entropy, etc. Opinion mining is a useful and beneficial way to scientific surveys, political polls, market research and business intelligence, etc. This paper presents a literature review of various techniques used for opinion mining and sentiment analysis.
A Survey on Sentiment Analysis and Opinion MiningIJSRD
In Today’s world, the social media has given web users a place for expressing and sharing their thoughts and opinions on different topics or events. For this purpose, the opinion mining has gained the importance. Sentiment classification and Opinion Mining is the study of people’s opinion, emotions, attitude towards the product, services, etc. Sentiment Analysis and Opinion Mining are the two interchangeable terms. There are various approaches and techniques exist for Sentiment Analysis like Naïve Bayes, Decision Trees, Support Vector Machines, Random Forests, Maximum Entropy, etc. Opinion mining is a useful and beneficial way to scientific surveys, political polls, market research and business intelligence, etc. This paper presents a literature review of various techniques used for opinion mining and sentiment analysis.
Profile Analysis of Users in Data Analytics DomainDrjabez
Data Analytics and Data Science is in the fast forward
mode recently. We see a lot of companies hiring people for data
analysis and data science, especially in India. Also, many
recruiting firms use stackoverflow to fish their potential
candidates. The industry has also started to recruit people based
on the shapes of expertise. Expertise of a personal is
metaphorically outlined by shapes of letters like I, T, M and
hyphen betting on her experiencein a section (depth) and
therefore the variety of areas of interest (width).This proposal
builds upon the work of mining shapes of user expertise in a
typical online social Question and Answer (Q&A) community
where expert users often answer questions posed by other
users.We have dealt with the temporal analysis of the expertise
among the Q&A community users in terms how the user/ expert
have evolved over time.
Keywords— Shapes of expertise, Graph communities, Expertise
evolution, Q&A community
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Welcome to the first live UiPath Community Day Dubai! Join us for this unique occasion to meet our local and global UiPath Community and leaders. You will get a full view of the MEA region's automation landscape and the AI Powered automation technology capabilities of UiPath. Also, hosted by our local partners Marc Ellis, you will enjoy a half-day packed with industry insights and automation peers networking.
📕 Curious on our agenda? Wait no more!
10:00 Welcome note - UiPath Community in Dubai
Lovely Sinha, UiPath Community Chapter Leader, UiPath MVPx3, Hyper-automation Consultant, First Abu Dhabi Bank
10:20 A UiPath cross-region MEA overview
Ashraf El Zarka, VP and Managing Director MEA, UiPath
10:35: Customer Success Journey
Deepthi Deepak, Head of Intelligent Automation CoE, First Abu Dhabi Bank
11:15 The UiPath approach to GenAI with our three principles: improve accuracy, supercharge productivity, and automate more
Boris Krumrey, Global VP, Automation Innovation, UiPath
12:15 To discover how Marc Ellis leverages tech-driven solutions in recruitment and managed services.
Brendan Lingam, Director of Sales and Business Development, Marc Ellis
2. 22 Computer Science & Information Technology (CS & IT)
2. RECENT DEVELOPMENTS IN OMSA
This section presents the recent development of OMSA approach with different frameworks,
methods, techniques, and algorithms. The main characteristics of OMSA approach is shown in
Table 1, which gives the complete or overall reference to the researchers and the process is
explained below. It is necessary to understand the present work to carry out future work without
duplication. First, the various framework and algorithms in opinion mining with data collection
approach, pre-processing stage and classification of polarity. Second, describes the various
framework and algorithms in sentiment analysis with data collection approach, pre-processing
stage and classification of polarity.
2.1 Twitter Opinion Mining (TOM) Frame work and Polarity Classification
Algorithm
Farhan Hassan Khan et al. [5] proposed a new TOM framework to predict the polarity of words
into positive or negative feelings in tweets, and to improve the accuracy level of this
classification. TOM framework is constructed into three stages. First, data acquisition process,
which is used to obtain the Twitter feeds with sparse features through Twitter streaming API from
OSN. Twitter4J library has been used to extract only English language tweets. Second, pre-processing,
which process each tweets individually for the refinement operations such as
detection and analysis of slangs/abbreviations, Lemmatization and correction, and stop words
removal. Then the refined tweets pass into the classifier. Third, Polarity Classification Algorithm
(PCA), it classifies the twitter feeds on basis of Enhanced Emoticon Classifier (EEC), Improved
Polarity Classifier (IPC), and SentiWordNet Classifier (SWNC) by using set of emoticons, a list
of positive and negative words, and SentiWordNet dictionary respectively. In this stage, reducing
the number of neutral tweets is the major issue. For this problem, final classification is performed
to indicate more accurate results than its predecessors based on the scores of EEC, IPC, and
SWNC.
2.2 Standard election prediction model by using User Influence factor
Malhar Anjaria et al. [10] introduced a model to predict the election result by applying the user
influence factor (re-tweets and each party garners) and extracting opinions using direct and
indirect feature on the basis of the supervised algorithms such as NB (simple probabilistic model),
MaxEnt (Uniform classification model), SVM (achieves maximum margin hyper plane), ANN
(feed forward network), and SVM with PCA (dimension reduction). This model is built into
several steps. Step1, data collection approach is used to collect tweets with Candidate’s name.
Step2, normalization and feature reduction includes the refinement operations Internet acronyms
and emoticons, duplicate tweets, candidate accounts, word expansion, URLs, repeated words and
repeated characters for to get original sentence format into usable format of tweets. Step3, feature
extraction and extended terms used for the purpose of unigram, bigram and a unigram + bigram.
Step4, machine learning methods, in which the polarity based classification applied to
independent features, and it fails to capture query specific sentiments so that the aforementioned
text classification algorithms used for direct features. Setp5, incorporating sentiment analysis
considers re-tweets into account for influence factor. In Step6, analyzing gender based votes
based on term frequency. Finally, US Presidential Election result 2012 and Karnataka State
Assembly Election 2013 results are shown that Twitter provides a reasonable accuracy.
3. Computer Science & Information Technology (CS & IT) 23
2.3 Similarity shaped membership function
Tapia-Rosero A et al. [15] employed a method to detect similarity shaped membership functions
in group decision making process. This method is constructed by using the symbolic notation,
similarity measure, and grouping membership functions by shape similarity. The symbolic
notation has two component algorithms to get shape-string and feature-string which represent a
sequence of symbols and sequence of linguistic terms respectively for each segment of
membership functions. The similarity measure used the linguistic terms from extremely short to
extremely long to obtain an overall similarity at unit interval. In a group decision making
environment, the grouping membership functions by shape similarity aims to gather groups using
similarity matrix.
2.4 Three-level similarity method (TLSM)
Jun ma et al. [8] stated a method to reduce the chance of applying inappropriate decisions in the
multi-criteria group decision making (MCGDM). In this aspect, a Gradual Aggregation
Algorithm (GAA) developed and established a TLSM. GAA faced two practical issues. First,
how to handle missing values. Second, how to generate a decision dynamically in MCGDM?. To
solve these issues, GAA is implemented in two ways, i.e., OGA (ordinary gradual algorithm) and
WGA (weighted gradual algorithm). The OGA does not explicitly process the criteria weights and
leaves it to the aggregation operator but the WGA does. TLSM will be measuring the similarity of
two participants opinion at three sequence levels, i.e., assessment level, criterion level, and
problem level. In Level-1, the term set will be divided into several semantic-equal groups for the
criterion and then used HCFSM. In Leve-2, identifies an appropriate SUF for each criterion at
PSA and PRW. In Level-3, each individual criterion provides a single perspective to observe
similarity of two opinions.
2.5 Unsupervised dependency analysis-based approach
Xiaolin Zheng et al. [17] presented an unsupervised dependency analysis-based approach to
extract AEP (Appraisal Expression Pattern) from reviews. The problem statement is defined at
different terminologies such as domain, aspect word, sentiment word, background word, and
review. Then the AEP is applied to represent the syntactic relationship between aspect and
sentiment words by using Shortest Dependency Path (SDP), Confidence score (CS) and
parameter inference. Finally, AEP-LDA (Latent Dirichlet Allocation) model is employed to
jointly identify the aspect and sentiment words. It outperforms the base line method when
compared to other supervised methods such as LDA, Local-LDA, Standard LDA, AEP-LDA (no-
AEP).
Pre-processing
Steps
Classification
Method (M) /
Technique
(T)
Result
Data
Collection
Process
Internet
Messages
Online Social
Networks
(OSN) Sites
Fig. 1 General System Architecture of OMSA approach
4. 24 Computer Science & Information Technology (CS & IT)
2.6 SuperedgeRank algorithm
Ning Ma et al. [11] introduced a SuperedgeRank algorithm to identify opinion leaders in online
public opinion supernetwork model. This model is built up into three stages. First, data processing
which identifies the main information from the public comments collected from internet and then
applies the ICTCLAS for splitting sentence into words. Second, online public opinion
supernetwork modeling includes four types of elements i.e., Social Subnetwork, Environment
Subnetwork, Psychological Subnetwork, and Viewpoint Subnetwork to form four layers of
supernetwork. Third, Indexes of online public opinion supernetwork includes Node Superdegree,
Superedge Degree to identify opinion leaders and Superedge-superedge Distance, and Superedge
Overlap to evaluate and verify the results. The aforesaid algorithm ranks the superedges in
supernetwork by using the indexes: the influential degree of information dissemination, the
transformation likelihood between different psychological types, and the similarity between
keywords of viewpoints.
2.7 Hybrid opinion mining framework for e-commerce application
Vinodhini G et al. [16] introduced two frameworks by the combination of classifiers with
principal component analysis (PCA) to reduce the dimension of feature set. First, PCA with
Bagging which is used to construct each member of the ensemble, and to predict the combination
over class labels. Second, Bayesian boosting model is employed using rapid miner tool. For data
pre-processing, a word vector representation of review sentences is created for the aforesaid
models. Finally, the results proved that the PCA is a suitable dimension reduction method for
bagged SVM and Bayesian boosting methods.
2.8 Opinion mining model for ontologies
Isidro Penalver-Martinez et al. [7] presented an innovative method called ontology based opinion
mining to improve the feature-based opinion mining by employing the ontologies in selection of
features and to provide a new vector analysis-based method for sentiment analysis. The
framework composed by four main modules namely, NLP module, Ontology-based feature
identification module, polarity identification module, and opinion mining module. In module1,
obtains the morphologic and syntactic structure of each sentence by including the pre-processing
and POS. In module2, extract the features from the opinions expressed by users. In module3,
provides the positive, negative and neutral values of nouns, adjectives and verbs. In module4, the
vector analysis enables an effective feature sentiment classification. Each feature is represented
using three coordinates. Finally, the results obtained in the movie review domain.
2.9 Sentiment extraction and change detection
Alvaro Ortigosa et al. [2] introduced a new method is called sentiment extraction and change
detection. The method includes the operations for extracting sentiments from texts are pre-processing,
segmentation into sentences, tokenization, emotion detection, interjection detection,
token score assignation (building the lexicon, removing the repetitive letters, spell checking),
syntactical analysis, polarity calculation, and for sentiment change detection are building the user
regular patter and then comparing weeks. In this aspect, a Facebook application is implemented in
SentBuk, which obtains the data from Facebook with the following permissions: Offline_access,
Read_stream, and User_about_me and using the user interface performs user regular pattern and
combining weeks for classification.
5. Computer Science & Information Technology (CS & IT) 25
2.10 Semi-supervised Laplacian Eigenmap (SS-LE)
Kyoungok Kim et al. [9] presented SS-LE to reduce the dimensionality of the data points. The
experimental process is taken into text cleaning, text refinement, vectorization, applying PCA,
and applying SS-LE to reduce the dimension to 2 or 3. SS-LE constructs the graph by utilizing
label information and without label information. From this two graphs, graph laplacian matrices
are calculated separately, and then dimensionality reduction process used to minimize the
distance between two data points and its neighbors by weights.
2.11 Three ensemble methods with five learners in Facebook application
Gang Wang et al. [6] conducted the comparative assessment to measure the performance of three
ensemble methods i.e., Bagging, Boosting, and Random Subspace with five learners: NB which is
a simple probabilistic classification method, MaxEnt doesn’t make any assumptions in relations
between features, Decision Tree is a sequence model for a sequence of simple tests, K-Nearest
Neighbor classifies majority of its vote of its neighbors, and SVM has ability to model non-linearity.
For the above mentioned three ensemble methods, the base learners are constructed
from the training data set using random independent, weighted versions, and random subspaces of
the feature space respectively.
2.12 VIKOR and Sentiment Analysis framework
Daekook Kang et al. [4] presented a new framework in two stages by combining the VIKOR
approach and sentiment analysis for measurement of customer satisfaction in mobile services.
VIKOR is a compromising ranking method for MCDM. First, data collection and pre-processing
stages involves into the operations like preprocessing data from relevant website. Second,
compiling dictionaries of service attributes and sentiment words, it combines the dictionary of
attributes and dictionary of sentiment words, and then expressed in verb phrases, adjective
phrases and adverbial phrases for sentiment words into positive, negative and neural polarity, at
the last assigns score with WordNet. Third, the constructed keyword vectors of customer’s
opinions with reference to the dictionaries. Fourth, the customer satisfaction is measured with
respect to each service attribute. Finally, evaluates the customer satisfaction by considering all
attributes.
2.13 Fuzzy deep belief network
Shusen Zhou et al. [14] constructed a two step semi- supervised learning method for the
sentiment classification. In this sense, the general DBN is trained by using abundant unlabeled
and labeled reviews, design a fuzzy membership function for each class of reviews, and then
DBN maps each review into the output space for constructing the FDBN. In the first step, all
unlabeled and labeled reviews trained by using the general DBN and estimate the parameters
based on mapping results of all reviews. In second step, all unlabeled and labeled reviews trained
based on membership functions. Also, AFD proposed by combining active learning with FDBN
for labeling and uses them for training.
2.14 Construction of constrained domain-specific sentiment lexicon
Sheng Huang et al. [13] proposed an automatic construction strategy of domain specific sentiment
lexicon based on constrained label propagation. In this method, each steps presented sequentially
as follows. In step1, sentiment term extraction, which is used to detect and extract candidate term
from the corpus. In this step, nouns and noun phrases are expressing objective states, adjectives,
6. 26 Computer Science & Information Technology (CS & IT)
verbs and their phrases are used for reviewed objects, adverbs and their phrases are used to
enhance or weaken the adjectives and verbs opinions. In step2, sentiments seeds extraction, which
maintains consistent sentiment polarities across multiple domains and it extracted from semi-structured
format i.e., Title, Pros, Cons, and Text. It describes rated aspect, positive and negative
aspect (nouns and adjective-noun phrases) respectively. In step3, association similarity graph
construction, which constructs similarity graph to propagate sentiment information. In step4,
constraints definition and extraction focused two types of constraints called contextual constraint
and morphological constraint. In contextual constraint, coherence or incoherence relations are
used to maintain sentiment polarity directly or reversely. In morphological constraint, coherence
or incoherence relations between sentiment terms and also maintains sentiment polarity directly
or reversely. In step5, constraint propagation defines a matrix and encodes the direct and reverse
constraints into pair-wise constraints. In step6, constrained label propagation in which each
sentiment term receives sentiment information from its neighbors and retains its initial polarity
label.
Table 1. Main characteristics of OMSA approach published in 2014.
7. Computer Science & Information Technology (CS & IT) 27
2.15 Ranked WordNet graph for sentiment polarity
Arturo Montejo-Raez et al. [3] employed a method for sentiment classification by using weights
of WordNet graph. In this method, the polarity of measurement consider in the interval [-1, 1] and
defines a function where values over zero refers positive polarity, values below zero refers
negative polarity and values to closer to zero refers neutral. This function is computed by
expanding senses and final estimation. Expanding senses intends to expand few concepts in order
to calculate the global polarity of the tweet by using the graph of WordNet according to random
walk algorithm. The final estimation i.e., final polarity score is evaluated by the combination of
SentiWordNet score and random walk weights.
3. EVALUATION RESULTS OF OMSA APPROACH
The OMSA approaches are demonstrated in particular resources with different datasets and its
volume as stated in Table 2. Also, the evaluated classification performance was analyzed by using
the feature selection process with the different types of metrics such as confusion matrices,
precision, recall and F-measure, etc., and their key findings in all the OMSA approach as shown
in Table 3. In this paper, we have used the Polarity Classification Algorithm (PCA) and
evaluation procedure to verify the accuracy for the above mentioned approach by using the
Sanders-Twitter Sentiment Corpus [18]. The corpus contained 5513 hand-classified tweets which
are focused on the topic of the companies (Apple, Google, Microsoft and Twitter) and products.
The tweet sentiments are labeled as positive, neutral, negative and irrelevant. The datasets count
is given in Table 4. These datasets has been measured with 43 trained tweets by using the
confusion matrices (Table 5), precision, recall, F-measure and accuracy. The results are shown in
Table 6 and Figure 2. In this analysis, the overall system accuracy is only considered and shown.
Table 4. Dataset count
Datasets No. of. Tweets
Apple 1313
Google 1381
Microsoft 1415
Twitter 1404
Table 5. Confusion matrix
P Q R S
P tpP ePQ ePR ePS
Q eQP tpQ eQR eQS
R eRP eRQ tpR eRS
S eSP eSQ eSR tpS
Precision P = tpP / (tpP+eQP+eRP+eSP), Recall P = tpP / (tpP+ePQ+ePR+ePS), F-measure = 2 x
(Precision x Recall) / (Precision + Recall), Accuracy = (True Positive + True Negative + True
Neutrals + True irrelevant) / (True Positive + False Positive + True Negative + False Negative +
True Neutrals + False Neutrals + True Irrelevant + False Irrelevant). Based on this results and
observations, the accuracy level of the OMSA approach remains the same for the resources.
8. 28 Computer Science & Information Technology (CS & IT)
Table 2. Results of OMSA approach with datasets and key findings
9. Computer Science & Information Technology (CS & IT) 29
Table 6. Dataset calculation for confusion matrices, precision, recall and accuracy
Figure 2. Comparing precision, recall, F-measure and accuracy with four different datasets
10. 30 Computer Science & Information Technology (CS & IT)
4. DISCUSSED CHALLENGES AND FUTURE DEVELOPMENTS IN OMSA
The challenges and future developments are discussed below for the above presented frameworks
of OMSA approach published in 2014. It is one of the important tools to the aspirant researchers
to focus on new innovative idea. Farhan Hassan Khan et al. [5] indicated that TOM framework
faced challenges due to their short length and irregular structure of the content such as named
entity recognition, anaphora resolution, parsing, sarcasm, sparsity, abbreviations, poor spellings,
punctuation and grammar, incomplete sentences. Also, suggested to compare the proposed
algorithm with TweetFeel and Sentiment 140 for further improvement of the accuracy. Malhar
Anjaria et al. [10] mentioned that the presence of all entities in unbiased and equal manner was
the biggest challenge to provide the accuracy in the standard election prediction model. But, there
is a chance of increasing the accuracy level in future by including the more influential factors as
age, educational background, employment, economic criterion, rural and urban and social
development index.
Tapia-Rosero A et al. [15] discussed that the group decision making process might be difficult
under a supervision of mediator. This work could be extended to final objectives i.e., strategic
planning, suitability analysis, and applications like fuzzy control, fuzzy time series to find the
similarities. Jun ma et al. [8] indicated the major issues that to reduce the risk of putting an
inappropriate decision making and measuring opinion similarity between the participants. In
GAA, the integrating information according to group of inputs and missing value and unclear
answers need to be studied in future. Xiaolin Zheng et al. [17] stated that to jointly identify aspect
and sentiment word with comparison of other models. In future work, AEP-LDA model to
assume at single sentence and extend at clause level and into aspect-based review summarization,
sentiment classification, and personalized recommendation systems.
Ning Ma et al. [11] discussed that online public opinion controlled by deleting long existing posts
with rumor from negative opinion leaders. After identifying the opinion leaders, the further work
should be focused on how to implement corresponding guidance and interference. Vinodhini G et
al. [16] misclassification is reduced, and the classification accuracy of negative opinion to be
improved. Isidro Penalver-Martinez et al. [7] addressed the present challenges that ontology-based
feature identification, and feature polarity identification. Alvaro Ortigosa et al. [2]
demonstrated to extract information from user messages and detect emotional changes. Further,
tests to be conducted to determine the values in each case for sentiment changes, and the
threshold to distinguish between small changes on user sentiment and significant changes.
Kyoungok Kim et al. [9] addressed the dimensionality reduction transformations into 2D or 3D
by using term frequency matrices, and further work suggested that the weight of the edges in the
label graph will be adjusted by using the sophisticated approach and to transform document into
term frequency. Gang Wang et al. [6] evaluated the ensemble methods, and specified that large
datasets need to be collected for validating the result in future i.e., improving the interpretability
of ensembles is an important research direction. Daekook Kang et al. [4] indicated that the
measurement of the customer satisfaction was conducted by surveys. It’s taken more time and
effort to collect useful information. Further studies suggested that to incorporate more advanced
techniques of sentiment analysis and validate the empirical results presented.
Shusen Zhou et al. [14] discussed the issues that embed the fuzzy knowledge to improve the
performance of semi-supervise based sentiment classification. Sheng Huang et al. [13]
incorporated the contextual and morphological constraints between sentiment terms, and
suggested the future works that incorporate more types of constraints knowledge between
sentiment terms and to distinguish the aspect-specific polarities. Arturo Montejo-Raez et al. [3]
11. Computer Science & Information Technology (CS & IT) 31
stated that how to deal with negation, and to study the context of a specific tweet among the time
line of tweets from the particular user in order to identify publisher’s mood and adjust final score.
Table 3. Performance Analysis of OMSA approach using feature extraction
Based on this observation, OMSA approach is not following any single algorithm or technique or
method for data collection process, pre-processing and classification, and to solve all issues in
social media for extracting the user’s opinion. Therefore, the problem is defined with some
specific application domain to estimate the polarity of the system. The OMSA approach is also
dealt with various models as shown in Table 1 such as descriptive, predictive and statistical for
the purpose of extracting opinions by using the applications of Fuzzy sets, grouping similarity
measure, group decision making process, policy creation process, OSN sites, and symbolic
notations. The analysis of result is very useful to overcome issues and to develop various new
methods or techniques without duplication.
12. 32 Computer Science & Information Technology (CS & IT)
5. CONCLUSION
Opinion mining and sentiment analysis is emerging as a challenging field as part of text mining in
social networks to an organization, Government and public. It has a lot of applications and
developments that to predict people’s polarity towards their decision making process. In this
paper, we conclude that the frameworks and algorithms of OMSA are presented with the data
collection process, preprocessing methods, classification and performance evaluation results as a
review.
REFERENCES
[1] Alejandro Pena-Ayala.: Educational data mining: A survey and a data mining-based analysis of recent
works. Expert Systems with Applications. 41, 1432-1462 (2014)
[2] Alvaro Ortigosa, Jose M. Martin, Rosa M. Carro.: Sentiment analysis in Facebook and its application
to e-learning. Computers in Human Behavior. 31, 527-541 (2014)
[3] Arturo Montejo-Raez, Eugenio Martinez-Camara, M. Teresa Martin-Valdivia, L. Alfonso Urena-
Lopez.: Ranked WordNet graph for Sentiment Polarity Classification in Twitter. Computer Speech
and Language. 28, 93-107 (2014)
[4] Daekook Kang, Yongtae Park.: Review-based measurement of customer satisfaction in mobile
service: Sentiment analysis and VIKOR approach. Expert Systems with Applications. 41, 1041-1050
(2014)
[5] Farhan Hassan Khan, Saba Bashir, Usman Qamar.: TOM: Twitter opinion mining framework using
hybrid classification scheme. Decision Support Systems. 57, 245-257 (2014)
[6] Gang Wang, Jianshan Sun, Jian Ma, Kaiquan Xu, Jibao Gu.: Sentiment Classification: The
contribution of ensemble learning. Decision support systems. 57, 77-93 (2014)
[7] Isidro Penalver-Martinez, Francisco Garcia-Sanchez, Rafael Valencia-Garcia, Miguel Angel
Rodriguez-Garcia, Valentin Moreno, Anabel Fraga, Jose Luis Sanchez-Cervantes.: Feature-based
opinion mining through ontologies. Expert Systems with Applications. 41, 5995-6008 (2014)
[8] Jun Ma, Jie Lu, Guangquan Zhang.: A three-level-similarity measuring method of participant
opinions in multiple-criteria group decision supports. Decision Support Systems. 59, 74-83 (2014)
[9] Kyoungok Kim, Jaewook Lee.: Sentiment visualization and classification via semi-supervised
nonlinear dimensionality reduction. Pattern Recognition. 47, 758-768 (2014)
[10] Malhar Anjaria & Ram Mohana Reddy Guddeti.: Influence factor based opinion mining of twitter
data using supervised learning. Sixth IEEE International conference on communication systems and
networks (COMSNETS). ISSN: 1409-5982, (2014)
[11] Ning Ma, Yijun Liu.: SuperedgeRank algorithm and its application in identifying opinion leader of
online public opinion supernetwork. Expert Systems with Applications. 41, 1357-1368 (2014)
[12] R.V. Vidhu Bhala, S. Abirami.: Trends in word sense disambiguation. Artificial Intelligence Review:
An International Science and Engineering Journal. DOI 10.1007/s10462-012-9331-5, Springer,
(2012)
[13] Sheng Huang, Zhendong Niu, Chongyang Shi.: Automatic construction of domain-specific sentiment
lexicon based on constrained label propagation. Knowledge-Based Systems. 56, 191-200 (2014)
[14] Shusen Zhou, Qingcai Chen, Xiaolong Wang.: Fuzzy deep belief networks for semi-supervised
sentiment classification. Neurocomputing. 131, 312-322 (2014)
[15] Tapia-Rosero A, A. Bronselaer, G. De Tre.: A method based on shape-similarity for detecting similar
opinions in group decision-making. Information Sciences. 258, 291-311 (2014)
[16] Vinodhini G, Chandrasekaran RM.: Measuring quality of hybrid opinion mining model for e-commerce
application. Measurement. 55, 101-109 (2014)
[17] Xiaolin Zheng, Zhen Lin, Xiaowei Wang, Kwei-Jay Lin, Meina Song.: Incorporating appraisal
expression patterns into topic modeling for aspect and sentiment word identification. Knowledge-
Based Systems. 61, 29-47 (2014)
[18] Niek J. Sanders.: Sanders-Twitter Sentiment Corpus, Version 2.0.1