This presentation consist of detail description regarding how social media sentiments analysis is performed , what is its scope and benefits in real life scenario.
Twitter Sentiment Analysis Project Done using R.
In these Project we deal with the tweets database that are avaialble to us by the Twitter. We clean the tweets and break them out into tokens and than analysis each word using Bag of Word concept and than rate each word on the basis of the score wheter it is positive, negative and neutral.
We used Naive Baye's Classifier as our base.
This presentation consist of detail description regarding how social media sentiments analysis is performed , what is its scope and benefits in real life scenario.
Twitter Sentiment Analysis Project Done using R.
In these Project we deal with the tweets database that are avaialble to us by the Twitter. We clean the tweets and break them out into tokens and than analysis each word using Bag of Word concept and than rate each word on the basis of the score wheter it is positive, negative and neutral.
We used Naive Baye's Classifier as our base.
Sentiment analysis - Our approach and use casesKarol Chlasta
I. Introduction to Sentiment Analysis and its applications.
II. How to approach Sentiment Analysis?
III. 2015 Elections in Poland on Twitter.com & Onet.pl.
Sentiment analysis or opinion mining is a process of categorizing and identifying the sentiment expressed in a particular text. The need of automatic sentiment retrieval of
the text is quite high as a number of reviews obtained from the Internet sources like Twitter are huge in number. These reviews or opinions on popular products or events help in determining the public opinion towards the issue. An averaged histogram model is proposed in the process that deals with text classification in continuous variable approach. After data cleaning and feature extraction from the reviews, average histograms are constructed for every class, containing a generalized feature representation in that particular class, namely positive and negative. Histograms of every test elements are then classified using k-NN, Bayesian Classifier and LSTM network. This work is then implemented in Android integrated with Twitter. The user will have to provide the topic for analysis. The Application will show the result as the percentage of positive review tweets in favor of the topic using Bayesian Classifier.
Sentimental analysis is a context based mining of text, which extracts and identify subjective information from a text or sentence provided. Here the main concept is extracting the sentiment of the text using machine learning techniques such as LSTM Long short term memory . This text classification method analyses the incoming text and determines whether the underlined emotion is positive or negative along with probability associated with that positive or negative statements. Probability depicts the strength of a positive or negative statement, if the probability is close to zero, it implies that the sentiment is strongly negative and if probability is close to1, it means that the statement is strongly positive. Here a web application is created to deploy this model using a Python based micro framework called flask. Many other methods, such as RNN and CNN, are inefficient when compared to LSTM. Dirash A R | Dr. S K Manju Bargavi "LSTM Based Sentiment Analysis" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-5 | Issue-4 , June 2021, URL: https://www.ijtsrd.compapers/ijtsrd42345.pdf Paper URL: https://www.ijtsrd.comcomputer-science/data-processing/42345/lstm-based-sentiment-analysis/dirash-a-r
Sentiment analysis of Twitter data using pythonHetu Bhavsar
Twitter is a popular social networking website where users posts and interact with messages known as “tweets”. To automate the analysis of such data, the area of Sentiment Analysis has emerged. It aims at identifying opinionative data in the Web and classifying them according to their polarity, i.e., whether they carry a positive or negative connotation. We will attempt to conduct sentiment analysis on “tweets” using various different machine learning algorithms.
Sentiment analysis is essential operation to understand the polarity of particular text, blog etc. This presentation has introduction to SA and the approaches in which they can be designed.
Project Report for Twitter Sentiment Analysis done using Apache Flume and data is analysed using Hive.
I intend to address the following questions:
How raw tweets can be used to find audience’s perception or sentiment about a person ?
How Hadoop can be used to solve this problem?
How Apache Hive can be used to organize the final data in a tabular format and query it?
How a data visualization tool can be used to display the findings?
It gives an overview of Sentiment Analysis, Natural Language Processing, Phases of Sentiment Analysis using NLP, brief idea of Machine Learning, Textblob API and related topics.
Sentiment analysis using naive bayes classifier Dev Sahu
This ppt contains a small description of naive bayes classifier algorithm. It is a machine learning approach for detection of sentiment and text classification.
Make a query regarding a topic of interest and come to know the sentiment for the day in pie-chart or for the week in form of line-chart for the tweets gathered from twitter.com
Sentiment analysis - Our approach and use casesKarol Chlasta
I. Introduction to Sentiment Analysis and its applications.
II. How to approach Sentiment Analysis?
III. 2015 Elections in Poland on Twitter.com & Onet.pl.
Sentiment analysis or opinion mining is a process of categorizing and identifying the sentiment expressed in a particular text. The need of automatic sentiment retrieval of
the text is quite high as a number of reviews obtained from the Internet sources like Twitter are huge in number. These reviews or opinions on popular products or events help in determining the public opinion towards the issue. An averaged histogram model is proposed in the process that deals with text classification in continuous variable approach. After data cleaning and feature extraction from the reviews, average histograms are constructed for every class, containing a generalized feature representation in that particular class, namely positive and negative. Histograms of every test elements are then classified using k-NN, Bayesian Classifier and LSTM network. This work is then implemented in Android integrated with Twitter. The user will have to provide the topic for analysis. The Application will show the result as the percentage of positive review tweets in favor of the topic using Bayesian Classifier.
Sentimental analysis is a context based mining of text, which extracts and identify subjective information from a text or sentence provided. Here the main concept is extracting the sentiment of the text using machine learning techniques such as LSTM Long short term memory . This text classification method analyses the incoming text and determines whether the underlined emotion is positive or negative along with probability associated with that positive or negative statements. Probability depicts the strength of a positive or negative statement, if the probability is close to zero, it implies that the sentiment is strongly negative and if probability is close to1, it means that the statement is strongly positive. Here a web application is created to deploy this model using a Python based micro framework called flask. Many other methods, such as RNN and CNN, are inefficient when compared to LSTM. Dirash A R | Dr. S K Manju Bargavi "LSTM Based Sentiment Analysis" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-5 | Issue-4 , June 2021, URL: https://www.ijtsrd.compapers/ijtsrd42345.pdf Paper URL: https://www.ijtsrd.comcomputer-science/data-processing/42345/lstm-based-sentiment-analysis/dirash-a-r
Sentiment analysis of Twitter data using pythonHetu Bhavsar
Twitter is a popular social networking website where users posts and interact with messages known as “tweets”. To automate the analysis of such data, the area of Sentiment Analysis has emerged. It aims at identifying opinionative data in the Web and classifying them according to their polarity, i.e., whether they carry a positive or negative connotation. We will attempt to conduct sentiment analysis on “tweets” using various different machine learning algorithms.
Sentiment analysis is essential operation to understand the polarity of particular text, blog etc. This presentation has introduction to SA and the approaches in which they can be designed.
Project Report for Twitter Sentiment Analysis done using Apache Flume and data is analysed using Hive.
I intend to address the following questions:
How raw tweets can be used to find audience’s perception or sentiment about a person ?
How Hadoop can be used to solve this problem?
How Apache Hive can be used to organize the final data in a tabular format and query it?
How a data visualization tool can be used to display the findings?
It gives an overview of Sentiment Analysis, Natural Language Processing, Phases of Sentiment Analysis using NLP, brief idea of Machine Learning, Textblob API and related topics.
Sentiment analysis using naive bayes classifier Dev Sahu
This ppt contains a small description of naive bayes classifier algorithm. It is a machine learning approach for detection of sentiment and text classification.
Make a query regarding a topic of interest and come to know the sentiment for the day in pie-chart or for the week in form of line-chart for the tweets gathered from twitter.com
Perform Twitter sentiment live stream analysis and classify the sentiment of a given text further analyzing the sentiments or emotions of people towards the entity.
A survey on sentiment analysis and opinion miningeSAT Journals
Abstract Sentiment analysis is a machine learning approach in which machines analyze and classify the human’s sentiments, emotions, opinions etc about some topic which are expressed in the form of either text or speech. The textual data available in the web is increasing day by day. In order to enhance the sales of a product and to improve the customer satisfaction, most of the on-line shopping sites provide the opportunity to customers to write reviews about products. These reviews are large in number and to mine the overall sentiment or opinion polarity from all of them, sentiment analysis can be used. Manual analysis of such large number of reviews is practically impossible. Therefore automated approach of a machine has significant role in solving this hard problem. The major challenge of the area of Sentiment analysis and Opinion mining lies in identifying the emotions expressed in these texts. This literature survey is done to study the sentiment analysis problem in-depth and to familiarize with other works done on the subject. Index Terms: Sentiment Analysis, Opinion Mining, Cross Domain Sentiment Analysis
A Pipeline for Distributed Topic and Sentiment Analysis of Tweets on Pivotal ...Srivatsan Ramanujam
Unstructured data is everywhere - in the form of posts, status updates, bloglets or news feeds in social media or in the form of customer interactions Call Center CRM. While many organizations study and monitor social media for tracking brand value and targeting specific customer segments, in our experience blending the unstructured data with the structured data in supplementing data science models has been far more effective than working with it independently.
In this talk we will show case an end-to-end topic and sentiment analysis pipeline we've built on the Pivotal Greenplum Database platform for Twitter feeds from GNIP, using open source tools like MADlib and PL/Python. We've used this pipeline to build regression models to predict commodity futures from tweets and in enhancing churn models for telecom through topic and sentiment analysis of call center transcripts. All of this was possible because of the flexibility and extensibility of the platform we worked with.
Sentiment Analysis/Opinion Mining of Twitter Data on Unigram/Bigram/Unigram+Bigram Model using:
1. Machine Learning
2. Lexical Scores
3. Emoticon Scores
YouTube Video: https://youtu.be/VuR16P87yPE
Link to the WebPage: http://akirato.github.io/Twitter-Sentiment-Analysis-Tool
Github Page: https://github.com/Akirato/Twitter-Sentiment-Analysis-Tool
Current trends of opinion mining and sentiment analysis in social networkseSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
English parts of speech is a challenge to many Indonesian teachers. The content of these slides are purely taken from a book (unfortunately I have completely forgotten the title ad author). By grouping the parts of speech and providing some examples, the book tries to 'elucidate' the seemingly perplexing topic.
Improving Sentiment Analysis of Short Informal Indonesian Product Reviews usi...TELKOMNIKA JOURNAL
Sentiment analysis in short informal texts like product reviews is more challenging. Short texts are
sparse, noisy, and lack of context information. Traditional text classification methods may not be suitable
for analyzing sentiment of short texts given all those difficulties. A common approach to overcome these
problems is to enrich the original texts with additional semantics to make it appear like a large document of
text. Then, traditional classification methods can be applied to it. In this study, we developed an automatic
sentiment analysis system of short informal Indonesian texts using Naïve Bayes and Synonym Based
Feature Expansion. The system consists of three main stages, preprocessing and normalization, features
expansion and classification. After preprocessing and normalization, we utilize Kateglo to find some
synonyms of every words in original texts and append them. Finally, the text is classified using Naïve
Bayes. The experiment shows that the proposed method can improve the performance of sentiment
analysis of short informal Indonesian product reviews. The best sentiment classification performance using
proposed feature expansion is obtained by accuracy of 98%.The experiment also show that feature
expansion will give higher improvement in small number of training data than in the large number of them.
Three experiments I have done with data science. Related to text analysis, integration. Focusing on the learning's rather than details on how it was done with source code. I feel it is important to see this subject in relation to business problems rather than as pure branch of Statistics. Focusing on what has to be done enabled me to find the right solution from a complicated and very interesting subject.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
Due to the fast growth of World Wide Web the online communication has increased. In recent times the communication focus has shifted to social networking. In order to enhance the text methods of communication such as tweets, blogs and chats, it is necessary to examine the emotion of user by studying the input text. Online reviews are posted by customers for the products and services on offer at a website portal. This has provided impetus to substantial growth of online purchasing making opinion analysis a vital factor for business development. To analyze such text and reviews sentiment analysis is used. Sentiment analysis is a sub domain of Natural Language Processing which acquires writer’s feelings about several products which are placed on the internet through various comments or posts. It is used to find the opinion or response of the user. Opinion may be positive, negative or neutral. In this paper a review on sentiment analysis is done and the challenges and issues involved in the process are discussed. The approaches to sentiment analysis using dictionaries such as SenticNet, SentiFul, SentiWordNet, and WordNet are studied. Dictionary-based approaches are efficient over a domain of study. Although a generalized dictionary like WordNet may be used, the accuracy of the classifier get affected due to issues like negation, synonyms, sarcasm, etc.
w
International Journal of Engineering Research and Development (IJERD)IJERD Editor
call for paper 2012, hard copy of journal, research paper publishing, where to publish research paper,
journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJERD, journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, publishing of research paper, reserach and review articles, IJERD Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathemetics journal, Physics journal, Chemistry journal, Computer Engineering, Computer Science journal, how to submit your paper, peer reviw journal, indexed journal, reserach and review articles, engineering journal, www.ijerd.com, research journals,
yahoo journals, bing journals, International Journal of Engineering Research and Development, google journals, hard copy of journal
International Journal of Engineering Research and Development (IJERD)IJERD Editor
journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJERD, journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, publishing of research paper, reserach and review articles, IJERD Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathemetics journal, Physics journal, Chemistry journal, Computer Engineering, Computer Science journal, how to submit your paper, peer reviw journal, indexed journal, reserach and review articles, engineering journal, www.ijerd.com, research journals,
yahoo journals, bing journals, International Journal of Engineering Research and Development, google journals, hard copy of journal
NLP (Natural Language Processing) is a mechanism that helps computers to know natural languages like English. In general, computers can understand data, tables etc. which are well formed. But when it involves natural languages, it's unacceptable for computers to spot them. NLP helps to translate the tongue in such a fashion which will be easily processed by modern computers. Financial Tracker is an approach which will use NLP as a tool and can differentiate the user messages in various categories. the appliance of the approach will be seen at multiple levels. At a personal level, this permits users to filtrate useful financial messages from an large junk of text messages. On the opposite hand, from an industrial point of view, this can be useful in services like online loan disbursal, which are hitting the market nowadays. These services attempt to provide online loans to individuals in an exceedingly faster and quicker manner. But when it involves business view, loan recovery from customers becomes a really important & crucial aspect. As most such services can’t take strict legal actions against the fraud customers, it becomes a requirement that loan should be provided only to those customers who deserve it. At that time, this model can come under the image. As a business we will find the user’s messages from their inbox (after taking permission from the users). These messages are often filtered using NLP which might help to differentiate various types of messages within the user's inbox which might further be used as a content for further prediction and analysis on user’s behaviour in terms of cash related transactions.
The 't' in tel software development for tel research problems, pitfalls, and ...Roland Klemke
At the core of TEL research are artefacts of digital technology, their design, implementation, application, and evaluation. Usually, these artefacts aim to fulfil a specific educational purpose and need to satisfy a number of requirements with respect to functionality, usability, scalability, or interoperability.
Software engineering is the discipline that structures, organises, and documents all aspects of the software development process in manageable steps. It explains all relevant stakeholder roles involved in the process and defines process models to handle the complexity of the software development process.
In research oriented projects, software engineering goals and research goals often collide: Software engineering strives to provide a fully fledged system with a complete set of functionality and a broad coverage of use cases. Research aims for evaluating testable hypotheses based on specific aspects of a system. This leads to the problem that the complexity of the design steps, complexity of the derived/developed solution contradicts easy to measure results. Furthermore, project contexts and research contexts often collide, leading to the question how to develop technology that fulfills development needs and research needs.
The lecture looks at typical situations, which occur in technology-oriented research projects and tries to show approaches to handle the inherent complexity within these.
References
Tchounikine, P.: Computer Science and Educational Software Design. Springer Berlin Heidelberg, Berlin, Heidelberg (2011).
Goodyear, P., Retalis, S.: Technology-enhanced learning Design Patterns and Pattern Languages. Sense Publishers (2010).
Mor, Y., Winters, N.: Design approaches in technology-enhanced learning. Interact. Learn. Environ. 15, 61–75 (2007).
Bjork, S., Holopainen, J.: Patterns in Game Design (Game Development Series). Charles River Media (2004).
Calvo, R.A., Turani, A.: E - learning Frameworks = ( Design Patterns + Software Components ). In (Goodyear & Retalis, 2010).
Wang, F., Hannafin, M.J.: Design-Based Research and Technology-Enhanced Learning Environments. Source Educ. Technol. Res. Dev. 53, 5–23 (2005).
Kirkwood, A., Price, L.: Technology-enhanced learning and teaching in higher education: what is “enhanced” and how do we know? A critical literature review. Learn. Media Technol. 39, 6–36 (2014).
Ross, S.M., Morrison, G.R., Lowther, D.L.: Educational Technology Research Past and Present: Balancing Rigor and Relevance to Impact School Learning. Contemp. Educ. Technol. 1, 17–35 (2010).
To document or not to document? An exploratory study on developers' motivatio...Hayim Makabee
Abstract: Technical debt represents the situation in a project where developers accept compromises in one dimension of a system in order to meet urgent demands in other dimensions. These compromises incur a “debt”, on which “interest” has to be paid to maintain the long-term health of the project. One of the elements of technical debt is documentation debt due to under-documentation of the evolving system. In this exploratory study, our goal is to examine the different aspects of developers' motivation to document code. Specifically, we aim to identify the motivating and hindering aspects of documentation as perceived by the developers. The motivating aspects of code documenting we find include improving code comprehensibility, order, and quality. The hindering aspects include developers’ perception of documenting as a tedious, difficult, and time consuming task that interrupts the coding process. These findings may serve as a basis for developing guidelines toward improving documentation practices and encouraging developers to document their code thus reducing documentation debt.
TOWARDS AUTOMATIC DETECTION OF SENTIMENTS IN CUSTOMER REVIEWSijistjournal
Opinions Play important role in the process of knowledge discovery or information retrieval and can be considered as a sub discipline of Data Mining. A major interest has been received towards the automatic extraction of human opinions from web documents. The sole purpose of Sentiment Analysis is to facilitate online consumers in decision making process of purchasing new products. Opinion Mining deals with searching of sentiments that are expressed by Individuals through on-line reviews,surveys, feedback,personal blogs etc. With the vast increase in the utilization of Internet in today's era a similar increase has been seen in the use of blog's,reviews etc. The person who actually uses these reviews or blog's is mostly a consumer or a manufacturer. As most of the customers of the world are buying & selling product on-line so it becomes company's responsibility to make their product updated. In the current scenario companies are taking product reviews from the customers and on the basis of product reviews they are able to know in which they are lacking or strong this can be accomplished with the help of sentiment analysis. Therefore Our objective of our research is to build a tool which can automatically extract opinion words and find out their polarity by using dictionary,This actually reduces the manual effort of reading these reviews and to evaluate them. The research also illustrates the benefits of using Unstructured text instead of training data which expensive . In this research effort we demonstrate a method which is based on rules where product reviews are extracted from review containing sites and analysis is done, so that a person may know whether a particular product review is positive or negative or neutral. The system will utilize a existing knowledge base for calculate positive and negative scores and on the basis of that decide whether a product is recommended or not. The system will evaluate the utility of Lexical resources over the training data.
Student information management system project report ii.pdfKamal Acharya
Our project explains about the student management. This project mainly explains the various actions related to student details. This project shows some ease in adding, editing and deleting the student details. It also provides a less time consuming process for viewing, adding, editing and deleting the marks of the students.
Courier management system project report.pdfKamal Acharya
It is now-a-days very important for the people to send or receive articles like imported furniture, electronic items, gifts, business goods and the like. People depend vastly on different transport systems which mostly use the manual way of receiving and delivering the articles. There is no way to track the articles till they are received and there is no way to let the customer know what happened in transit, once he booked some articles. In such a situation, we need a system which completely computerizes the cargo activities including time to time tracking of the articles sent. This need is fulfilled by Courier Management System software which is online software for the cargo management people that enables them to receive the goods from a source and send them to a required destination and track their status from time to time.
Explore the innovative world of trenchless pipe repair with our comprehensive guide, "The Benefits and Techniques of Trenchless Pipe Repair." This document delves into the modern methods of repairing underground pipes without the need for extensive excavation, highlighting the numerous advantages and the latest techniques used in the industry.
Learn about the cost savings, reduced environmental impact, and minimal disruption associated with trenchless technology. Discover detailed explanations of popular techniques such as pipe bursting, cured-in-place pipe (CIPP) lining, and directional drilling. Understand how these methods can be applied to various types of infrastructure, from residential plumbing to large-scale municipal systems.
Ideal for homeowners, contractors, engineers, and anyone interested in modern plumbing solutions, this guide provides valuable insights into why trenchless pipe repair is becoming the preferred choice for pipe rehabilitation. Stay informed about the latest advancements and best practices in the field.
Event Management System Vb Net Project Report.pdfKamal Acharya
In present era, the scopes of information technology growing with a very fast .We do not see any are untouched from this industry. The scope of information technology has become wider includes: Business and industry. Household Business, Communication, Education, Entertainment, Science, Medicine, Engineering, Distance Learning, Weather Forecasting. Carrier Searching and so on.
My project named “Event Management System” is software that store and maintained all events coordinated in college. It also helpful to print related reports. My project will help to record the events coordinated by faculties with their Name, Event subject, date & details in an efficient & effective ways.
In my system we have to make a system by which a user can record all events coordinated by a particular faculty. In our proposed system some more featured are added which differs it from the existing system such as security.
Final project report on grocery store management system..pdfKamal Acharya
In today’s fast-changing business environment, it’s extremely important to be able to respond to client needs in the most effective and timely manner. If your customers wish to see your business online and have instant access to your products or services.
Online Grocery Store is an e-commerce website, which retails various grocery products. This project allows viewing various products available enables registered users to purchase desired products instantly using Paytm, UPI payment processor (Instant Pay) and also can place order by using Cash on Delivery (Pay Later) option. This project provides an easy access to Administrators and Managers to view orders placed using Pay Later and Instant Pay options.
In order to develop an e-commerce website, a number of Technologies must be studied and understood. These include multi-tiered architecture, server and client-side scripting techniques, implementation technologies, programming language (such as PHP, HTML, CSS, JavaScript) and MySQL relational databases. This is a project with the objective to develop a basic website where a consumer is provided with a shopping cart website and also to know about the technologies used to develop such a website.
This document will discuss each of the underlying technologies to create and implement an e- commerce website.
Automobile Management System Project Report.pdfKamal Acharya
The proposed project is developed to manage the automobile in the automobile dealer company. The main module in this project is login, automobile management, customer management, sales, complaints and reports. The first module is the login. The automobile showroom owner should login to the project for usage. The username and password are verified and if it is correct, next form opens. If the username and password are not correct, it shows the error message.
When a customer search for a automobile, if the automobile is available, they will be taken to a page that shows the details of the automobile including automobile name, automobile ID, quantity, price etc. “Automobile Management System” is useful for maintaining automobiles, customers effectively and hence helps for establishing good relation between customer and automobile organization. It contains various customized modules for effectively maintaining automobiles and stock information accurately and safely.
When the automobile is sold to the customer, stock will be reduced automatically. When a new purchase is made, stock will be increased automatically. While selecting automobiles for sale, the proposed software will automatically check for total number of available stock of that particular item, if the total stock of that particular item is less than 5, software will notify the user to purchase the particular item.
Also when the user tries to sale items which are not in stock, the system will prompt the user that the stock is not enough. Customers of this system can search for a automobile; can purchase a automobile easily by selecting fast. On the other hand the stock of automobiles can be maintained perfectly by the automobile shop manager overcoming the drawbacks of existing system.
Quality defects in TMT Bars, Possible causes and Potential Solutions.PrashantGoswami42
Maintaining high-quality standards in the production of TMT bars is crucial for ensuring structural integrity in construction. Addressing common defects through careful monitoring, standardized processes, and advanced technology can significantly improve the quality of TMT bars. Continuous training and adherence to quality control measures will also play a pivotal role in minimizing these defects.
Immunizing Image Classifiers Against Localized Adversary Attacksgerogepatton
This paper addresses the vulnerability of deep learning models, particularly convolutional neural networks
(CNN)s, to adversarial attacks and presents a proactive training technique designed to counter them. We
introduce a novel volumization algorithm, which transforms 2D images into 3D volumetric representations.
When combined with 3D convolution and deep curriculum learning optimization (CLO), itsignificantly improves
the immunity of models against localized universal attacks by up to 40%. We evaluate our proposed approach
using contemporary CNN architectures and the modified Canadian Institute for Advanced Research (CIFAR-10
and CIFAR-100) and ImageNet Large Scale Visual Recognition Challenge (ILSVRC12) datasets, showcasing
accuracy improvements over previous techniques. The results indicate that the combination of the volumetric
input and curriculum learning holds significant promise for mitigating adversarial attacks without necessitating
adversary training.
Water scarcity is the lack of fresh water resources to meet the standard water demand. There are two type of water scarcity. One is physical. The other is economic water scarcity.
1. 1
(I)
TABLE OF CONTENTS
Chapter No. Topics Page No.
Student Declaration II
Certificate from the Supervisor III
Acknowledgement IV
Summary (Not more than 250 words) V
List of Figures VI
List of Tables VII
List of Symbols and Acronyms VIII
Chapter-1 Introduction 10-13
1.1 General Introduction
1.2 List some relevant current/open problems.
1.3 Problem Statement
1.4 Overview of proposed solution approach and Novelty/benefits
1.5 Give tabular comparison of other existing approaches/ solution to
the problem framed
Chapter-2 Literature Survey 14-17
2.1 Summary of papers studied
2.2 Integrated summary of the literature studied
Chapter 3: Analysis, Design and Modeling 18-21
3.1 Overall description of the project
3.2 Functional requirements
3.3 Non Functional requirements
3.4 Logical database requirements
3.5 Design Diagrams
3.3.1Use Case diagrams
2. 2
3.3.2 Class diagrams / Control Flow Diagrams
3.3.3 Sequence Diagram/Activity diagrams
Chapter-4 Implementation and Testing 22-25
4.1 Implementation details and issues
4.1.1 Implementation Issues
4.1.2 Algorithms (Module wise- with respect to design)
4.2 Risk Analysis and Mitigation
Chapter-5 Testing (Focus on Quality of Robustness and Testing) 26-28
5.1 Testing Plan
5.2 Component decomposition and type of testing required
5.3 List all test cases
5.4 Error and Exception Handling
5.5 Limitations of the solution
Chapter-6 Findings & Conclusion 29-29
5.1 Findings
5.2 Conclusion
5.3 Future Work
References 30-30
Brief Bio-data (Resume) 31-31
3. 3
(II)
DECLARATION
I hereby declare that this submission is my own work and that, to the best of my knowledge and
belief, it contains no material previously published or written by another person nor material which
has been accepted for the award of any other degree or diploma of the university or other institute of
higher learning, except where due acknowledgment has been made in the text.
Place: Noida Signature:
Date: 04-06-2015 Name:Utkarsh
Enrollment No:9911103587
4. 4
(III)
CERTIFICATE
This is to certify that the work titled Sentiment Analysis of Opinions submitted by Utkarsh in
partial fulfillment for the award of degree of B. Tech of Jaypee Institute of Information
Technology University, Noida has been carried out under my supervision. This work has not been
submitted partially or wholly to any other University or Institute for the award of this or any other
degree.
Signature of Supervisor
Name of Supervisor Mr. Sudhanshu Kulshrestha
Designation Asst. Prof., Deptt. of CSE
Date 04-06-2015
5. 5
(IV)
ACKNOWLEDGEMENT
The satisfaction which accompanies the successful completion of any project is incomplete without
the mention of names of those who made it possible, because success is the epitome of hard work,
perseverance, undeterred courage, zeal, determination and the most encouraging guidance and
advice which serve as the beacon light and crown our effort with success.
I would like to thank Mrs. Shelly Sachdeva(Project Coordinator) for constructive instructions and
appreciation.
I am deeply indebted to Mr. Sudhanshu Kulshrestha(Project Guide) for his constant guidance,
constructive consoling and unfailing encouragement throughout the completion of this project.
I would also like to thank other faculty of CSE department for their continuous help for effective
implementation of this project and also for finalization of this project report.
Lastly, I thank our families for their support and encouragement.
Signature of the Student
Name of Student Utkarsh
Enrollment Number 9911103587
Date 04-06-2015
6. 6
(V)
SUMMARY
The Sentiment Analysis of Opinions is one of the works in Natural Language Processing and there
are various open problems exist in this field of study. In this project, the Problems is To detect
sentiments and output the scores for the overall sentiments in the given text.This project is about
detecting sentiments in a opinions/opinions given as text in simple English. It gives scores positive
if overall sentiments of the given text are positive and negative if overall sentiments of the given
text is negative otherwise zero for neutral. It is based on linguistic approach using one of the
modules of open source python library called NLTK. The other methods which are available, like
naive bias classifiers can also be used for detecting and mentioning sentiments. This project is
written in python 2.7 language using IDLE as editor and tkinter module is used to get Text as input
and to Display its output in a separate window.
__________________ __________________
Signature of Student Signature of Supervisor
Name: Utkarsh Name :Mr.Sudhanshu Kulshrestha
Date:04 - 06 - 2015
7. 7
(VI)
LIST OF FIGURES
Sentiment Analysis Page Number 17
Use cases diagrams Page Number 20
Class diagrams Page Number 20
Activity or Flow diagrams Page Number 21
IG diagram Page Number 24
8. 8
(VII)
LIST OF TABLES
Tables Page Number
1.Tabular comparison of other existing approaches/ solution to the 15
problem framed
3.Risk analysis and mitigation plan 23
4.Component testing 24
5.Top risk on the basis of IG diagram 24
6.Mitigation Approaches 25
7.Additional resources needed for mitigation 25
8.Testing Plan 26
9.Testing Team Details 27
10.Test Environment 27
11.Component decomposition and type of testing 28
12.List of all test cases 28
9. 9
(VIII)
LIST OF SYMBOLS & ACRONYMS
1.NLTK: Natural Language Tool Kit
2.NLP:Natural Language Processing.
3.IDLE:Python Editor.
4.IDE :Integrated Development environment
5.OS: operating system
6.DOS: Disk operating systems
10. 10
Chapter-1 Introduction
1.1 General Introduction
The Sentiment analysis is part of natural language processing. Natural language Processing is used
for data analytics purpose, to extract meaningful information from lots of data. This is one of the
methods to get information about current trend in the market of what people are thinking or talking
on social media. There are so many practical applications present in the current world like in
election which party is favourable or gaining popularity or a customer watching for reviews before
actually buying something online. These are few of the applications which are getting harder to
solve as size of data keeps on increasing. Big part goes to arrange this data into something
meaningful before analysing it. This part of arrangement of data is called Text Classification.
Sentiment classification and analysis is performed in python using nltk module. Python has special
module NLTK to do tasks in natural language processing. It supports multiple languages like
English, Hindi, Chinese etc to do classification of text or data into something meaningful.
Text Classification can be performed in following ways:
1. Sentiment-Classification
2. Features-based-Sentiment-classification
3. Summarization-of-sentiments
These classifications classify the complete document in accordance with the sentiments or opinions
listed in the text. Feature based approach however, classifies the sentiments based on specifications
of the entity(Noun) listed in the text. This approach reveals about good or bad quality about certain
entities based on the details listed with it. Opinions summarization is similar to text summarization
but opinion summarization gives a clear indication about the sentiment attached with the text. It
outputs the sentiment precisely not in the form of substring of the given text, It mentions the text in
the positive or negative words about the entities so that a whole document can be best described in
few words without losing the abstract of the document. These types of classification can be
performed before actually analysing the text. After text classification, it performs tagging with the
words.
Sentiment classification can be performed at different level.
1. Document Level
11. 11
2. Sentence level
3. Word level
English is one of the most preferred language to work for natural language processing. This project
is based on opinions in English language, does not support other languages at all.
Consider an example : "I watched the movie burger. The movie was very good and the actor did an
awesome job."
"When Modi returned from U.S.A., I got my 15 lakhs as promised by PM Modi"
It clearly tells about the movie and the actor stating positive review. However the sentiment
classifier is still not able to classify sarcasm. It is still a big problem for data analytics and a topic of
research. How to perform this in a machine language is much harder. There are approaches which
perform such operations
1. Linguistic approach
2. Machine Learning
1.2 Current Open Problems/Issues
1. Linguistic Approach:
It the basic approach to deal with the sentiment analysis. It uses tagging technique with the
tokens and then starts analyzing it. The problems with this approach is
Negation: This approach can not deal with negation very well. Few times, this
approach produces opposite in sense result .
Grammatically incorrect sentences: This approach uses datasets to match words
during tagging. So if a sentences with polarity is formed grammatically incorrect, it
is not possible to match it with the existing datasets of polar words. The datasets
must contain all the polar words used in regular language to make it more efficient.
Sometimes, users say something but mean something else type sentences in the text
which make sense but not analyzable by a machine.
2. Machine-Learning-approach: There are methods which classify the text like naive_bias or
S_V_M also suffers from problems like:
1. Sarcasm:.
2. Jumbling of words
12. 12
3. Chatting text or tweets: Limited words to type
1.3 Problem Statement
To detect the given text as input, perform analysis on the data and show the score of the polarity of
input text. The score shows the polarity of the text. If it is greater than zero means sentiment is
positive else negative..
The input will be taken from the user in string format. After inputing the string, the approach used
in this project classify/toekenize the text in tokens. When tokenization is completed, it starts
operation of tagging to each token and then evaluate it. This generates a score which after
conversion from integer to string is displayed on the screen.
1.4 Overview of proposed solution approach and Novelty/benefits
Proposed solution is built in python 2.7 using nltk module and tkinter to take input from the user
and to display output. It is based on linguistic approach. It takes the string, tokenized it and matches
the tokens with the datasets in database with the tags added with tokens. Finally it evaluates the
score for each polar words and calculates the score for the given text.
The file contains code is divided into classes:
splitter_class: The given input is in string format. The whole paragraph can not be evaluated
as it it. First this class splits the text into tokens/words using tokenizer function of nltk
module.
pos_tagger_class: When splitting is done, this class adds tags to the each tokens so that these
tokens can be classified as verbs or nouns or adverbs or adjectives etc. This class does the
tagging work and returns the tagged sentences.
dicionary_tagger_class: This class uses the datasets available with the project to make a
dictionary for all the tokens tokenized by splitter_class and make a dictionary of tagged
tokens.
1.5 Give tabular comparison of other existing approaches/ solution to the
problem framed
Linguistic_approach Simple approach, easy to code and good results
13. 13
with simple texts
Naive_bias_classifier approach Machine_Learning approach, works by first
learning the text then evaluating the other part of
text based on the learning outcome, outperforms
linguistic approach
Support_Vector machine approach Machine_Learning approach, works by data
analysis and finding patterns in the data to
evaluate. Gives a better classification, Better
Results.
14. 14
Chapter-2 Literature Studied
2.1 Summary of Papers
Paper-1 : Sentiment Analysis And Opining Mining
By: G.Vinodhini and RM.Chandrasekaran [June 6,2012]
Department of Computer Science and Engineering, Annamalai
University, Annamalai Nagar-608002.
Summary
The big volume of data present on internet today consisting of regular updating and increasing in
size of social networks, news, entertainment, reviews, blogs, discussions forums provides a large
number of opinions. The data analytics focus of these opinions for sentiment analysis work.
Researchers are currently working to build a software to detect and classify the texts available
online. The precise information extracted from these type of resources present on internet today can
give us lots of information about user's liking, disliking, what they want or do not want to buy and it
can be used by the other party to take advantage of this information to provide better deals to the
users or help users to get better deals in case of reviews. The data available on internet after
classification and analyzing can be very valuable to the users.
This paper detailed about the survey describing about the methods in data analytics and the
problems exist in the area of data analytics /sentiment analysis.
Weblink- http://www.dmi.unict.it/~faro/tesi/sentiment_analysis/SA2.pdf
Paper-2 : Boost up! Sentiment Categorization with Machine
Learning Techniques
By: Andr´es Cassinelli, Chih-Wei Chen [ June 5,2009]
Summary
To calculate the sentiment of a given text or opinion or review, it is noted that methods have an
analysis nearly same to the past works in data analytics in reviews or sentiment analysis, it works
precisely in a better way. If these methods are applied to the multi-classfication techniques, the
results could be quite same. On applying classification techniques on the data, it first uses the data
as training set to train itself and the evaluates the rest of the data, so the technique mentioned in the
paper describes the relationship between the objects in an efficient way.
Weblink- http://www.cs.cornell.edu/home/llee/papers/sentiment.pdf
Paper-3: Twitter as a Corpus for Sentiment Analysis and Opinion
Mining
By: Alexander Pak, Patrick Paroubek [2010]
Universit´e de Paris-Sud, Laboratoire LIMSI-CNRS, Bˆatiment 508,F-
91405 Orsay Cedex, France
15. 15
Summary
Today Social network sites like twitter, facebook, google plus, linkedin etc are famous tools to
communicate with other people on internet. Thousands of people shares information with each
other. This information may be useful for some or waste data for some. If properly analysed, this
data could be very useful for some purposes. It may be in the form of opinions or results to others.
So these social sites can be very effective in generating information (also useful) about so many
aspects in today's life for human. But there is less work done in recent times because these social
networking sites came into existence shortly. In this paper, the author specifies the details using
Twitter, one of the most famous social network in present world, for the works of sentiment
analysis.
Weblink: http://lrec-conf.org/proceedings/lrec2010/pdf/385_Paper.pdf
2.2 Integrated Summary of the literature studied
Sentiment analysis is currently one of the popular topic in research field.There are various works
going on in this area for different languages not studied until now like Arabic, Hindi, Thai etc.
There are various open source libraries available for different languages like python, R etc which
makes the work easy to analyze the text and process it. It can be used for various purposes like in
reviewing movies, products of a companies, about companies, feeling or emotions of citizens for a
country. The most popular way to get this information on social media and analyze it. To make it
into something meaningful sense, the classifier techniques must be used.
The data must be in readable format, in English. The classifiers are used to tokenize of classify the
data. The SuperWised learning technique is used with machine learning approach to detect
sentiments and analyze the sentiments of the rest of the text . Un-Superwised learning is linguistic
approach in which text is first tokenized into tokens and added with tags to evaluate the sentiments
of the text.
How to get lots of data to evaluate:
Social sites
1. Facebook.com
2. Twitter.com
3. LinkedIn.com
News websites and comments
Movie Reviewing sites
Products selling sites
1. Flipkart
2. Snapdeal
Blogs etc
Techniques used presently are:
Machine learning
1. Naive_Bias_classifier
2. Suppport_Vector_Machine
3. Decision_tree
16. 16
Text Structure:
A array of sents/sentences
Each sent is again tokenized called tokens
Each word or token is padded with 2 other tags in dictionary format. These added tags make
each token to be recognized as verbs, nouns, adjectives, adverbs etc to verify if that token is
polar word or not.
Separate datasets are there so that each token can be matched with words present in the
datasets.
First, collection of data is a concern.Useful data is what is required before analysing the
data.Sentiment analysis is performed on the data which is about a product or review and user wants
to know about if it is good or not. Sentiments can have various types of polarity or emotions about
something particular.
Summarizing the opinions is also one of the great concern for today's reseachers. summarizing the
sentiments does not deal with subset of text or its one part of text to be printed. It is printing the
data with a precise sense in fewer number of words and it also contains the subject of the text.
17. 17
opinionative words or phrases
Features
Fig1Sentiment Classification and analysis.
Product Reviews
Sentiment_identific
ation
Feature_Selection
Sentiment_classification
Sentiment Polarity
18. 18
Chapter-3 Analysis, Design and Modeling
3.1 Overall description of the project
3.1.1 Introduction
This software is built on windows(8) platform 64 bit, Python version -2.7 ,32 bit system.
It uses "nltk" module which can be downloaded from nltk.org . The input section uses
tkinter module to get the input and to display the output. Tkinter must be downloaded first
to run this on any system . All the listed Setups above are available free on the python
official website.
Purpose
This software can be used by any user who wants to analyze movie reviews or
product reviews or any opinions in positive or negative.
Scope
The opinions must be in English and simple words. It does not support other
languages. It may not handle sarcastic or negation well. So in that case, result may
vary or unexpected.
Product perspective
This software doesn't depend on any other hardware of software other than resources
provided by a system. Python setup with nltk and tkinter module do all the work
required.
Product functions
This software takes a string typed by user and produces the sentiment score . The
user needs to type a string and wait for the output. Output may take some time for
processing depends on the size of text typed by the user.
User characteristics
The users can be anyone who wants to analyse data on the basis of polarity of the
sentences. It works in the same way for each user and execution time of
text_processing depends on the size of data given to the software by user.
Constraints
The user must know English and know how to install python setups. If python
setups are installed, no pre-requisite knowledge is required to handle this software.
Hardware configuration must be met.
Assumptions and dependencies
System must support python 2.7 32 bit. tkinter may not work with 3.x python
because of syntax change. Windows platform must be xp or wista or windows 7 or 8.
Memory must be 512mb at least. System handles text files.
3.2 Functional Requirements:
Sentiment analysis has to be performed on text in English and it gives output as:.
a. Positive
b. Negative
c. Neutral (zero)
19. 19
3.3 Non Functional Requirements
Data selection: data can be downloaded from standford site or various user reviews
sites or social networks. Reviews for movies and reviews for product must be
checked for separate datasets listed in the database.
Accessibility: To access the data listed on nltk, run "nltk . download( ) on idle
Documentation-Proper comments are there within each file for explanation.
Maintainability - Codes does not need to be maintained if not altered.
Portability - The user just need to run the .py file on any system to analyse
reviews/opinions.
Reliability - It depends on the structure language of opinions.
Response Time - Long reviews can take more time to pre-process it and then
tokenization. .
3.4 Logical database requirements
For database , separate files are added with the source code in separate folder with an extension
yaml. .yaml extension is easy to map with data members which are common for various languages
like arrays, dictionaries etc. There is no sql or other data base concepts are used in the project.Data
sets Files are attached with the source code using their director/file name paths with Python file
handling.
3.5 Design Diagrams
3.3.1Use Case diagrams
user
1.Input String
2. takes
input
3.Press Enter
4. Start
processing the
data,tokenization
5.waiting for the Output
6.Output screen appears with
the sentiment score
Backgr
ound
Proces
sing
20. 20
Class diagrams / Control Flow Diagrams
Pos_Tagger_Class
+init()
+pos_tag()
Dictionary_tagger_
Class
+init()
+tag()
+tag_sentence()
Object class(python)
Splitter_class
+init()
+split()
22. 22
Chapter-4 Implementation details and issues
4.1 Implementation details and issues
The implementation is done in Python 2.7 using nltk and tkinter module. NLTK module is used for
text processing purpose which is open sourced. nltk gives many corpa for data analytics purpose.
These corpa can be used to recreate grammar or taggers which againg can be used with the tokens
for tagging and generating efficient classified data.
To download corpus like chas, books or novels listed to be used with data analytics purpose, run
nltk . download ( ) in python editor. this will download all the required documents for the
sentiment analysis purpose and can be used by importing "import nltk" .
It uses file handling in python. So check the path carefully first. All the files must be placed first
and its path names must be given to the dictionary_tagger_class.
Python 3.x may not be compatible with this code as there are many functions or tkinter changed in
3.x versions of python. It contains 3 classes:
splitter_class: To split texts into tokens
pos_tagger_class: for tagging purpose
dictionary_tagger_class: make tagged tokens a dictionary data-type
4.1.1 Implementation Issues
Finding compatible functions with the nltk module and html parsing functions were few of
the issues with the project. there are many changes in python 2.7 and 3.x versions so
keeping syntax with compatible version was also one of the issues. Tkinter is also different
for python 2.7 and python 3.x as there are syntax changes in python 2.7 .
4.1.2 Algorithms (Module wise- with respect to design)
First module deals with the copying the content from web for downloading the reviews.
Second module deals with the tokenization process of texts and converting it into lists of
strings.
Third module deals with the tagging the tokens with accurate tags.
Fourth module deals with the file handling to add the files of datasets to the source code.
Fifth module deals with the making of dictionary tagged data members of text tokens.
Sixth module deals with the displaying the text attached with the polar words of text and the
result.
For Input and Output, Tkinter is being used here with python. It takes input and supplies it
to the source code of the sentiment analysis code and after processing, sentiment analysis
code returns the score for sentiment analysis which displayed on the screen using Tkinter.
Tkinter is a separate module for python.
The approach is Linguistic approach. In this approach, first, text is tokenized using
tokenizer_ function and then added tags with it. These tokens are then matched with the
existing data sets stored separately using .yaml extension. If token is found, it compares for
the attached tag with the token. On the basis of attached tag, it evaluates if it is positive or
negative. If the token s not found in the datasets, it is treated as neutral. Adjective or
Adverbs increases the score in the direction of polarity of words.
23. 23
4.2 Risk Analysis and Mitigation
Ris
k
Id.
Description of risk Risk area Probabilit
y
(P)
Impac
t
(I)
PE
R*I
Risk
selected
for
mitigatio
n
(Y/N)
Mitigatio
n plan
Classificatio
n
1 Memory
Overflow/underflo
w
Memory 0.001 L 0.00
1
Y Try/catch
block
Code and
Unit test
2 Invalid Input( not
string)
Conversio
n of data
type
problem,
too large
numbers,
passing
string of
greater
size than
allowed
0.3 L 0.3 N Code and
unit test
7 Improper use of
function(not
passing required
parameters )
Prototypin
g
0.3 M 0.9 N Coding
Implentation
24. 24
Interrelationship Graph
3 Performance Time of
execution
0.3 M 0.9 N Development
Process
4 Complier not
working
Compiler
problem
0.001 L 0.001 Y Re-
insall/Re-
open
Environment
and test
5 Code not working Code
altered
0.3 M 0.9 N Engineering
Specialities
6 Unwanted output Code
altered
0.1 L 0.1 N Engineering
Specialities
Memory
wt:0.001
Code Not
working
wt:0.9
Perfor
mance
wt:0.9
Unwanted
Output
wt:0.1
Prototyping
wt:0.9
Compiler
problem
wt:0.001
Data
Type/range
wt:0.3
25. 25
S.No Risk Area # of Risk
statements
Weights(in+out) Total weight Priority
1 Code altered 4 0.1+0.1+0.1+0.9 1.2 High
2 Memory 2 0.001+0.3 0.301 Low
3 Data
type/range
2 0.3+0.9 1.2 High
4 Performance 1 0.1 0.1 Low
5 Prototyping 2 0.3+0.9 1.2 High
6 Compiler
problem
1 0.9 0.9 Medium
Top Risks as the ones with maximum total weight from the graph
Risk Id Risk Statement Risk Area Priority of Risk area
in IG
1 Code not
working/unwanted
output
Code Altered 1
Mitigation Approaches
Use Try/catch block for invalid input constraints.
Make function definition private..
For compiler problem, re-install/re-open it or check for the python path in the environment
variable.
For unwanted output, check for the range of input values or prototypes of functions.
Date Started Date To complete Owner
1 - May -2015 15 - May - 2015 Utkarsh
Additional resources needed for mitigation
Copy the source code for backup.
26. 26
Chapter-5 Testing (Focus on Quality of Robustness and Testing)
5.2.1 Testing Plan
The source code for sentiment analysis is checked for different reviews taken from different sites. A
test file is also maintained for this purpose in a separate folder and its output is also saved. The type
of testing performed is mentioned here:.
Type of Test Will test be
performed?
Comments/explanation Software component
Requirement testing Yes
Unit Yes Listed in first program source files
Integration Yes Linked with source
file using fle handling
Database files
Performance Yes Depends on the
execution of text input
Length of text in
tkinter
Stress Yes Compiled py files
Compliance No
Security No Not hidden Dot py file for
implementation
Load No
Volume No
Example test cases Yes Number of test cases
are written in main file
and added with
datasets
Main files and
datasets
Compilation Yes For syntactical errors Python source files
Test Team Details
Test Schedule
Activity Start date Completion date Hours Comments
Obtain input
data
01/05/2015 10/05/2015 3 hours/Day Input taken from
various sources
Tester Utkarsh Performed all the test cases
27. 27
on internet
Test region
setup
11/05/2015 15/05/2015 3 hours/Day Input taken from
various sources
on internet
TEST ENVIRONMENT- Description of test platforms
Software Items
Operating systems windows 8 Notepad
Python editor and compiler tkinter and nltk
Hardware Items
A complete system with pre-installed software for running python programs, nltk and tkinter
modules
5.2 Component decomposition and type of testing required
S.No List of various components Type of testing
required
Technique of writnig
test cases
1 TEST1 Integration White Box
2 TEST2 Performance Blak Box
3 TEST3 Example test cases Black box
5.3 List all test cases in prescribed format
Test cases for component
Test case Id Input Output Status
TEST1 Linked with file Console output score Pass
TEST2 Datasets Console output score Pass
TEST3 Numbers Integral Fail
TEST3 String Score Pass
4 TEST4 Compilation White Box
28. 28
TEST3 Review from online
site
Score Pass
TEST4 Example test cases
linked with separate
files
Console output Pass
5..4 Error and Exception Handling (mention debugging techniques with which
you have corrected errors)
Test case id Test Case for component Debugging technique
1 Tkinter Print or tracing
2 Source code Backtracking
5.6 Limitations of the solution
The source code does not work for the following test cases:
Grammatically ill formed sentences.
Sentences having Sarcasm.
Negation may not be handled well by the source code
Too large text (in MB data of text file).Python takes lot of time to execute this much of data.
Jumbling of words in sentences.
29. 29
Chapter-6 Findings & Conclusion
6.1 Findings
The sentiment analysis is efficient for simple English, not for any other language. The sentence
formation must be simple and straight forward because it does not handle various cases of sentences
formation like jumbling of words or sarcastic sentences. Input can be taken from tkintr in text
format and similarly displayed. nltk module works really good for natural language processing. It
also provides other techniques to classify the text like naive-bias classifier or svm. Nltk includes
different kind of tagging functions to add tags with tokens.
6.2 Conclusion
.This approach used in the project works efficiently with plain English text. It is easy to code and
simple in understanding, does not require regular expression construction. There are built taggers
available which an be used directly with the texts. To make more efiicient, different techniques can
be grouped together.Naive_Bias_classifier or S_V_M can work better in case of complex sentences.
6.3 Future Work
Using different techniques like machine learning ,super_wised learnig to train the one part
of text and use this training to analyze the rest of the text.
Combine different techniques to see the result of combined approach of algorithms
This work can be extended for other languages like Hindi etc.
Construction of Regular Grammar makes the tagging part more efficient. Generate own
regular expressions.
30. 30
References
[1] http://en.wikipedia.org/wiki/sentiment-analysis1
[2] http://inltk.org
[3] http://marl.gi2mo.org/img/class_diagram_v0.2.png
[4] http://www.nltk.org/books
[5] http://nlp.stanford.edu/IR-book/html/htmledition/edit-distance-1.html
[6] https://wiki.python.org/moin/TkInter
[7] www.tutorialspoint.com/python/python_sending_email.htm
Appendix
A. Time Line
01-02 04-03 20-03 25-04 10-05 25-05 04-06
Synopsis
Study research
papers and
Implementation
Midterm report
Implementation
Testing
Report
31. 31
Resume
Utkarsh
Date of Birth: 15-08-1993
E-Mail: soniutkarsh@ymail.com
Phone No.: +91-8468088422 Codechef Profile:Utkarsh3587
Interests:
Data Structures
Algorithms
Operating Systems
Object Oriented Programming
Education:
B.Tech., Computer Science & Engineering-2015
Jaypee Institute of Information Technology , Noida
4th
year (7th
Semester) , Current CGPA : 6.2/10.
Senior Secondary-2010
Sardar Patel Public Senior Secondary School , Delhi
CBSE with 74.6% .
Secondary-2008
Sardar Patel Public Senior Secondary School , Delhi
CBSE with 83.8%.
Skillset:
Programming Languages: C , C++
Operating Systems : Ubuntu , Windows
Web Technologies: HTML, CSS, JavaScript
Projects:
Hybrid Cross Platform Application
This Project was done on PhoneGap Platform using web technologies like html, css and java
script. Under this project I have implemented some functionalities like downloading study
material, playing quizzes , reading newspaper and few other functions etc.
Face Recognition Application using OpenCV for Android
It was an android application project based on Image Processing using OpenCV libraries. It
detects faces and recognizes them on the basis of stored images.