SlideShare a Scribd company logo
Tweet Segmentation and Its Application to Named Entity Recognition
Abstract:
Twitter has attracted millions of users to share and disseminate most up-to-
date information, resulting in large volumes of data produced everyday.
However, many applications in Information Retrieval (IR) and Natural
Language Processing (NLP) suffer severely from the noisy and short nature
of tweets. In this paper, we propose a novel framework for tweet
segmentation in a batch mode, called HybridSeg. By splitting tweets into
meaningful segments, the semantic or context information is well
preserved and easily extracted by the downstream applications. HybridSeg
finds the optimal segmentation of a tweet by maximizing the sum of the
stickiness scores of its candidate segments. The stickiness score considers
the probability of a segment being a phrase in English (i.e., global context)
and the probability of a segment being a phrase within the batch of tweets
(i.e., local context). For the latter, we propose and evaluate two models to
derive local context by considering the linguistic features and term-
dependency in a batch of tweets, respectively. HybridSeg is also designed
to iteratively learn from confident segments as pseudo feedback.
Experiments on two tweet data sets show that tweet segmentation quality
is significantly improved by learning both global and local contexts
compared with using global context alone. Through analysis and
comparison, we show that local linguistic features are more reliable for
learning local context compared with term-dependency. As an application,
we show that high accuracy is achieved in named entity recognition by
applying segment-based part-of-speech (POS) tagging.
Existing System:
Many organizations have been reported to create and monitor targeted
Twitter streams to collect and understand users’ opinions. Targeted Twitter
stream is usually constructed by filtering tweets with predefined selection
criteria (e.g., tweets published by users from a geographical region, tweets
that match one or more predefined keywords).
Due to its invaluable business value of timely information from these
tweets, it is imperative to understand tweets’ language for a large body of
downstream applications, such as named entity recognition (NER) event
detection and summarization opinion mining sentiment analysis and many
others.
Given the limited length of a tweet (i.e., 140 characters) and no restrictions
on its writing styles, tweets often contain grammatical errors, misspellings,
and informal abbreviations. The error-prone and short nature of tweets
often make the word-level language models for tweets less reliable. For
example, given a tweet “I call her, no answer.
Proposed System:
To achieve high quality tweet segmentation, we propose a generic tweet
segmentation framework, named HybridSeg. HybridSeg learns from both
global and local contexts, and has the ability of learning from pseudo
feedback.
Global context: Tweets are posted for information sharing and
communication. The named entities and semantic phrases are well
preserved in tweets. The global context derived from Web pages (e.g.,
Microsoft Web N-Gram corpus) or Wikipedia therefore helps identifying
the meaningful segments in tweets.
Hardware Requirements:
• System : Pentium IV 2.4 GHz.
• Hard Disk : 40 GB.
• Floppy Drive : 1.44 Mb.
• Monitor : 15 VGA Colour.
• Mouse : Logitech.
• RAM : 256 Mb.
Software Requirements:
• Operating system : - Windows XP.
• Front End : - JSP
• Back End : - SQL Server
Software Requirements:
• Operating system : - Windows XP.
• Front End : - .Net
• Back End : - SQL Server

More Related Content

What's hot

Odsc 2019 entity_reputation_knowledge_graph
Odsc 2019 entity_reputation_knowledge_graphOdsc 2019 entity_reputation_knowledge_graph
Odsc 2019 entity_reputation_knowledge_graph
venkatramanJ4
 
Detection of cyber-bullying
Detection of cyber-bullying Detection of cyber-bullying
Detection of cyber-bullying
Ziar Khan
 
Big Data Analytics- USE CASES SOLVED USING NETWORK ANALYSIS TECHNIQUES IN GEPHI
Big Data Analytics- USE CASES SOLVED USING NETWORK ANALYSIS TECHNIQUES IN GEPHIBig Data Analytics- USE CASES SOLVED USING NETWORK ANALYSIS TECHNIQUES IN GEPHI
Big Data Analytics- USE CASES SOLVED USING NETWORK ANALYSIS TECHNIQUES IN GEPHI
Ruchika Sharma
 
Final Poster for Engineering Showcase
Final Poster for Engineering ShowcaseFinal Poster for Engineering Showcase
Final Poster for Engineering ShowcaseTucker Truesdale
 
Social Trust-aware Recommendation System: A T-Index Approach
Social Trust-aware Recommendation System: A T-Index ApproachSocial Trust-aware Recommendation System: A T-Index Approach
Social Trust-aware Recommendation System: A T-Index Approach
Nima Dokoohaki
 
IRJET- Fake News Detection
IRJET- Fake News DetectionIRJET- Fake News Detection
IRJET- Fake News Detection
IRJET Journal
 
IRJET- A Pragmatic Supervised Learning Methodology of Hate Speech Detection i...
IRJET- A Pragmatic Supervised Learning Methodology of Hate Speech Detection i...IRJET- A Pragmatic Supervised Learning Methodology of Hate Speech Detection i...
IRJET- A Pragmatic Supervised Learning Methodology of Hate Speech Detection i...
IRJET Journal
 
Learning to detect phishing ur ls
Learning to detect phishing ur lsLearning to detect phishing ur ls
Learning to detect phishing ur ls
eSAT Publishing House
 
Evaluation Social Ties and Trust in Online Social Network
Evaluation Social Ties and Trust in Online Social NetworkEvaluation Social Ties and Trust in Online Social Network
Evaluation Social Ties and Trust in Online Social Network
International Journal of Science and Research (IJSR)
 
An Approach to Detect and Avoid Social Engineering and Phasing Attack in Soci...
An Approach to Detect and Avoid Social Engineering and Phasing Attack in Soci...An Approach to Detect and Avoid Social Engineering and Phasing Attack in Soci...
An Approach to Detect and Avoid Social Engineering and Phasing Attack in Soci...
IJASRD Journal
 
tweet segmentation
tweet segmentation tweet segmentation
tweet segmentation
prashanttarone
 
Big Data Analysis- Live DATA PRESENTATION- Bitcoin Alpha trust network
Big Data Analysis- Live DATA PRESENTATION- Bitcoin Alpha trust networkBig Data Analysis- Live DATA PRESENTATION- Bitcoin Alpha trust network
Big Data Analysis- Live DATA PRESENTATION- Bitcoin Alpha trust network
Ruchika Sharma
 
Phishing detection in ims using domain ontology and cba an innovative rule ...
Phishing detection in ims using domain ontology and cba   an innovative rule ...Phishing detection in ims using domain ontology and cba   an innovative rule ...
Phishing detection in ims using domain ontology and cba an innovative rule ...
ijistjournal
 
A Comparative Analysis of Different Feature Set on the Performance of Differe...
A Comparative Analysis of Different Feature Set on the Performance of Differe...A Comparative Analysis of Different Feature Set on the Performance of Differe...
A Comparative Analysis of Different Feature Set on the Performance of Differe...
gerogepatton
 
IRJET- Fake Profile Identification using Machine Learning
IRJET-  	  Fake Profile Identification using Machine LearningIRJET-  	  Fake Profile Identification using Machine Learning
IRJET- Fake Profile Identification using Machine Learning
IRJET Journal
 
A Deep Learning Technique for Web Phishing Detection Combined URL Features an...
A Deep Learning Technique for Web Phishing Detection Combined URL Features an...A Deep Learning Technique for Web Phishing Detection Combined URL Features an...
A Deep Learning Technique for Web Phishing Detection Combined URL Features an...
IJCNCJournal
 
Fake News Detection using Machine Learning
Fake News Detection using Machine LearningFake News Detection using Machine Learning
Fake News Detection using Machine Learning
ijtsrd
 
Using hash tag graph based topic model to connect semantically-related words ...
Using hash tag graph based topic model to connect semantically-related words ...Using hash tag graph based topic model to connect semantically-related words ...
Using hash tag graph based topic model to connect semantically-related words ...
Shakas Technologies
 
IRJET- Fake News Detection using Logistic Regression
IRJET- Fake News Detection using Logistic RegressionIRJET- Fake News Detection using Logistic Regression
IRJET- Fake News Detection using Logistic Regression
IRJET Journal
 

What's hot (19)

Odsc 2019 entity_reputation_knowledge_graph
Odsc 2019 entity_reputation_knowledge_graphOdsc 2019 entity_reputation_knowledge_graph
Odsc 2019 entity_reputation_knowledge_graph
 
Detection of cyber-bullying
Detection of cyber-bullying Detection of cyber-bullying
Detection of cyber-bullying
 
Big Data Analytics- USE CASES SOLVED USING NETWORK ANALYSIS TECHNIQUES IN GEPHI
Big Data Analytics- USE CASES SOLVED USING NETWORK ANALYSIS TECHNIQUES IN GEPHIBig Data Analytics- USE CASES SOLVED USING NETWORK ANALYSIS TECHNIQUES IN GEPHI
Big Data Analytics- USE CASES SOLVED USING NETWORK ANALYSIS TECHNIQUES IN GEPHI
 
Final Poster for Engineering Showcase
Final Poster for Engineering ShowcaseFinal Poster for Engineering Showcase
Final Poster for Engineering Showcase
 
Social Trust-aware Recommendation System: A T-Index Approach
Social Trust-aware Recommendation System: A T-Index ApproachSocial Trust-aware Recommendation System: A T-Index Approach
Social Trust-aware Recommendation System: A T-Index Approach
 
IRJET- Fake News Detection
IRJET- Fake News DetectionIRJET- Fake News Detection
IRJET- Fake News Detection
 
IRJET- A Pragmatic Supervised Learning Methodology of Hate Speech Detection i...
IRJET- A Pragmatic Supervised Learning Methodology of Hate Speech Detection i...IRJET- A Pragmatic Supervised Learning Methodology of Hate Speech Detection i...
IRJET- A Pragmatic Supervised Learning Methodology of Hate Speech Detection i...
 
Learning to detect phishing ur ls
Learning to detect phishing ur lsLearning to detect phishing ur ls
Learning to detect phishing ur ls
 
Evaluation Social Ties and Trust in Online Social Network
Evaluation Social Ties and Trust in Online Social NetworkEvaluation Social Ties and Trust in Online Social Network
Evaluation Social Ties and Trust in Online Social Network
 
An Approach to Detect and Avoid Social Engineering and Phasing Attack in Soci...
An Approach to Detect and Avoid Social Engineering and Phasing Attack in Soci...An Approach to Detect and Avoid Social Engineering and Phasing Attack in Soci...
An Approach to Detect and Avoid Social Engineering and Phasing Attack in Soci...
 
tweet segmentation
tweet segmentation tweet segmentation
tweet segmentation
 
Big Data Analysis- Live DATA PRESENTATION- Bitcoin Alpha trust network
Big Data Analysis- Live DATA PRESENTATION- Bitcoin Alpha trust networkBig Data Analysis- Live DATA PRESENTATION- Bitcoin Alpha trust network
Big Data Analysis- Live DATA PRESENTATION- Bitcoin Alpha trust network
 
Phishing detection in ims using domain ontology and cba an innovative rule ...
Phishing detection in ims using domain ontology and cba   an innovative rule ...Phishing detection in ims using domain ontology and cba   an innovative rule ...
Phishing detection in ims using domain ontology and cba an innovative rule ...
 
A Comparative Analysis of Different Feature Set on the Performance of Differe...
A Comparative Analysis of Different Feature Set on the Performance of Differe...A Comparative Analysis of Different Feature Set on the Performance of Differe...
A Comparative Analysis of Different Feature Set on the Performance of Differe...
 
IRJET- Fake Profile Identification using Machine Learning
IRJET-  	  Fake Profile Identification using Machine LearningIRJET-  	  Fake Profile Identification using Machine Learning
IRJET- Fake Profile Identification using Machine Learning
 
A Deep Learning Technique for Web Phishing Detection Combined URL Features an...
A Deep Learning Technique for Web Phishing Detection Combined URL Features an...A Deep Learning Technique for Web Phishing Detection Combined URL Features an...
A Deep Learning Technique for Web Phishing Detection Combined URL Features an...
 
Fake News Detection using Machine Learning
Fake News Detection using Machine LearningFake News Detection using Machine Learning
Fake News Detection using Machine Learning
 
Using hash tag graph based topic model to connect semantically-related words ...
Using hash tag graph based topic model to connect semantically-related words ...Using hash tag graph based topic model to connect semantically-related words ...
Using hash tag graph based topic model to connect semantically-related words ...
 
IRJET- Fake News Detection using Logistic Regression
IRJET- Fake News Detection using Logistic RegressionIRJET- Fake News Detection using Logistic Regression
IRJET- Fake News Detection using Logistic Regression
 

Similar to Tweet segmentation and its application to named entity recognition

Questions about questions
Questions about questionsQuestions about questions
Questions about questionsmoresmile
 
[IJET-V2I2P3] Authors:Vijay Choure, Kavita Mahajan ,Nikhil Patil, Aishwarya N...
[IJET-V2I2P3] Authors:Vijay Choure, Kavita Mahajan ,Nikhil Patil, Aishwarya N...[IJET-V2I2P3] Authors:Vijay Choure, Kavita Mahajan ,Nikhil Patil, Aishwarya N...
[IJET-V2I2P3] Authors:Vijay Choure, Kavita Mahajan ,Nikhil Patil, Aishwarya N...
IJET - International Journal of Engineering and Techniques
 
Tweet Summarization and Segmentation: A Survey
Tweet Summarization and Segmentation: A SurveyTweet Summarization and Segmentation: A Survey
Tweet Summarization and Segmentation: A Survey
vivatechijri
 
P1803018289
P1803018289P1803018289
P1803018289
IOSR Journals
 
How Anonymous Can Someone be on Twitter?
How Anonymous Can Someone be on Twitter?How Anonymous Can Someone be on Twitter?
How Anonymous Can Someone be on Twitter?
George Sam
 
A Baseline Based Deep Learning Approach of Live Tweets
A Baseline Based Deep Learning Approach of Live TweetsA Baseline Based Deep Learning Approach of Live Tweets
A Baseline Based Deep Learning Approach of Live Tweets
ijtsrd
 
IRJET- An Experimental Evaluation of Mechanical Properties of Bamboo Fiber Re...
IRJET- An Experimental Evaluation of Mechanical Properties of Bamboo Fiber Re...IRJET- An Experimental Evaluation of Mechanical Properties of Bamboo Fiber Re...
IRJET- An Experimental Evaluation of Mechanical Properties of Bamboo Fiber Re...
IRJET Journal
 
Event summarization using tweets
Event summarization using tweetsEvent summarization using tweets
Event summarization using tweets
moresmile
 
Latent Dirichlet Allocation as a Twitter Hashtag Recommendation System
Latent Dirichlet Allocation as a Twitter Hashtag Recommendation SystemLatent Dirichlet Allocation as a Twitter Hashtag Recommendation System
Latent Dirichlet Allocation as a Twitter Hashtag Recommendation System
Shailly Saxena
 
Twitter sentiment analysis.pptx
Twitter sentiment analysis.pptxTwitter sentiment analysis.pptx
Twitter sentiment analysis.pptx
Rishita Gupta
 
Harnessing Web Page Directories for Large-Scale Classification of Tweets
Harnessing Web Page Directories for Large-Scale Classification of TweetsHarnessing Web Page Directories for Large-Scale Classification of Tweets
Harnessing Web Page Directories for Large-Scale Classification of Tweets
Gabriela Agustini
 
IRJET- A Survey on Trend Analysis on Twitter for Predicting Public Opinion on...
IRJET- A Survey on Trend Analysis on Twitter for Predicting Public Opinion on...IRJET- A Survey on Trend Analysis on Twitter for Predicting Public Opinion on...
IRJET- A Survey on Trend Analysis on Twitter for Predicting Public Opinion on...
IRJET Journal
 
RDataMining slides-text-mining-with-r
RDataMining slides-text-mining-with-rRDataMining slides-text-mining-with-r
RDataMining slides-text-mining-with-r
Yanchang Zhao
 
Lexicon Based Emotion Analysis on Twitter Data
Lexicon Based Emotion Analysis on Twitter DataLexicon Based Emotion Analysis on Twitter Data
Lexicon Based Emotion Analysis on Twitter Data
ijtsrd
 
Groundhog day: near duplicate detection on twitter
Groundhog day: near duplicate detection on twitterGroundhog day: near duplicate detection on twitter
Groundhog day: near duplicate detection on twitter
Dan Nguyen
 
A STUDY ON TWITTER SENTIMENT ANALYSIS USING DEEP LEARNING
A STUDY ON TWITTER SENTIMENT ANALYSIS USING DEEP LEARNINGA STUDY ON TWITTER SENTIMENT ANALYSIS USING DEEP LEARNING
A STUDY ON TWITTER SENTIMENT ANALYSIS USING DEEP LEARNING
IRJET Journal
 
IRJET - Implementation of Twitter Sentimental Analysis According to Hash Tag
 IRJET - Implementation of Twitter Sentimental Analysis According to Hash Tag IRJET - Implementation of Twitter Sentimental Analysis According to Hash Tag
IRJET - Implementation of Twitter Sentimental Analysis According to Hash Tag
IRJET Journal
 
Sentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSentiment Analysis of Twitter Data
Sentiment Analysis of Twitter Data
Sumit Raj
 
Named Entity Recognition using Tweet Segmentation
Named Entity Recognition using Tweet SegmentationNamed Entity Recognition using Tweet Segmentation
Named Entity Recognition using Tweet Segmentation
IRJET Journal
 
IRJET - Suicidal Text Detection using Machine Learning
IRJET -  	  Suicidal Text Detection using Machine LearningIRJET -  	  Suicidal Text Detection using Machine Learning
IRJET - Suicidal Text Detection using Machine Learning
IRJET Journal
 

Similar to Tweet segmentation and its application to named entity recognition (20)

Questions about questions
Questions about questionsQuestions about questions
Questions about questions
 
[IJET-V2I2P3] Authors:Vijay Choure, Kavita Mahajan ,Nikhil Patil, Aishwarya N...
[IJET-V2I2P3] Authors:Vijay Choure, Kavita Mahajan ,Nikhil Patil, Aishwarya N...[IJET-V2I2P3] Authors:Vijay Choure, Kavita Mahajan ,Nikhil Patil, Aishwarya N...
[IJET-V2I2P3] Authors:Vijay Choure, Kavita Mahajan ,Nikhil Patil, Aishwarya N...
 
Tweet Summarization and Segmentation: A Survey
Tweet Summarization and Segmentation: A SurveyTweet Summarization and Segmentation: A Survey
Tweet Summarization and Segmentation: A Survey
 
P1803018289
P1803018289P1803018289
P1803018289
 
How Anonymous Can Someone be on Twitter?
How Anonymous Can Someone be on Twitter?How Anonymous Can Someone be on Twitter?
How Anonymous Can Someone be on Twitter?
 
A Baseline Based Deep Learning Approach of Live Tweets
A Baseline Based Deep Learning Approach of Live TweetsA Baseline Based Deep Learning Approach of Live Tweets
A Baseline Based Deep Learning Approach of Live Tweets
 
IRJET- An Experimental Evaluation of Mechanical Properties of Bamboo Fiber Re...
IRJET- An Experimental Evaluation of Mechanical Properties of Bamboo Fiber Re...IRJET- An Experimental Evaluation of Mechanical Properties of Bamboo Fiber Re...
IRJET- An Experimental Evaluation of Mechanical Properties of Bamboo Fiber Re...
 
Event summarization using tweets
Event summarization using tweetsEvent summarization using tweets
Event summarization using tweets
 
Latent Dirichlet Allocation as a Twitter Hashtag Recommendation System
Latent Dirichlet Allocation as a Twitter Hashtag Recommendation SystemLatent Dirichlet Allocation as a Twitter Hashtag Recommendation System
Latent Dirichlet Allocation as a Twitter Hashtag Recommendation System
 
Twitter sentiment analysis.pptx
Twitter sentiment analysis.pptxTwitter sentiment analysis.pptx
Twitter sentiment analysis.pptx
 
Harnessing Web Page Directories for Large-Scale Classification of Tweets
Harnessing Web Page Directories for Large-Scale Classification of TweetsHarnessing Web Page Directories for Large-Scale Classification of Tweets
Harnessing Web Page Directories for Large-Scale Classification of Tweets
 
IRJET- A Survey on Trend Analysis on Twitter for Predicting Public Opinion on...
IRJET- A Survey on Trend Analysis on Twitter for Predicting Public Opinion on...IRJET- A Survey on Trend Analysis on Twitter for Predicting Public Opinion on...
IRJET- A Survey on Trend Analysis on Twitter for Predicting Public Opinion on...
 
RDataMining slides-text-mining-with-r
RDataMining slides-text-mining-with-rRDataMining slides-text-mining-with-r
RDataMining slides-text-mining-with-r
 
Lexicon Based Emotion Analysis on Twitter Data
Lexicon Based Emotion Analysis on Twitter DataLexicon Based Emotion Analysis on Twitter Data
Lexicon Based Emotion Analysis on Twitter Data
 
Groundhog day: near duplicate detection on twitter
Groundhog day: near duplicate detection on twitterGroundhog day: near duplicate detection on twitter
Groundhog day: near duplicate detection on twitter
 
A STUDY ON TWITTER SENTIMENT ANALYSIS USING DEEP LEARNING
A STUDY ON TWITTER SENTIMENT ANALYSIS USING DEEP LEARNINGA STUDY ON TWITTER SENTIMENT ANALYSIS USING DEEP LEARNING
A STUDY ON TWITTER SENTIMENT ANALYSIS USING DEEP LEARNING
 
IRJET - Implementation of Twitter Sentimental Analysis According to Hash Tag
 IRJET - Implementation of Twitter Sentimental Analysis According to Hash Tag IRJET - Implementation of Twitter Sentimental Analysis According to Hash Tag
IRJET - Implementation of Twitter Sentimental Analysis According to Hash Tag
 
Sentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSentiment Analysis of Twitter Data
Sentiment Analysis of Twitter Data
 
Named Entity Recognition using Tweet Segmentation
Named Entity Recognition using Tweet SegmentationNamed Entity Recognition using Tweet Segmentation
Named Entity Recognition using Tweet Segmentation
 
IRJET - Suicidal Text Detection using Machine Learning
IRJET -  	  Suicidal Text Detection using Machine LearningIRJET -  	  Suicidal Text Detection using Machine Learning
IRJET - Suicidal Text Detection using Machine Learning
 

More from ieeepondy

Demand aware network function placement
Demand aware network function placementDemand aware network function placement
Demand aware network function placement
ieeepondy
 
Service description in the nfv revolution trends, challenges and a way forward
Service description in the nfv revolution trends, challenges and a way forwardService description in the nfv revolution trends, challenges and a way forward
Service description in the nfv revolution trends, challenges and a way forward
ieeepondy
 
Secure optimization computation outsourcing in cloud computing a case study o...
Secure optimization computation outsourcing in cloud computing a case study o...Secure optimization computation outsourcing in cloud computing a case study o...
Secure optimization computation outsourcing in cloud computing a case study o...
ieeepondy
 
Spatial related traffic sign inspection for inventory purposes using mobile l...
Spatial related traffic sign inspection for inventory purposes using mobile l...Spatial related traffic sign inspection for inventory purposes using mobile l...
Spatial related traffic sign inspection for inventory purposes using mobile l...
ieeepondy
 
Standards for hybrid clouds
Standards for hybrid cloudsStandards for hybrid clouds
Standards for hybrid clouds
ieeepondy
 
Rfhoc a random forest approach to auto-tuning hadoop's configuration
Rfhoc a random forest approach to auto-tuning hadoop's configurationRfhoc a random forest approach to auto-tuning hadoop's configuration
Rfhoc a random forest approach to auto-tuning hadoop's configuration
ieeepondy
 
Resource and instance hour minimization for deadline constrained dag applicat...
Resource and instance hour minimization for deadline constrained dag applicat...Resource and instance hour minimization for deadline constrained dag applicat...
Resource and instance hour minimization for deadline constrained dag applicat...
ieeepondy
 
Reliable and confidential cloud storage with efficient data forwarding functi...
Reliable and confidential cloud storage with efficient data forwarding functi...Reliable and confidential cloud storage with efficient data forwarding functi...
Reliable and confidential cloud storage with efficient data forwarding functi...
ieeepondy
 
Rebuttal to “comments on ‘control cloud data access privilege and anonymity w...
Rebuttal to “comments on ‘control cloud data access privilege and anonymity w...Rebuttal to “comments on ‘control cloud data access privilege and anonymity w...
Rebuttal to “comments on ‘control cloud data access privilege and anonymity w...
ieeepondy
 
Scalable cloud–sensor architecture for the internet of things
Scalable cloud–sensor architecture for the internet of thingsScalable cloud–sensor architecture for the internet of things
Scalable cloud–sensor architecture for the internet of things
ieeepondy
 
Scalable algorithms for nearest neighbor joins on big trajectory data
Scalable algorithms for nearest neighbor joins on big trajectory dataScalable algorithms for nearest neighbor joins on big trajectory data
Scalable algorithms for nearest neighbor joins on big trajectory data
ieeepondy
 
Robust workload and energy management for sustainable data centers
Robust workload and energy management for sustainable data centersRobust workload and energy management for sustainable data centers
Robust workload and energy management for sustainable data centers
ieeepondy
 
Privacy preserving deep computation model on cloud for big data feature learning
Privacy preserving deep computation model on cloud for big data feature learningPrivacy preserving deep computation model on cloud for big data feature learning
Privacy preserving deep computation model on cloud for big data feature learning
ieeepondy
 
Pricing the cloud ieee projects, ieee projects chennai, ieee projects 2016,ie...
Pricing the cloud ieee projects, ieee projects chennai, ieee projects 2016,ie...Pricing the cloud ieee projects, ieee projects chennai, ieee projects 2016,ie...
Pricing the cloud ieee projects, ieee projects chennai, ieee projects 2016,ie...
ieeepondy
 
Protection of big data privacy
Protection of big data privacyProtection of big data privacy
Protection of big data privacy
ieeepondy
 
Power optimization with bler constraint for wireless fronthauls in c ran
Power optimization with bler constraint for wireless fronthauls in c ranPower optimization with bler constraint for wireless fronthauls in c ran
Power optimization with bler constraint for wireless fronthauls in c ran
ieeepondy
 
Performance aware cloud resource allocation via fitness-enabled auction
Performance aware cloud resource allocation via fitness-enabled auctionPerformance aware cloud resource allocation via fitness-enabled auction
Performance aware cloud resource allocation via fitness-enabled auction
ieeepondy
 
Performance limitations of a text search application running in cloud instances
Performance limitations of a text search application running in cloud instancesPerformance limitations of a text search application running in cloud instances
Performance limitations of a text search application running in cloud instances
ieeepondy
 
Performance analysis and optimal cooperative cluster size for randomly distri...
Performance analysis and optimal cooperative cluster size for randomly distri...Performance analysis and optimal cooperative cluster size for randomly distri...
Performance analysis and optimal cooperative cluster size for randomly distri...
ieeepondy
 
Predictive control for energy aware consolidation in cloud datacenters
Predictive control for energy aware consolidation in cloud datacentersPredictive control for energy aware consolidation in cloud datacenters
Predictive control for energy aware consolidation in cloud datacenters
ieeepondy
 

More from ieeepondy (20)

Demand aware network function placement
Demand aware network function placementDemand aware network function placement
Demand aware network function placement
 
Service description in the nfv revolution trends, challenges and a way forward
Service description in the nfv revolution trends, challenges and a way forwardService description in the nfv revolution trends, challenges and a way forward
Service description in the nfv revolution trends, challenges and a way forward
 
Secure optimization computation outsourcing in cloud computing a case study o...
Secure optimization computation outsourcing in cloud computing a case study o...Secure optimization computation outsourcing in cloud computing a case study o...
Secure optimization computation outsourcing in cloud computing a case study o...
 
Spatial related traffic sign inspection for inventory purposes using mobile l...
Spatial related traffic sign inspection for inventory purposes using mobile l...Spatial related traffic sign inspection for inventory purposes using mobile l...
Spatial related traffic sign inspection for inventory purposes using mobile l...
 
Standards for hybrid clouds
Standards for hybrid cloudsStandards for hybrid clouds
Standards for hybrid clouds
 
Rfhoc a random forest approach to auto-tuning hadoop's configuration
Rfhoc a random forest approach to auto-tuning hadoop's configurationRfhoc a random forest approach to auto-tuning hadoop's configuration
Rfhoc a random forest approach to auto-tuning hadoop's configuration
 
Resource and instance hour minimization for deadline constrained dag applicat...
Resource and instance hour minimization for deadline constrained dag applicat...Resource and instance hour minimization for deadline constrained dag applicat...
Resource and instance hour minimization for deadline constrained dag applicat...
 
Reliable and confidential cloud storage with efficient data forwarding functi...
Reliable and confidential cloud storage with efficient data forwarding functi...Reliable and confidential cloud storage with efficient data forwarding functi...
Reliable and confidential cloud storage with efficient data forwarding functi...
 
Rebuttal to “comments on ‘control cloud data access privilege and anonymity w...
Rebuttal to “comments on ‘control cloud data access privilege and anonymity w...Rebuttal to “comments on ‘control cloud data access privilege and anonymity w...
Rebuttal to “comments on ‘control cloud data access privilege and anonymity w...
 
Scalable cloud–sensor architecture for the internet of things
Scalable cloud–sensor architecture for the internet of thingsScalable cloud–sensor architecture for the internet of things
Scalable cloud–sensor architecture for the internet of things
 
Scalable algorithms for nearest neighbor joins on big trajectory data
Scalable algorithms for nearest neighbor joins on big trajectory dataScalable algorithms for nearest neighbor joins on big trajectory data
Scalable algorithms for nearest neighbor joins on big trajectory data
 
Robust workload and energy management for sustainable data centers
Robust workload and energy management for sustainable data centersRobust workload and energy management for sustainable data centers
Robust workload and energy management for sustainable data centers
 
Privacy preserving deep computation model on cloud for big data feature learning
Privacy preserving deep computation model on cloud for big data feature learningPrivacy preserving deep computation model on cloud for big data feature learning
Privacy preserving deep computation model on cloud for big data feature learning
 
Pricing the cloud ieee projects, ieee projects chennai, ieee projects 2016,ie...
Pricing the cloud ieee projects, ieee projects chennai, ieee projects 2016,ie...Pricing the cloud ieee projects, ieee projects chennai, ieee projects 2016,ie...
Pricing the cloud ieee projects, ieee projects chennai, ieee projects 2016,ie...
 
Protection of big data privacy
Protection of big data privacyProtection of big data privacy
Protection of big data privacy
 
Power optimization with bler constraint for wireless fronthauls in c ran
Power optimization with bler constraint for wireless fronthauls in c ranPower optimization with bler constraint for wireless fronthauls in c ran
Power optimization with bler constraint for wireless fronthauls in c ran
 
Performance aware cloud resource allocation via fitness-enabled auction
Performance aware cloud resource allocation via fitness-enabled auctionPerformance aware cloud resource allocation via fitness-enabled auction
Performance aware cloud resource allocation via fitness-enabled auction
 
Performance limitations of a text search application running in cloud instances
Performance limitations of a text search application running in cloud instancesPerformance limitations of a text search application running in cloud instances
Performance limitations of a text search application running in cloud instances
 
Performance analysis and optimal cooperative cluster size for randomly distri...
Performance analysis and optimal cooperative cluster size for randomly distri...Performance analysis and optimal cooperative cluster size for randomly distri...
Performance analysis and optimal cooperative cluster size for randomly distri...
 
Predictive control for energy aware consolidation in cloud datacenters
Predictive control for energy aware consolidation in cloud datacentersPredictive control for energy aware consolidation in cloud datacenters
Predictive control for energy aware consolidation in cloud datacenters
 

Recently uploaded

aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
siemaillard
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
joachimlavalley1
 
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptxStudents, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
EduSkills OECD
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
Tamralipta Mahavidyalaya
 
How to Break the cycle of negative Thoughts
How to Break the cycle of negative ThoughtsHow to Break the cycle of negative Thoughts
How to Break the cycle of negative Thoughts
Col Mukteshwar Prasad
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
Special education needs
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
MIRIAMSALINAS13
 
The Art Pastor's Guide to Sabbath | Steve Thomason
The Art Pastor's Guide to Sabbath | Steve ThomasonThe Art Pastor's Guide to Sabbath | Steve Thomason
The Art Pastor's Guide to Sabbath | Steve Thomason
Steve Thomason
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
EverAndrsGuerraGuerr
 
Template Jadual Bertugas Kelas (Boleh Edit)
Template Jadual Bertugas Kelas (Boleh Edit)Template Jadual Bertugas Kelas (Boleh Edit)
Template Jadual Bertugas Kelas (Boleh Edit)
rosedainty
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
Jisc
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
Atul Kumar Singh
 
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
Nguyen Thanh Tu Collection
 
Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
JosvitaDsouza2
 
PART A. Introduction to Costumer Service
PART A. Introduction to Costumer ServicePART A. Introduction to Costumer Service
PART A. Introduction to Costumer Service
PedroFerreira53928
 
Cambridge International AS A Level Biology Coursebook - EBook (MaryFosbery J...
Cambridge International AS  A Level Biology Coursebook - EBook (MaryFosbery J...Cambridge International AS  A Level Biology Coursebook - EBook (MaryFosbery J...
Cambridge International AS A Level Biology Coursebook - EBook (MaryFosbery J...
AzmatAli747758
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
Vivekanand Anglo Vedic Academy
 
How to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERPHow to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERP
Celine George
 
MARUTI SUZUKI- A Successful Joint Venture in India.pptx
MARUTI SUZUKI- A Successful Joint Venture in India.pptxMARUTI SUZUKI- A Successful Joint Venture in India.pptx
MARUTI SUZUKI- A Successful Joint Venture in India.pptx
bennyroshan06
 

Recently uploaded (20)

aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
 
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptxStudents, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
 
How to Break the cycle of negative Thoughts
How to Break the cycle of negative ThoughtsHow to Break the cycle of negative Thoughts
How to Break the cycle of negative Thoughts
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
 
The Art Pastor's Guide to Sabbath | Steve Thomason
The Art Pastor's Guide to Sabbath | Steve ThomasonThe Art Pastor's Guide to Sabbath | Steve Thomason
The Art Pastor's Guide to Sabbath | Steve Thomason
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
 
Template Jadual Bertugas Kelas (Boleh Edit)
Template Jadual Bertugas Kelas (Boleh Edit)Template Jadual Bertugas Kelas (Boleh Edit)
Template Jadual Bertugas Kelas (Boleh Edit)
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
 
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
 
Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
 
PART A. Introduction to Costumer Service
PART A. Introduction to Costumer ServicePART A. Introduction to Costumer Service
PART A. Introduction to Costumer Service
 
Cambridge International AS A Level Biology Coursebook - EBook (MaryFosbery J...
Cambridge International AS  A Level Biology Coursebook - EBook (MaryFosbery J...Cambridge International AS  A Level Biology Coursebook - EBook (MaryFosbery J...
Cambridge International AS A Level Biology Coursebook - EBook (MaryFosbery J...
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
 
How to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERPHow to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERP
 
MARUTI SUZUKI- A Successful Joint Venture in India.pptx
MARUTI SUZUKI- A Successful Joint Venture in India.pptxMARUTI SUZUKI- A Successful Joint Venture in India.pptx
MARUTI SUZUKI- A Successful Joint Venture in India.pptx
 

Tweet segmentation and its application to named entity recognition

  • 1. Tweet Segmentation and Its Application to Named Entity Recognition Abstract: Twitter has attracted millions of users to share and disseminate most up-to- date information, resulting in large volumes of data produced everyday. However, many applications in Information Retrieval (IR) and Natural Language Processing (NLP) suffer severely from the noisy and short nature of tweets. In this paper, we propose a novel framework for tweet segmentation in a batch mode, called HybridSeg. By splitting tweets into meaningful segments, the semantic or context information is well preserved and easily extracted by the downstream applications. HybridSeg finds the optimal segmentation of a tweet by maximizing the sum of the stickiness scores of its candidate segments. The stickiness score considers the probability of a segment being a phrase in English (i.e., global context) and the probability of a segment being a phrase within the batch of tweets (i.e., local context). For the latter, we propose and evaluate two models to derive local context by considering the linguistic features and term- dependency in a batch of tweets, respectively. HybridSeg is also designed to iteratively learn from confident segments as pseudo feedback. Experiments on two tweet data sets show that tweet segmentation quality is significantly improved by learning both global and local contexts compared with using global context alone. Through analysis and
  • 2. comparison, we show that local linguistic features are more reliable for learning local context compared with term-dependency. As an application, we show that high accuracy is achieved in named entity recognition by applying segment-based part-of-speech (POS) tagging. Existing System: Many organizations have been reported to create and monitor targeted Twitter streams to collect and understand users’ opinions. Targeted Twitter stream is usually constructed by filtering tweets with predefined selection criteria (e.g., tweets published by users from a geographical region, tweets that match one or more predefined keywords). Due to its invaluable business value of timely information from these tweets, it is imperative to understand tweets’ language for a large body of downstream applications, such as named entity recognition (NER) event detection and summarization opinion mining sentiment analysis and many others. Given the limited length of a tweet (i.e., 140 characters) and no restrictions on its writing styles, tweets often contain grammatical errors, misspellings, and informal abbreviations. The error-prone and short nature of tweets often make the word-level language models for tweets less reliable. For example, given a tweet “I call her, no answer. Proposed System:
  • 3. To achieve high quality tweet segmentation, we propose a generic tweet segmentation framework, named HybridSeg. HybridSeg learns from both global and local contexts, and has the ability of learning from pseudo feedback. Global context: Tweets are posted for information sharing and communication. The named entities and semantic phrases are well preserved in tweets. The global context derived from Web pages (e.g., Microsoft Web N-Gram corpus) or Wikipedia therefore helps identifying the meaningful segments in tweets. Hardware Requirements: • System : Pentium IV 2.4 GHz. • Hard Disk : 40 GB. • Floppy Drive : 1.44 Mb. • Monitor : 15 VGA Colour. • Mouse : Logitech. • RAM : 256 Mb. Software Requirements: • Operating system : - Windows XP. • Front End : - JSP • Back End : - SQL Server
  • 4. Software Requirements: • Operating system : - Windows XP. • Front End : - .Net • Back End : - SQL Server