Tweet are being created short text message and shared for both users and data analysts. Twitter which receive
over 400 million tweets per day has emerged as an invaluable source of news, blogs, opinions and more. our
proposed work consists three components tweet stream clustering to cluster tweet using k-means cluster
algorithm and second tweet cluster vector technique to generate rank summarization using greedy algorithm,
therefore requires functionality which significantly differ from traditional summarization . in general, tweet
summarization and third to detect and monitors the summary-based and volume based variation to produce
timeline automatically from tweet stream. Implementing continuous tweet stream reducing a text document is
however not an simple task, since a huge number of tweets are worthless, unrelated and raucous in nature, due
to the social nature of tweeting. Further, tweets are strongly correlated with their posted instance and up-to-theminute
tweets tend to arrive at a very fast rate. Efficiency—tweet streams are always very big in level, hence the
summarization algorithm should be greatly capable; Flexibility—it should provide tweet summaries of random
moment durations. (3) Topic evolution—it should routinely detect sub-topic changes and the moments that they
happen.
On Summarization and Timeline Generation for Evolutionary Tweet Streams1crore projects
IEEE PROJECTS 2015
1 crore projects is a leading Guide for ieee Projects and real time projects Works Provider.
It has been provided Lot of Guidance for Thousands of Students & made them more beneficial in all Technology Training.
Dot Net
DOTNET Project Domain list 2015
1. IEEE based on datamining and knowledge engineering
2. IEEE based on mobile computing
3. IEEE based on networking
4. IEEE based on Image processing
5. IEEE based on Multimedia
6. IEEE based on Network security
7. IEEE based on parallel and distributed systems
Java Project Domain list 2015
1. IEEE based on datamining and knowledge engineering
2. IEEE based on mobile computing
3. IEEE based on networking
4. IEEE based on Image processing
5. IEEE based on Multimedia
6. IEEE based on Network security
7. IEEE based on parallel and distributed systems
ECE IEEE Projects 2015
1. Matlab project
2. Ns2 project
3. Embedded project
4. Robotics project
Eligibility
Final Year students of
1. BSc (C.S)
2. BCA/B.E(C.S)
3. B.Tech IT
4. BE (C.S)
5. MSc (C.S)
6. MSc (IT)
7. MCA
8. MS (IT)
9. ME(ALL)
10. BE(ECE)(EEE)(E&I)
TECHNOLOGY USED AND FOR TRAINING IN
1. DOT NET
2. C sharp
3. ASP
4. VB
5. SQL SERVER
6. JAVA
7. J2EE
8. STRINGS
9. ORACLE
10. VB dotNET
11. EMBEDDED
12. MAT LAB
13. LAB VIEW
14. Multi Sim
CONTACT US
1 CRORE PROJECTS
Door No: 214/215,2nd Floor,
No. 172, Raahat Plaza, (Shopping Mall) ,Arcot Road, Vadapalani, Chennai,
Tamin Nadu, INDIA - 600 026
Email id: 1croreprojects@gmail.com
website:1croreprojects.com
Phone : +91 97518 00789 / +91 72999 51536
Generating event storylines from microblogsmoresmile
The document presents a study that analyzes factors influencing tweet rates about weather events. It finds that tweet rates correlate most strongly with weather extremeness, followed by change in weather and expectation. Data on tweets and weather from 2010-2011 across 56 North American cities were analyzed using linear regression models to correlate daily tweet volumes with measures of weather expectation, extremeness, and change from historical records. The study provides quantitative insight into how inherent biases in self-reported social media data relate to characteristics of real-world events.
Real-time Classification of Malicious URLs on Twitter using Machine Activity ...Pete Burnap
This document summarizes research on classifying malicious URLs on Twitter in real-time using machine activity data. The researchers collected data on URLs shared on Twitter during sporting events and used a honeypot to identify malicious ones. They built machine learning models to predict maliciousness based on metrics like CPU usage, network traffic, and processes when a URL was clicked. The best model was a multi-layer perceptron that achieved up to 72% accuracy. It showed network activity, CPU usage, and processes were predictive. Testing on a new dataset showed some independence between events. Using only 1% of training data caused a small 5% drop in performance, alleviating concerns over data requirements.
Kelas 9Bab 5 : Dasar Jaringan Komputeranggraeni618
Dokumen tersebut memberikan penjelasan singkat tentang jaringan komputer, termasuk pengertian jaringan komputer, macam-macam jaringan komputer seperti LAN, MAN, WAN, dan wireless, pengertian topologi jaringan beserta contoh topologi seperti ring, bus, star, mesh dan tree. Dokumen tersebut juga menjelaskan pengertian internet dan intranet serta beberapa aplikasi internet populer seperti email, world wide web, dan file transfer protocol.
Dokumen tersebut memberikan penjelasan tentang perangkat keras utama dan pendukung yang dibutuhkan untuk mengakses internet, termasuk modem, komputer, saluran telepon, switch, repeater, bridge, dan router. Dokumen tersebut juga menjelaskan fungsi setiap perangkat tersebut.
The document discusses database and application security. It covers securing the database layer from unauthorized access and data compromise. Specific risks like SQL injection and privilege issues for database administrators are addressed. The document also discusses securing web servers and applications from attacks. Techniques like implementing profiles to restrict users, auditing, and using a reverse proxy for authentication are proposed to enhance security.
On Summarization and Timeline Generation for Evolutionary Tweet Streams1crore projects
IEEE PROJECTS 2015
1 crore projects is a leading Guide for ieee Projects and real time projects Works Provider.
It has been provided Lot of Guidance for Thousands of Students & made them more beneficial in all Technology Training.
Dot Net
DOTNET Project Domain list 2015
1. IEEE based on datamining and knowledge engineering
2. IEEE based on mobile computing
3. IEEE based on networking
4. IEEE based on Image processing
5. IEEE based on Multimedia
6. IEEE based on Network security
7. IEEE based on parallel and distributed systems
Java Project Domain list 2015
1. IEEE based on datamining and knowledge engineering
2. IEEE based on mobile computing
3. IEEE based on networking
4. IEEE based on Image processing
5. IEEE based on Multimedia
6. IEEE based on Network security
7. IEEE based on parallel and distributed systems
ECE IEEE Projects 2015
1. Matlab project
2. Ns2 project
3. Embedded project
4. Robotics project
Eligibility
Final Year students of
1. BSc (C.S)
2. BCA/B.E(C.S)
3. B.Tech IT
4. BE (C.S)
5. MSc (C.S)
6. MSc (IT)
7. MCA
8. MS (IT)
9. ME(ALL)
10. BE(ECE)(EEE)(E&I)
TECHNOLOGY USED AND FOR TRAINING IN
1. DOT NET
2. C sharp
3. ASP
4. VB
5. SQL SERVER
6. JAVA
7. J2EE
8. STRINGS
9. ORACLE
10. VB dotNET
11. EMBEDDED
12. MAT LAB
13. LAB VIEW
14. Multi Sim
CONTACT US
1 CRORE PROJECTS
Door No: 214/215,2nd Floor,
No. 172, Raahat Plaza, (Shopping Mall) ,Arcot Road, Vadapalani, Chennai,
Tamin Nadu, INDIA - 600 026
Email id: 1croreprojects@gmail.com
website:1croreprojects.com
Phone : +91 97518 00789 / +91 72999 51536
Generating event storylines from microblogsmoresmile
The document presents a study that analyzes factors influencing tweet rates about weather events. It finds that tweet rates correlate most strongly with weather extremeness, followed by change in weather and expectation. Data on tweets and weather from 2010-2011 across 56 North American cities were analyzed using linear regression models to correlate daily tweet volumes with measures of weather expectation, extremeness, and change from historical records. The study provides quantitative insight into how inherent biases in self-reported social media data relate to characteristics of real-world events.
Real-time Classification of Malicious URLs on Twitter using Machine Activity ...Pete Burnap
This document summarizes research on classifying malicious URLs on Twitter in real-time using machine activity data. The researchers collected data on URLs shared on Twitter during sporting events and used a honeypot to identify malicious ones. They built machine learning models to predict maliciousness based on metrics like CPU usage, network traffic, and processes when a URL was clicked. The best model was a multi-layer perceptron that achieved up to 72% accuracy. It showed network activity, CPU usage, and processes were predictive. Testing on a new dataset showed some independence between events. Using only 1% of training data caused a small 5% drop in performance, alleviating concerns over data requirements.
Kelas 9Bab 5 : Dasar Jaringan Komputeranggraeni618
Dokumen tersebut memberikan penjelasan singkat tentang jaringan komputer, termasuk pengertian jaringan komputer, macam-macam jaringan komputer seperti LAN, MAN, WAN, dan wireless, pengertian topologi jaringan beserta contoh topologi seperti ring, bus, star, mesh dan tree. Dokumen tersebut juga menjelaskan pengertian internet dan intranet serta beberapa aplikasi internet populer seperti email, world wide web, dan file transfer protocol.
Dokumen tersebut memberikan penjelasan tentang perangkat keras utama dan pendukung yang dibutuhkan untuk mengakses internet, termasuk modem, komputer, saluran telepon, switch, repeater, bridge, dan router. Dokumen tersebut juga menjelaskan fungsi setiap perangkat tersebut.
The document discusses database and application security. It covers securing the database layer from unauthorized access and data compromise. Specific risks like SQL injection and privilege issues for database administrators are addressed. The document also discusses securing web servers and applications from attacks. Techniques like implementing profiles to restrict users, auditing, and using a reverse proxy for authentication are proposed to enhance security.
In Soil, Generally there exists pores which are field with water. These pores are interconnected & hence it
becomes highly complex & intricate network of irregular tubes. Where there is pressure difference, water flows
from high potential to low potential zone. The pores may be large or small, may be irregular in shape. The
quantum of flow depends on the above. The ease with which water can flow through soil is the permeability of
the soil. Geotechnical engineers may come across the problem of soil which may be highly permeable or may
have restricted permeability. Engineers are needed to quantitatively assess the amount of flow likely to occur.
Dokumen tersebut memberikan instruksi cara membuat tabel, grafik, dan diagram organisasi menggunakan Microsoft PowerPoint. Langkah pertamanya adalah memilih menu insert dan memilih jenis objek yang diinginkan, seperti tabel, chart atau smartart. Kemudian mengikuti petunjuk-petunjuk selanjutnya seperti memilih jenis tabel, grafik atau diagram yang diinginkan.
Este documento fornece um perfil socioeconômico das 4.036 famílias beneficiárias do Programa Banco de Alimentos em Santo André que estão cadastradas no Cadastro Único. Mostra que a maioria dessas famílias são chefiadas por mulheres, negras, com renda muito baixa e alta dependência. Muitos trabalhadores dessas famílias atuam na informalidade.
Bab 4 menjelaskan cara menambahkan animasi, video, dan suara pada presentasi PowerPoint. Pertama, klik animations lalu tambahkan efek masuk, penekanan, dan keluar pada teks. Kedua, atur animasi untuk berjalan otomatis dengan mengklik start sebelum atau sesudah yang sebelumnya. Ketiga, tambahkan file suara dan video dengan mengklik transition sound dan menambahkan media clips.
1. The document analyzes the stresses on a differential housing cover manufactured by pressure die casting.
2. Finite element analysis is used to model and analyze stresses on the part, identifying high stress areas that could cause failures.
3. The analysis finds the stresses are within acceptable limits except near an opening, and deformation is minimal.
4. The study optimizes the part thickness to minimize stresses and reduce rejected parts.
Generation Strong Yoga Club for Girls FINAL2Sarah Wolfe
The Yoga Club for Girls document announces a fall and winter 2016 schedule of gatherings focused on inner strength, mindful living, and knowing yourself led by guest teacher Sarah Wolfe. It also advertises a Girl's Night Out event on November 11th from 7-8:30pm that will use yoga, meditation, and mantras to help heal body image, strengthen self-esteem, and cultivate self-love. The event costs $25 and includes crafts, snacks, and activities. Registration can be done online or by phone.
This document summarizes an algorithm for efficiently clustering short messages like tweets into general domains or topics. The algorithm breaks the clustering task into two stages: (1) batch clustering of user-annotated data using hashtags to create dense virtual documents, and (2) online clustering of new tweets using the centroids from stage 1. Experimental results show the algorithm outperforms other clustering methods and can accurately cluster large streams of sparse, short messages efficiently.
Tweet Summarization and Segmentation: A Surveyvivatechijri
The document summarizes various techniques for tweet summarization and segmentation, and discusses how the particle swarm optimization (PSO) algorithm can be used for tweet clustering. It provides an overview of 8 papers that propose different approaches for tweet summarization using techniques like graph clustering, keyword graphs, and segmentation. It also analyzes these techniques and compares the performance of PSO to other clustering algorithms like K-means. The analysis shows that methods combining K-means and PSO have higher accuracy than K-means alone, and that PSO performs better than K-means for real-time tweet clustering, achieving an accuracy of 90.6% versus 62% for K-means.
Twitter is a free social networking microblogging service that allows registered members to broadcast, in real-time, short posts called tweets. Twitter members can broadcast tweets and follow other users’ tweets by using multiple devices, making this information system one of the fastest in the world. In this chapter, we leverage this characteristic to introduce a novel topic-detection method aimed at informing, in real-time, a specific user about the most emerging arguments expressed by the network around his/her domain interests. With this goal, we aim at formalizing the information spread over the network by studying the topology of the network and by modeling the implicit and explicit connections among the users. Then, we propose an innovative term aging model, based on a biological metaphor, to retrieve the freshest arguments of discussion, represented through a minimal set of terms, expressed by the community within the foci of interest of a specific user. We finally test the proposed model through various experiments and user studies.
This paper describes the use of Storm at Twitter. Storm is a realtime fault-tolerant and distributed stream data processing system.Storm is currently being used to run various critical computations in Twitter at scale, and in real-time. This paper describes the architecture of Storm and its methods for distributed scale-out and fault-tolerance. This paper also describes how queries (aka.topologies) are executed in Storm, and presents some operational stories based on running Storm at Twitter. We also present results
from an empirical evaluation demonstrating the resilience of
Storm in dealing with machine failures. Storm is under active
development at Twitter and we also present some potential
directions for future work.
The document provides an overview of automatic text summarization techniques. It discusses the large amount of online content available and how summarization can help analyze large corpora. The key challenges of automatic summarization include determining sentence importance and eliminating redundancy. Extractive summarization selects existing text snippets while abstractive summarization can generate new text. Summarization has applications like summarizing tweets about sports games or reviews. Advanced approaches use social media metadata and graph-based models.
This document summarizes a final project report for a parallel text mining framework that analyzes tweets in real-time. The project crawls tweets from major news outlets using Twitter's API and analyzes each tweet using hidden Markov models for part-of-speech tagging. The analysis is performed in parallel across 30 machines with 16 cores each using MPI. Bottlenecks include Twitter's API rate limits and scraping news articles, which are addressed through multi-threading and batch processing tweets.
Social Sensor for Real Time Event DetectionIJERA Editor
Social sites play an important role in our day to day life. Peoples are spending at most of the time on social
networking sites and for this reason social networking sites have received a more attention in last few years.
Social networking websites have received greater attention in the past few years .Twitter is one of the most
popular networking site which received attention from peoples by providing its real time nature. The important
characteristic of our system is its real-time nature. We create a system to collects the tweets from various groups
of users and detect real time events like earthquake, traffic etc. To detect real time events, we build a classifier
of comments based on features such as the keywords in a comments, the number of words. With the help of
comments we will approximately find out the location of the target event. We assume each user as a sensor and
apply priority base algorithm which is used for tweet analysis.
To resolve the traffic congestion problem, we have to consider the volume of the traffic, traffic speed, road
occupancy etc. Classify tweets into a positive and negative class. Produce a probabilistic model for event
detection. We create group of users on the basis of their interest such as peoples from traffic police department
are post the tweets related to traffic event, also peoples form whether department are post their tweets related to
flood etc. Each tweet is associated with user group, tweet, time and location. By processing time and location
information, we can detect the real time events.
This document discusses a proposed system for segmenting tweets and applying that segmentation to name entity recognition. The proposed system uses k-means clustering to segment tweets into categories like Bollywood, business, education, politics, and sports. This segmentation aims to preserve the semantic meaning of tweets and improve downstream applications like named entity recognition, achieving better accuracy than word-based approaches. The proposed system also adds security features like blocking users and is intended to be more efficient and flexible than prior algorithms.
IRJET - Twitter Spam Detection using CobwebIRJET Journal
This document discusses using Cobweb clustering and Gradient Boosting techniques to detect spam on Twitter. Cobweb clustering creates a classification tree to predict attributes of new objects by summarizing the attribute distributions of existing nodes. Gradient Boosting is an ensemble method that uses multiple weak learners (decision trees) to create a stronger predictive model. The paper aims to combine these techniques to create an enhanced spam detection system. It also reviews several existing approaches for Twitter spam detection using techniques like Hidden Markov Models, Random Forests, and asynchronous link-based algorithms.
IRJET- An Improved Machine Learning for Twitter Breaking News Extraction ...IRJET Journal
This document discusses an improved machine learning approach for extracting breaking news from Twitter based on trending topics. The approach aims to filter tweets to remove irrelevant information, cluster similar tweets that relate to real-world events to identify breaking news stories, and dynamically rank the identified news stories over time for tracking. The approach is evaluated using different supervised text classification algorithms to classify tweets as news or not and a density-based clustering algorithm to group related tweets.
To Get any Project for CSE, IT ECE, EEE Contact Me @ 09666155510, 09849539085 or mail us - ieeefinalsemprojects@gmail.com-Visit Our Website: www.finalyearprojects.org
To Get any Project for CSE, IT ECE, EEE Contact Me @ 09666155510, 09849539085 or mail us - ieeefinalsemprojects@gmail.com-Visit Our Website: www.finalyearprojects.org
In Soil, Generally there exists pores which are field with water. These pores are interconnected & hence it
becomes highly complex & intricate network of irregular tubes. Where there is pressure difference, water flows
from high potential to low potential zone. The pores may be large or small, may be irregular in shape. The
quantum of flow depends on the above. The ease with which water can flow through soil is the permeability of
the soil. Geotechnical engineers may come across the problem of soil which may be highly permeable or may
have restricted permeability. Engineers are needed to quantitatively assess the amount of flow likely to occur.
Dokumen tersebut memberikan instruksi cara membuat tabel, grafik, dan diagram organisasi menggunakan Microsoft PowerPoint. Langkah pertamanya adalah memilih menu insert dan memilih jenis objek yang diinginkan, seperti tabel, chart atau smartart. Kemudian mengikuti petunjuk-petunjuk selanjutnya seperti memilih jenis tabel, grafik atau diagram yang diinginkan.
Este documento fornece um perfil socioeconômico das 4.036 famílias beneficiárias do Programa Banco de Alimentos em Santo André que estão cadastradas no Cadastro Único. Mostra que a maioria dessas famílias são chefiadas por mulheres, negras, com renda muito baixa e alta dependência. Muitos trabalhadores dessas famílias atuam na informalidade.
Bab 4 menjelaskan cara menambahkan animasi, video, dan suara pada presentasi PowerPoint. Pertama, klik animations lalu tambahkan efek masuk, penekanan, dan keluar pada teks. Kedua, atur animasi untuk berjalan otomatis dengan mengklik start sebelum atau sesudah yang sebelumnya. Ketiga, tambahkan file suara dan video dengan mengklik transition sound dan menambahkan media clips.
1. The document analyzes the stresses on a differential housing cover manufactured by pressure die casting.
2. Finite element analysis is used to model and analyze stresses on the part, identifying high stress areas that could cause failures.
3. The analysis finds the stresses are within acceptable limits except near an opening, and deformation is minimal.
4. The study optimizes the part thickness to minimize stresses and reduce rejected parts.
Generation Strong Yoga Club for Girls FINAL2Sarah Wolfe
The Yoga Club for Girls document announces a fall and winter 2016 schedule of gatherings focused on inner strength, mindful living, and knowing yourself led by guest teacher Sarah Wolfe. It also advertises a Girl's Night Out event on November 11th from 7-8:30pm that will use yoga, meditation, and mantras to help heal body image, strengthen self-esteem, and cultivate self-love. The event costs $25 and includes crafts, snacks, and activities. Registration can be done online or by phone.
This document summarizes an algorithm for efficiently clustering short messages like tweets into general domains or topics. The algorithm breaks the clustering task into two stages: (1) batch clustering of user-annotated data using hashtags to create dense virtual documents, and (2) online clustering of new tweets using the centroids from stage 1. Experimental results show the algorithm outperforms other clustering methods and can accurately cluster large streams of sparse, short messages efficiently.
Tweet Summarization and Segmentation: A Surveyvivatechijri
The document summarizes various techniques for tweet summarization and segmentation, and discusses how the particle swarm optimization (PSO) algorithm can be used for tweet clustering. It provides an overview of 8 papers that propose different approaches for tweet summarization using techniques like graph clustering, keyword graphs, and segmentation. It also analyzes these techniques and compares the performance of PSO to other clustering algorithms like K-means. The analysis shows that methods combining K-means and PSO have higher accuracy than K-means alone, and that PSO performs better than K-means for real-time tweet clustering, achieving an accuracy of 90.6% versus 62% for K-means.
Twitter is a free social networking microblogging service that allows registered members to broadcast, in real-time, short posts called tweets. Twitter members can broadcast tweets and follow other users’ tweets by using multiple devices, making this information system one of the fastest in the world. In this chapter, we leverage this characteristic to introduce a novel topic-detection method aimed at informing, in real-time, a specific user about the most emerging arguments expressed by the network around his/her domain interests. With this goal, we aim at formalizing the information spread over the network by studying the topology of the network and by modeling the implicit and explicit connections among the users. Then, we propose an innovative term aging model, based on a biological metaphor, to retrieve the freshest arguments of discussion, represented through a minimal set of terms, expressed by the community within the foci of interest of a specific user. We finally test the proposed model through various experiments and user studies.
This paper describes the use of Storm at Twitter. Storm is a realtime fault-tolerant and distributed stream data processing system.Storm is currently being used to run various critical computations in Twitter at scale, and in real-time. This paper describes the architecture of Storm and its methods for distributed scale-out and fault-tolerance. This paper also describes how queries (aka.topologies) are executed in Storm, and presents some operational stories based on running Storm at Twitter. We also present results
from an empirical evaluation demonstrating the resilience of
Storm in dealing with machine failures. Storm is under active
development at Twitter and we also present some potential
directions for future work.
The document provides an overview of automatic text summarization techniques. It discusses the large amount of online content available and how summarization can help analyze large corpora. The key challenges of automatic summarization include determining sentence importance and eliminating redundancy. Extractive summarization selects existing text snippets while abstractive summarization can generate new text. Summarization has applications like summarizing tweets about sports games or reviews. Advanced approaches use social media metadata and graph-based models.
This document summarizes a final project report for a parallel text mining framework that analyzes tweets in real-time. The project crawls tweets from major news outlets using Twitter's API and analyzes each tweet using hidden Markov models for part-of-speech tagging. The analysis is performed in parallel across 30 machines with 16 cores each using MPI. Bottlenecks include Twitter's API rate limits and scraping news articles, which are addressed through multi-threading and batch processing tweets.
Social Sensor for Real Time Event DetectionIJERA Editor
Social sites play an important role in our day to day life. Peoples are spending at most of the time on social
networking sites and for this reason social networking sites have received a more attention in last few years.
Social networking websites have received greater attention in the past few years .Twitter is one of the most
popular networking site which received attention from peoples by providing its real time nature. The important
characteristic of our system is its real-time nature. We create a system to collects the tweets from various groups
of users and detect real time events like earthquake, traffic etc. To detect real time events, we build a classifier
of comments based on features such as the keywords in a comments, the number of words. With the help of
comments we will approximately find out the location of the target event. We assume each user as a sensor and
apply priority base algorithm which is used for tweet analysis.
To resolve the traffic congestion problem, we have to consider the volume of the traffic, traffic speed, road
occupancy etc. Classify tweets into a positive and negative class. Produce a probabilistic model for event
detection. We create group of users on the basis of their interest such as peoples from traffic police department
are post the tweets related to traffic event, also peoples form whether department are post their tweets related to
flood etc. Each tweet is associated with user group, tweet, time and location. By processing time and location
information, we can detect the real time events.
This document discusses a proposed system for segmenting tweets and applying that segmentation to name entity recognition. The proposed system uses k-means clustering to segment tweets into categories like Bollywood, business, education, politics, and sports. This segmentation aims to preserve the semantic meaning of tweets and improve downstream applications like named entity recognition, achieving better accuracy than word-based approaches. The proposed system also adds security features like blocking users and is intended to be more efficient and flexible than prior algorithms.
IRJET - Twitter Spam Detection using CobwebIRJET Journal
This document discusses using Cobweb clustering and Gradient Boosting techniques to detect spam on Twitter. Cobweb clustering creates a classification tree to predict attributes of new objects by summarizing the attribute distributions of existing nodes. Gradient Boosting is an ensemble method that uses multiple weak learners (decision trees) to create a stronger predictive model. The paper aims to combine these techniques to create an enhanced spam detection system. It also reviews several existing approaches for Twitter spam detection using techniques like Hidden Markov Models, Random Forests, and asynchronous link-based algorithms.
IRJET- An Improved Machine Learning for Twitter Breaking News Extraction ...IRJET Journal
This document discusses an improved machine learning approach for extracting breaking news from Twitter based on trending topics. The approach aims to filter tweets to remove irrelevant information, cluster similar tweets that relate to real-world events to identify breaking news stories, and dynamically rank the identified news stories over time for tracking. The approach is evaluated using different supervised text classification algorithms to classify tweets as news or not and a density-based clustering algorithm to group related tweets.
To Get any Project for CSE, IT ECE, EEE Contact Me @ 09666155510, 09849539085 or mail us - ieeefinalsemprojects@gmail.com-Visit Our Website: www.finalyearprojects.org
To Get any Project for CSE, IT ECE, EEE Contact Me @ 09666155510, 09849539085 or mail us - ieeefinalsemprojects@gmail.com-Visit Our Website: www.finalyearprojects.org
To Get any Project for CSE, IT ECE, EEE Contact Me @ 09666155510, 09849539085 or mail us - ieeefinalsemprojects@gmail.com-Visit Our Website: www.finalyearprojects.org
This master's thesis analyzes the structure of online news articles to identify the presence of single and stacked news triangles. The study analyzes annotations from 68 Danish journalists on 31 online news articles. It finds that 30 of 31 articles contain an overall news triangle, while 14 contain stacked news triangles. It also analyzes the use of named entities in articles and annotations, finding differences in entity types across categories but no clear influences on news triangle presence. The thesis aims to help automate news annotation by identifying the structure of online news, but finds that stacked news triangles are difficult to identify consistently. More research is needed to understand influences on article structure.
Pit Overload Analysis in Content Centric NetworksMatteo Virgilio
Content Centric Networking represents a paradigm shift in the evolution and definition of modern network protocols. Many research efforts have been made with the purpose of proving the feasibility and the scalability of this proposal. Our main contribution is to provide an analysis of the Pending Interest Table memory requirements in real deployment scenarios, especially considering the impact of distributed denial of service attacks. In fact, the state that the protocol maintains for each resource request makes the routers more prone to resources exhaustion issues than in traditional stateless solutions. Our results are derived by using a full custom simulator and considering the different node architectures that have been proposed as valid reference models. The main outcomes point out differentiated weaknesses in each architecture we investigated and underline the need for improvements in terms of security and scalability.
IRJET- Finding Related Forum Posts through Intention-Based SegmentationIRJET Journal
This document presents a novel technique for finding related discussion posts on forums by segmenting each post into sections based on the intention of the author. Each section aims to convey a different message or objective. Relatedness between posts is determined by comparing sections that share the same intention, rather than comparing the full text of posts. The technique involves identifying sections within each post using linguistic and semantic cues. Sections with the same intention are then clustered together. The effectiveness of this intention-based segmentation for suggesting related forum posts is evaluated on real user data from different domains. The proposed approach is found to be more effective at determining post relatedness than direct text comparisons of full posts.
Applying Clustering Techniques for Efficient Text Mining in Twitter Dataijbuiiir1
Knowledge is the ultimate output of decisions on a dataset. The revolution of the Internet has made the global distance closer with the touch on the hand held electronic devices. Usage of social media sites have increased in the past decades. One of the most popular social media micro blog is Twitter. Twitter has millions of users in the world. In this paper the analysis of Twitter data is performed through the text contained in hash tags. After Preprocessing clustering algorithms are applied on text data. The different clusters formed are compared through various parameters. Visualization techniques are used to portray the results from which inferences like time series and topic flow can be easily made. The observed results show that the hierarchical clustering algorithm performs better than other algorithms.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
Text categorization is a term that has intrigued researchers for quite some time now. It is the concept
in which news articles are categorized into specific groups to cut down efforts put in manually categorizing
news articles into particular groups. A growing number of statistical classification and machine learning
technique have been applied to text categorization. This paper is based on the automatic text categorization
of news articles based on clustering using k-mean algorithm. The goal of this paper is to automatically
categorize news articles into groups. Our paper mostly concentrates on K-mean for clustering and for term
frequency we are going to use TF-IDF dictionary is applied for categorization. This is done using mahaout
as platform.
IRJET- Identification of Prevalent News from Twitter and Traditional Media us...IRJET Journal
This document describes a study that uses community detection models to identify prevalent news topics discussed on both Twitter and traditional media like BBC. It collects tweets and news articles about sports over a one-month period. Keywords are extracted from the data and a graph is constructed to represent relationships between words. Three community detection models - Girvan-Newman clustering, CLIQUE, and Louvain - are used to cluster similar content and detect communities of keywords representing news topics. The number of unique Twitter users engaged with each topic is also calculated to rank topics by user attention. The goal is to analyze how information is distributed between social and traditional media and identify emerging topics with low coverage in traditional sources.
Similar to Topic Evolutionary Tweet Stream Clustering Algorithm and TCV Rank Summarization (20)
Discover the latest insights on Data Driven Maintenance with our comprehensive webinar presentation. Learn about traditional maintenance challenges, the right approach to utilizing data, and the benefits of adopting a Data Driven Maintenance strategy. Explore real-world examples, industry best practices, and innovative solutions like FMECA and the D3M model. This presentation, led by expert Jules Oudmans, is essential for asset owners looking to optimize their maintenance processes and leverage digital technologies for improved efficiency and performance. Download now to stay ahead in the evolving maintenance landscape.
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Sinan KOZAK
Sinan from the Delivery Hero mobile infrastructure engineering team shares a deep dive into performance acceleration with Gradle build cache optimizations. Sinan shares their journey into solving complex build-cache problems that affect Gradle builds. By understanding the challenges and solutions found in our journey, we aim to demonstrate the possibilities for faster builds. The case study reveals how overlapping outputs and cache misconfigurations led to significant increases in build times, especially as the project scaled up with numerous modules using Paparazzi tests. The journey from diagnosing to defeating cache issues offers invaluable lessons on maintaining cache integrity without sacrificing functionality.
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...IJECEIAES
Climate change's impact on the planet forced the United Nations and governments to promote green energies and electric transportation. The deployments of photovoltaic (PV) and electric vehicle (EV) systems gained stronger momentum due to their numerous advantages over fossil fuel types. The advantages go beyond sustainability to reach financial support and stability. The work in this paper introduces the hybrid system between PV and EV to support industrial and commercial plants. This paper covers the theoretical framework of the proposed hybrid system including the required equation to complete the cost analysis when PV and EV are present. In addition, the proposed design diagram which sets the priorities and requirements of the system is presented. The proposed approach allows setup to advance their power stability, especially during power outages. The presented information supports researchers and plant owners to complete the necessary analysis while promoting the deployment of clean energy. The result of a case study that represents a dairy milk farmer supports the theoretical works and highlights its advanced benefits to existing plants. The short return on investment of the proposed approach supports the paper's novelty approach for the sustainable electrical system. In addition, the proposed system allows for an isolated power setup without the need for a transmission line which enhances the safety of the electrical network
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTjpsjournal1
The rivalry between prominent international actors for dominance over Central Asia's hydrocarbon
reserves and the ancient silk trade route, along with China's diplomatic endeavours in the area, has been
referred to as the "New Great Game." This research centres on the power struggle, considering
geopolitical, geostrategic, and geoeconomic variables. Topics including trade, political hegemony, oil
politics, and conventional and nontraditional security are all explored and explained by the researcher.
Using Mackinder's Heartland, Spykman Rimland, and Hegemonic Stability theories, examines China's role
in Central Asia. This study adheres to the empirical epistemological method and has taken care of
objectivity. This study analyze primary and secondary research documents critically to elaborate role of
china’s geo economic outreach in central Asian countries and its future prospect. China is thriving in trade,
pipeline politics, and winning states, according to this study, thanks to important instruments like the
Shanghai Cooperation Organisation and the Belt and Road Economic Initiative. According to this study,
China is seeing significant success in commerce, pipeline politics, and gaining influence on other
governments. This success may be attributed to the effective utilisation of key tools such as the Shanghai
Cooperation Organisation and the Belt and Road Economic Initiative.
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsVictor Morales
K8sGPT is a tool that analyzes and diagnoses Kubernetes clusters. This presentation was used to share the requirements and dependencies to deploy K8sGPT in a local environment.
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...IJECEIAES
Medical image analysis has witnessed significant advancements with deep learning techniques. In the domain of brain tumor segmentation, the ability to
precisely delineate tumor boundaries from magnetic resonance imaging (MRI)
scans holds profound implications for diagnosis. This study presents an ensemble convolutional neural network (CNN) with transfer learning, integrating
the state-of-the-art Deeplabv3+ architecture with the ResNet18 backbone. The
model is rigorously trained and evaluated, exhibiting remarkable performance
metrics, including an impressive global accuracy of 99.286%, a high-class accuracy of 82.191%, a mean intersection over union (IoU) of 79.900%, a weighted
IoU of 98.620%, and a Boundary F1 (BF) score of 83.303%. Notably, a detailed comparative analysis with existing methods showcases the superiority of
our proposed model. These findings underscore the model’s competence in precise brain tumor localization, underscoring its potential to revolutionize medical
image analysis and enhance healthcare outcomes. This research paves the way
for future exploration and optimization of advanced CNN models in medical
imaging, emphasizing addressing false positives and resource efficiency.
Manufacturing Process of molasses based distillery ppt.pptx
Topic Evolutionary Tweet Stream Clustering Algorithm and TCV Rank Summarization
1. K.Selvaraj Int. Journal of Engineering Research and Applications www.ijera.com
ISSN: 2248-9622, Vol. 5, Issue 11, (Part - 4) November 2015, pp.11-14
www.ijera.com 11 | P a g e
Topic Evolutionary Tweet Stream Clustering Algorithm and TCV
Rank Summarization
K.Selvaraj1
, S.Balaji2
Department of computer science and engineering, Akshaya college of engineering and technology, India
Abstract
Tweet are being created short text message and shared for both users and data analysts. Twitter which receive
over 400 million tweets per day has emerged as an invaluable source of news, blogs, opinions and more. our
proposed work consists three components tweet stream clustering to cluster tweet using k-means cluster
algorithm and second tweet cluster vector technique to generate rank summarization using greedy algorithm,
therefore requires functionality which significantly differ from traditional summarization . in general, tweet
summarization and third to detect and monitors the summary-based and volume based variation to produce
timeline automatically from tweet stream. Implementing continuous tweet stream reducing a text document is
however not an simple task, since a huge number of tweets are worthless, unrelated and raucous in nature, due
to the social nature of tweeting. Further, tweets are strongly correlated with their posted instance and up-to-the-
minute tweets tend to arrive at a very fast rate. Efficiency—tweet streams are always very big in level, hence the
summarization algorithm should be greatly capable; Flexibility—it should provide tweet summaries of random
moment durations. (3) Topic evolution—it should routinely detect sub-topic changes and the moments that they
happen.
Keywords: Tweet Stream, summarization, Timeline, Topic evolution,summary.
I. INTRODUCTION
Growing attractiveness of microblogging
services such as Twitter, Weibo, and Tumblr has
resulted in the explosion of the amount of short-text
messages. Twitter, for instance, which receives over
400 million tweets per day1 has emerged as an
invaluable source of news, blogs, opinions, and more.
Tweets, in their raw form, while being
informative, can also be overwhelming. For instance,
search for a hot topic in Twitter may yield millions of
tweets, spanning weeks. Even if filtering is allowed,
plowing through so many tweets for important
contents would be a nightmare, not to mention the
enormous amount of noise and redundancy that one
might encounter. To make things worse, new tweets
satisfying the filtering criteria may arrive
continuously, at an unpredictable rate. One possible
solution to information overload problem is
summarization. Summarization represents restating
of the main ideas of the text in as few words as
possible Intuitively, a good summary should cover
the main topics (or subtopics) and have diversity
among the sentences to reduce redundancy.
Summarization is widely used in comfortable
arrangement, specially when users surf the internet
with their mobile devices which have much lesser
screens than PCs. Traditional document
summarization approaches, however, are not as
effective in the situation of tweets given both the big
size of tweets as well as the fast and continuous
nature of their arrival. Tweet summarization,
therefore, requires functionalities which significantly
differ from traditional summarization. In general,
tweet summarization has to take into consideration
the temporal feature of the arriving tweets. Consider
a user interested in a topic-related tweet stream, for
example, tweets about ―Apple‖. A tweet
summarization system will continuously monitor
―Apple‖ related tweets producing a real-time timeline
of the tweet stream. a user may explore tweets based
on a timeline (e.g., ―Apple‖ tweets posted between
October to November). Given a timeline range, the
document system may generate a series of current
time summaries to highlight points where the
topic/subtopics evolved in the stream. Such a system
will effectively enable the user to learn major news/
discussion related to ―Apple‖ without having to read
through the entire tweet stream. Given the big picture
about topic evolution about ―Apple‖, a user may
decide to zoom in to get a more detailed report for a
smaller duration (e.g., from three hour) system may
provide a drill-down summary of the duration that
enables the user to get additional details for that
duration. Such application would not only facilitate
easy navigation in topic-relevant tweets, but also
support a range of data analysis tasks such as instant
reports or historical survey.
II. Twitter Summarization
2.1 Stream Data Clustering
The tweet stream clustering module maintains
the online statistical data. Given a topic-based tweet
RESEARCH ARTICLE OPEN ACCESS
2. K.Selvaraj Int. Journal of Engineering Research and Applications www.ijera.com
ISSN: 2248-9622, Vol. 5, Issue 11, (Part - 4) November 2015, pp.11-14
www.ijera.com 12 | P a g e
stream, it is able to efficiently cluster the tweets and
maintain compact cluster information a scalable
clustering framework which selectively stores
important portions of the data, and compresses or
discards other portions. CluStream is one of the most
classic stream clustering methods. It consists of an
online micro-clustering component and an offline
macro-clustering component. A variety of services on
the Web such as news filtering, text crawling, and
topic detecting etc. have posed requirements for text
stream clustering CluStream to generate duration-
based clustering results for text and categorical data
streams. However, this algorithm relies on an online
phase to generate a large number of ―micro-clusters‖
and an offline phase to re-cluster them. In contrast,
our tweet stream clustering algorithm is an online
procedure without extra offline clustering. And in the
context of tweet summarization, we adapt the online
clustering phase by incorporating the new structure
TCV, and restricting the number of clusters to
guarantee efficiency and the quality of TCVs.
2.1 Tweet Stream Initialization
At the start of the stream, we collect a small
number of tweets and use a k-means clustering
algorithm to create the initial clusters. The
corresponding TCVs are initialized according to
Definition 1. Next, the stream clustering process
starts to incrementally update the TCVs whenever a
new tweet arrives.
2.2 Incremental Clustering
Suppose a tweet t arrives at time ts, and there are
N active clusters at that time. The key problem is to
decide whether to attract into one of the in progress
clusters or advance t as a new cluster. We first find
the cluster whose centroid is the closest to t.
Specifically, we get the centroid of each cluster based
on Equation (1), compute its cosine similarity to t,
and find the cluster Cp with the largest similarity.
2.3 Deleting Outdated Clusters
For most events (such as news, football matches
and concerts) in tweet streams, timeliness is
important because they usually do not last for a long
time. Therefore it is safe to delete the clusters
representing these sub-topics when they are rarely
discussed. To find out such clusters, an intuitive way
is to estimate the average arrival time (denoted as
Avgp) of the last p percent of tweets in a cluster.
However, storing p percent of tweets for every cluster
will increase memory costs, especially when clusters
grow big. Thus, we employ an approximate method
to get Avgp.
2.4 Merging Clusters
If the number of clusters keeps increasing with
few deletions, system memory will be exhausted. To
avoid this, we specify an upper limit for the number
of clusters as Nmax. When the limit is reached, a
merging process starts. The process merges clusters
in a greedy way. First, we sort all cluster pairs by
their centroid similarities in a descending order.
Then, starting with the most similar pair, we try to
merge two clusters in it. When both clusters are
single clusters which have not been merged with
other clusters, they are merged into a new composite
cluster. When one of them belongs to a composite
cluster (it has been merged with others before), the
other is also merged into that composite cluster.
When both of them have been merged, if they belong
to the same composite cluster, this pair is skipped;
otherwise, the two composite clusters are merged
together. This process continues until there are only
mc percentage of the original clusters left (mc is a
merging coefficient which provides a balance
between available memory space and the quality of
remaining clusters).
Fig.1. The Frame Work Of Sumblr
III. Related Work
3.1 High-Level Summarization
The high-level summarization module provides
two types of summaries: online and historical
summaries. An online summary describes what is
currently discussed among the public. Thus, the input
for generating online summaries is retrieved directly
from the current clusters maintained in memory. On
the other hand, a historical summary helps people
understand the main happenings during a specific
period,which means we need to eliminate the
influence of tweet contents from the outside of that
period. As a result, retrieval of the required
information for generating historical summaries is
3. K.Selvaraj Int. Journal of Engineering Research and Applications www.ijera.com
ISSN: 2248-9622, Vol. 5, Issue 11, (Part - 4) November 2015, pp.11-14
www.ijera.com 13 | P a g e
more complicated, and this shall be our focus in the
following discussion. Suppose the length of a user-
defined time duration is H, and the ending timestamp
of the duration is tse.
IV. Document/Microblog Summarization
Document summarization can be categorized as
extractive and abstractive. The former selects
sentences from the documents, while the latter may
generate phrases and sentences that do not appear in
the original documents. In this paper, we focus on
extractive summarization. Extractive document
summarization has received a lot of recent attention.
Most of them assign salient scores to sentences of the
documents, and select the top-ranked sentences.
Some works try to extract summaries without such
salient scores. the symmetric non-negative matrix
factorization to cluster sentences and choose
sentences in each cluster for summarization.
proposed to summarize documents from the
perspective of data reconstruction, and select
sentences that can best reconstruct the original
documents. In modeled documents (hotel reviews) as
multi-attribute uncertain data and optimized a
probabilistic coverage problem of the summary There
have also been studies on summarizing microblogs
for some specific types of events, e.g., sports events.
proposed to identify the participants of events, and
generate summaries based on sub-events detected
from each participant. introduced a solution by
learning the underlying hidden state representation of
the event, which needs to learn from previous events
(football games) with similar structure. In
summarized events by exploiting ―good reporters‖,
depending on event-specific keywords which need to
be given in advance. In contrast, we aim to deal with
general topic-relevant tweet streams without such
prior knowledge. Moreover, their method stores all
the tweets in each segment and selects a single tweet
as the summary, while our method maintains distilled
information in TCVs to reduce storage/ computation
cost, and generates multiple tweet summaries in
terms of content coverage and novelty. In addition to
online summarization, our method also supports
historical summarization by maintaining TCV
snapshots.
V. Timeline Detection
The demand for analyzing massive contents in
social medias fuels the developments in visualization
techniques. Timeline is one of these techniques
which can make analysis tasks easier and faster.
presented a timeline-based backchannel for
conversations around events. proposed the
evolutionary timeline summarization (ETS) to
compute evolution timelines similar to ours, which
consists of a series of time-stamped summaries. the
dates of summaries are determined by a pre-defined
timestamp set. In contrast, our method discovers the
changing dates and generates timelines dynamically
during the process of continuous summarization.
Moreover, ETS does not focus on efficiency and
scalability issues, which are very important in our
streaming context. Several systems detect important
moments when rapid increases or ―spikes‖ in status
update volume happen. Developed an algorithm
based on TCP congestion detection, employed a
slope-based method to find spikes. After that, tweets
from each moment are identified, and word clouds or
summaries are selected. Different from this two-step
approach, our method detects topic evolution and
produces summaries/timelines in an online fashion.
TABLE 1
BASIC INFORMATION OF DATASET
5.1 Summary-Based Variation
As tweets arrive from the stream, online
summaries are produced continuously by utilizing
online cluster statistics in TCVs. This allows for
generation of a real-time timeline. Generally, when
an obvious variation occurs in the main contents
discussed in tweets (in the form of summary), we can
expect a change of sub-topic (i.e., a time node on the
timeline). To quantify the variation, we use the
divergence to measure the distance between two
word distributions in two successive summaries Sc
and Sp (Sc is the distribution of the current summary
and Sp is that of the previous one)
5.2 Volume-Based Variation
Though the summary-based variation can reflect
sub-topic changes, some of them may not be
influential enough. Since many tweets are related to
users’ daily life or trivial events, a sub-topic change
detected from textual contents may not be significant
enough. To this end, we consider the use of rapid
increases (or ―spikes‖) in the volume of tweets over
time, which is a common technique in existing online
event detection systems . A spike suggests that
something essential in a minute happened because a
lot of people found the need to comment on it. In this
part, we develop a spike-finding method. As the
input, the binning process in Algorithm needs to
count the tweet arrival volume in each time unit.
5.1 Experiments for Summarization
5.1.1 Over All Performance Comparison
We construct five data sets to evaluate
summarization. One is obtained by conducting
keyword filtering on a large Twitter data set. The
4. K.Selvaraj Int. Journal of Engineering Research and Applications www.ijera.com
ISSN: 2248-9622, Vol. 5, Issue 11, (Part - 4) November 2015, pp.11-14
www.ijera.com 14 | P a g e
other four include tweets acquired during one month
in 2015 via Twitter’s keyword tracking.
Baseline methods: In Existing summarization
methods have not been designed to handle continuous
summarization. In this way, here implement the
sliding window version of the above three
algorithms, namely Cluster Sum, LexRank, and
DSDR.
In our experiments, we find similar trends in the
comparison of precision, recall and F-score between
the proposed approach and the baseline methods.
Fig .2.Scalability on data size
VI. CONCLUSION
We proposed a prototype called Sumblr which
supported continuous tweet stream summarization.
Sumblr employs a tweet stream clustering algorithm
to compress tweets into TCVs and maintains them in
an online fashion. Then, it uses a TCV-Rank
summarization algorithm for generating online
summaries and historical summaries with arbitrary
time durations. The topic evolution can be detected
automatically, allowing Sumblr to produce dynamic
timelines for tweet streams. The experimental results
make obvious the competence and success of our
method. For future work, we aim to develop a multi-
topic version of Sumblr in a spread system, and
estimate it on more complete and large-scale data
sets.
References
[1] C. C. Aggarwal, J. Han, J. Wang, and P. S.
Yu, ―A framework for clustering evolving
data streams,‖ in Proc. 29th Int. Conf.
VeryLarge Data Bases, 2003, pp. 81–92.
[2] T. Zhang, R. Ramakrishnan, and M. Livny,
―BIRCH: An efficient data clustering
method for very large databases,‖ in Proc.
ACM SIGMOD Int. Conf. Manage. Data,
1996, pp. 103–114.
[3] P. S. Bradley, U. M. Fayyad, and C. Reina,
―Scaling clustering algorithms to large
databases,‖ in Proc. Knowl. Discovery Data
Mining, 1998, pp. 9–15.
[4] L. Gong, J. Zeng, and S. Zhang, ―Text
stream clustering algorithm based on
adaptive feature selection,‖ Expert Syst.
Appl., vol. 38, no. 3, pp. 1393–1399, 2011.
[5] Q. He, K. Chang, E.-P. Lim, and J. Zhang,
―Bursty feature representation for clustering
text streams,‖ in Proc. SIAM Int. Conf. Data
Mining, 2007, pp. 491–496.
[6] J. Zhang, Z. Ghahramani, and Y. Yang, ―A
probabilistic model for online document
clustering with application to novelty
detection,‖ in Proc. Adv. Neural Inf.
Process. Syst., 2004, pp. 1617–1624.
[7] S. Zhong, ―Efficient streaming text
clustering,‖ Neural Netw., vol. 18, nos. 5/6,
pp. 790–798, 2005.
[8] C. C. Aggarwal and P. S. Yu, ―On clustering
massive text and categorical data streams,‖
Knowl. Inf. Syst., vol. 24, no. 2, pp. 171–
196, 2010.
[9] R. Barzilay and M. Elhadad, ―Using lexical
chains for text summarization,‖ in Proc.
ACL Workshop Intell. Scalable Text
Summarization, 1997, pp. 10–17.
[10] W.-T. Yih, J. Goodman, L. Vanderwende,
and H. Suzuki, ―Multidocument
summarization by maximizing informative
contentwords,‖ in Proc. 20th Int. Joint Conf.
Artif. Intell., 2007, pp. 1776–1782.
Balaji Subramani currently working as
Assistant Professor in Akshaya College of
Engineering and Technology. He has 4 years of
Teaching Experience. He did his Master Degree
M.Tech (IT) in K.S.Rangasamy College of
Technology. He has participated in various
International Conferences and workshops held at
different places. His area of interest includes Word
sense disambiguation, web mining, Information
retrieval and social network analysis.
Selvaraj Kumarasamy currently
pursuing Master degree in computer science and
engineering in Akshaya College of Engineering and
Technology. I Was participated in various
International Conferences and workshops held at
different places. I did my research work in data
mining .