The rapid growth of Internet has revolutionized online news reporting. Many users tend to use online news
websites to obtain news information. When considering Sri Lanka, there are numerous news websites,
which are subscribed on a daily basis. With the rise in this number of news websites, the Sri Lankan
authorities of media face the issue of lacking a proper methodology or a tool which is capable of tracking
and regulating publications made by different disseminators of news.
This paper proposes a News Agent toolbox which periodically extracts news articles and associated
comments with the aid of a concept called Mapping Rules; to classify them into Personalized Categories
defined in terms of keywords based Category Profiles. The proposed tool also analyzes comments made by
the readers with the aid of simple statistical techniques to discover the most popular news articles and
fluctuations in popularity of news stories.
Ins and Outs of News Twitter as a Real-Time News Analysis ServiceArjumand Younus
Paper approved to Workshop on Visual Interfaces to the Social and Semantic Web, 2011 (VISSW 2011) in conjunction with International Conference on Intelligent User Interfaces (IUI 2011) to be held at Stanford University, Palo Alto.
Pestle based event detection and classificationeSAT Journals
Abstract Organizations use PESTLE classification as a tool for tracking the environment in which they are functioning and for launching plan of new product or service. It helps to give true view of the environment from different aspects. These aspects are essential for any business that organization may be in as it gives a clear picture one wishes to check and observe while contemplating on certain idea or plan. The PESTLE framework helps to understand the market dynamics and is also one of the pillars of strategic management of an enterprise that drives goal and strategy for them. PESTLE based event detection approach proposed in this paper would help for PESTLE analysis of any organization. It puts together all relevant factors in terms of detected events in one place and classifies them into separate buckets while taking current market situation into consideration. We accomplish this with the application of clustering technique and later training the classifier to classify the events in PESTLE format. Keywords: Event Detection, PESTLE Analysis, Twitter
Social Media Influence Analysis using Data Science TechniquesMuhammad Bilal
The major purpose of this literature search report is to demonstrate the usage of different tactics of data science to investigate impact of social media while considering the interaction between influences and their followers.
Ins and Outs of News Twitter as a Real-Time News Analysis ServiceArjumand Younus
Paper approved to Workshop on Visual Interfaces to the Social and Semantic Web, 2011 (VISSW 2011) in conjunction with International Conference on Intelligent User Interfaces (IUI 2011) to be held at Stanford University, Palo Alto.
Pestle based event detection and classificationeSAT Journals
Abstract Organizations use PESTLE classification as a tool for tracking the environment in which they are functioning and for launching plan of new product or service. It helps to give true view of the environment from different aspects. These aspects are essential for any business that organization may be in as it gives a clear picture one wishes to check and observe while contemplating on certain idea or plan. The PESTLE framework helps to understand the market dynamics and is also one of the pillars of strategic management of an enterprise that drives goal and strategy for them. PESTLE based event detection approach proposed in this paper would help for PESTLE analysis of any organization. It puts together all relevant factors in terms of detected events in one place and classifies them into separate buckets while taking current market situation into consideration. We accomplish this with the application of clustering technique and later training the classifier to classify the events in PESTLE format. Keywords: Event Detection, PESTLE Analysis, Twitter
Social Media Influence Analysis using Data Science TechniquesMuhammad Bilal
The major purpose of this literature search report is to demonstrate the usage of different tactics of data science to investigate impact of social media while considering the interaction between influences and their followers.
In this contribution, we develop an accurate and effective event detection method to detect events from a
Twitter stream, which uses visual and textual information to improve the performance of the mining
process. The method monitors a Twitter stream to pick up tweets having texts and images and stores them
into a database. This is followed by applying a mining algorithm to detect an event. The procedure starts
with detecting events based on text only by using the feature of the bag-of-words which is calculated using
the term frequency-inverse document frequency (TF-IDF) method. Then it detects the event based on image
only by using visual features including histogram of oriented gradients (HOG) descriptors, grey-level cooccurrence
matrix (GLCM), and color histogram. K nearest neighbours (Knn) classification is used in the
detection. The final decision of the event detection is made based on the reliabilities of text only detection
and image only detection. The experiment result showed that the proposed method achieved high accuracy
of 0.94, comparing with 0.89 with texts only, and 0.86 with images only.
SOCIAL MEDIA ANALYSIS ON SUPPLY CHAIN MANAGEMENT IN FOOD INDUSTRYKaustubh Nale
This paper proposes the importance of
social media analysis in supply chain management in the
food industry. In this analysis, the social media platform
(Twitter) is used to obtain information. In this approach,
two different software (Nodexl and Nvivo) are used to
conduct data mining and text analysis. The outcome of this
analysis will help researchers to make decisions based on
customer feedback.
Social networking sites are a significant source of information to know the behavior of users and to know
what is occupying society of all ages and accordingly helpful information can be provided to specialists
and decision-makers. According to official sources, 98.43% of Saudi youth use social networking sites. The
study and analysis of social media data are done to provide the necessary information to increase
investment opportunities within the Kingdom of Saudi Arabia, by studying and analyzing what people
occupy on the communication sites through their tweets about the labor market and investment. Given the
huge volume of data and also its randomness, a survey of the data will be done and collected from through
keywords, the priority of arranging the data, and recording it as (positive - negative - mixed). The study
analysis and conclusion will be based on data-mining and its techniques of analysis and deduction
.
INCREASING THE INVESTMENT’S OPPORTUNITIES IN KINGDOM OF SAUDI ARABIA BY STUDY...ijcsit
Social networking sites are a significant source of information to know the behavior of users and to know
what is occupying society of all ages and accordingly helpful information can be provided to specialists
and decision-makers. According to official sources, 98.43% of Saudi youth use social networking sites. The
study and analysis of social media data are done to provide the necessary information to increase
investment opportunities within the Kingdom of Saudi Arabia, by studying and analyzing what people
occupy on the communication sites through their tweets about the labor market and investment. Given the
huge volume of data and also its randomness, a survey of the data will be done and collected from through
keywords, the priority of arranging the data, and recording it as (positive - negative - mixed). The study
analysis and conclusion will be based on data-mining and its techniques of analysis and deduction.
Quantum Criticism: an Analysis of Political News Reportingmlaij
In this project, we continuously collect data from the RSS feeds of traditional news sources. We apply
several pre-trained implementations of named entity recognition (NER) tools, quantifying the success of
each implementation. We also perform sentiment analysis of each news article at the document, paragraph
and sentence level, with the goal of creating a corpus of tagged news articles that is made available to the
public through a web interface. We show how the data in this corpus could be used to identify bias in news
reporting, and also establish different quantifiable publishing patterns of left-leaning and right-leaning
news organisations.
SOCIAL MEDIA NEWS: MOTIVATION, PURPOSE AND USAGEijcsit
This paper presents the results of an online survey which was conducted to analyse the use of social web in
the context of daily news. Users’ motivation and habit in the news consumption were focused. Moreover,
users’ news behaviour was distinguished in three purposes such news consumption, news production and
news dissemination to find out if the usage has a passive or active character. In a second step it was
questioned which social software is used for which purpose. In conclusion users appreciate social software
for features such as interactivity and information that traditional media does not provide. Among the social
web platforms users prefer social networking sites as well as videoshare platforms. Social networking sites
also rank first in the news production and dissemination.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
Extracting intelligence from online news sourceseSAT Journals
Abstract This paper summarizes initiative for news extraction when we are investigating a simple approach for visualization of a range of content. To find specific information easily a novel approach of 5W1H is easiest & best suitable. Here we are “Extracting Intelligence from various online news sources”. Intelligence here means “detecting &tracking, visualization”. So our objective is not only extracting the news events occurred but to visualize it as well. This paper presents relatively lightweight approach of mapping the extracted news events. We present results of our work in news event extraction ,relevancy visualization, news visualization of extracted events, to enhance user interaction in information access and exploitation tasks. Here our news event extraction is done by 5W1H approach for detecting & tracking news events & then using its output to visualizing those events by personalizing maps. Index Terms: Event extraction, Visualization, Detecting & tracking, NER, NEXUS
NEW ONTOLOGY RETRIEVAL IMAGE METHOD IN 5K COREL IMAGESijcax
Semantic annotation of images is an important research topic on both image understanding and database
or web image search. Image annotation is a technique to choosing appropriate labels for images with
extracting effective and hidden feature in pictures. In the feature extraction step of proposed method, we
present a model, which combined effective features of visual topics (global features over an image) and
regional contexts (relationship between the regions in Image and each other regions images) to automatic
image annotation.In the annotation step of proposed method, we create a new ontology (base on WordNet
ontology) for the semantic relationships between tags in the classification and improving semantic gap
exist in the automatic image annotation.Experiments result on the 5k Corel dataset show the proposed
method of image annotation in addition to reducing the complexity of the classification, increased accuracy
compared to the another methods
THE STUDY OF CUCKOO OPTIMIZATION ALGORITHM FOR PRODUCTION PLANNING PROBLEMijcax
Constrained Nonlinear programming problems are hard problems, and one of the most widely used and
common problems for production planning problem to optimize. In this study, one of the mathematical
models of production planning is survey and the problem solved by cuckoo algorithm. Cuckoo Algorithm is
efficient method to solve continues non linear problem. Moreover, mentioned models of production
planning solved with Genetic algorithm and Lingo software and the results will compared. The Cuckoo
Algorithm is suitable choice for optimization in convergence of solution
More Related Content
Similar to INTELLIGENT AGENT FOR PUBLICATION AND SUBSCRIPTION PATTERN ANALYSIS OF NEWS WEBSITES
In this contribution, we develop an accurate and effective event detection method to detect events from a
Twitter stream, which uses visual and textual information to improve the performance of the mining
process. The method monitors a Twitter stream to pick up tweets having texts and images and stores them
into a database. This is followed by applying a mining algorithm to detect an event. The procedure starts
with detecting events based on text only by using the feature of the bag-of-words which is calculated using
the term frequency-inverse document frequency (TF-IDF) method. Then it detects the event based on image
only by using visual features including histogram of oriented gradients (HOG) descriptors, grey-level cooccurrence
matrix (GLCM), and color histogram. K nearest neighbours (Knn) classification is used in the
detection. The final decision of the event detection is made based on the reliabilities of text only detection
and image only detection. The experiment result showed that the proposed method achieved high accuracy
of 0.94, comparing with 0.89 with texts only, and 0.86 with images only.
SOCIAL MEDIA ANALYSIS ON SUPPLY CHAIN MANAGEMENT IN FOOD INDUSTRYKaustubh Nale
This paper proposes the importance of
social media analysis in supply chain management in the
food industry. In this analysis, the social media platform
(Twitter) is used to obtain information. In this approach,
two different software (Nodexl and Nvivo) are used to
conduct data mining and text analysis. The outcome of this
analysis will help researchers to make decisions based on
customer feedback.
Social networking sites are a significant source of information to know the behavior of users and to know
what is occupying society of all ages and accordingly helpful information can be provided to specialists
and decision-makers. According to official sources, 98.43% of Saudi youth use social networking sites. The
study and analysis of social media data are done to provide the necessary information to increase
investment opportunities within the Kingdom of Saudi Arabia, by studying and analyzing what people
occupy on the communication sites through their tweets about the labor market and investment. Given the
huge volume of data and also its randomness, a survey of the data will be done and collected from through
keywords, the priority of arranging the data, and recording it as (positive - negative - mixed). The study
analysis and conclusion will be based on data-mining and its techniques of analysis and deduction
.
INCREASING THE INVESTMENT’S OPPORTUNITIES IN KINGDOM OF SAUDI ARABIA BY STUDY...ijcsit
Social networking sites are a significant source of information to know the behavior of users and to know
what is occupying society of all ages and accordingly helpful information can be provided to specialists
and decision-makers. According to official sources, 98.43% of Saudi youth use social networking sites. The
study and analysis of social media data are done to provide the necessary information to increase
investment opportunities within the Kingdom of Saudi Arabia, by studying and analyzing what people
occupy on the communication sites through their tweets about the labor market and investment. Given the
huge volume of data and also its randomness, a survey of the data will be done and collected from through
keywords, the priority of arranging the data, and recording it as (positive - negative - mixed). The study
analysis and conclusion will be based on data-mining and its techniques of analysis and deduction.
Quantum Criticism: an Analysis of Political News Reportingmlaij
In this project, we continuously collect data from the RSS feeds of traditional news sources. We apply
several pre-trained implementations of named entity recognition (NER) tools, quantifying the success of
each implementation. We also perform sentiment analysis of each news article at the document, paragraph
and sentence level, with the goal of creating a corpus of tagged news articles that is made available to the
public through a web interface. We show how the data in this corpus could be used to identify bias in news
reporting, and also establish different quantifiable publishing patterns of left-leaning and right-leaning
news organisations.
SOCIAL MEDIA NEWS: MOTIVATION, PURPOSE AND USAGEijcsit
This paper presents the results of an online survey which was conducted to analyse the use of social web in
the context of daily news. Users’ motivation and habit in the news consumption were focused. Moreover,
users’ news behaviour was distinguished in three purposes such news consumption, news production and
news dissemination to find out if the usage has a passive or active character. In a second step it was
questioned which social software is used for which purpose. In conclusion users appreciate social software
for features such as interactivity and information that traditional media does not provide. Among the social
web platforms users prefer social networking sites as well as videoshare platforms. Social networking sites
also rank first in the news production and dissemination.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
Extracting intelligence from online news sourceseSAT Journals
Abstract This paper summarizes initiative for news extraction when we are investigating a simple approach for visualization of a range of content. To find specific information easily a novel approach of 5W1H is easiest & best suitable. Here we are “Extracting Intelligence from various online news sources”. Intelligence here means “detecting &tracking, visualization”. So our objective is not only extracting the news events occurred but to visualize it as well. This paper presents relatively lightweight approach of mapping the extracted news events. We present results of our work in news event extraction ,relevancy visualization, news visualization of extracted events, to enhance user interaction in information access and exploitation tasks. Here our news event extraction is done by 5W1H approach for detecting & tracking news events & then using its output to visualizing those events by personalizing maps. Index Terms: Event extraction, Visualization, Detecting & tracking, NER, NEXUS
Similar to INTELLIGENT AGENT FOR PUBLICATION AND SUBSCRIPTION PATTERN ANALYSIS OF NEWS WEBSITES (20)
NEW ONTOLOGY RETRIEVAL IMAGE METHOD IN 5K COREL IMAGESijcax
Semantic annotation of images is an important research topic on both image understanding and database
or web image search. Image annotation is a technique to choosing appropriate labels for images with
extracting effective and hidden feature in pictures. In the feature extraction step of proposed method, we
present a model, which combined effective features of visual topics (global features over an image) and
regional contexts (relationship between the regions in Image and each other regions images) to automatic
image annotation.In the annotation step of proposed method, we create a new ontology (base on WordNet
ontology) for the semantic relationships between tags in the classification and improving semantic gap
exist in the automatic image annotation.Experiments result on the 5k Corel dataset show the proposed
method of image annotation in addition to reducing the complexity of the classification, increased accuracy
compared to the another methods
THE STUDY OF CUCKOO OPTIMIZATION ALGORITHM FOR PRODUCTION PLANNING PROBLEMijcax
Constrained Nonlinear programming problems are hard problems, and one of the most widely used and
common problems for production planning problem to optimize. In this study, one of the mathematical
models of production planning is survey and the problem solved by cuckoo algorithm. Cuckoo Algorithm is
efficient method to solve continues non linear problem. Moreover, mentioned models of production
planning solved with Genetic algorithm and Lingo software and the results will compared. The Cuckoo
Algorithm is suitable choice for optimization in convergence of solution
COMPARATIVE ANALYSIS OF ROUTING PROTOCOLS IN MOBILE AD HOC NETWORKSijcax
A Mobile Ad Hoc Network (MANET) is a collection of mobile nodes that want to communicate without any
pre-determined infrastructure and fixed organization of available links. Each node in MANET operates as
a router, forwarding information packets for other mobile nodes. There are many routing protocols that
possess different performance levels in different scenarios. The main task is to evaluate the existing routing
protocols and finding by comparing them the best one. In this article we compare AODV, DSR, DSDV,
OLSR and DYMO routing protocols in mobile ad hoc networks (MANETs) to specify the best operational
conditions for each MANETs protocol. We study these five MANETs routing protocols by different
simulations in NS-2 simulator. We describe that pause time parameter affect their performance. This
performance analysis is measured in terms of Packet Delivery Ratio, Average End-to-End Delay,
Normalized Routing Load and Average Throughput.
PREDICTING ACADEMIC MAJOR OF STUDENTS USING BAYESIAN NETWORKS TO THE CASE OF ...ijcax
In this study, which took place current year in the city of Maragheh in IRAN. Number of high school
students in the fields of study: mathematics, Experimental Sciences, humanities, vocational, business and
science were studied and compared. The purpose of this research is to predict the academic major of high
school students using Bayesian networks. The effective factors have been used in academic major selection
for the first time as an effective indicator of Bayesian networks. Evaluation of Impacts of indicators on
each other, discretization data and processing them was performed by GeNIe. The proper course would be
advised for students to continue their education.
A Multi Criteria Decision Making Based Approach for Semantic Image Annotation ijcax
Automatic image annotation has emerged as an important research topic due to its potential application on
both image understanding and web image search. This paper presents a model, which integrates visual
topics and regional contexts to automatic image annotation. Regional contexts model the relationship
between the regions, while visual topics provide the global distribution of topics over an image. Previous
image annotation methods neglected the relationship between the regions in an image, while these regions
are exactly explanation of the image semantics, therefore considering the relationship between them are
helpful to annotate the images. Regional contexts and visual topics are learned by PLSA (Probability
Latent Semantic Analysis) from the training data. The proposed model incorporates these two types of
information by MCDM (Multi Criteria Decision Making) approach based on WSM (Weighted Sum
Method). Experiments conducted on the 5k Corel dataset demonstrate the effectiveness of the proposed
model.
On Fuzzy Soft Multi Set and Its Application in Information Systems ijcax
Research on information and communication technologies have been developed rapidly since it can be
applied easily to several areas like computer science, medical science, economics, environments,
engineering, among other. Applications of soft set theory, especially in information systems have been
found paramount importance. Recently, Mukherjee and Das defined some new operations in fuzzy soft
multi set theory and show that the De-Morgan’s type of results hold in fuzzy soft multi set theory with
respect to these newly defined operations. In this paper, we extend their work and study some more basic
properties of their defined operations. Also, we define some basic supporting tools in information system
also application of fuzzy soft multi sets in information system are presented and discussed. Here we define
the notion of fuzzy multi-valued information system in fuzzy soft multi set theory and show that every fuzzy
soft multi set is a fuzzy multi valued information system.
Web is a collection of inter-related files on one or more web servers while web mining means extracting
valuable information from web databases. Web mining is one of the data mining domains where data
mining techniques are used for extracting information from the web servers. The web data includes web
pages, web links, objects on the web and web logs. Web mining is used to understand the customer
behaviour, evaluate a particular website based on the information which is stored in web log files. Web
mining is evaluated by using data mining techniques, namely classification, clustering, and association
rules. It has some beneficial areas or applications such as Electronic commerce, E-learning, Egovernment, E-policies, E-democracy, Electronic business, security, crime investigation and digital library.
Retrieving the required web page from the web efficiently and effectively becomes a challenging task
because web is made up of unstructured data, which delivers the large amount of information and increase
the complexity of dealing information from different web service providers. The collection of information
becomes very hard to find, extract, filter or evaluate the relevant information for the users. In this paper,
we have studied the basic concepts of web mining, classification, processes and issues. In addition to this,
this paper also analyzed the web mining research challenges.
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR ijcax
The Pattern classification system classifies the pattern into feature space within a boundary. In case
adversarial applications use, for example Spam Filtering, the Network Intrusion Detection System (NIDS),
Biometric Authentication, the pattern classification systems are used. Spam filtering is an adversary
application in which data can be employed by humans to attenuate perspective operations. To appraise the
security issue related Spam Filtering voluminous machine learning systems. We presented a framework for
the experimental evaluation of the classifier security in an adversarial environments, that combines and
constructs on the arms race and security by design, Adversary modelling and Data distribution under
attack. Furthermore, we presented a SVM, LR and MILR classifier for classification to categorize email as
legitimate (ham) or spam emails on the basis of thee text samples.
Visually impaired people face many problems in their day to day lives. Among them, outdoor navigation is
one of the major concerns. The existing solutions based on Wireless Sensor Networks(WSN) and Global
Positioning System (GPS) track ZigBee units or RFID (Radio Frequency Identification) tags fixed on the
navigation system. The issues pertaining to these solutions are as follows: (1) It is suitable only when the
visually impaired person is commuting in a familiar environment; (2) The device provides only a one way
communication; (3) Most of these instruments are heavy and sometimes costly. Preferable solution would
be to make a system which is easy to carry and cheap.
The objective of this paper is to break down the technological barriers, and to propose a system by
developing an Android App which would help a visually impaired person while traveling via the public
transport system like Bus. The proposed system uses an inbuilt feature of smart phone such as GPS
location tracker to track the location of the user and Text to Speech converter. The system also integrates
Google Speech to Text converter for capturing the voice input and converts them to text. This system
recommends the requirement of installing a GPS module in buses for real time tracking. With minor
modification, this App can also help older people for independent navigation.
ADVANCED E-VOTING APPLICATION USING ANDROID PLATFORMijcax
The advancement in the mobile devices, wireless and web technologies given rise to the new application
that will make the voting process very easy and efficient. The E-voting promises the possibility of
convenient, easy and safe way to capture and count the votes in an election[1]. This research project
provides the specification and requirements for E-Voting using an Android platform. The e-voting means
the voting process in election by using electronic device. The android platform is used to develop an evoting application. At first, an introduction about the system is presented. Sections II and III describe all
the concepts (survey, design and implementation) that would be used in this work. Finally, the proposed evoting system will be presented. This technology helps the user to cast the vote without visiting the polling
booth. The application follows proper authentication measures in order to avoid fraud voters using the
system. Once the voting session is completed the results can be available within a fraction of seconds. All
the candidates vote count is encrypted and stored in the database in order to avoid any attacks and
disclosure of results by third person other than the administrator. Once the session is completed the admin
can decrypt the vote count and publish results and can complete the voting process.
The design of silicon chips in every semiconductor industry involves the testing of these chips with other
components on the board. The platform developed acts as power on vehicle for the silicon chips. This
Printed Circuit Board design that serves as a validation platform is foundational to the semiconductor
industry.
The manual/repetitive design activities that accompany the development of this board must be minimized to
achieve high quality, improve design efficiency, and eliminate human-errors. One of the time consuming
tasks in the board design is the Trace Length matching. The paper aims to reduce the length matching time
by automating it using SKILL scripts.
RESEARCH TRENDS İN EDUCATIONAL TECHNOLOGY İN TURKEY: 2010-2018 YEAR THESIS AN...ijcax
The purpose of this research is the analysis using meta-analysis of studies in the field of Educational
Technology in Turkey and in the field is to demonstrate how to get to that trend. For this purpose, a total of
263 studies were analyzed including 98 theses and 165 articles published between 2010-2018. Purpose
sampling method was used when selecting publications. In the research, while selecting articles and theses;
Turkey addressed; YOK Tez Tarama Database, Journal of Hacettepe University Faculty of Education,
Educational Sciences : Theory & Practice Journal, Education and Science Journal, Elementary Education
Online Journal, The Turkish Online Journal of Education and The Turkish Online Journal of Educational
Technology used in journals. Publications have been reviewed under 11 criteria. Index, year of
publication, research scope, method, education level, sample, number of samples, data collection methods,
analysis techniques, and research tendency, research topics in Educational Technology Research in Turkey
has revealed. The data is interpreted based on percentage and frequency and the results are shown using
the table.
RESEARCH TRENDS İN EDUCATIONAL TECHNOLOGY İN TURKEY: 2010-2018 YEAR THESIS AN...ijcax
The purpose of this research is the analysis using meta-analysis of studies in the field of Educational
Technology in Turkey and in the field is to demonstrate how to get to that trend. For this purpose, a total of
263 studies were analyzed including 98 theses and 165 articles published between 2010-2018. Purpose
sampling method was used when selecting publications. In the research, while selecting articles and theses;
Turkey addressed; YOK Tez Tarama Database, Journal of Hacettepe University Faculty of Education,
Educational Sciences : Theory & Practice Journal, Education and Science Journal, Elementary Education
Online Journal, The Turkish Online Journal of Education and The Turkish Online Journal of Educational
Technology used in journals. Publications have been reviewed under 11 criteria. Index, year of
publication, research scope, method, education level, sample, number of samples, data collection methods,
analysis techniques, and research tendency, research topics in Educational Technology Research in Turkey
has revealed. The data is interpreted based on percentage and frequency and the results are shown using
the table
IMPACT OF APPLYING INTERNATIONAL QUALITY STANDARDS ON MEDICAL EQUIPMENT IN SA...ijcax
With the great development that, modern medical technology is witnessing today, medical devices and
equipment have become a basic pillar of any healthcare system in the world and cannot be dispensed with,
so we find competition between the major companies that manufacture medical devices and equipment
resulting in a huge variety of complex modern medical technologies. These medical devices and equipment
require high accuracy in manufacturing and packaging in addition to operation, maintenance, and followup, because any error in any of the previous stages will have bad consequences for the patients and the
health system, there are many accidents that have led to some deaths. Therefore, we find that many medical
device producers and medical companies in addition to health service providers seek to find systems and
protocols to reduce accidents resulting from medical devices. As a result, many systems have recently
appeared that seek to protect from the dangers of medical devices and equipment. This research aims to
conduct a study of the effects of international standards on the safety of medical devices and equipment
and reduce their risks. By counting the international standards in force in the Kingdom of Saudi Arabia
that are applied by the Saudi Food and Drug Authority, making questionnaires, and distributing them to
health service providers and regulatory bodies for medical devices and equipment, considering the data,
these data will be analysed and evaluated the effectiveness of quality systems and standards in maintaining
Effectiveness and quality of medical devices and equipment. The study will include governmental and
private health services sectors.
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR ijcax
The Pattern classification system classifies the pattern into feature space within a boundary. In case adversarial applications use, for example Spam Filtering, the Network Intrusion Detection System (NIDS), Biometric Authentication, the pattern classification systems are used. Spam filtering is an adversary
application in which data can be employed by humans to attenuate perspective operations. To appraise the
security issue related Spam Filtering voluminous machine learning systems. We presented a framework for the experimental evaluation of the classifier security in an adversarial environments, that combines and constructs on the arms race and security by design, Adversary modelling and Data distribution under
attack. Furthermore, we presented a SVM, LR and MILR classifier for classification to categorize email as legitimate (ham) or spam emails on the basis of thee text samples
Developing Product Configurator Tool Using CADs’ API with the help of Paramet...ijcax
Order placingis a crucial phase of lifecycle of a Mass-customizable product and seeks improvement in
Mechanical industry. ‘Product Configurator’ is a good solution to bring in data transparency and speed
up the process. Configuration tools arebeing used on a very small scale,reasons being lack of awareness
and dearer costs of existing tools. In this research work a product configurator is developedfor
Hydraulic Actuator (HA).This method uses Applicable Programing Interface (API) of a CAD tool coupled
with Visual Basics (VB) and MS Excel.Itis a standaloneapplication of VB and its integration into web
portal can be the future scope. The final aim was to reduce time delay at CRM phase,bring more
transparency in the ordering system and to establish a method which, small and medium scale enterprises
canafford. Trails on the tool developed generated Part-Assembly drawings, BOM and JT files in
moments.
DESIGN AND DEVELOPMENT OF CUSTOM CHANGE MANAGEMENT WORKFLOW TEMPLATES AND HAN...ijcax
A large no. of automobile companies finding a convinient way to manage design changes with the use of
various PLM techniques. Change in any product is something that should occur on timely basis to match
up with customer requirement and cost reduction. The change made in the vehicle designs directly affects
various concerned agencies. Automobile Vehicle structures contains thousands of parts and if there is any
change is occurring in child parts then it becomes important to track that impacted part, propose a solution
on that part and release a new assembly structure with feasible changes such that all efforts need to be
done for cost reduction.
Visually impaired people face many problems in their day to day lives. Among them, outdoor navigation is
one of the major concerns. The existing solutions based on Wireless Sensor Networks(WSN) and Global
Positioning System (GPS) track ZigBee units or RFID (Radio Frequency Identification) tags fixed on the
navigation system. The issues pertaining to these solutions are as follows: (1) It is suitable only when the
visually impaired person is commuting in a familiar environment; (2) The device provides only a one way
communication; (3) Most of these instruments are heavy and sometimes costly. Preferable solution would
be to make a system which is easy to carry and cheap.
The objective of this paper is to break down the technological barriers, and to propose a system by
developing an Android App which would help a visually impaired person while traveling via the public
transport system like Bus. The proposed system uses an inbuilt feature of smart phone such as GPS
location tracker to track the location of the user and Text to Speech converter. The system also integrates
Google Speech to Text converter for capturing the voice input and converts them to text. This system
recommends the requirement of installing a GPS module in buses for real time tracking. With minor
modification, this App can also help older people for independent navigation.
TEACHER’S ATTITUDE TOWARDS UTILISING FUTURE GADGETS IN EDUCATION ijcax
Today’s era is an era of modernization and globalization. Everything is happening at a very fast rate
whether it is politics, societal reforms, commercialization, transportation, or educational innovations. In
every few second, technology grows either in the form of arrival of the new devices/gadgets with millions of
apps and these latest technological objects may be in the form of hardware/software devices. We are the
educationists, teachers, students and stakeholders of present Indian educational system. These
gadgets/devices are partly being used by us or most of them are still unaware of these innovative
technologies due to the mass media or economical factor. So, there is a need to improvise ourselves
towards utilizing the future gadgets in order to explore the educational uses, barriers and preparatoryneeds of these available devices for educational purposes. This paper aims to study the opinion of the
teacher-educators about the usage of future gadgets in higher education. It will also contribute towards
establishing the list of latest technological devices, and how it can enhances the process of teachinglearning system.
IMPROVING REAL TIME TASK AND HARNESSING ENERGY USING CSBTS IN VIRTUALIZED CLOUDijcax
Cloud computing provides the facility for the business customers to scale up and down their resource usage
based on needs. This is because of the virtualization technology. The scheduling objectives are to improve
the system’s schedule ability for the real-time tasks and to save energy. To achieve the objectives, we
employed the virtualization technique and rolling-horizon optimization with vertical scheduling operation.
The project considers Cluster Scoring Based Task Scheduling (CSBTS) algorithm which aims to decrease
task’s completion time and the policies for VM’s creation, migration and cancellation are to dynamically
adjust the scale of cloud in a while meets the real-time requirements and to save energy
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSEDuvanRamosGarzon1
AIRCRAFT GENERAL
The Single Aisle is the most advanced family aircraft in service today, with fly-by-wire flight controls.
The A318, A319, A320 and A321 are twin-engine subsonic medium range aircraft.
The family offers a choice of engines
Welcome to WIPAC Monthly the magazine brought to you by the LinkedIn Group Water Industry Process Automation & Control.
In this month's edition, along with this month's industry news to celebrate the 13 years since the group was created we have articles including
A case study of the used of Advanced Process Control at the Wastewater Treatment works at Lleida in Spain
A look back on an article on smart wastewater networks in order to see how the industry has measured up in the interim around the adoption of Digital Transformation in the Water Industry.
Vaccine management system project report documentation..pdfKamal Acharya
The Division of Vaccine and Immunization is facing increasing difficulty monitoring vaccines and other commodities distribution once they have been distributed from the national stores. With the introduction of new vaccines, more challenges have been anticipated with this additions posing serious threat to the already over strained vaccine supply chain system in Kenya.
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxR&R Consult
CFD analysis is incredibly effective at solving mysteries and improving the performance of complex systems!
Here's a great example: At a large natural gas-fired power plant, where they use waste heat to generate steam and energy, they were puzzled that their boiler wasn't producing as much steam as expected.
R&R and Tetra Engineering Group Inc. were asked to solve the issue with reduced steam production.
An inspection had shown that a significant amount of hot flue gas was bypassing the boiler tubes, where the heat was supposed to be transferred.
R&R Consult conducted a CFD analysis, which revealed that 6.3% of the flue gas was bypassing the boiler tubes without transferring heat. The analysis also showed that the flue gas was instead being directed along the sides of the boiler and between the modules that were supposed to capture the heat. This was the cause of the reduced performance.
Based on our results, Tetra Engineering installed covering plates to reduce the bypass flow. This improved the boiler's performance and increased electricity production.
It is always satisfying when we can help solve complex challenges like this. Do your systems also need a check-up or optimization? Give us a call!
Work done in cooperation with James Malloy and David Moelling from Tetra Engineering.
More examples of our work https://www.r-r-consult.dk/en/cases-en/
Water scarcity is the lack of fresh water resources to meet the standard water demand. There are two type of water scarcity. One is physical. The other is economic water scarcity.
Immunizing Image Classifiers Against Localized Adversary Attacksgerogepatton
This paper addresses the vulnerability of deep learning models, particularly convolutional neural networks
(CNN)s, to adversarial attacks and presents a proactive training technique designed to counter them. We
introduce a novel volumization algorithm, which transforms 2D images into 3D volumetric representations.
When combined with 3D convolution and deep curriculum learning optimization (CLO), itsignificantly improves
the immunity of models against localized universal attacks by up to 40%. We evaluate our proposed approach
using contemporary CNN architectures and the modified Canadian Institute for Advanced Research (CIFAR-10
and CIFAR-100) and ImageNet Large Scale Visual Recognition Challenge (ILSVRC12) datasets, showcasing
accuracy improvements over previous techniques. The results indicate that the combination of the volumetric
input and curriculum learning holds significant promise for mitigating adversarial attacks without necessitating
adversary training.
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)MdTanvirMahtab2
This presentation is about the working procedure of Shahjalal Fertilizer Company Limited (SFCL). A Govt. owned Company of Bangladesh Chemical Industries Corporation under Ministry of Industries.
About
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Technical Specifications
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
Key Features
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface
• Compatible with MAFI CCR system
• Copatiable with IDM8000 CCR
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
Application
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Democratizing Fuzzing at Scale by Abhishek Aryaabh.arya
Presented at NUS: Fuzzing and Software Security Summer School 2024
This keynote talks about the democratization of fuzzing at scale, highlighting the collaboration between open source communities, academia, and industry to advance the field of fuzzing. It delves into the history of fuzzing, the development of scalable fuzzing platforms, and the empowerment of community-driven research. The talk will further discuss recent advancements leveraging AI/ML and offer insights into the future evolution of the fuzzing landscape.
INTELLIGENT AGENT FOR PUBLICATION AND SUBSCRIPTION PATTERN ANALYSIS OF NEWS WEBSITES
1. International Journal of Computer-Aided Technologies (IJCAx) Vol.2,No.1,January 2015
1
INTELLIGENT AGENT FOR PUBLICATION AND
SUBSCRIPTION PATTERN ANALYSIS OF NEWS
WEBSITES
W.D.R Wijedasa and Chathura De Silva
Department of Computer Science Engineering, Faculty of Engineering University of
Moratuwa, Sri Lanka
ABSTRACT
The rapid growth of Internet has revolutionized online news reporting. Many users tend to use online news
websites to obtain news information. When considering Sri Lanka, there are numerous news websites,
which are subscribed on a daily basis. With the rise in this number of news websites, the Sri Lankan
authorities of media face the issue of lacking a proper methodology or a tool which is capable of tracking
and regulating publications made by different disseminators of news.
This paper proposes a News Agent toolbox which periodically extracts news articles and associated
comments with the aid of a concept called Mapping Rules; to classify them into Personalized Categories
defined in terms of keywords based Category Profiles. The proposed tool also analyzes comments made by
the readers with the aid of simple statistical techniques to discover the most popular news articles and
fluctuations in popularity of news stories.
KEYWORDS
News Articles; Mapping Rules; Personalized Classification; Category Profiles
1.INTRODUCTION
The rapid growth of Internet and the increased confidence on digitized information have
revolutionized online news reporting. Such improvements provide efficient means and ways such
as news websites and news blogs, for disseminating news information much quicker than ever
before. The main objective of online news reporting is to provide up-to-date news as well as to
increase news consumption and its usages.
When considering Sri Lanka, there are large number of news websites and news blogs in all three
mediums, Sinhala, English and Tamil. These news websites and blogs publish news related to
local issues, politics, events, celebrations, people, business, weather and entertainment. Such
news websites are subscribed on a daily basis for up-to-date information. With the growth in this
number of news websites and news blogs, the Sri Lankan authorities of media face the issue of
lacking a proper methodology or a tool which is capable of carrying out the following. Identifying
frequently published news topics, finding out topics related to a given set of keywords,
identifying topics which increase popularity with respect to time and identifying news
subscription patterns of users in terms of spotting out the regular users, which topics they
subscribe most and whether they actively express their ideas via commenting.
Lacking of a proper methodology or a tool with previously mentioned capabilities has led the
responsible parties of media to face several issues. Among these issues, difficulty of tracking and
regulating publications made by different disseminators of news takes a major place. This major
2. International Journal of Computer-Aided Technologies (IJCAx) Vol.2,No.1,January 2015
2
issue has led to several sub issues such as difficulty in identifying incidents which require more
attention and difficulty in maintaining the fairness, accuracy and balance of coverage under each
publication. Moreover the difficulty in regulating publications made by different parties in a way
in which it meets the demands of the readers can be considered as another major issue.
A media system within a country directly influences its society in various ways and plays an
important role in the economy of the country. It has been argued that a person’s view is
influenced more by the media system within the country than by his or her personal experiences.
Additionally the public opinion and the awareness are also affected by the means by which news
is being reported [1].
When considering the media system of Sri Lanka, online news reporting also plays a significant
role that is equivalent to traditional paper based news media. As a result of the significant role
played by the news websites within the Sri Lankan media system, it highlights that these online
news websites and news blogs need to balance reporting of news while providing equal coverage
on each and every view. Moreover fairness and accuracy of covered topics are also considered as
equally important qualities. Such qualities can be maintained by solving previously mentioned
issues faced by the Sri Lankan authorities of media.
This paper proposes an Intelligent Agent for publication and subscription patterns analysis of
news websites and associated reader comments, using Personalized Classification [2] and simple
statistical techniques.
2.PREVIOUS WORK
Media analysis has remained as a vast research domain, in social sciences for several decades.
Traditional media analysis has been based on a process known as “coding”, which involves
manual analysis and annotation of small sets of news articles[1][3].
As the result of exponential growth of the World Wide Web and the Internet, online news
reporting has been modernized with the most efficient ways of reporting news. Therefore the
users and authorities of media are now experiencing overwhelming quantities of readily available
news content which grows day by day. This has resulted in a great difficulty for the responsible
parties of media to analyze the news content via coding.
A variety of research efforts have been carried out to provide an opportunity to monitor a vast
number of news outlets, constantly and in an automated way [3] [4] [5] [6]. The required
automation of analysis has been realized by the utilization of Artificial Intelligence (AI)
techniques from the fields of Natural Language Processing ( NLP), Text mining, Text Analytics
and Machine Learning (ML) [3] [4].
Some of the existing automations of analysis are News Outlets Analysis and Monitoring
(NOAM) [3], Lynda [4], News Analysis System (NAS) [5], NewsBlaster [6] and Google News.
NOAM [3] is a data management system which gathers and monitors multi-lingual news content
from 21 countries. Lynda [4] is a multi-purpose system which focuses mainly on the US media
system. It has been developed to detect spatial and temporal distributions of Named Entities. NAS
[5] is a prototype news analysis system which classifies and indexes news stories in real time
while NewsBlaster [6] is a robust news tracking and summarization system of Colombia.
Extraction of news articles from various semi-structured and heterogeneous online sources is a
major activity and a huge challenge for above mentioned automated systems. Various research
efforts have been taken place to come up with algorithms and approaches for extracting news
3. International Journal of Computer-Aided Technologies (IJCAx) Vol.2,No.1,January 2015
3
articles from online sources. Among such approaches, work in [7] has proposed a system called
HTML2RSS which extracts content from HTML Web pages based on Document Object Model
(DOM) structure to generate RSS files. Authors have introduced a concept called Mapping Rules
[7] for extraction purposes. A Mapping Rule is defined as XML file containing XPath
information, which specifies from which node of the DOM tree the content should be extracted.
These Mapping Rules have been generated using manual help. A GUI has been provided for users
to mark HTML elements such as text and links. Subsequently the XPath information of the
selected elements has been extracted automatically by the system.
Most of the automated systems such as NOAM [3], NAS [5], NewsBlaster [6] and Google News
have utilized Clustering or Classification for analyzing the extracted news articles. Grouping
news articles into stories describing a similar event is a major activity in such systems. Various
research efforts have been taken place in order to identify the benefits of existing algorithms and
to propose new approaches for grouping news articles. Among such efforts a novel classification
approach named Personalized Classification has been presented in [2] [8] for classifying news
documents. Personalized Classification enables users to define their own categories of interest on
the fly, with the aim of automating the assignment of documents to such categories. The authors
have used a set of keywords to define each Personalized Category and have utilized Support
Vector Machines (SVM) to perform the classification. The training documents for Personalized
Categories have been obtained from a pool of training documents with the use of a search engine
and the set of keywords defining each category. Subsequently a classifier per Personalized
Category has been created and trained for grouping the articles into stories.
3.METHODOLOGY
Section 3 will describe the approach carried out by this research to address the important issues
identified in section 1, by coming up with a News Agent Toolbox. The proposed tool has been
fundamentally developed to periodically analyze patterns and visualize the results for surveillance
purposes.
The news content analysis process of the tool follows a three step methodology, namely the news
content acquisition, pre-processing and analysis, as in existing systems such as NOAM. A probe
was developed for periodical acquisition of news content, associated reader comments and news
article sharing information made on social networks, from a set of pre-defined Sri Lankan news
websites. The pre-processing step involves tokenizing, stop-word removal, stemming and
representing the news content in a suitable form. The content analysis step involves major
activities, such as classifying news articles into Personalized Categories [2] [8] and analyzing
reader comments and sharing information associated with news articles. The high level
architecture of the proposed tool is depicted in Figure 1.
4. International Journal of Computer-Aided Technologies (IJCAx) Vol.2,No.1,January 2015
4
Figure 1. High level architecture of the proposed system
The tool has been designed using a modular architecture where each module is specialized for a
particular task. According to the three steps methodology, the tool has been designed with four
major modules namely the Content Acquisitioner, Pre-processor, News Analyzer and Comments
Analyzer. Moreover a user friendly GUI and a visualization tool has been attached to the system
to support user interaction and visualization of results.
3.1.Content Acquisitioner
The Content Acquisitioner module was developed to extract news articles, associated user
comments and sharing information on social networks, from a set of predefined English news
websites and news blogs.
Several challenges were encountered while developing the extraction functionality of the Content
Acquisitioner module. Non-uniform structures of news websites and variations in organization of
news articles take a major place among these challenges. Each news website has its own structure
and its own way of organizing news articles. Such arrangements lead to difficulties in coming up
with an extraction methodology which suits all news websites with different structures. Existence
of news websites which do not follow specifications and websites with malformed html content is
also challenging. Moreover, many news websites place additional content such as videos,
advertisements and navigational elements in between news content, making the extraction process
of full text news content a complicated task.
In order to minimize the impact of above mentioned challenges and to come up with an extraction
methodology which suits many sites with different arrangements, the work proposed in [7],
Mapping Rules were used for each news website in order to extract navigational links that point
to full stories of news articles.
5. International Journal of Computer-Aided Technologies (IJCAx) Vol.2,No.1,January 2015
5
3.1.1.News Content Extraction
A four step methodology was used to extract news articles from a set of user provided news
websites.
Step 1:XPath based Mapping Rule generation for each website
As the first step of the extraction process, Mapping Rules [7] per news website will be generated
with the manual assistance. The user desired news website will be loaded via the tool. The user
will be provided with a functionality to select sample links pointing to news articles, within web
pages where news articles are required to be extracted. These selected sample links will be used
to distinguish links pointing to news articles from links pointing to additional content such as
videos, advertisements and etc. Subsequently the XPath of the user selected sample links will be
determined by the tool. Next the identified XPath will be processed in a way that can be used to
aid the extraction of all news links appearing in similar paths within a given web page.
Sample XPath expression identified for a selected news link.
/html /body /div [3] /div [10] /div [2] /div [4] /div [2] /h5 /a
Sample XPath expression which has been processed to aid the extraction of news links appearing
in similar paths.
/html /body /div /div /div /div /div /h5 /a
Step 2: Creation of DOM trees for web pages containing news article links
A DOM tree structure is generated for each webpage from which the articles are required to be
extracted, using HtmlAgilityPack [9] parser.
Step 3: Extraction of links pointing to news articles based on Mapping Rules
Links pointing to news articles which fall under given XPath based Mapping Rules are extracted
by traversing through DOM trees of the web pages.
Step 4: Extraction of full text news content from news articles
Nboilerpipe [10] library, developed by Christian Kohlschütter, was used for extracting textual
news content from news articles pointed by the links extracted in the previous step.
3.1.2. Reader Comments Extraction
A three step methodology was used to extract the author of the comment, commented date and
the comments associated with the extracted news articles.
Step 1: XPath based Mapping Rule generation
As the first step of the comments extraction process, three separate Mapping Rules will be
generated for each website, in order to aid the extraction of commented authors together with the
commented dates and the comments.
Step 2: Creation of DOM trees for news articles containing user comments
A DOM tree structure is generated for each news article webpage.
6. International Journal of Computer-Aided Technologies (IJCAx) Vol.2,No.1,January 2015
6
Step 3: Extraction of comments from news articles based on Mapping Rules
The commented authors and their corresponding comments and the commented dates within each
article web page are extracted by traversing through DOM trees.
Many news websites facilitate users to share news articles via social networks such as Facebook,
Twitter, LinkedIn, Google plus and etc. Social sharing allows users to explicitly publish the URL
of a news article or to like the URL of a news article. The statistics of such social sharing
information on Facebook, Twitter, LinkedIn and Google plus were extracted utilizing below
described two step methodology [11].
For each extracted news article the corresponding social network API call was made to the given
endpoint incorporating the article's URL. In next step the retrieved response in terms of JSON
objects were queried to extract the counts of likes and shares made for the extracted news article.
3.2. Pre-processor
The major functions of the pre-processor module are to perform an initial cleansing of each
article, tokenizing, removal of stop-words and stemming the content of news articles. The Porter
stemming algorithm by Martin Porter [12], developed based on an explicit list of suffixes together
with the criterion under which each suffix to be removed from a word was utilized for stemming
news articles.
3.3. News Analyzer
The main functionality of the News Analyzer component is to classify news articles into a set of
predefined news stories defined under general categories such as politics, business, health and etc.
The approach adopted for the classification process is Personalized Classification presented in
work [2] and [8].
3.3.1. Personalized Classification of News Articles
Personalized Classification is a process which enables users to create their own classification
categories on the fly. This classification methodology is capable of classifying documents under
diverse user interests, compared to general classification process with a fixed set of pre-defined
categories.
The News Analyzer component which utilizes Personalized Classification approach follows a
three step methodology for creating personalized categories. As the first step it enables an
administrative user of the tool to define general categories such as politics, business, health,
sports and etc. Subsequently it enables the user to define more specific news stories (Personalized
Categories), under each general category. As the final step the user will be allowed to define a set
of keywords to describe each personalized news category. A personalized category defined in
terms of keywords will be named as keyword based Category Profile [2] [8]. Keyword based
Category Profiles simplifies the user effort, compared to selecting adequate number of training
documents for each Personalized Category.
7. International Journal of Computer-Aided Technologies (IJCAx) Vol.2,No.1,January 2015
7
3.3.2. Keyword based Category Profiles
In this research each Personalized Category is defined in terms of a list of user specified
keywords instead of using a set of training documents. The Personalized Classification process
based on keyword Category Profiles, would work optimally when the provided keywords have a
high discriminatory power in distinguishing news articles under a desired category from the
articles under the other categories.
In order to facilitate a higher discriminating power for keywords, a weighting scheme tabulated in
Table 1 was introduced. The proposed system enables users to specify a higher weight on
keywords which may occur infrequently and a lower weight on certain keywords which may
occur frequently in many news articles.
Table1. Weighting scheme used for keywords
Weight 0.2 0.4 0.6 0.8 1.0
Indication Very Low Low Medium High Very High
A sample Personalized Classification category will be illustrated below.
General Category: HEALTH
Personalized Category: DCD and Whey Protein products detected
Occurrence of Keywords: Multiple Occurrences
Keywords and Weights:
Dicyandiamide 1.0
DCD 1.0
Whey Protein 1.0
Botulism 1.0
Clostridium 1.0
Milk powder 0.4
3.3.3. Classification Rank Calculation
This research utilizes a classification rank calculation methodology for the classification process
of news articles. The classification rank is an indication of how well, a given news article is
connected to the classified Personalized Category. According to the classification rank calculation
methodology, initially the news articles will be classified into two categories namely the user
desired Personalized Category or OTHER category. All news articles with a classification rank
above zero will get assigned to the user desired Personalized Category and the articles with a zero
classification rank will get assigned to the category of OTHER.
The proposed classification process is depicted in Figure 2.
8. International Journal of Computer-Aided Technologies (IJCAx) Vol.2,No.1,January 2015
8
Figure 2. Personalized Classification process
The classification rank of each news article is calculated based on the tf-idf values of the
keywords describing the desired Personalized Category. The tf-idf weight of a keyword ki in a
news article aj is computed from the term frequency Freq (ki, aj) and the inverse document
frequency as provided in Eq. 1 [2].
tfidf (ki, aj) = Freq (ki, aj) * log2 N / DF(ki) (1)
According to the Eq.1, N is the number of news articles extracted on the user specified date and
DF(ki) is the number of news articles in the article collection N, having keyword ki occurring at
least once.
The classification rank of an article is defined in terms of the summation of the tf-idf weights of
all keywords describing the desired Personalized Category denoted by cp [2].
rank(aj) = ∑ kє cp tfidf (ki, aj) (2)
Three additional equations for calculating ranks were derived utilizing Eq.1 and Eq.2 with the
aim of increasing the accuracy of the classification process and reducing the inaccurate articles
being classified into desired categories with high rank values.
The Eq.3 will calculate weighted tf-idf value of a keyword ki in a news article aj by incorporating
weights defined for the keywords describing the desired Personalized Category. This aims to
reduce the rank values of articles having one to two keywords with high frequency, which might
not exactly describe the desired news story.
wgt_tfidf (ki, aj) = [Wi * Freq (ki, aj)] * log2 N / DF(ki) (3)
According to Eq.3 Wi will denote the user defined weight given for keyword ki describing the
Personalized Category. Based on Eq.3 weighted summation and weighted average based rank
calculations were derived and will be denoted by Eq.4 and Eq.5 respectively.
wgt_rank(aj) = ∑ kє cp wgt_tfidf (ki, aj) (4)
wgt_avg_rank(aj) = ∑ kє cp wgt_tfidf (ki, aj)/ total(Wi) (5)
According to Eq.5, total(Wi) will denote the total of the user defined Weights, of the keywords
describing the desired Personalized Category. The classification process of the proposed system
was evaluated by calculating the classification rank of news articles utilizing Eq.5.
9. International Journal of Computer-Aided Technologies (IJCAx) Vol.2,No.1,January 2015
9
3.4. Comment Analyzer
The main functionality of the Comments Analyzer component is to analyze the most popular
news articles published within a given period, graphically plot sudden fluctuations in popularity
of news stories and to provide a detailed profile for each commenter.
3.4.1. Popularity Score for News Articles
The Comments Analyzer component enables users to obtain a list of most popular news articles
which have been published within a user desired period. Popularity of each article was calculated
utilizing the statistics of the below factors.
• Number of unique users who have made comments.
• Number of days the article has received comments.
• Number of times that the news article has been shared, liked and commented via
Facebook.
• Total number of times that the article has been shared via Twitter.
• Total number of times that the article has been shared via Google plus.
• Total number of times that the article has been shared via LinkedIn.
A popularity score was calculated for each news article utilizing a four steps methodology. As the
first step statistics of above listed factors were calculated utilizing the comments and sharing
information extracted via the Content Acquisitioner module. Subsequently the calculated
statistics were analyzed with the aid of IMB SPSS Statistics [13] toolkit in order to come up with
weights for each factor. As the third step, the weights obtained from the previous step were
utilized to calculate a popularity score for each article using Eq.6.
popularity_score = (Weight[i] *Factor[i]) /age (6)
According to the Eq.6 Weight[i] will denote the weight of the i th
factor, where Factor[i] will
denote the total number of cases available for the i th
factor for a given news article. The number
of days since the article was published will be denoted via variable age.
As the final step, the articles will be ranked according to the popularity scores and the top most k
articles will be presented to the users of the News Agent toolbox.
3.4.2. Profile for Commenters
The Comments Analyzer module maintains user profiles for commenters of news websites. The
main purpose of maintaining user profiles is to present various statistics related to comments
made by users and their commenting behaviour.
3.4.3. Popularity Analysis of News Stories (Personalized Categories)
The main purpose of analyzing the popularity of news stories is to provide a graphical
visualization of sudden increases in popularity of news articles belonging to a specific news story
(Personalized Category). The popularity analysis of news stories will follow a four step
methodology as described below.
As the first step the tool will classify the news articles based on the user provided period and the
news story (Personalized Category). As the next step the tool will calculate the total number of
comments, Tweets, Facebook shares, LinkedIn shares and the Google plus shares that the
10. International Journal of Computer-Aided Technologies (IJCAx) Vol.2,No.1,January 2015
10
classified articles belonging to the given news story has received. Subsequently the tool will
calculate sudden increases in popularity by considering the slope (the rate of change) between the
two data points which represent the total number of comments and sharing statistics. As the final
step the tool will plot how the popularity of the news story has been change over the user given
period.
4. RESULTS
The proposed system was evaluated utilizing the accuracy of news article extraction process of
Content Acquisitioner and the classification results of the News Analyzer modules. The
extraction accuracy was simply evaluated using the percentages of correctly and incorrectly
extracted articles. The classification results were evaluated by comparing the results obtained via
the system with the results obtained via a manual classification process.
The Weighted Average rank calculation given by Eq.5, was utilized for the classification
evaluation process.
4.1. Article Extraction Accuracy
The accuracy of the News Article Extraction process was evaluated utilizing data, which have
been extracted in months of October and November. Around six thousand full text news articles
were manually examined in order to identify news articles and links which have not been
extracted as expected. The news articles which included only the commenting sections made by
the users, the text content placed on navigator sections, the text content of footer sections and
news links pointing to advertisements, leave comments and contact us web pages were considered
as articles and links which have been extracted incorrectly.
The Figure 3 will illustrate the number of news articles which have been extracted accurately and
inaccurately considering the total number of news articles extracted per day between the chosen
time period.
Figure 3. Accuracy of article extraction process
11. International Journal of Computer-Aided Technologies (IJCAx) Vol.2,No.1,January 2015
11
4.2. Article Classification Accuracy
The classification process of the News Analyzer component was evaluated creating several timely
news stories (Personalized Categories) under three main categories such as Politics, Health and
Business. The classification accuracy of each news story was measured considering the period of
time where multiple news websites have reported the related news event.
The following keyword based Category Profile was utilized to obtain test results for news story
“Commonwealth (CHOGM) 2013”.
General Category: POLITICS
Personalized Category: Commonwealth (CHOGM) 2013
Occurrence of Keywords: Multiple Occurrences
Keywords and Weights:
Commonwealth 1.0
CHOGM 1.0
Commonwealth Heads Government Meeting 1.0
Commonwealth Secretariat 1.0
Commonwealth Leaders 1.0
Summit 0.4
Figure 4 will illustrate the test results obtained for the classification process of news articles
related to the news story “Commonwealth (CHOGM) 2013”. The graph illustrates the number of
news articles classified by the News Agent tool box compared to the number of news articles
classified manually. The gap between the two will indicate the number of incorrect articles which
have been classified by the proposed tool.
Figure 4. Test results obtained for Commonwealth (CHOGM) 2013
12. International Journal of Computer-Aided Technologies (IJCAx) Vol.2,No.1,January 2015
12
The following keyword based Category Profile was utilized to obtain test results for news story
“Whey Protein detected in dairy products”.
General Category: HEALTH
Personalized Category: Whey Protein detected in dairy products
Occurrence of Keywords: Multiple Occurrences
Keywords and Weights:
Whey Protein 1.0
Botulism 1.0
Clostridium 1.0
Milk powder 0.4
Dairy products 0.4
Bacteria 0.4
Figure 5 will illustrate the test results obtained for the classification process of news articles
related to the news story “Whey Protein detected in dairy products”.
Figure 5. Test results obtained for Whey Protein detected in dairy products
The following keyword based Category Profile was utilized to obtain test results for news story
“Central Bank related news”.
General Category: BUSINESS
Personalized Category: Central Bank related news
Occurrence of Keywords: Multiple Occurrences
Keywords and Weights:
Central Bank 1.0
CB 1.0
Monitor 0.2
Report 0.4
Figure 6 will illustrate the test results obtained for the classification process of news articles
related to the news story “Central Bank related news”.
13. International Journal of Computer-Aided Technologies (IJCAx) Vol.2,No.1,January 2015
13
Figure 6. Test results obtained for Central Bank related news
According to the test results obtained for classifying news articles into above mentioned news
stories, it is clear that the accuracy of the proposed system is much closer to the results obtained
via the manual classification process. According to the conducted evaluations, the News Analyzer
module functions with an approximate accuracy of 81% to 82%.
4.2.1 Discriminative Power of Keywords vs. Accuracy
The accuracy of the classification process is dependent upon the degree of the discriminating
power of keywords defined under Category Profiles of the news stories. The higher the keywords
are capable of distinguishing news articles of one news story from another, more accuracy can be
obtained by the proposed classification process.
The test results obtained by the following sample test scenarios show that Category Profiles with
keywords having high discriminating power provide more accuracy and Category Profiles with
keywords having low discriminating power provide less accuracy. Figure 7 and Figure 8 will
illustrate how the classification accuracy has been altered when the discriminating power of the
keywords are changed.
The following keyword based Category Profile was utilized to obtain test results for news story
“Dengue outbreaks in Sri Lanka” with high discriminating power keywords.
General Category: HEALTH
Personalized Category: Dengue outbreaks in Sri Lanka
Occurrence of Keywords: Multiple Occurrences
Keywords and Weights:
Dengue 1.0
Dengue mosquito 1.0
Dengue virus 1.0
Dengue Hemorrhagic Fever 1.0
Fever 0.4
Mosquito breeding 0.4
14. International Journal of Computer-Aided Technologies (IJCAx) Vol.2,No.1,January 2015
14
Figure 7. Results for keywords with high discriminating power
The following keyword based Category Profile was utilized to obtain test results for news story
“Dengue outbreaks in Sri Lanka” with low discriminating power keywords.
General Category: HEALTH
Personalized Category: Dengue outbreaks in Sri Lanka
Occurrence of Keywords: Multiple Occurrences
Keywords and Weights:
Dengue 1.0
Mosquito 1.0
Virus 1.0
Fever 0.4
Breeding 0.4
Figure 8. Results for keywords with low discriminating power
15. International Journal of Computer-Aided Technologies (IJCAx) Vol.2,No.1,January 2015
15
5. CONCLUSIONS
In summary, this research attempts to develop an Intelligent Agent, for analyzing readily
available news content and the subscription patterns of Sri Lankan news websites and news blogs.
The aim of this research is to take proper initiatives to mine heavily available online news content
and the subscription data, in order to provide an efficient way to surveillance different publication
and subscription patterns. The main objectives of this research are to assist the Sri Lankan
authorities of media to track and regulate different publications made by different disseminators
of news, to assist identifying incidents which requires more attention and news events increasing
popularity over time, to assure proper balance, accuracy and equal coverage of news reporting.
The proposed system has been evaluated utilizing the accuracy of news article extraction process
and the classification process. The extraction accuracy has been simply evaluated using the
percentages of correctly and incorrectly extracted articles. The classification results have been
evaluated using several news stories together with the keyword profiles for general categories
such as Politics, Health and Business. The test results show that the extraction process functions
with a 98% of accuracy and the classification process functions with an approximate accuracy of
81%.
ACKNOWLEDGEMENTS
I would like to convey my sincere gratitude to my supervisor, Dr. Chathura De Silva, Senior
Lecturer, Department of Computer Science and Engineering, University of Moratuwa for
providing the valuable research idea, guidance, constructive comments and extensive support
throughout this research.
I also express my sincere gratitude to our MSc Project Coordinator Dr A. Shehan Perera, Senior
Lecturer, Department of Computer Science and Engineering, University of Moratuwa for
providing extensive support and encouragement by conducting weekly sessions, forums and
progress presentations throughout the year.
REFERENCES
[1] I. Flaounas, ‘Pattern analysis of news media content’, University of Bristol, 2011.
[2] A. Sun, E.-P. Lim, and W.-K. Ng, ‘Personalized classification for keyword-based category profiles’, in
Research and Advanced Technology for Digital Libraries, Springer, 2002, pp. 61–74.
[3] I. Flaounas, O. Ali, M. Turchi, T. Snowsill, F. Nicart, T. De Bie, and Cristianini, ‘NOAM: news
outlets analysis and monitoring system’, in Proc. of the 2011 ACM SIGMOD international conference
on Management of data, 2011, pp. 1275–1278.
[4] L. Lloyd, D. Kechagias, and S. Skiena, ‘Lydia: A system for large-scale news analysis’, in String
Processing and Information Retrieval, 2005, pp. 161–166.
[5] R. J. Kuhns, ‘A news analysis system’, in Proceedings of the 12th conference on Computational
linguistics-Volume 1, 1988, pp. 351–355.
[6] K. R. McKeown, R. Barzilay, D. Evans, V. Hatzivassiloglou, J. L. Klavans, A. Nenkova, C. Sable, B.
Schiffman, S. Sigelman, and M. Summarization, Tracking and Summarizing News on a Daily Basis
with Columbia’s Newsblaster, Proceedings of Human Language Technology Conference. San Diego,
USA, 2002.
[7] Geng, Q. Gao, and J. Pan, ‘Extracting content for news web pages based on DOM’, IJCSNS
International Journal of Computer Science and Network Security, vol. 7, no. 2, pp. 124–129,2007.
[8] C.-H. C. A. S. Ee and P. Lim, ‘Automated online news classification with personalization’, in 4th
international conference on Asian digital libraries, 2001.
16. International Journal of Computer-Aided Technologies (IJCAx) Vol.2,No.1,January 2015
16
[9] “Parsing HTML Documents with the Html Agility Pack” [Online]. Available:
http://www.4guysfromrolla.com/articles/011211-1.aspx. [Accessed: 29-Apr-2013].
[10]Kohlschütter, P. Fankhauser, and W. Nejdl, ‘Boilerplate detection using shallow text features’, in
Proceedings of the third ACM international conference on Web search and data mining, New York,
NY, USA, 2010, pp. 441–450.
[11]‘Get Social Share Counts - A Complete Guide’, CUBE3X. [Online]. Available:
http://cube3x.com/2013/01/get-social-share-counts-a-complete-guide/. [Accessed: 10-Oct-2013].
[12]‘Porter Stemming Algorithm’. [Online]. Available: http:// tartarus.org /martin
/PorterStemmer/index.html. [Accessed: 03-Jun-2013].
[13]‘IBM SPSS Statistics’, 29-Oct-2013. [Online]. Available: http://www-
01.ibm.com/software/analytics/spss/products/statistics/. [Accessed: 14-Oct-2013].
Authors
W.D.R Wijedasa received her M.Sc. Degree in Computer Science Engineering fro m
University of Moratuwa Sri Lanka in 2014. She has been working as a Software
Engineer at IFS RnD Pvt Limited from 2010. Her research interests are Data mining,
Artificial Intelligence and Agent Technologies.
Dr. Chathura De Silva received his MEng. Degree and Ph.D. Degree from National
University of Singapore. He has been working as a Senior Lecturer at Department of
Computer Science E ngineering University of Moratuwa Sri Lanka. He is the current
Head of Department of Computer Science Engineering University of Moratuwa.