In this paper, nature inspired methods are proposed for solving problems in the field of Semantic Web mining, namely the clustering of Web resources based on their metadata, as well as the automatic classification of Web pages.
An introductory-to-mid level to presentation to complex network analysis: network metrics, analysis of online social networks, approximated algorithms, memorization issues, storage.
Penalty Function Method For Solving Fuzzy Nonlinear Programming Problempaperpublications3
Abstract: In this work, the fuzzy nonlinear programming problem (FNLPP) has been developed and their result have also discussed. The numerical solutions of crisp problems and have been compared and the fuzzy solution and its effectiveness have also been presented and discussed. The penalty function method has been developed and mixed with Nelder and Mend’s algorithm of direct optimization problem solutionhave been used together to solve this FNLPP.
Keyword:Fuzzy set theory, fuzzy numbers, decision making, nonlinear programming, Nelder and Mend’s algorithm, penalty function method.
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...Beniamino Murgante
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov – National Centre for Geocomputation, National University of Ireland , Maynooth (Ireland)
Intelligent Analysis of Environmental Data (S4 ENVISA Workshop 2009)
Using Alpha-cuts and Constraint Exploration Approach on Quadratic Programming...TELKOMNIKA JOURNAL
In this paper, we propose a computational procedure to find the optimal solution of quadratic programming
problems by using fuzzy -cuts and constraint exploration approach. We solve the problems in
the original form without using any additional information such as Lagrange’s multiplier, slack, surplus and
artificial variable. In order to find the optimal solution, we divide the calculation in two stages. In the first
stage, we determine the unconstrained minimization of the quadratic programming problem (QPP) and check
its feasibility. By unconstrained minimization we identify the violated constraints and focus our searching in
these constraints. In the second stage, we explored the feasible region along side the violated constraints
until the optimal point is achieved. A numerical example is included in this paper to illustrate the capability of
-cuts and constraint exploration to find the optimal solution of QPP.
Slides from our PacificVis 2015 presentation.
The paper tackles the problems of the “giant hairballs”, the dense and tangled structures often resulting from visualiza- tion of large social graphs. Proposed is a high-dimensional rotation technique called AGI3D, combined with an ability to filter elements based on social centrality values. AGI3D is targeted for a high-dimensional embedding of a social graph and its projection onto 3D space. It allows the user to ro- tate the social graph layout in the high-dimensional space by mouse dragging of a vertex. Its high-dimensional rotation effects give the user an illusion that he/she is destructively reshaping the social graph layout but in reality, it assists the user to find a preferred positioning and direction in the high- dimensional space to look at the internal structure of the social graph layout, keeping it unmodified. A prototype im- plementation of the proposal called Social Viewpoint Finder is tested with about 70 social graphs and this paper reports four of the analysis results.
A wide variety of combinatorial problems can be viewed as Weighted Constraint Satisfaction Problems (WCSPs). All resolution methods have an exponential time complexity for big instances. Moreover, they combine several techniques, use a wide variety of concepts and notations that are difficult to understand and implement. In this paper, we model this problem in terms of an original 0-1 quadratic programming subject to linear constraints. This model is validated by the proposed and demonstrated theorem. View its performance, we use the Hopfield neural network to solve the obtained model basing on original energy function. To validate our model, we solve several instances of benchmarking WCSP. Our approach has the same memory complexity as the HNN and the same time complexity as Euler-Cauchy method. In this regard, our approach recognizes the optimal solution of the said instances.
A new transformation into State Transition Algorithm for finding the global m...Michael_Chou
To promote the global search ability of the original state transition algorithm, a new operator called axesion is suggested, which aims to search along the axes and strengthen single dimensional search. Several benchmark minimization
problems are used to illustrate the advantages of the improved algorithm over other random search methods. The results of
numerical experiments show that the new transformation can enhance the performance of the state transition algorithm and the new strategy is effective and reliable.
Image Super-Resolution Reconstruction Based On Multi-Dictionary LearningIJRESJOURNAL
ABSTRACT: In order to overcome the problems that the single dictionary cannot be adapted to variety types of images and the reconstruction quality couldn’t meet the application, we propose a novel Multi-Dictionary Learning algorithm for feature classification. The algorithm uses the orientation information of the low resolution image to guide the image patches in the database to classify, and designs the classification dictionary which can effectively express the reconstructed image patches. Considering the nonlocal similarity of the image, we construct the combined nonlocal mean value(C-NLM) regularizer, and take the steering kernel regression(SKR) to formulate a local regularization ,and establish a unified reconstruction framework. Extensive experiments on single image validate that the proposed method, compared with several other state-of-the-art learning based algorithms, achieves improvement in image quality and provides more details.
CHN and Swap Heuristic to Solve the Maximum Independent Set ProblemIJECEIAES
We describe a new approach to solve the problem to find the maximum independent set in a given Graph, known also as Max-Stable set problem (MSSP). In this paper, we show how Max-Stable problem can be reformulated into a linear problem under quadratic constraints, and then we resolve the QP result by a hybrid approach based Continuous Hopfeild Neural Network (CHN) and Local Search. In a manner that the solution given by the CHN will be the starting point of the local search. The new approach showed a good performance than the original one which executes a suite of CHN runs, at each execution a new leaner constraint is added into the resolved model. To prove the efficiency of our approach, we present some computational experiments of solving random generated problem and typical MSSP instances of real life problem.
EFFICIENT KNOWLEDGE BASE MANAGEMENT IN DCSP ijasuc
DCSP (Distributed Constraint Satisfaction Problem) has been a very important research area in AI
(Artificial Intelligence). There are many application problems in distributed AI that can be formalized as
DSCPs. With the increasing complexity and problem size of the application problems in AI, the required
storage place in searching and the average searching time are increasing too. Thus, to use a limited
storage place efficiently in solving DCSP becomes a very important problem, and it can help to reduce
searching time as well. This paper provides an efficient knowledge base management approach based on
general usage of hyper-resolution-rule in consistence algorithm. The approach minimizes the increasing of
the knowledge base by eliminate sufficient constraint and false nogood. These eliminations do not change
the completeness of the original knowledge base increased. The proofs are given as well. The example
shows that this approach decrease both the new nogoods generated and the knowledge base greatly. Thus
it decreases the required storage place and simplify the searching process.
European Digital Business Culture Online #bigitalBusiness #P2T2 KEDGE Business School
http://businessculture.org Digital Business Culture online- preliminary results from Passport to Trade 2.0 project on Business Culture in 31 European countries.
An introductory-to-mid level to presentation to complex network analysis: network metrics, analysis of online social networks, approximated algorithms, memorization issues, storage.
Penalty Function Method For Solving Fuzzy Nonlinear Programming Problempaperpublications3
Abstract: In this work, the fuzzy nonlinear programming problem (FNLPP) has been developed and their result have also discussed. The numerical solutions of crisp problems and have been compared and the fuzzy solution and its effectiveness have also been presented and discussed. The penalty function method has been developed and mixed with Nelder and Mend’s algorithm of direct optimization problem solutionhave been used together to solve this FNLPP.
Keyword:Fuzzy set theory, fuzzy numbers, decision making, nonlinear programming, Nelder and Mend’s algorithm, penalty function method.
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov –...Beniamino Murgante
Kernel based models for geo- and environmental sciences- Alexei Pozdnoukhov – National Centre for Geocomputation, National University of Ireland , Maynooth (Ireland)
Intelligent Analysis of Environmental Data (S4 ENVISA Workshop 2009)
Using Alpha-cuts and Constraint Exploration Approach on Quadratic Programming...TELKOMNIKA JOURNAL
In this paper, we propose a computational procedure to find the optimal solution of quadratic programming
problems by using fuzzy -cuts and constraint exploration approach. We solve the problems in
the original form without using any additional information such as Lagrange’s multiplier, slack, surplus and
artificial variable. In order to find the optimal solution, we divide the calculation in two stages. In the first
stage, we determine the unconstrained minimization of the quadratic programming problem (QPP) and check
its feasibility. By unconstrained minimization we identify the violated constraints and focus our searching in
these constraints. In the second stage, we explored the feasible region along side the violated constraints
until the optimal point is achieved. A numerical example is included in this paper to illustrate the capability of
-cuts and constraint exploration to find the optimal solution of QPP.
Slides from our PacificVis 2015 presentation.
The paper tackles the problems of the “giant hairballs”, the dense and tangled structures often resulting from visualiza- tion of large social graphs. Proposed is a high-dimensional rotation technique called AGI3D, combined with an ability to filter elements based on social centrality values. AGI3D is targeted for a high-dimensional embedding of a social graph and its projection onto 3D space. It allows the user to ro- tate the social graph layout in the high-dimensional space by mouse dragging of a vertex. Its high-dimensional rotation effects give the user an illusion that he/she is destructively reshaping the social graph layout but in reality, it assists the user to find a preferred positioning and direction in the high- dimensional space to look at the internal structure of the social graph layout, keeping it unmodified. A prototype im- plementation of the proposal called Social Viewpoint Finder is tested with about 70 social graphs and this paper reports four of the analysis results.
A wide variety of combinatorial problems can be viewed as Weighted Constraint Satisfaction Problems (WCSPs). All resolution methods have an exponential time complexity for big instances. Moreover, they combine several techniques, use a wide variety of concepts and notations that are difficult to understand and implement. In this paper, we model this problem in terms of an original 0-1 quadratic programming subject to linear constraints. This model is validated by the proposed and demonstrated theorem. View its performance, we use the Hopfield neural network to solve the obtained model basing on original energy function. To validate our model, we solve several instances of benchmarking WCSP. Our approach has the same memory complexity as the HNN and the same time complexity as Euler-Cauchy method. In this regard, our approach recognizes the optimal solution of the said instances.
A new transformation into State Transition Algorithm for finding the global m...Michael_Chou
To promote the global search ability of the original state transition algorithm, a new operator called axesion is suggested, which aims to search along the axes and strengthen single dimensional search. Several benchmark minimization
problems are used to illustrate the advantages of the improved algorithm over other random search methods. The results of
numerical experiments show that the new transformation can enhance the performance of the state transition algorithm and the new strategy is effective and reliable.
Image Super-Resolution Reconstruction Based On Multi-Dictionary LearningIJRESJOURNAL
ABSTRACT: In order to overcome the problems that the single dictionary cannot be adapted to variety types of images and the reconstruction quality couldn’t meet the application, we propose a novel Multi-Dictionary Learning algorithm for feature classification. The algorithm uses the orientation information of the low resolution image to guide the image patches in the database to classify, and designs the classification dictionary which can effectively express the reconstructed image patches. Considering the nonlocal similarity of the image, we construct the combined nonlocal mean value(C-NLM) regularizer, and take the steering kernel regression(SKR) to formulate a local regularization ,and establish a unified reconstruction framework. Extensive experiments on single image validate that the proposed method, compared with several other state-of-the-art learning based algorithms, achieves improvement in image quality and provides more details.
CHN and Swap Heuristic to Solve the Maximum Independent Set ProblemIJECEIAES
We describe a new approach to solve the problem to find the maximum independent set in a given Graph, known also as Max-Stable set problem (MSSP). In this paper, we show how Max-Stable problem can be reformulated into a linear problem under quadratic constraints, and then we resolve the QP result by a hybrid approach based Continuous Hopfeild Neural Network (CHN) and Local Search. In a manner that the solution given by the CHN will be the starting point of the local search. The new approach showed a good performance than the original one which executes a suite of CHN runs, at each execution a new leaner constraint is added into the resolved model. To prove the efficiency of our approach, we present some computational experiments of solving random generated problem and typical MSSP instances of real life problem.
EFFICIENT KNOWLEDGE BASE MANAGEMENT IN DCSP ijasuc
DCSP (Distributed Constraint Satisfaction Problem) has been a very important research area in AI
(Artificial Intelligence). There are many application problems in distributed AI that can be formalized as
DSCPs. With the increasing complexity and problem size of the application problems in AI, the required
storage place in searching and the average searching time are increasing too. Thus, to use a limited
storage place efficiently in solving DCSP becomes a very important problem, and it can help to reduce
searching time as well. This paper provides an efficient knowledge base management approach based on
general usage of hyper-resolution-rule in consistence algorithm. The approach minimizes the increasing of
the knowledge base by eliminate sufficient constraint and false nogood. These eliminations do not change
the completeness of the original knowledge base increased. The proofs are given as well. The example
shows that this approach decrease both the new nogoods generated and the knowledge base greatly. Thus
it decreases the required storage place and simplify the searching process.
European Digital Business Culture Online #bigitalBusiness #P2T2 KEDGE Business School
http://businessculture.org Digital Business Culture online- preliminary results from Passport to Trade 2.0 project on Business Culture in 31 European countries.
Identifying a “research problem”
Refining a research aim and objectives
Selecting a data collection method
Conducting interviews
Critical review of data collection options
The effective use of social media in business slides from an open evening of the Search & Social Media Marketing #SSMM course at Salford Business School. http://www.searchmarketing.salford.ac.uk/ start from slide 22 if you want to see the main points discussed as part of the Open Evening.
Can we make higher education relevant to Search & Social Media Marketing indu...KEDGE Business School
Can we make higher education relevant to Search & Social Media Marketing industry needs?
Abstract
Higher education institutions are often criticised for the lack of relevant educational courses that equip students with the skills to meet specific industry needs. The issue of relevant education has been particular significance in the highly dynamic business information technology related subjects. This paper presents a discussion that outlines the benefits and advantages of including Search & Social Media Marketing as a taught subject within higher education.
The key argument presented here is that search and social media marketing not only provides relevance to an emerging commercial industry, but also represents an opportunity for delivering cutting-edge education that crosses a range of disciplinary boundaries by having the topic itself provide context and content. Search and social media marketing is a topic largely defined by the emerging need of marketing professionals to engage and apply their pre-existing knowledge and strategies to the context of search engines and social media.
The data gathered in this case study is based on four action research cycles conducted during the academic years 2008/09 and 2010/11. Additionally, data was collected using an industry survey of 112 respondents who attended the Search Analytics and Social Conference (SASCon 2010), short course participants and UK-based marketing agencies.
The key findings of this study are that a) the Search & Social Media Marketing industry is growing, but is still in its infancy and offers a great opportunity for collaboration between the industry and higher education but b) despite the healthy and growing career opportunities within the discipline, there is a lack of higher education provision, demonstrating the need for academics to engage in this subject area.
http://www.searchmarketing.salford.ac.uk/
Colorado Annual Attorney Registration Process 2010Kelli Adams
Changes have been made to the annual attorney registration process. This year will be easier than last year. Attorneys will access their annual statement on our website. New features have been added. Firms will be able to pay for their entire firm in one transaction. New forms of payments have been added, including AMEX and Discover. Important notification dates will be provided.
Research projects – the process
Standard activities in research projects
Creating a GANTT Chart
Risk management
Project tracking
Research projects – the outputs
Documentation – classic structure
Basic writing skills
Harvard referencing
Plagiarism
Evaluation of a hybrid method for constructing multiple SVM kernelsinfopapers
Dana Simian, Florin Stoica, Evaluation of a hybrid method for constructing multiple SVM kernels, Recent Advances in Computers, Proceedings of the 13th WSEAS International Conference on Computers, Recent Advances in Computer Engineering Series, WSEAS Press, Rodos, Greece, July 23-25, 2009, ISSN: 1790-5109, ISBN: 978-960-474-099-4, pp. 619-623
A COMPREHENSIVE ANALYSIS OF QUANTUM CLUSTERING : FINDING ALL THE POTENTIAL MI...IJDKP
Quantum clustering (QC), is a data clustering algorithm based on quantum mechanics which is
accomplished by substituting each point in a given dataset with a Gaussian. The width of the Gaussian is a
σ value, a hyper-parameter which can be manually defined and manipulated to suit the application.
Numerical methods are used to find all the minima of the quantum potential as they correspond to cluster
centers. Herein, we investigate the mathematical task of expressing and finding all the roots of the
exponential polynomial corresponding to the minima of a two-dimensional quantum potential. This is an
outstanding task because normally such expressions are impossible to solve analytically. However, we
prove that if the points are all included in a square region of size σ, there is only one minimum. This bound
is not only useful in the number of solutions to look for, by numerical means, it allows to to propose a new
numerical approach “per block”. This technique decreases the number of particles by approximating some
groups of particles to weighted particles. These findings are not only useful to the quantum clustering
problem but also for the exponential polynomials encountered in quantum chemistry, Solid-state Physics
and other applications.
Joint3DShapeMatching - a fast approach to 3D model matching using MatchALS 3...Mamoon Ismail Khalid
we extend the global optimization-based
approach of jointly matching a set of images to jointly
matching a set of 3D meshes. The estimated correspon
dences simultaneously maximize pairwise feature affini
ties and cycle consistency across multiple models. We
show that the low-rank matrix recovery problem can be
efficiently applied to the 3D meshes as well. The fast
alternating minimization algorithm helps to handle real
world practical problems with thousands of features. Ex
perimental results show that, unlike the state-of-the-art
algorithm which rely on semi-definite programming, our
algorithm provides an order of magnitude speed-up along
with competitive performance. Along with the joint shape
matching we propose an approach to apply a distortion
term in pairwise matching, which helps in successfully
matching the reflexive sub-parts of two models distinc
tively. In the end, we demonstrate the applicability of
the algorithm to match a set of 3D meshes of the SCAPE
benchmark database
RunPool: A Dynamic Pooling Layer for Convolution Neural NetworkPutra Wanda
Deep learning (DL) has achieved a significant performance in computer vision problems, mainly in automatic feature extraction and representation. However, it is not easy to determine the best pooling method in a different case study. For instance, experts can implement the best types of pooling in image processing cases, which might not be optimal for various tasks. Thus, it is
required to keep in line with the philosophy of DL. In dynamic neural network architecture, it is not practically possible to find
a proper pooling technique for the layers. It is the primary reason why various pooling cannot be applied in the dynamic and multidimensional dataset. To deal with the limitations, it needs to construct an optimal pooling method as a better option than max pooling and average pooling. Therefore, we introduce a dynamic pooling layer called RunPool to train the convolutional
neuralnetwork(CNN)architecture.RunPoolpoolingisproposedtoregularizetheneuralnetworkthatreplacesthedeterministic
pooling functions. In the final section, we test the proposed pooling layer to address classification problems with online social network (OSN) dataset
FAST ALGORITHMS FOR UNSUPERVISED LEARNING IN LARGE DATA SETScsandit
The ability to mine and extract useful information automatically, from large datasets, is a
common concern for organizations (having large datasets), over the last few decades. Over the
internet, data is vastly increasing gradually and consequently the capacity to collect and store
very large data is significantly increasing.
Existing clustering algorithms are not always efficient and accurate in solving clustering
problems for large datasets.
However, the development of accurate and fast data classification algorithms for very large
scale datasets is still a challenge. In this paper, various algorithms and techniques especially,
approach using non-smooth optimization formulation of the clustering problem, are proposed
for solving the minimum sum-of-squares clustering problems in very large datasets. This
research also develops accurate and real time L2-DC algorithm based with the incremental
approach to solve the minimum
WEIGHTED CONSTRAINT SATISFACTION AND GENETIC ALGORITHM TO SOLVE THE VIEW SELE...ijdms
A Data warehouse is a tool that is used by big companies, and it gathers data coming from different
sources. The main goal of a data warehouse is not only to store data, but also to help companies to make
decisions. The huge volume of data makes processing queries complex and time-consuming. In order to
solve this problem, the materialization of views is suggested as a solution to improve the processing of the
queries. The materialization of views aims to optimize an objective function which is a compromise
between the cost of processing queries and the cost of maintenance under a storage space constraint. In
this work, we modelled the view selection problem as a weight constraint satisfaction problem. In
addition, we use the used the multiple view processing plans (MVPP) Framework as a search space, and
we call genetic algorithm to select views to be materialized. According to experimental result, the
proposed algorithm has been used to show the quality of the appropriate materialized views selection.
COMPARISON OF WAVELET NETWORK AND LOGISTIC REGRESSION IN PREDICTING ENTERPRIS...ijcsit
Enterprise financial distress or failure includes bankruptcy prediction, financial distress, corporate performance prediction and credit risk estimation. The aim of this paper is that using wavelet networks innon-linear combination prediction to solve ARMA (Auto-Regressive and Moving Average) model problem.ARMA model need estimate the value of all parameters in the model, it has a large amount of computation.Under this aim, the paper provides an extensive review of Wavelet networks and Logistic regression. Itdiscussed the Wavelet neural network structure, Wavelet network model training algorithm, Accuracy rateand error rate (accuracy of classification, Type I error, and Type II error). The main research opportunity exist a proposed of business failure prediction model (wavelet network model and logistic regression
model). The empirical research which is comparison of Wavelet Network and Logistic Regression on training and forecasting sample, the result shows that this wavelet network model is high accurate and the overall prediction accuracy, Type Ⅰerror and Type Ⅱ error, wavelet networks model is better thanlogistic regression model.
MULTIPROCESSOR SCHEDULING AND PERFORMANCE EVALUATION USING ELITIST NON DOMINA...ijcsa
Task scheduling plays an important part in the improvement of parallel and distributed systems. The problem of task scheduling has been shown to be NP hard. The time consuming is more to solve the problem in deterministic techniques. There are algorithms developed to schedule tasks for distributed environment, which focus on single objective. The problem becomes more complex, while considering biobjective.This paper presents bi-objective independent task scheduling algorithm using elitist Nondominated
sorting genetic algorithm (NSGA-II) to minimize the makespan and flowtime. This algorithm generates pareto global optimal solutions for this bi-objective task scheduling problem. NSGA-II is implemented by using the set of benchmark instances. The experimental result shows NSGA-II generates efficient optimal schedules.
An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing S...Pooyan Jamshidi
https://arxiv.org/abs/1606.06543
Finding optimal configurations for Stream Processing Systems (SPS) is a challenging problem due to the large number of parameters that can influence their performance and the lack of analytical models to anticipate the effect of a change. To tackle this issue, we consider tuning methods where an experimenter is given a limited budget of experiments and needs to carefully allocate this budget to find optimal configurations. We propose in this setting Bayesian Optimization for Configuration Optimization (BO4CO), an auto-tuning algorithm that leverages Gaussian Processes (GPs) to iteratively capture posterior distributions of the configuration spaces and sequentially drive the experimentation. Validation based on Apache Storm demonstrates that our approach locates optimal configurations within a limited experimental budget, with an improvement of SPS performance typically of at least an order of magnitude compared to existing configuration algorithms.
A simple framework for contrastive learning of visual representationsDevansh16
Link: https://machine-learning-made-simple.medium.com/learnings-from-simclr-a-framework-contrastive-learning-for-visual-representations-6c145a5d8e99
If you'd like to discuss something, text me on LinkedIn, IG, or Twitter. To support me, please use my referral link to Robinhood. It's completely free, and we both get a free stock. Not using it is literally losing out on free money.
Check out my other articles on Medium. : https://rb.gy/zn1aiu
My YouTube: https://rb.gy/88iwdd
Reach out to me on LinkedIn. Let's connect: https://rb.gy/m5ok2y
My Instagram: https://rb.gy/gmvuy9
My Twitter: https://twitter.com/Machine01776819
My Substack: https://devanshacc.substack.com/
Live conversations at twitch here: https://rb.gy/zlhk9y
Get a free stock on Robinhood: https://join.robinhood.com/fnud75
This paper presents SimCLR: a simple framework for contrastive learning of visual representations. We simplify recently proposed contrastive self-supervised learning algorithms without requiring specialized architectures or a memory bank. In order to understand what enables the contrastive prediction tasks to learn useful representations, we systematically study the major components of our framework. We show that (1) composition of data augmentations plays a critical role in defining effective predictive tasks, (2) introducing a learnable nonlinear transformation between the representation and the contrastive loss substantially improves the quality of the learned representations, and (3) contrastive learning benefits from larger batch sizes and more training steps compared to supervised learning. By combining these findings, we are able to considerably outperform previous methods for self-supervised and semi-supervised learning on ImageNet. A linear classifier trained on self-supervised representations learned by SimCLR achieves 76.5% top-1 accuracy, which is a 7% relative improvement over previous state-of-the-art, matching the performance of a supervised ResNet-50. When fine-tuned on only 1% of the labels, we achieve 85.8% top-5 accuracy, outperforming AlexNet with 100X fewer labels.
Comments: ICML'2020. Code and pretrained models at this https URL
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
Cite as: arXiv:2002.05709 [cs.LG]
(or arXiv:2002.05709v3 [cs.LG] for this version)
Submission history
From: Ting Chen [view email]
[v1] Thu, 13 Feb 2020 18:50:45 UTC (5,093 KB)
[v2] Mon, 30 Mar 2020 15:32:51 UTC (5,047 KB)
[v3] Wed, 1 Jul 2020 00:09:08 UTC (5,829 KB)
Similar to Semantic Web mining using nature inspired optimization methods (20)
Biological screening of herbal drugs: Introduction and Need for
Phyto-Pharmacological Screening, New Strategies for evaluating
Natural Products, In vitro evaluation techniques for Antioxidants, Antimicrobial and Anticancer drugs. In vivo evaluation techniques
for Anti-inflammatory, Antiulcer, Anticancer, Wound healing, Antidiabetic, Hepatoprotective, Cardio protective, Diuretics and
Antifertility, Toxicity studies as per OECD guidelines
The French Revolution, which began in 1789, was a period of radical social and political upheaval in France. It marked the decline of absolute monarchies, the rise of secular and democratic republics, and the eventual rise of Napoleon Bonaparte. This revolutionary period is crucial in understanding the transition from feudalism to modernity in Europe.
For more information, visit-www.vavaclasses.com
Palestine last event orientationfvgnh .pptxRaedMohamed3
An EFL lesson about the current events in Palestine. It is intended to be for intermediate students who wish to increase their listening skills through a short lesson in power point.
Francesca Gottschalk - How can education support child empowerment.pptxEduSkills OECD
Francesca Gottschalk from the OECD’s Centre for Educational Research and Innovation presents at the Ask an Expert Webinar: How can education support child empowerment?
Embracing GenAI - A Strategic ImperativePeter Windle
Artificial Intelligence (AI) technologies such as Generative AI, Image Generators and Large Language Models have had a dramatic impact on teaching, learning and assessment over the past 18 months. The most immediate threat AI posed was to Academic Integrity with Higher Education Institutes (HEIs) focusing their efforts on combating the use of GenAI in assessment. Guidelines were developed for staff and students, policies put in place too. Innovative educators have forged paths in the use of Generative AI for teaching, learning and assessments leading to pockets of transformation springing up across HEIs, often with little or no top-down guidance, support or direction.
This Gasta posits a strategic approach to integrating AI into HEIs to prepare staff, students and the curriculum for an evolving world and workplace. We will highlight the advantages of working with these technologies beyond the realm of teaching, learning and assessment by considering prompt engineering skills, industry impact, curriculum changes, and the need for staff upskilling. In contrast, not engaging strategically with Generative AI poses risks, including falling behind peers, missed opportunities and failing to ensure our graduates remain employable. The rapid evolution of AI technologies necessitates a proactive and strategic approach if we are to remain relevant.
How to Make a Field invisible in Odoo 17Celine George
It is possible to hide or invisible some fields in odoo. Commonly using “invisible” attribute in the field definition to invisible the fields. This slide will show how to make a field invisible in odoo 17.
Model Attribute Check Company Auto PropertyCeline George
In Odoo, the multi-company feature allows you to manage multiple companies within a single Odoo database instance. Each company can have its own configurations while still sharing common resources such as products, customers, and suppliers.
A Strategic Approach: GenAI in EducationPeter Windle
Artificial Intelligence (AI) technologies such as Generative AI, Image Generators and Large Language Models have had a dramatic impact on teaching, learning and assessment over the past 18 months. The most immediate threat AI posed was to Academic Integrity with Higher Education Institutes (HEIs) focusing their efforts on combating the use of GenAI in assessment. Guidelines were developed for staff and students, policies put in place too. Innovative educators have forged paths in the use of Generative AI for teaching, learning and assessments leading to pockets of transformation springing up across HEIs, often with little or no top-down guidance, support or direction.
This Gasta posits a strategic approach to integrating AI into HEIs to prepare staff, students and the curriculum for an evolving world and workplace. We will highlight the advantages of working with these technologies beyond the realm of teaching, learning and assessment by considering prompt engineering skills, industry impact, curriculum changes, and the need for staff upskilling. In contrast, not engaging strategically with Generative AI poses risks, including falling behind peers, missed opportunities and failing to ensure our graduates remain employable. The rapid evolution of AI technologies necessitates a proactive and strategic approach if we are to remain relevant.
Semantic Web mining using nature inspired optimization methods
1. Semantic Web mining using nature inspired
optimization methods
Diana Andreea Gorea, Lucian Bentea
Faculty of Computer Science, “A.I. Cuza” University, Ia¸i, Romania
s
Abstract. In this paper, nature inspired methods are proposed for solv-
ing problems in the field of Semantic Web mining, namely the clustering
of Web resources based on their metadata, as well as the automatic clas-
sification of Web pages.
1 Introduction
This paper proposes the use of nature inspired methods when solving the problem
of RDF clustering, as well as that of the automatic classification of Web pages.
The most promising methods that the authors found are those belonging to the
Ant Colony Optimization (ACO) framework. While this paper does not aim
to give an introduction ACO, the interested reader can refer to [3] for further
information.
The paper is organized as follows. Section 2 describes efficient heuristics in
two different cases - when the number of clusters is predetermined, or when it is
unknown and is part of the solution. By clustering Semantic Web resources, it is
possible to find representatives for a set of similar resources and thus be able to
reduce the size of large ontologies. This would also bring insight into the main
concepts that an ontology contains. Section 3 summarizes the paper [6] and also
brings further insight into how ACO heuristics can be used to find classification
rules for Web pages. Section 4 draws the conclusions and suggests subjects for
further research.
2 Clustering of Semantic Web data
The data clustering problem refers to grouping a set of data into several nonempty
subsets whose members are considered similar, with respect to some similarity
measure. In the context of Semantic Web data, which can be represented through
RDF graphs, the clustering problem becomes that of grouping individuals in the
graph. An individual, also called an instance in [5], is a single resource node
together with some of its neighbouring nodes, forming a subgraph that is rel-
evant to that resource node. Several instance extraction methods are proposed
in [5]: Immediate Properties, Concise Bounded Description (CBD)1 , or Depth
1
Concise Bounded Description: http://www.w3.org/Submission/CBD/
2. Limited Crawling. The optimal method to use depends on the type of data to be
processed, e.g. RDF data coming converted from a relational database, FOAF
documents, etc., and the structure of its associated RDF graph. The same crite-
rion holds when choosing the optimal similarity measure; the authors of [5] also
propose three distance measures, one based on feature vectors (denoted simFV),
one based on conceptual graphs, inspired by the similarity measure of concep-
tual graphs introduced in [10], and another being an ontology based measure
(denoted simOnt).
2.1 Predetermined number of clusters (the ACOC algorithm)
Assuming a set Ω := {X1 , X2 , . . . , Xm } of individuals is extracted from an RDF
graph G and without giving an explicit formula for the above similarity measures,
the RDF data clustering problem can be formally described as the following
discrete optimization problem. Let sim be a similarity measure, e.g. simFV or
simOnt above. Also let n ≥ 1 be the predetermined number of clusters into
which the data is to be grouped and denote by C1 , C2 , . . . , Cn ∈ Ω the variables
to be determined as the centers of each cluster. By defining the variable wij
through
1, the individual Xi belongs to cluster j,
wij := (1)
0, otherwise,
for i = 1, . . . , m and j = 1, . . . , n, the aim is to
m n
Maximize wij sim(Xi , Cj ), (2)
i=1 j=1
such that each individual belongs to only one cluster,
n
wij = 1, i = 1, . . . , m, (3)
j=1
and there are no empty clusters,
m
wij ≥ 1, j = 1, . . . , n. (4)
i=1
To the best of the authors’ knowledge, there is no proof related to the NP-hard
complexity of this general clustering problem. The most recent results on this
subject is the article [8], which proves that the clustering problem, also known as
the k-means problem, is NP-hard, in the restricted case of planar graphs. How-
ever, as is the case with most discrete optimization problems, clustering of RDF
data is also computationally expensive and solution approximation methods are
preferred.
One of the most promising algorithms for solving the previous optimization
problem is Ant Colony Optimization for Clustering (ACOC), introduced in [7],
3. which is an alternative to the classic k-means algorithm, known to have sev-
eral drawbacks. The numerical results in [7] show that ACOC obtains the best
results, on several test cases, among various approximation methods, including
the k-means algorithm. It also achieves this with the highest convergence rate,
therefore only requiring a few iteration steps to detect the optimum. Since ACOC
is part of the Ant Colony Optimization framework, the idea is to have several
ants “foraging” for the optimum, thus avoiding premature convergence due to
local optima. Apart from using the idea of pheromone trails, each node to be
explored also contains a heuristic value, representing the estimated global gain
from picking that node; this is used to accelerate the convergence of the algo-
rithm. Eventually, ants are grouped into clusters and a solution to the original
RDF clustering problem can be obtained through a decoding algorithm.
2.2 Variable number of clusters (from SSCFL to RDF clustering)
In the case when the number of clusters is not predetermined, but only a fixed
number of individuals are allowed to live in each cluster, the previous problem
can be formulated as a Single Source Capacitated Facility Location (SSCFL)
problem, which can be described as follows. Consider several facilities (e.g. med-
ical or telecommunications facilities) that are installed at different locations in
a city. These facilities provide goods to a number of customers, whose demands
are known beforehand. Each facility comes with the necessary logistics to create
a physical network that would allow customers to connect to the facility. How-
ever, each facility only provides a fixed amount of resources to the customers
who connect to it. The available amount of resources corresponding to a facility
is also called its capacity; hence the adjective capacitated in the name of this
optimization problem. The question is which of the facilities to open and which
customers should be assigned to each open facility, so that the total costs of
opening the facilities and of creating the physical networks are minimized, while
making sure that each customer’s demand is satisfied by exactly one facility.
In Figure 1, a solution to a particular SSCFL problem is represented. The
customers are the light green round rectangles, while the facilities are the light
red circles. The arrows denote assignment relations - the tip of the arrow points
to the facility to which the customer is assigned. The number on each facility
node designates its capacity, while the number on each customer node represents
its demand. Notice that the given solution is feasible, i.e. the total demand of
the customers assigned to a facility does not exceed its maximum capacity and
no customers are left unassigned. Also, in this case, it was decided that three
facilities (having capacities 1, 6, 10) remain closed.
In order to adapt the SSCFL problem to RDF clustering, customers are the
same with the individuals that need to be grouped and the facilities represent
the center of the clusters, which can be activated or not. Thus, consider the
variable wij defined as in the previous subsection and let yi ∈ {0, 1} be the
Boolean variable specifying whether the i-th facility is to be opened or not, for
all i. Also, denote by αi the cost of opening the i-th facility, which is the same
with the cost of taking the individual Xi to be a cluster center, and by αij the
4. 1.5 2.2 1.3 2
3 2.5
10 8
2.5 5 2 1
1.2 6 1.7
Fig. 1. Solution to a particular SSCFL problem
cost of assigning the j-th customer to the i-th facility, for all i, j with 1 ≤ i ≤ m
and 1 ≤ j ≤ n. In the case of RDF data clustering, the costs αij represent the
opposite of the similarity measure between the individual Xi and the cluster
center Cj and they are given by:
αij = −sim(Xi , Cj ), i = 1, . . . , m, j = 1, . . . , n. (5)
Provided that the facilities (the potential cluster centers) have corresponding
capacities u1 , u2 , . . . , um ∈ R+ , the aim of this adapted SSCFL problem is then
to
m n m
Minimize αi yi + αij wij , (6)
i=1 i=1 j=1
subject to the following constraints:
- each customer is assigned to exactly one facility (each individual Xi is as-
signed to exactly one cluster)
n
wij = 1, i = 1, . . . , m, (7)
j=1
- provided that a facility is open (a cluster center is activated), the total
demand of the customers assigned to it (the demand of a group of individuals
to belong to the corresponding cluster) cannot exceed its capacity; also, a
customer cannot be assigned to a facility that is closed (an individual cannot
be represented by a cluster center that is not activated),
m
di wij ≤ uj yj , j = 1, . . . , n, (8)
i=1
5. - a customer can either be assigned or not to a facility (an individual can
either be included or not in a group),
wij ∈ {0, 1}, i = 1, . . . , m, j = 1, . . . , n. (9)
- facilities can either be open or close (cluster centers can either be activated
or not),
yi ∈ {0, 1}, i = 1, . . . , m. (10)
Note: Before carrying on, notice that in a solution to this problem, there may
be individuals that remain ungrouped, which is not necessarily a drawback. On
the contrary, this may provide more realistic solutions to the clustering problem.
The previous integer programming problem is proven in [9] to be NP-hard
and therefore, heuristic solution techniques need to be created to handle its com-
plexity. A survey of the more recent heuristics is given in [1], where the methods
of Tabu Search, Simulated Annealing and Genetic Algorithms are compared
on account of their efficiency with respect to different parameters. An alterna-
tive solution based on Genetic Algorithms is also the subject of [2], in which
two special crossover operators are defined, guaranteeing the feasibility of the
approximations. Also, the Particle Swarm Optimization algorithm described in
[11] and the Ant Colony Optimization algorithm in [13] have the potential to be
adapted to the RDF clustering problem.
3 Web page classification using Ant Colony Optimization
Semantic Web is a combination of data from different sources integrated in a
common format as opposed to the original Web, concentrated mainly on the
exchange of documents. It also has a format that connects data to objects from
the real world. By doing so, the information seeker may jump from one database
to another, just because they are linked because they share knowledge on the
same thing [12].
However, these are all made by human knowledge and so we can also take into
account the factor of subjectivism and the errors that may occur in placement,
content or classification of knowledge. If in the case of user-less web pages (like
portfolio sites or advertising pages) the desire to provide quality content lays only
in the hands of the site owner who may or may not be aware of the mistakes,
once other users appear (that have rights to upload, tag, write content) the task
of keeping the information provided as accurate as possible becomes harder than
ever.
A study we found, shows the way and the results of how general web content
can be sorted by using an Ant Colony Algorithm. We will present the study and
try to connect its findings with what we know that may apply for semantic web
as well.
6. 3.1 Preprocessing
The challenge when dealing with web pages is that the developers do not follow
every time a standardized way of creating web pages. This has many reasons:
design implementation issues that may require certain tricks (fully flash based
sites have no <h1> tag), lack of interest or knowledge in applying them, no
or badly chosen <meta> tags (too much or not related to page content), generic
<title> tags (all pages have the same title). At least regarding meta tags things
started to improve once everyone realised the advantages of being well ranked on
search engines. This generated a higher rate of attention to the content of those
tags and a very high interest in SEO (search engine optimisation). In general,
this would not be an issue for Semantic Web just because they are standardized
and not yet very popular so that, in theory at least, exceptions from the rules
are few.
The contents of web pages can be filtered using texts preprocessing methods
to obtain fewer relevant word to search for and a more human like understanding
of the given text. The most difficult aspect that the methods described above
must provide is the ability to handle well homographs (is one of a group of words
that share the same spelling but have different meanings [14]; ex: stalk - part
of a plant) and stalk (follow/harass a person); left (opposite of right) and left
(past tense of leave) [15]) .
For the study they used WordNet (a lexical program that offers some rela-
tionships between words [4]) to filter the information. From it, they selected:
- the morphological preprocessor (to combine words like: make, made, making
into one word make) to reduce the number of words to search in
- to identify all nouns from the text, as they may offer some relevant search
information. But there is an interesting fact that nouns may have
- the same spelling as verbs (a large number of examples describing this may
be found in [16])
- the words lexical family. If the text has words like: roof, window and door,
they may all apply to house. This is a questionable technique, as for some
associated words the result may not be a real link between them (this is
especially the case for homograph words), or, for other cases (as the one
described above), a significant increase in efficiency.
As far as Semantic Web is concerned all three methods may offer interesting
alternatives to the end results:
- the morphological processor is an interesting option as a word written in
natural language may be linked to another, and only the latter is relevant.
However, a word like left, if processed by this process may not remain in the
same way, but become leave. Having this in mind, it’s probably a good idea
to keep both when dealing with Semantic Web.
- The distinction between nouns/verbs is also not so relevant in terms of
searching a word in semantic web but it becomes significant in terms of
SPARQL queries. This has, however, the advantage that it knows by the
way the syntax is formed which one is the noun and which is the verb.
7. - For the connections between different types of words, has relevance only
if multiple words are searched for at the same time, and some common
denominators may then be used to provide results that better match as
many items provided as possible
For both search types, the end result should be a list of search words, with the
note that, for web mining it should only contain the most relevant words, and for
Semantic Web it should have first the words obtained by joining the semantics,
then the morphologically obtained values (if any) and the words themselves.
This may seem an unnecessary overload but it may help the end user to better
understand the results given, and the first would be the most relevant.
3.2 Algorithm
The Ant-miner algorithm is a variation of the Ant Colony paradigm, used in data
mining. In the beginning it initialises the training set of all available training
cases (web pages) and adds an empty rule list. In an Repeat-Until loop, one
classification rule at a time is discovered: first, all trails are initialised with the
same quantity of pheromone (giving them the same chance to be selected) and
an inner rule lets the ants to select the best option. Each ant selects the path
to follow based on the path followed by the previous ants due to the presence
of pheromone traces. The higher the amount the better the path. In the second
step, the irrelevant terms are removed so that in step three the pheromone values
are updated . The inner loop continues until a condition is fulfilled (maximum
number of paths is generated).
After the processing of the inner loop, the highest-quality rule is chosen
and added to the discovered rule list. All training sets that satisfy the rule are
removed. This ensures that the next inner loop will run with fewer rules than
the previous. The outer loop continues it’s execution until a criteria is satisfied
(ex: some max number of uncovered cases is covered). The algorithm returns the
rule list found.
3.3 Experiment
The study took into account the <meta> and <title> contents of the BBC site.
They chose this because of their high code writing standard, and due to the
very well structured information that improved the chance of making very good
connections between <meta> and content.
4 Conclusions and further research
This paper shows how nature inspired optimization methods can be more effi-
cient than classical, exact methods, when implementing Semantic Web mining
algorithms. Among all, the Ant Colony Optimization metaheuristic proves to
be one of the best solution techniques. As future work, the ideas described in
8. the previous sections need to be implemented and thoroughly tested, as nature
inspired methods have rarely been used in the context of mining the Semantic
Web. Such an implementation would then allow the clustering of resources based
on their associated metadata, e.g. their FOAF description, the microformat in-
formation they contain, etc.
References
1. Arostegui, Jr., M.A., Jr., Kadipasaoglu, S.N., Khumawala, B.M., An empirical com-
parison of Tabu Search, Simulated Annealing, and Genetic Algorithms for facilities
location problems, International Journal of Production Economics, Vol. 103, No. 2,
742-754, 2006.
2. Cortinhal, M.J., Captivo, M.E., Genetic Algorithms for the Single Source Capac-
itated Location Problem: a Computational Study, in the Proceedings of the 4th
Metaheuristics International Conference, 355-359, Porto, Portugal 2001.
3. Dorigo, M., St¨tzle, T., Ant Colony Optimization, MIT Press, 2004.
u
4. Fellbaum, C. (Ed.), WordNet - an electronic lexical database, MIT, 1998.
5. Grimnes, G.A., Edwards, P., Preece, A., Instance based Clustering of Semantic Web
Resources, in the Proceedings of the 5th European Semantic Web Conference, LNCS
5021, Springer-Verlag, pp. 303-317, 2008.
6. Holden, N., Freitas, A.A., Web Page Classification with an Ant Colony algorithm,
in the Proceedings of the 8th International Conference on Parallel Problem Solving
from Nature, LNCS 3242, Springer-Verlag, pp. 1092-1102, 2004.
7. Kao, Y., Cheng, K., An ACO-Based Clustering Algorithm, in the Proceedings of
the Ant Colony Optimization and Swarm Intelligence Conference, LNCS 4150, pp.
340-347, 2006.
8. Mahajan, M., Nimbhorkar, P., Varadarajan, K., The Planar k-means Problem is
NP-hard, in the Proceedings of the 3rd International Workshop on Algorithms and
Computation, LNCS 5431, pp. 274-285, 2009.
9. Mirchandani, P.B., Francis, R.L., Discrete location theory, New York: Wiley, 1990.
10. Montes-y-G´mez, M., Gelbukh, A., L´pez-L´pez, A., Comparison of Conceptual
o o o
Graphs, in Lecture Notes in Artificial Intelligence, Volume 1793, Springer-Verlag,
pp. 548-556, 2000.
11. Sevkli, M., Guner, A.R., A Continuous Particle Swarm Optimization Algorithm
for the Uncapacitated Facility Location Problem, in the Proceedings of the 5th In-
ternational Workshop on Ant Colony Optimization and Swarm Intelligence, ANTS
2006, 316-323, Brussels, Belgium 2006.
12. The official W3C Semantic Web Activity page at http://www.w3.org/2001/sw/.
13. Venables, H., Moscardini, A., An Adaptive Search Heuristic for the Capacitated
Fixed Charge Location Problem, in the Proceedings of the 5th International Work-
shop on Ant Colony Optimization and Swarm Intelligence, ANTS 2006, 348-355,
Brussels, Belgium 2006.
14. The wapedia page on homographs at http://wapedia.mobi/en/Homograph.
15. The wapedia page on homonyms at http://wapedia.mobi/en/Homonyms.
16. Words that can be used both as nouns and verbs, http://www.dailywritingtips.
com/careful-with-words-used-as-noun-and-verb/