Data mining is the process of using technology to identify patterns and prospects from large amount of information. In Data Mining, Clustering is an important research topic and wide range of unverified classification application. Clustering is technique which divides a data into meaningful groups. K-means clustering is a method of cluster analysis which aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean. In this paper, we present the comparison of different K-means clustering algorithms.
K-MEDOIDS CLUSTERING USING PARTITIONING AROUND MEDOIDS FOR PERFORMING FACE R...ijscmc
Face recognition is one of the most unobtrusive biometric techniques that can be used for access control as well as surveillance purposes. Various methods for implementing face recognition have been proposed with varying degrees of performance in different scenarios. The most common issue with effective facial biometric systems is high susceptibility of variations in the face owing to different factors like changes in pose, varying illumination, different expression, presence of outliers, noise etc. This paper explores a novel technique for face recognition by performing classification of the face images using unsupervised learning approach through K-Medoids clustering. Partitioning Around Medoids algorithm (PAM) has been used for performing K-Medoids clustering of the data. The results are suggestive of increased robustness to noise and outliers in comparison to other clustering methods. Therefore the technique can also be used to increase the overall robustness of a face recognition system and thereby increase its invariance and make it a reliably usable biometric modality
K-Means clustering uses an iterative procedure which is very much sensitive and dependent upon the initial centroids. The initial centroids in the k-means clustering are chosen randomly, and hence the clustering also changes with respect to the initial centroids. This paper tries to overcome this problem of random selection of centroids and hence change of clusters with a premeditated selection of initial centroids. We have used the iris, abalone and wine data sets to demonstrate that the proposed method of finding the initial centroids and using the centroids in k-means algorithm improves the clustering performance. The clustering also remains the same in every run as the initial centroids are not randomly selected but through premeditated method.
A brief description of clustering, two relevant clustering algorithms(K-means and Fuzzy C-means), clustering validation, two inner validity indices(Dunn-n-Dunn and Devies Bouldin) .
Step by step operations by which we make a group of objects in which attributes
of all the objects are nearly similar, known as clustering. So, a cluster is a collection of
objects that acquire nearly same attribute values. The property of an object in a cluster is
similar to other objects in same cluster but different with objects of other clusters.
Clustering is used in wide range of applications like pattern recognition, image processing,
data analysis, machine learning etc. Nowadays, more attention has been put on categorical
data rather than numerical data. Where, the range of numerical attributes organizes in a
class like small, medium, high, and so on. There is wide range of algorithm that used to
make clusters of given categorical data. Our approach is to enhance the working on well-
known clustering algorithm k-modes to improve accuracy of algorithm. We proposed a new
approach named “High Accuracy Clustering Algorithm for Categorical datasets”.
New Approach for K-mean and K-medoids AlgorithmEditor IJCATR
K-means and K-medoids clustering algorithms are widely used for many practical applications. Original k
medoids algorithms select initial centroids and medoids randomly that affect the quality of the resulting clusters and sometimes it
generates unstable and empty clusters which are meaningless.
expensive and requires time proportional to the product of the number of data items, number of clusters and the number of iterations.
The new approach for the k mean algorithm eliminates the deficiency of exiting k mean. It first calculates the initial centro
requirements of users and then gives better, effective and stable cluster. It also takes less execution time because it eliminates
unnecessary distance computation by using previous iteration. The new approach for k
systematically based on initial centroids. It generates stable clusters to improve accuracy.
K-MEDOIDS CLUSTERING USING PARTITIONING AROUND MEDOIDS FOR PERFORMING FACE R...ijscmc
Face recognition is one of the most unobtrusive biometric techniques that can be used for access control as well as surveillance purposes. Various methods for implementing face recognition have been proposed with varying degrees of performance in different scenarios. The most common issue with effective facial biometric systems is high susceptibility of variations in the face owing to different factors like changes in pose, varying illumination, different expression, presence of outliers, noise etc. This paper explores a novel technique for face recognition by performing classification of the face images using unsupervised learning approach through K-Medoids clustering. Partitioning Around Medoids algorithm (PAM) has been used for performing K-Medoids clustering of the data. The results are suggestive of increased robustness to noise and outliers in comparison to other clustering methods. Therefore the technique can also be used to increase the overall robustness of a face recognition system and thereby increase its invariance and make it a reliably usable biometric modality
K-Means clustering uses an iterative procedure which is very much sensitive and dependent upon the initial centroids. The initial centroids in the k-means clustering are chosen randomly, and hence the clustering also changes with respect to the initial centroids. This paper tries to overcome this problem of random selection of centroids and hence change of clusters with a premeditated selection of initial centroids. We have used the iris, abalone and wine data sets to demonstrate that the proposed method of finding the initial centroids and using the centroids in k-means algorithm improves the clustering performance. The clustering also remains the same in every run as the initial centroids are not randomly selected but through premeditated method.
A brief description of clustering, two relevant clustering algorithms(K-means and Fuzzy C-means), clustering validation, two inner validity indices(Dunn-n-Dunn and Devies Bouldin) .
Step by step operations by which we make a group of objects in which attributes
of all the objects are nearly similar, known as clustering. So, a cluster is a collection of
objects that acquire nearly same attribute values. The property of an object in a cluster is
similar to other objects in same cluster but different with objects of other clusters.
Clustering is used in wide range of applications like pattern recognition, image processing,
data analysis, machine learning etc. Nowadays, more attention has been put on categorical
data rather than numerical data. Where, the range of numerical attributes organizes in a
class like small, medium, high, and so on. There is wide range of algorithm that used to
make clusters of given categorical data. Our approach is to enhance the working on well-
known clustering algorithm k-modes to improve accuracy of algorithm. We proposed a new
approach named “High Accuracy Clustering Algorithm for Categorical datasets”.
New Approach for K-mean and K-medoids AlgorithmEditor IJCATR
K-means and K-medoids clustering algorithms are widely used for many practical applications. Original k
medoids algorithms select initial centroids and medoids randomly that affect the quality of the resulting clusters and sometimes it
generates unstable and empty clusters which are meaningless.
expensive and requires time proportional to the product of the number of data items, number of clusters and the number of iterations.
The new approach for the k mean algorithm eliminates the deficiency of exiting k mean. It first calculates the initial centro
requirements of users and then gives better, effective and stable cluster. It also takes less execution time because it eliminates
unnecessary distance computation by using previous iteration. The new approach for k
systematically based on initial centroids. It generates stable clusters to improve accuracy.
Classification of common clustering algorithm and techniques, e.g., hierarchical clustering, distance measures, K-means, Squared error, SOFM, Clustering large databases.
Experimental study of Data clustering using k- Means and modified algorithmsIJDKP
The k- Means clustering algorithm is an old algorithm that has been intensely researched owing to its ease
and simplicity of implementation. Clustering algorithm has a broad attraction and usefulness in
exploratory data analysis. This paper presents results of the experimental study of different approaches to
k- Means clustering, thereby comparing results on different datasets using Original k-Means and other
modified algorithms implemented using MATLAB R2009b. The results are calculated on some performance
measures such as no. of iterations, no. of points misclassified, accuracy, Silhouette validity index and
execution time
This is very simple introduction to Clustering with some real world example. At the end of lecture I use stackOverflow API to test some clustering. I also wants to try facebook but it has some problem with it's API
It is a data mining technique used to place the data elements into their related groups. Clustering is the process of partitioning the data (or objects) into the same class, The data in one class is more similar to each other than to those in other cluster.
Fuzzy clustering and fuzzy c-means partition cluster analysis and validation ...IJECEIAES
A hard partition clustering algorithm assigns equally distant points to one of the clusters, where each datum has the probability to appear in simultaneous assignment to further clusters. The fuzzy cluster analysis assigns membership coefficients of data points which are equidistant between two clusters so the information directs have a place toward in excess of one cluster in the meantime. For a subset of CiteScore dataset, fuzzy clustering (fanny) and fuzzy c-means (fcm) algorithms were implemented to study the data points that lie equally distant from each other. Before analysis, clusterability of the dataset was evaluated with Hopkins statistic which resulted in 0.4371, a value < 0.5, indicating that the data is highly clusterable. The optimal clusters were determined using NbClust package, where it is evidenced that 9 various indices proposed 3 cluster solutions as best clusters. Further, appropriate value of fuzziness parameter m was evaluated to determine the distribution of membership values with variation in m from 1 to 2. Coefficient of variation (CV), also known as relative variability was evaluated to study the spread of data. The time complexity of fuzzy clustering (fanny) and fuzzy c-means algorithms were evaluated by keeping data points constant and varying number of clusters.
Frequent pattern mining techniques helpful to find interesting trends or patterns in
massive data. Prior domain knowledge leads to decide appropriate minimum support threshold. This
review article show different frequent pattern mining techniques based on apriori or FP-tree or user
define techniques under different computing environments like parallel, distributed or available data
mining tools, those helpful to determine interesting frequent patterns/itemsets with or without prior
domain knowledge. Proposed review article helps to develop efficient and scalable frequent pattern
mining techniques.
Classification of common clustering algorithm and techniques, e.g., hierarchical clustering, distance measures, K-means, Squared error, SOFM, Clustering large databases.
Experimental study of Data clustering using k- Means and modified algorithmsIJDKP
The k- Means clustering algorithm is an old algorithm that has been intensely researched owing to its ease
and simplicity of implementation. Clustering algorithm has a broad attraction and usefulness in
exploratory data analysis. This paper presents results of the experimental study of different approaches to
k- Means clustering, thereby comparing results on different datasets using Original k-Means and other
modified algorithms implemented using MATLAB R2009b. The results are calculated on some performance
measures such as no. of iterations, no. of points misclassified, accuracy, Silhouette validity index and
execution time
This is very simple introduction to Clustering with some real world example. At the end of lecture I use stackOverflow API to test some clustering. I also wants to try facebook but it has some problem with it's API
It is a data mining technique used to place the data elements into their related groups. Clustering is the process of partitioning the data (or objects) into the same class, The data in one class is more similar to each other than to those in other cluster.
Fuzzy clustering and fuzzy c-means partition cluster analysis and validation ...IJECEIAES
A hard partition clustering algorithm assigns equally distant points to one of the clusters, where each datum has the probability to appear in simultaneous assignment to further clusters. The fuzzy cluster analysis assigns membership coefficients of data points which are equidistant between two clusters so the information directs have a place toward in excess of one cluster in the meantime. For a subset of CiteScore dataset, fuzzy clustering (fanny) and fuzzy c-means (fcm) algorithms were implemented to study the data points that lie equally distant from each other. Before analysis, clusterability of the dataset was evaluated with Hopkins statistic which resulted in 0.4371, a value < 0.5, indicating that the data is highly clusterable. The optimal clusters were determined using NbClust package, where it is evidenced that 9 various indices proposed 3 cluster solutions as best clusters. Further, appropriate value of fuzziness parameter m was evaluated to determine the distribution of membership values with variation in m from 1 to 2. Coefficient of variation (CV), also known as relative variability was evaluated to study the spread of data. The time complexity of fuzzy clustering (fanny) and fuzzy c-means algorithms were evaluated by keeping data points constant and varying number of clusters.
Frequent pattern mining techniques helpful to find interesting trends or patterns in
massive data. Prior domain knowledge leads to decide appropriate minimum support threshold. This
review article show different frequent pattern mining techniques based on apriori or FP-tree or user
define techniques under different computing environments like parallel, distributed or available data
mining tools, those helpful to determine interesting frequent patterns/itemsets with or without prior
domain knowledge. Proposed review article helps to develop efficient and scalable frequent pattern
mining techniques.
Spark Summit EU 2015: Combining the Strengths of MLlib, scikit-learn, and RDatabricks
This talk discusses integrating common data science tools like Python pandas, scikit-learn, and R with MLlib, Spark’s distributed Machine Learning (ML) library. Integration is simple; migration to distributed ML can be done lazily; and scaling to big data can significantly improve accuracy. We demonstrate integration with a simple data science workflow. Data scientists often encounter scaling bottlenecks with single-machine ML tools. Yet the overhead in migrating to a distributed workflow can seem daunting. In this talk, we demonstrate such a migration, taking advantage of Spark and MLlib’s integration with common ML libraries. We begin with a small dataset which runs on a single machine. Increasing the size, we hit bottlenecks in various parts of the workflow: hyperparameter tuning, then ETL, and eventually the core learning algorithm. As we hit each bottleneck, we parallelize that part of the workflow using Spark and MLlib. As we increase the dataset and model size, we can see significant gains in accuracy. We end with results demonstrating the impressive scalability of MLlib algorithms. With accuracy comparable to traditional ML libraries, combined with state-of-the-art distributed scalability, MLlib is a valuable new tool for the modern data scientist.
Scalable Data Science with SparkR: Spark Summit East talk by Felix CheungSpark Summit
R is a very popular platform for Data Science. Apache Spark is a highly scalable data platform. How could we have the best of both worlds? How could a Data Scientist leverage the rich 9000+ packages on CRAN, and integrate Spark into their existing Data Science toolset?
In this talk we will walkthrough many examples how several new features in Apache Spark 2.x will enable this. We will also look at exciting changes in and coming next in Apache Spark 2.x releases.
k-means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. This results in a partitioning of the data space into Voronoi cells.
From common errors seen in running Spark applications, e.g., OutOfMemory, NoClassFound, disk IO bottlenecks, History Server crash, cluster under-utilization to advanced settings used to resolve large-scale Spark SQL workloads such as HDFS blocksize vs Parquet blocksize, how best to run HDFS Balancer to re-distribute file blocks, etc. you will get all the scoop in this information-packed presentation.
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning ModelsAnyscale
Apache Spark has rapidly become a key tool for data scientists to explore, understand and transform massive datasets and to build and train advanced machine learning models. The question then becomes, how do I deploy these model to a production environment? How do I embed what I have learned into customer facing data applications?
In this webinar, we will discuss best practices from Databricks on
how our customers productionize machine learning models
do a deep dive with actual customer case studies,
show live tutorials of a few example architectures and code in Python, Scala, Java and SQL.
Unsupervised learning Algorithms and Assumptionsrefedey275
Topics :
Introduction to unsupervised learning
Unsupervised learning Algorithms and Assumptions
K-Means algorithm – introduction
Implementation of K-means algorithm
Hierarchical Clustering – need and importance of hierarchical clustering
Agglomerative Hierarchical Clustering
Working of dendrogram
Steps for implementation of AHC using Python
Gaussian Mixture Models – Introduction, importance and need of the model
Normal , Gaussian distribution
Implementation of Gaussian mixture model
Understand the different distance metrics used in clustering
Euclidean, Manhattan, Cosine, Mahala Nobis
Features of a Cluster – Labels, Centroids, Inertia, Eigen vectors and Eigen values
Principal component analysis
Supervised learning (classification)
Supervision: The training data (observations, measurements, etc.) are accompanied by labels indicating the class of the observations
New data is classified based on the training set
Unsupervised learning (clustering)
The class labels of training data is unknown
Given a set of measurements, observations, etc. with the aim of establishing the existence of classes or clusters in the data
Types of Hierarchical Clustering
There are mainly two types of hierarchical clustering:
Agglomerative hierarchical clustering
Divisive Hierarchical clustering
A distribution in statistics is a function that shows the possible values for a variable and how often they occur.
In probability theory and statistics, the Normal Distribution, also called the Gaussian Distribution.
is the most significant continuous probability distribution.
Sometimes it is also called a bell curve.
k-Means is a rather simple but well known algorithms for grouping objects, clustering. Again all objects need to be represented as a set of numerical features. In addition the user has to specify the number of groups (referred to as k) he wishes to identify. Each object can be thought of as being represented by some feature vector in an n dimensional space, n being the number of all features used to describe the objects to cluster. The algorithm then randomly chooses k points in that vector space, these point serve as the initial centers of the clusters. Afterwards all objects are each assigned to center they are closest to. Usually the distance measure is chosen by the user and determined by the learning task. After that, for each cluster a new center is computed by averaging the feature vectors of all objects assigned to it. The process of assigning objects and recomputing centers is repeated until the process converges. The algorithm can be proven to converge after a finite number of iterations. Several tweaks concerning distance measure, initial center choice and computation of new average centers have been explored, as well as the estimation of the number of clusters k. Yet the main principle always remains the same. In this project we will discuss about K-means clustering algorithm, implementation and its application to the problem of unsupervised learning
Clustering Using Shared Reference Points Algorithm Based On a Sound Data ModelWaqas Tariq
A novel clustering algorithm CSHARP is presented for the purpose of finding clusters of arbitrary shapes and arbitrary densities in high dimensional feature spaces. It can be considered as a variation of the Shared Nearest Neighbor algorithm (SNN), in which each sample data point votes for the points in its k-nearest neighborhood. Sets of points sharing a common mutual nearest neighbor are considered as dense regions/ blocks. These blocks are the seeds from which clusters may grow up. Therefore, CSHARP is not a point-to-point clustering algorithm. Rather, it is a block-to-block clustering technique. Much of its advantages come from these facts: Noise points and outliers correspond to blocks of small sizes, and homogeneous blocks highly overlap. This technique is not prone to merge clusters of different densities or different homogeneity. The algorithm has been applied to a variety of low and high dimensional data sets with superior results over existing techniques such as DBScan, K-means, Chameleon, Mitosis and Spectral Clustering. The quality of its results as well as its time complexity, rank it at the front of these techniques.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
An improvement in k mean clustering algorithm using better time and accuracyijpla
Cluster
analysis
or
clustering
is the task of grouping a set of objects in such a way that objects in the same
group (called a
cluster
) are more similar (in some sense or another) to each other than to those in other
groups (clusters)
.
K
-
means
is
one of the simplest unsupervised learning algorithms that solve the well
known clustering problem.
The
process of k means algorithm data
is partiti
oned int
o K clusters and the
data are randomly choose
to the clusters resulti
ng in clusters that have
the sa
me number of data
set
.
This
paper is proposed a new K means clustering algorithm we calculate the initial
centroids
systemically
instead of random assigned due to which accuracy and time
improved.
Optimising Data Using K-Means Clustering AlgorithmIJERA Editor
K-means is one of the simplest unsupervised learning algorithms that solve the well known clustering problem. The procedure follows a simple and easy way to classify a given data set through a certain number of clusters (assume k clusters) fixed a priori. The main idea is to define k centroids, one for each cluster. These centroids should be placed in a cunning way because of different location causes different result. So, the better choice is to place them as much as possible far away from each other.
International Journal of Engineering and Science Invention (IJESI)inventionjournals
International Journal of Engineering and Science Invention (IJESI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJESI publishes research articles and reviews within the whole field Engineering Science and Technology, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online.
Ensemble based Distributed K-Modes ClusteringIJERD Editor
Clustering has been recognized as the unsupervised classification of data items into groups. Due to the explosion in the number of autonomous data sources, there is an emergent need for effective approaches in distributed clustering. The distributed clustering algorithm is used to cluster the distributed datasets without gathering all the data in a single site. The K-Means is a popular clustering method owing to its simplicity and speed in clustering large datasets. But it fails to handle directly the datasets with categorical attributes which are generally occurred in real life datasets. Huang proposed the K-Modes clustering algorithm by introducing a new dissimilarity measure to cluster categorical data. This algorithm replaces means of clusters with a frequency based method which updates modes in the clustering process to minimize the cost function. Most of the distributed clustering algorithms found in the literature seek to cluster numerical data. In this paper, a novel Ensemble based Distributed K-Modes clustering algorithm is proposed, which is well suited to handle categorical data sets as well as to perform distributed clustering process in an asynchronous manner. The performance of the proposed algorithm is compared with the existing distributed K-Means clustering algorithms, and K-Modes based Centralized Clustering algorithm. The experiments are carried out for various datasets of UCI machine learning data repository.
Comparison Between Clustering Algorithms for Microarray Data AnalysisIOSR Journals
Currently, there are two techniques used for large-scale gene-expression profiling; microarray and
RNA-Sequence (RNA-Seq).This paper is intended to study and compare different clustering algorithms that used
in microarray data analysis. Microarray is a DNA molecules array which allows multiple hybridization
experiments to be carried out simultaneously and trace expression levels of thousands of genes. It is a highthroughput
technology for gene expression analysis and becomes an effective tool for biomedical research.
Microarray analysis aims to interpret the data produced from experiments on DNA, RNA, and protein
microarrays, which enable researchers to investigate the expression state of a large number of genes. Data
clustering represents the first and main process in microarray data analysis. The k-means, fuzzy c-mean, selforganizing
map, and hierarchical clustering algorithms are under investigation in this paper. These algorithms
are compared based on their clustering model.
Max stable set problem to found the initial centroids in clustering problemnooriasukmaningtyas
In this paper, we propose a new approach to solve the document-clustering using the K-Means algorithm. The latter is sensitive to the random selection of the k cluster centroids in the initialization phase. To evaluate the quality of K-Means clustering we propose to model the text document clustering problem as the max stable set problem (MSSP) and use continuous Hopfield network to solve the MSSP problem to have initial centroids. The idea is inspired by the fact that MSSP and clustering share the same principle, MSSP consists to find the largest set of nodes completely disconnected in a graph, and in clustering, all objects are divided into disjoint clusters. Simulation results demonstrate that the proposed K-Means improved by MSSP (KM_MSSP) is efficient of large data sets, is much optimized in terms of time, and provides better quality of clustering than other methods.
Due to availability of internet and evolution of embedded devices, Internet of things can be useful to contribute in energy domain. The Internet of Things (IoT) will deliver a smarter grid to enable more information and connectivity throughout the infrastructure and to homes. Through the IoT, consumers, manufacturers and utility providers will come across new ways to manage devices and ultimately conserve resources and save money by using smart meters, home gateways, smart plugs and connected appliances. The future smart home, various devices will be able to measure and share their energy consumption, and actively participate in house-wide or building wide energy management systems. This paper discusses the different approaches being taken worldwide to connect the smart grid. Full system solutions can be developed by combining hardware and software to address some of the challenges in building a smarter and more connected smart grid.
A Survey Report on : Security & Challenges in Internet of Thingsijsrd.com
In the era of computing technology, Internet of Things (IoT) devices are now popular in each and every domains like e-governance, e-Health, e-Home, e-Commerce, and e-Trafficking etc. Iot is spreading from small to large applications in all fields like Smart Cities, Smart Grids, Smart Transportation. As on one side IoT provide facilities and services for the society. On the other hand, IoT security is also a crucial issues.IoT security is an area which totally concerned for giving security to connected devices and networks in the IoT .As, IoT is vast area with usability, performance, security, and reliability as a major challenges in it. The growth of the IoT is exponentially increases as driven by market pressures, which proportionally increases the security threats involved in IoT The relationship between the security and billions of devices connecting to the Internet cannot be described with existing mathematical methods. In this paper, we explore the opportunities possible in the IoT with security threats and challenges associated with it.
In today’s emerging world of Internet, each and every thing is supposed to be in connected mode with the help of billions of smart devices. By connecting all the devises used in our day to day life, make our life trouble less and easy. We are incorporated in a world where we are used to have smart phones, smart cars, smart gadgets, smart homes and smart cities. Different institutes and researchers are working for creating a smart world for us but real question which we need to emphasis on is how to make dumb devises talk with uncommon hardware and communication technology. For the same what kind of mechanism to use with various protocols and less human interaction. The purpose is to provide the key area for application of IoT and a platform on which various devices having different mechanism and protocols can communicate with an integrated architecture.
Study on Issues in Managing and Protecting Data of IOTijsrd.com
This paper discusses variety of issues for preserving and managing data produced by IoT. Every second large amount of data are added or updated in the IoT databases across the heterogeneous environment. While managing the data each phase of data processing for IoT data is exigent like storing data, querying, indexing, transaction management and failure handling. We also refer to the problem of data integration and protection as data requires to be fit in single layout and travel securely as they arrive in the pool from diversified sources in different structure. Finally, we confer a standardized pathway to manage and to defend data in consistent manner.
Interactive Technologies for Improving Quality of Education to Build Collabor...ijsrd.com
Today with advancement in Information Communication Technology (ICT) the way the education is being delivered is seeing a paradigm shift from boring classroom lectures to interactive applications such as 2-D and 3-D learning content, animations, live videos, response systems, interactive panels, education games, virtual laboratories and collaborative research (data gathering and analysis) etc. Engineering is emerging with more innovative solutions in the field of education and bringing out their innovative products to improve education delivery. The academic institutes which were once hesitant to use such technology are now looking forward to such innovations. They are adopting the new ways as they are realizing the vast benefits of using such methods and technology. The benefits are better comprehensibility, improved learning efficiency of students, and access to vast knowledge resources, geographical reach, quick feedback, accountability and quality research. This paper focuses on how engineering can leverage the latest technology and build a collaborative learning environment which can then be integrated with the national e-learning grid.
Internet of Things - Paradigm Shift of Future Internet Application for Specia...ijsrd.com
In the world more than 15% people are living with disability that also include children below age of 10 years. Due to lack of independent support services specially abled (handicap) people overly rely on other people for their basic needs, that excludes them from being financially and socially active. The Internet of Things (IoT) can give support system and a better quality of life as well as participation in routine and day to day life. For this purpose, the future solutions for current problems has been introduced in this paper. Daunting challenges have been considered as future research and glimpse of the IoT for specially abled person is given in the paper.
A Study of the Adverse Effects of IoT on Student's Lifeijsrd.com
Internet of things (IoT) is the most powerful invention and if used in the positive direction, internet can prove to be very productive. But, now a days, due to the social networking sites such as Face book, WhatsApp, twitter, hike etc. internet is producing adverse effects on the student life, especially those students studying at college Level. As it is rightly said, something which has some positive effects also has some of the negative effects on the other hand. In this article, we are discussing some adverse effects of IoT on student’s life.
Pedagogy for Effective use of ICT in English Language Learningijsrd.com
The use of information and communications technology (ICT) in education is a relatively new phenomenon and it has been the educational researchers' focus of attention for more than two decades. Educators and researchers examine the challenges of using ICT and think of new ways to integrate ICT into the curriculum. However, there are some barriers for the teachers that prevent them to use ICT in the classroom and develop supporting materials through ICT. The purpose of this study is to examine the high school English teachers’ perceptions of the factors discouraging teachers to use ICT in the classroom.
In recent years usage of private vehicles create urban traffic more and more crowded. As result traffic becomes one of the important problems in big cities in all over the world. Some of the traffic concerns are traffic jam and accidents which have caused a huge waste of time, more fuel consumption and more pollution. Time is very important parameter in routine life. The main problem faced by the people is real time routing. Our solution Virtual Eye will provide the current updates as in the real time scenario of the specific route. This research paper presents smart traffic navigation system, based on Internet of Things, which is featured by low cost, high compatibility, easy to upgrade, to replace traditional traffic management system and the proposed system can improve road traffic tremendously.
Ontological Model of Educational Programs in Computer Science (Bachelor and M...ijsrd.com
In this work there is illustrated an ontological model of educational programs in computer science for bachelor and master degrees in Computer science and for master educational program “Computer science as second competence†by Tempus project PROMIS.
Understanding IoT Management for Smart Refrigeratorijsrd.com
Lately the concept of Internet of Things (IoT) is being more elaborated and devices and databases are proposed thereby to meet the need of an Internet of Things scenario. IoT is being considered to be an integral part of smart house where devices will be connected to each other and also react upon certain environmental input. This will eventually include the home refrigerator, air conditioner, lights, heater and such other home appliances. Therefore, we focus our research on the database part for such an IoT’ fridge which we called as smart Fridge. We describe the potentials achievable through a database for an IoT refrigerator to manage the refrigerator food and also aid the creation of a monthly budget of the house for a family. The paper aims at the data management issue based on a proposed design for an intelligent refrigerator leveraging the sensor technology and the wireless communication technology. The refrigerator which identifies products by reading the barcodes or RFID tags is proposed to order the required products by connecting to the Internet. Thus the goal of this paper is to minimize human interaction to maintain the daily life events.
DESIGN AND ANALYSIS OF DOUBLE WISHBONE SUSPENSION SYSTEM USING FINITE ELEMENT...ijsrd.com
Double wishbone designs allow the engineer to carefully control the motion of the wheel throughout suspension travel. 3-D model of the Lower Wishbone Arm is prepared by using CAD software for modal and stress analysis. The forces and moments are used as the boundary conditions for finite element model of the wishbone arm. By using these boundary conditions static analysis is carried out. Then making the load as a function of time; quasi-static analysis of the wishbone arm is carried out. A finite element based optimization is used to optimize the design of lower wishbone arm. Topology optimization and material optimization techniques are used to optimize lower wishbone arm design.
A Review: Microwave Energy for materials processingijsrd.com
Microwave energy is a latest largest growing technique for material processing. This paper presents a review of microwave technologies used for material processing and its use for industrial applications. Advantages in using microwave energy for processing material include rapid heating, high heating efficiency, heating uniformity and clean energy. The microwave heating has various characteristics and due to which it has been become popular for heating low temperature applications to high temperature applications. In recent years this novel technique has been successfully utilized for the processing of metallic materials. Many researchers have reported microwave energy for sintering, joining and cladding of metallic materials. The aim of this paper is to show the use of microwave energy not only for non-metallic materials but also the metallic materials. The ability to process metals with microwave could assist in the manufacturing of high performance metal parts desired in many industries, for example in automotive and aeronautical industries.
Web Usage Mining: A Survey on User's Navigation Pattern from Web Logsijsrd.com
With an expontial growth of World Wide Web, there are so many information overloaded and it became hard to find out data according to need. Web usage mining is a part of web mining, which deal with automatic discovery of user navigation pattern from web log. This paper presents an overview of web mining and also provide navigation pattern from classification and clustering algorithm for web usage mining. Web usage mining contain three important task namely data preprocessing, pattern discovery and pattern analysis based on discovered pattern. And also contain the comparative study of web mining techniques.
APPLICATION OF STATCOM to IMPROVED DYNAMIC PERFORMANCE OF POWER SYSTEMijsrd.com
Application of FACTS controller called Static Synchronous Compensator STATCOM to improve the performance of power grid with Wind Farms is investigated .The essential feature of the STATCOM is that it has the ability to absorb or inject fastly the reactive power with power grid . Therefore the voltage regulation of the power grid with STATCOM FACTS device is achieved. Moreover restoring the stability of the power system having wind farm after occurring severe disturbance such as faults or wind farm mechanical power variation is obtained with STATCOM controller . The dynamic model of the power system having wind farm controlled by proposed STATCOM is developed . To validate the powerful of the STATCOM FACTS controller, the studied power system is simulated and subjected to different severe disturbances. The results prove the effectiveness of the proposed STATCOM controller in terms of fast damping the power system oscillations and restoring the power system stability.
Making model of dual axis solar tracking with Maximum Power Point Trackingijsrd.com
Now a days solar harvesting is more popular. As the popularity become higher the material quality and solar tracking methods are more improved. There are several factors affecting the solar system. Major influence on solar cell, intensity of source radiation and storage techniques The materials used in solar cell manufacturing limit the efficiency of solar cell. This makes it particularly difficult to make considerable improvements in the performance of the cell, and hence restricts the efficiency of the overall collection process. Therefore, the most attainable maximum power point tracking method of improving the performance of solar power collection is to increase the mean intensity of radiation received from the source used. The purposed of tracking system controls elevation and orientation angles of solar panels such that the panels always maintain perpendicular to the sunlight. The measured variables of our automatic system were compared with those of a fixed angle PV system. As a result of the experiment, the voltage generated by the proposed tracking system has an overall of about 28.11% more than the fixed angle PV system. There are three major approaches for maximizing power extraction in medium and large scale systems. They are sun tracking, maximum power point (MPP) tracking or both.
A REVIEW PAPER ON PERFORMANCE AND EMISSION TEST OF 4 STROKE DIESEL ENGINE USI...ijsrd.com
In day today's relevance, it is mandatory to device the usage of diesel in an economic way. In present scenario, the very low combustion efficiency of CI engine leads to poor performance of engine and produces emission due to incomplete combustion. Study of research papers is focused on the improvement in efficiency of the engine and reduction in emissions by adding ethanol in a diesel with different blends like 5%, 10%, 15%, 20%, 25% and 30% by volume. The performance and emission characteristics of the engine are tested observed using blended fuels and comparative assessment is done with the performance and emission characteristics of engine using pure diesel.
Study and Review on Various Current Comparatorsijsrd.com
This paper presents study and review on various current comparators. It also describes low voltage current comparator using flipped voltage follower (FVF) to obtain the single supply voltage. This circuit has short propagation delay and occupies a small chip area as compare to other current comparators. The results of this circuit has obtained using PSpice simulator for 0.18 μm CMOS technology and a comparison has been performed with its non FVF counterpart to contrast its effectiveness, simplicity, compactness and low power consumption.
Reducing Silicon Real Estate and Switching Activity Using Low Power Test Patt...ijsrd.com
Power dissipation is a challenging problem for today's system-on-chip design and test. This paper presents a novel architecture which generates the test patterns with reduced switching activities; it has the advantage of low test power and low hardware overhead. The proposed LP-TPG (test pattern generator) structure consists of modified low power linear feedback shift register (LP-LFSR), m-bit counter, gray counter, NOR-gate structure and XOR-array. The seed generated from LP-LFSR is EXCLUSIVE-OR ed with the data generated from gray code generator. The XOR result of the sequence is single input changing (SIC) sequence, in turn reduces the switching activity and so power dissipation will be very less. The proposed architecture is simulated using Modelsim and synthesized using Xilinx ISE9.2.The Xilinx chip scope tool will be used to test the logic running on FPGA.
Defending Reactive Jammers in WSN using a Trigger Identification Service.ijsrd.com
In the last decade, the greatest threat to the wireless sensor network has been Reactive Jamming Attack because it is difficult to be disclosed and defend as well as due to its mass destruction to legitimate sensor communications. As discussed above about the Reactive Jammers Nodes, a new scheme to deactivate them efficiently is by identifying all trigger nodes, where transmissions invoke the jammer nodes, which has been proposed and developed. Due to this identification mechanism, many existing reactive jamming defending schemes can be benefited. This Trigger Identification can also work as an application layer .In this paper, on one side we provide the several optimization problems to provide complete trigger identification service framework for unreliable wireless sensor networks and on the other side we also provide an improved algorithm with regard to two sophisticated jamming models, in order to enhance its robustness for various network scenarios.
Saudi Arabia stands as a titan in the global energy landscape, renowned for its abundant oil and gas resources. It's the largest exporter of petroleum and holds some of the world's most significant reserves. Let's delve into the top 10 oil and gas projects shaping Saudi Arabia's energy future in 2024.
Welcome to WIPAC Monthly the magazine brought to you by the LinkedIn Group Water Industry Process Automation & Control.
In this month's edition, along with this month's industry news to celebrate the 13 years since the group was created we have articles including
A case study of the used of Advanced Process Control at the Wastewater Treatment works at Lleida in Spain
A look back on an article on smart wastewater networks in order to see how the industry has measured up in the interim around the adoption of Digital Transformation in the Water Industry.
Hierarchical Digital Twin of a Naval Power SystemKerry Sado
A hierarchical digital twin of a Naval DC power system has been developed and experimentally verified. Similar to other state-of-the-art digital twins, this technology creates a digital replica of the physical system executed in real-time or faster, which can modify hardware controls. However, its advantage stems from distributing computational efforts by utilizing a hierarchical structure composed of lower-level digital twin blocks and a higher-level system digital twin. Each digital twin block is associated with a physical subsystem of the hardware and communicates with a singular system digital twin, which creates a system-level response. By extracting information from each level of the hierarchy, power system controls of the hardware were reconfigured autonomously. This hierarchical digital twin development offers several advantages over other digital twins, particularly in the field of naval power systems. The hierarchical structure allows for greater computational efficiency and scalability while the ability to autonomously reconfigure hardware controls offers increased flexibility and responsiveness. The hierarchical decomposition and models utilized were well aligned with the physical twin, as indicated by the maximum deviations between the developed digital twin hierarchy and the hardware.
Overview of the fundamental roles in Hydropower generation and the components involved in wider Electrical Engineering.
This paper presents the design and construction of hydroelectric dams from the hydrologist’s survey of the valley before construction, all aspects and involved disciplines, fluid dynamics, structural engineering, generation and mains frequency regulation to the very transmission of power through the network in the United Kingdom.
Author: Robbie Edward Sayers
Collaborators and co editors: Charlie Sims and Connor Healey.
(C) 2024 Robbie E. Sayers
Immunizing Image Classifiers Against Localized Adversary Attacksgerogepatton
This paper addresses the vulnerability of deep learning models, particularly convolutional neural networks
(CNN)s, to adversarial attacks and presents a proactive training technique designed to counter them. We
introduce a novel volumization algorithm, which transforms 2D images into 3D volumetric representations.
When combined with 3D convolution and deep curriculum learning optimization (CLO), itsignificantly improves
the immunity of models against localized universal attacks by up to 40%. We evaluate our proposed approach
using contemporary CNN architectures and the modified Canadian Institute for Advanced Research (CIFAR-10
and CIFAR-100) and ImageNet Large Scale Visual Recognition Challenge (ILSVRC12) datasets, showcasing
accuracy improvements over previous techniques. The results indicate that the combination of the volumetric
input and curriculum learning holds significant promise for mitigating adversarial attacks without necessitating
adversary training.
Cosmetic shop management system project report.pdfKamal Acharya
Buying new cosmetic products is difficult. It can even be scary for those who have sensitive skin and are prone to skin trouble. The information needed to alleviate this problem is on the back of each product, but it's thought to interpret those ingredient lists unless you have a background in chemistry.
Instead of buying and hoping for the best, we can use data science to help us predict which products may be good fits for us. It includes various function programs to do the above mentioned tasks.
Data file handling has been effectively used in the program.
The automated cosmetic shop management system should deal with the automation of general workflow and administration process of the shop. The main processes of the system focus on customer's request where the system is able to search the most appropriate products and deliver it to the customers. It should help the employees to quickly identify the list of cosmetic product that have reached the minimum quantity and also keep a track of expired date for each cosmetic product. It should help the employees to find the rack number in which the product is placed.It is also Faster and more efficient way.
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxR&R Consult
CFD analysis is incredibly effective at solving mysteries and improving the performance of complex systems!
Here's a great example: At a large natural gas-fired power plant, where they use waste heat to generate steam and energy, they were puzzled that their boiler wasn't producing as much steam as expected.
R&R and Tetra Engineering Group Inc. were asked to solve the issue with reduced steam production.
An inspection had shown that a significant amount of hot flue gas was bypassing the boiler tubes, where the heat was supposed to be transferred.
R&R Consult conducted a CFD analysis, which revealed that 6.3% of the flue gas was bypassing the boiler tubes without transferring heat. The analysis also showed that the flue gas was instead being directed along the sides of the boiler and between the modules that were supposed to capture the heat. This was the cause of the reduced performance.
Based on our results, Tetra Engineering installed covering plates to reduce the bypass flow. This improved the boiler's performance and increased electricity production.
It is always satisfying when we can help solve complex challenges like this. Do your systems also need a check-up or optimization? Give us a call!
Work done in cooperation with James Malloy and David Moelling from Tetra Engineering.
More examples of our work https://www.r-r-consult.dk/en/cases-en/
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
A survey on Efficient Enhanced K-Means Clustering Algorithm
1. IJSRD - International Journal for Scientific Research & Development| Vol. 1, Issue 9, 2013 | ISSN (online): 2321-0613
All rights reserved by www.ijsrd.com 1756
A survey on Efficient Enhanced K-Means Clustering Algorithm
Bhoomi Bangoria1
Prof. Nirali Mankad2
Prof. Vimal Pambhar3
1
P. G. Student 2, 3
Assistant Professor
1, 2
Computer Engineering Department
1, 2
Noble Engineering College, Junagadh, Gujarat 3
Dr. Subhash Technical Campus, Junagadh, Gujarat
Abstract—Data mining is the process of using technology to
identify patterns and prospects from large amount of
information. In Data Mining, Clustering is an important
research topic and wide range of unverified classification
application. Clustering is technique which divides a data
into meaningful groups. K-means clustering is a method of
cluster analysis which aims to partition n observations into k
clusters in which each observation belongs to the cluster
with the nearest mean. In this paper, we present the
comparison of different K-means clustering algorithms.
I. INTRODUCTION
Fast rescue of the related information from the databases has
always been a major issue. Different techniques have been
developed for this purpose; one of them is Data Clustering.
Clustering is a technique which divides data objects into
groups based on the information establish in data that
illustrates the objects and relationships between them, their
mark values which can be used in many applications, such
as knowledge discovery, vector quantization, pattern
recognition, data mining, data dredging and etc. [2]
A categorization of major clustering methods: [1]
Partitioning methodsA.
In partitioned clustering, the algorithms typically determine
all clusters at once, it divides the set of data objects into
non-overlapping clusters, and each data object is in exactly
one cluster.
Hierarchical methodsB.
It creates a hierarchical decomposition of the given set of
data objects. A hierarchical method can be classified as
being either agglomerative or divisive, based on how the
hierarchical decomposition is formed. The agglomerative
approach, also called the bottom-up approach, starts with
each object forming a separate group. It successively merges
the objects or groups that are close to one another, until all
of the groups are merged into one (the topmost level of the
hierarchy), or until a termination condition holds. The
divisive approach, also called the top-down approach, starts
with all of the objects in the same cluster. In each successive
iteration, a cluster is split up into smaller clusters, until
eventually each object is in one cluster, or until a
termination condition holds.
Density-based methodsC.
Density-based methods can find only spherical-shaped
clusters and encounter difficulty at discovering clusters of
arbitrary shapes. i. e. A Density-Based Clustering Method
Based on Connected Regions with Sufficiently High Density
(DBSCAN.)
Grid-based methodsD.
Grid-based methods quantize the object space into a finite
number of cells that form a grid structure. the main
advantage of this approach is its fast processing time, which
is typically independent of the number of data objects and
dependent only on the number of cells in each dimension in
the quantized space. i.e. STatistical INformation Grid
(STING)
Model-based methodsE.
Model-based methods hypothesize a model for each of the
clusters and find the best fit of the data to the given model.
i.e. Expectation-Maximization (EM)
High-dimensional data clustering methodsF.
It is a particularly important task in cluster analysis because
many applications require the analysis of objects containing
a large number of features or dimensions. i.e. CLustering In
QUEst (CLIQUE): A Dimension-Growth Subspace
Clustering Method
Constraint-based clustering methodsG.
It is a clustering approach that performs clustering by
incorporation of user-specified or application-oriented
constraints.
II. K-MEANS CLUSTERING ALGORITHM
K-means is a data mining algorithm which performs
clustering. As mentioned previously, clustering is dividing a
dataset into a number of groups such that similar items fall
into same groups [1]. Clustering uses unsupervised learning
technique which means that result clusters are not known
before the execution of clustering algorithm unlike the case
in classification. Some clustering algorithms takes the
number of desired clusters as input while some others decide
the number of result clusters themselves.
K-means algorithm uses an iterative process in
order to cluster database. It takes the number of desired
clusters and the initial means as inputs and produces final
means as output. Mentioned first and last means are the
means of clusters. If the algorithm is required to produce K
clusters then there will be K initial means and K final
means. In completion, K-means algorithm produces K final
means which answers why the name of algorithm is K-
means.
After termination of K-means clustering, each
object in dataset becomes a member of one cluster. This
cluster is determined by searching all over the means in
order to find the cluster with nearest mean to the object.
Shortest distanced mean is considered to be the mean of
cluster to which observed object belongs. K-means
algorithm tries to group the items in dataset into desired
number of clusters. To perform this task it makes some
iteration until it converges. After each iteration, calculated
means are updated such that they become closer to final
means. And finally, the algorithm converges and stops
performing iterations.
2. A Survey on Efficient Enhanced K-Means Clustering Algorithm
(IJSRD/Vol. 1/Issue 9/2013/0016)
All rights reserved by www.ijsrd.com 1757
Fig. 1: Example of K-means Algorithm [1]
Fig. 2: Flowchart of K-mean clustering algorithm
Steps of Algorithm:A.
Input:
D = {d1, d2, dn} //set of n data items.
k // Number of desired clusters
Output: A set of k clusters.
Steps:
Arbitrarily choose k data-items from D as initial centroids;
Repeat
Assign each item di to the cluster which has the
closest centroid;
Calculate new mean for each cluster;
Until convergence criteria is met.
III. ADVANTAGES AND DISADVANTAGES OF K
MEAN CLUSTERING
AdvantagesA.
1) K-mean clustering is simple and flexible.
2) K-mean clustering algorithm is easy to understand and
implements.
DisadvantagesB.
1) In K-mean clustering user need to specify the number
of cluster in advance [3].
2) K-mean clustering algorithm performance depends on
an initial centroids that why the algorithm doesn’t have
guarantee for optimal solution [3].
IV. RELATED WORK
1) An Efficient K-Means Clustering Algorithm for
Reducing Time Complexity using Uniform Distribution Data
Points
D. Napoleon & P. Ganga lakshmi [12] proposed a method
for making the K-Means algorithm more valuable and
professional; so as to get better clustering with abridged
complexity. The best algorithm was found out based on their
recital using uniform distribution data points. The accuracy
of the algorithm was investigated during different execution
of the program on the input data points. From the
experimental results, the uniform distribution data points are
used easily realize the values and get a superior result.
2) An efficient enhanced k-means clustering algorithm
Fahim et al. [4] proposed k-means algorithm determines
spherical shaped cluster, whose center is the magnitude
center of points in that cluster, this center moves as new
points are added to or detached from it. This proposition
makes the center closer to some points and far apart from
the other points, the points that become nearer to the center
will stay in that cluster, so there is no need to find its
distances to other cluster centers. The points far apart from
the center may alter the cluster, so only for these points their
distances to other cluster centers will be calculated, and
assigned to the nearest center.
3) MK-means - Modified K-means clustering algorithm
Hesam et al. [5] proposed The MK-Means clustering uses
principal components analysis, to establish a provisional
value of count of classes and grant changeable labels for
objects. This section is paying interest on how to project
objects into space of principal vector and how to discover
labels of objects by using a probability of connectivity
matrix that for every two objects shows the probability of
being in same class. We consider the number of connected
objects to Xi from two nearest classes, and also the sum of
distances of Xi for connected objects with respect to their
class ID.
4) A Novel Density based improved k-means Clustering
Algorithm – Dbkmeans
K. Mumtaz et al. [6] proposed in our imitation of the
algorithm, we only assumed overlapped clusters of circular
or spheroid in nature. So the criteria for splitting or joining a
cluster can be determined based on the number of estimated
points in a cluster or the estimated density of the cluster.
5) A Survey on K-mean Clustering and Particle Swarm
Optimization
P. Vora et al. [7] proposed James MacQueen, the one who
proposed the term "k-means “in 1967. But the standard
algorithm was firstly introduced by Stuart Lloyd in 1957 as
a technique pulse-code modulation. The K-Means clustering
algorithm is a partition-based cluster analysis method.
According to the algorithm we firstly select k objects as
initial cluster centers, then calculate the distance between
each cluster centre and each object and allocate it to the
nearest cluster, revise the averages of all clusters, repeat this
process until the criterion function congregated.