Ethan Bowen analyzed different kernel functions for support vector machines on a multi-class teacher performance dataset. He tested polynomial, radial basis function (RBF), and custom kernels. A custom multihyperkernel called BowenRBF, which combined Laplacian and exponential RBF kernels, achieved the highest classification accuracy compared to other kernels tested. However, BowenRBF's performance on other datasets requires more research to justify it as universally better than standard kernels like Gaussian RBF.
Using Fuzzy Code Search to Link Code Fragments in Discussions to Source CodeNicolas Bettenburg
Talk on Using Fuzzy Code Search to Link Code Fragments in Discussions to Source Code, given at the 16th European Conference on Software Maintenance and Reengineering (CSMR'12) in Hungary.
Machine-learning scoring functions for molecular dockingPedro Ballester
Docking tools to predict whether and how a small molecule binds to a macromolecular target can be applied if a structural model of such target is available. The reliability of docking depends, however, on the accuracy of the adopted scoring function (SF). Despite intense research over the years, improving the accuracy of SFs for structure‐based binding affinity prediction or virtual screening has proven to be a challenging task for any class of method. New SFs based on modern machine‐learning regression models, which do not impose a predetermined functional form and thus are able to exploit effectively much larger amounts of experimental data, have recently been the object of much interest. These machine‐learning SFs have been shown to outperform a wide range of classical SFs at both binding affinity prediction and virtual screening. The emerging picture from these studies is that the classical approach of using linear regression with a small number of expert‐selected structural features can be strongly improved by a machine‐learning approach based on nonlinear regression allied with comprehensive data‐driven feature selection. As the performance of classical SFs does not grow with larger training datasets, this performance gap is expected to widen as more training data becomes available in the future.
The document is from the website www.tentickle-stretchtents.com and promotes the use of stretch tents for weddings. It describes how stretch tents offer flexibility in venue size and space, can be installed virtually anywhere both indoors and outdoors, provide visual impact while also being safe with fire retardant fabrics. Stretch tents allow for hassle-free installation and provide flexible space for catering, dancing, decorations. The summary concludes by stating the website provides cost-effective stretch tent hire options for weddings and provides contact information.
The letter expresses interest in working for the recipient's prestigious agency. The applicant is an electrical engineering senior student at NJIT who has also worked as an engineer assistant at Twin Peaks, Inc. in Long Island, New York. She believes her combination of work experience and education has prepared her to make an immediate contribution. She provides contact information and requests a personal interview to further discuss employment possibilities.
The document advertises an opportunity for undergraduate students to become campus tour guides for the Office of Admissions. As a campus tour guide, students would share their passion for being a Husky and unique experience with over 37,000 prospective students who visit the university each year. Undergraduates who are dedicated representatives of the university are encouraged to apply for the campus tour guide position by January 13, 2016.
Using Fuzzy Code Search to Link Code Fragments in Discussions to Source CodeNicolas Bettenburg
Talk on Using Fuzzy Code Search to Link Code Fragments in Discussions to Source Code, given at the 16th European Conference on Software Maintenance and Reengineering (CSMR'12) in Hungary.
Machine-learning scoring functions for molecular dockingPedro Ballester
Docking tools to predict whether and how a small molecule binds to a macromolecular target can be applied if a structural model of such target is available. The reliability of docking depends, however, on the accuracy of the adopted scoring function (SF). Despite intense research over the years, improving the accuracy of SFs for structure‐based binding affinity prediction or virtual screening has proven to be a challenging task for any class of method. New SFs based on modern machine‐learning regression models, which do not impose a predetermined functional form and thus are able to exploit effectively much larger amounts of experimental data, have recently been the object of much interest. These machine‐learning SFs have been shown to outperform a wide range of classical SFs at both binding affinity prediction and virtual screening. The emerging picture from these studies is that the classical approach of using linear regression with a small number of expert‐selected structural features can be strongly improved by a machine‐learning approach based on nonlinear regression allied with comprehensive data‐driven feature selection. As the performance of classical SFs does not grow with larger training datasets, this performance gap is expected to widen as more training data becomes available in the future.
The document is from the website www.tentickle-stretchtents.com and promotes the use of stretch tents for weddings. It describes how stretch tents offer flexibility in venue size and space, can be installed virtually anywhere both indoors and outdoors, provide visual impact while also being safe with fire retardant fabrics. Stretch tents allow for hassle-free installation and provide flexible space for catering, dancing, decorations. The summary concludes by stating the website provides cost-effective stretch tent hire options for weddings and provides contact information.
The letter expresses interest in working for the recipient's prestigious agency. The applicant is an electrical engineering senior student at NJIT who has also worked as an engineer assistant at Twin Peaks, Inc. in Long Island, New York. She believes her combination of work experience and education has prepared her to make an immediate contribution. She provides contact information and requests a personal interview to further discuss employment possibilities.
The document advertises an opportunity for undergraduate students to become campus tour guides for the Office of Admissions. As a campus tour guide, students would share their passion for being a Husky and unique experience with over 37,000 prospective students who visit the university each year. Undergraduates who are dedicated representatives of the university are encouraged to apply for the campus tour guide position by January 13, 2016.
The document contains review questions about assessing and caring for patients with integumentary system disorders. It addresses topics like identifying layers of the skin, documenting skin assessment findings, educating patients about conditions like psoriasis and skin cancer prevention, and risk factors for developing non-melanoma skin cancer. Common skin lesions, signs of dehydration, complications of intravenous infiltration, and techniques to reduce acne scarring are also discussed.
Constance Wohlford has over 10 years of experience in database administration, IT support, and data analytics. She has a proven track record of creating and improving database systems to increase efficiency and cut costs. Her technical skills include databases, SQL, Java, web programming, and the Microsoft Office suite. She holds a Bachelor's degree in Computer Information Systems and has taken additional STEM courses.
1. The study aims to test if rhodopsins Rh1 and Rh4 function as chemosensors in Drosophila by facilitating gustatory sensing. Genetic, molecular, behavioral and electrophysiological experiments provide evidence that Rh1 and Rh4 are required in gustatory receptor neurons to sense the bitter compound aristolochic acid but not caffeine.
2. Flies lacking Rh1 or Rh4 failed to avoid low concentrations of aristolochic acid in choice assays, while their avoidance of caffeine was normal. Rh1 was found to be required to mediate neuronal responses to low concentrations of aristolochic acid by coupling to Gαq and phospholipase Cβ.
3. RT-PCR showed
This short document promotes creating presentations using Haiku Deck on SlideShare. It encourages the reader to get started making their own Haiku Deck presentation by providing a button to click to begin the process. The document is advertising the ability to easily create presentations on SlideShare using Haiku Deck.
This document discusses using unsupervised support vector analysis to increase the efficiency of simulation-based functional verification. It describes applying an unsupervised machine learning technique called support vector analysis to filter redundant tests from a set of verification tests. By clustering similar tests into regions of a similarity metric space, it aims to select the most important tests to verify a design while removing redundant tests, improving verification efficiency. The approach trains an unsupervised support vector model on an initial set of simulated tests and uses it to filter future tests by comparing them to support vectors that define regions in the similarity space.
The document discusses machine learning techniques, including supervised learning methods like decision tree induction, k-nearest neighbors classification, and artificial neural networks. It provides details on how each technique works, such as how decision trees and k-NN classify new data, and how neural networks are trained through backpropagation to reduce error on training data. Risks like overfitting are also addressed.
K-Nearest neighbor is one of the most commonly used classifier based in lazy learning. It is one of the most commonly used methods in recommendation systems and document similarity measures. It mainly uses Euclidean distance to find the similarity measures between two data points.
Evaluation of a hybrid method for constructing multiple SVM kernelsinfopapers
Dana Simian, Florin Stoica, Evaluation of a hybrid method for constructing multiple SVM kernels, Recent Advances in Computers, Proceedings of the 13th WSEAS International Conference on Computers, Recent Advances in Computer Engineering Series, WSEAS Press, Rodos, Greece, July 23-25, 2009, ISSN: 1790-5109, ISBN: 978-960-474-099-4, pp. 619-623
This document summarizes research applying deep learning techniques to predict epigenomic enhancer regions from DNA sequences. Four models - variations of Basset, DeepSea, DanQ, and a custom Greenside-Basset model - were trained on a dataset from the NIH Roadmap Epigenomics Mapping Consortium to label 1000 base pair sequences as active or inactive for 57 cell types. The Basset variation achieved the best balanced accuracy of 72.8% and had the highest precision and F1 scores, though it still weighted negative examples heavily, as precision was not strong. Experimenting with different embeddings, loss functions, and architectures helped narrow the models best suited for this sequence labeling task.
Improving Machine Learning Approaches to Coreference Resolutionbutest
The document presents research on improving machine learning approaches to coreference resolution. The researchers extend previous work by making modifications to the learning framework and expanding the feature set. Modifications to the learning framework, including best-first clustering, improved training set creation, and clustering methods, provided gains in precision and F1 score on standard coreference data sets. However, expanding the feature set from 12 to 53 features initially degraded performance, though selecting a subset of 22-26 features focused on common nouns improved precision. The best system achieved F1 scores of 70.4 and 63.4 on the two data sets, setting a new state-of-the-art.
Types of Machine Learnig Algorithms(CART, ID3)Fatimakhan325
The document summarizes several machine learning algorithms used for data mining:
- Decision trees use nodes and edges to iteratively divide data into groups for classification or prediction.
- Naive Bayes classifiers use Bayes' theorem for text classification, spam filtering, and sentiment analysis due to their multi-class prediction abilities.
- K-nearest neighbors algorithms find the closest K data points to make predictions for classification or regression problems.
- ID3, CART, and k-means clustering are also summarized highlighting their uses, advantages, and disadvantages.
A general frame for building optimal multiple SVM kernelsinfopapers
Dana Simian, Florin Stoica, A General Frame for Building Optimal Multiple SVM Kernels, Large-Scale Scientific Computing, Lecture Notes in Computer Science, 2012, Volume 7116/2012, 256-263, DOI: 10.1007/978-3-642-29843-1_29
Large Scale Kernel Learning using Block Coordinate DescentShaleen Kumar Gupta
This paper explores using block coordinate descent to scale kernel learning methods to large datasets. It compares exact kernel methods to two approximation techniques, Nystrom and random Fourier features, on speech, text, and image datasets. Experimental results show that Nystrom generally achieves better accuracy than random features but requires more iterations. The paper also analyzes the performance and scalability of computing kernel blocks in a distributed setting.
This document compares different machine learning techniques for web page classification, including k-Nearest Neighbors, Naive Bayes, Support Vector Machine, Classification and Regression Trees, Random Forest, and Particle Swarm Optimization. Experiments were performed using two datasets to evaluate the accuracy of each technique. The document discusses the implementation methodology, including representations of web pages, performance metrics, and the classification algorithms.
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural N...Scientific Review SR
This document summarizes a study that evaluated the performance of a kernel radial basis probabilistic neural network (Kernel RBPNN) model for classifying iris data, compared to backpropagation, radial basis function, and radial basis probabilistic neural network models. The Kernel RBPNN model achieved the highest classification accuracy of 89.12% on test data from the iris dataset, performing better than the other models. It also had the fastest training time, being over 80 times faster than the radial basis function model. Analysis of the receiver operating characteristic curves showed that the Kernel RBPNN model had the largest area under the curve, indicating it had the best classification prediction capability out of the four models evaluated.
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Ne...Scientific Review
Radial Basis Probabilistic Neural Network (RBPNN) has a broader generalized capability that been successfully applied to multiple fields. In this paper, the Euclidean distance of each data point in RBPNN is extended by calculating its kernel-induced distance instead of the conventional sum-of squares distance. The kernel function is a generalization of the distance metric that measures the distance between two data points as the data points are mapped into a high dimensional space. During the comparing of the four constructed classification models with Kernel RBPNN, Radial Basis Function networks, RBPNN and Back-Propagation networks as proposed, results showed that, model classification on Iris Data with Kernel RBPNN display an outstanding performance in this regard.
The document contains review questions about assessing and caring for patients with integumentary system disorders. It addresses topics like identifying layers of the skin, documenting skin assessment findings, educating patients about conditions like psoriasis and skin cancer prevention, and risk factors for developing non-melanoma skin cancer. Common skin lesions, signs of dehydration, complications of intravenous infiltration, and techniques to reduce acne scarring are also discussed.
Constance Wohlford has over 10 years of experience in database administration, IT support, and data analytics. She has a proven track record of creating and improving database systems to increase efficiency and cut costs. Her technical skills include databases, SQL, Java, web programming, and the Microsoft Office suite. She holds a Bachelor's degree in Computer Information Systems and has taken additional STEM courses.
1. The study aims to test if rhodopsins Rh1 and Rh4 function as chemosensors in Drosophila by facilitating gustatory sensing. Genetic, molecular, behavioral and electrophysiological experiments provide evidence that Rh1 and Rh4 are required in gustatory receptor neurons to sense the bitter compound aristolochic acid but not caffeine.
2. Flies lacking Rh1 or Rh4 failed to avoid low concentrations of aristolochic acid in choice assays, while their avoidance of caffeine was normal. Rh1 was found to be required to mediate neuronal responses to low concentrations of aristolochic acid by coupling to Gαq and phospholipase Cβ.
3. RT-PCR showed
This short document promotes creating presentations using Haiku Deck on SlideShare. It encourages the reader to get started making their own Haiku Deck presentation by providing a button to click to begin the process. The document is advertising the ability to easily create presentations on SlideShare using Haiku Deck.
This document discusses using unsupervised support vector analysis to increase the efficiency of simulation-based functional verification. It describes applying an unsupervised machine learning technique called support vector analysis to filter redundant tests from a set of verification tests. By clustering similar tests into regions of a similarity metric space, it aims to select the most important tests to verify a design while removing redundant tests, improving verification efficiency. The approach trains an unsupervised support vector model on an initial set of simulated tests and uses it to filter future tests by comparing them to support vectors that define regions in the similarity space.
The document discusses machine learning techniques, including supervised learning methods like decision tree induction, k-nearest neighbors classification, and artificial neural networks. It provides details on how each technique works, such as how decision trees and k-NN classify new data, and how neural networks are trained through backpropagation to reduce error on training data. Risks like overfitting are also addressed.
K-Nearest neighbor is one of the most commonly used classifier based in lazy learning. It is one of the most commonly used methods in recommendation systems and document similarity measures. It mainly uses Euclidean distance to find the similarity measures between two data points.
Evaluation of a hybrid method for constructing multiple SVM kernelsinfopapers
Dana Simian, Florin Stoica, Evaluation of a hybrid method for constructing multiple SVM kernels, Recent Advances in Computers, Proceedings of the 13th WSEAS International Conference on Computers, Recent Advances in Computer Engineering Series, WSEAS Press, Rodos, Greece, July 23-25, 2009, ISSN: 1790-5109, ISBN: 978-960-474-099-4, pp. 619-623
This document summarizes research applying deep learning techniques to predict epigenomic enhancer regions from DNA sequences. Four models - variations of Basset, DeepSea, DanQ, and a custom Greenside-Basset model - were trained on a dataset from the NIH Roadmap Epigenomics Mapping Consortium to label 1000 base pair sequences as active or inactive for 57 cell types. The Basset variation achieved the best balanced accuracy of 72.8% and had the highest precision and F1 scores, though it still weighted negative examples heavily, as precision was not strong. Experimenting with different embeddings, loss functions, and architectures helped narrow the models best suited for this sequence labeling task.
Improving Machine Learning Approaches to Coreference Resolutionbutest
The document presents research on improving machine learning approaches to coreference resolution. The researchers extend previous work by making modifications to the learning framework and expanding the feature set. Modifications to the learning framework, including best-first clustering, improved training set creation, and clustering methods, provided gains in precision and F1 score on standard coreference data sets. However, expanding the feature set from 12 to 53 features initially degraded performance, though selecting a subset of 22-26 features focused on common nouns improved precision. The best system achieved F1 scores of 70.4 and 63.4 on the two data sets, setting a new state-of-the-art.
Types of Machine Learnig Algorithms(CART, ID3)Fatimakhan325
The document summarizes several machine learning algorithms used for data mining:
- Decision trees use nodes and edges to iteratively divide data into groups for classification or prediction.
- Naive Bayes classifiers use Bayes' theorem for text classification, spam filtering, and sentiment analysis due to their multi-class prediction abilities.
- K-nearest neighbors algorithms find the closest K data points to make predictions for classification or regression problems.
- ID3, CART, and k-means clustering are also summarized highlighting their uses, advantages, and disadvantages.
A general frame for building optimal multiple SVM kernelsinfopapers
Dana Simian, Florin Stoica, A General Frame for Building Optimal Multiple SVM Kernels, Large-Scale Scientific Computing, Lecture Notes in Computer Science, 2012, Volume 7116/2012, 256-263, DOI: 10.1007/978-3-642-29843-1_29
Large Scale Kernel Learning using Block Coordinate DescentShaleen Kumar Gupta
This paper explores using block coordinate descent to scale kernel learning methods to large datasets. It compares exact kernel methods to two approximation techniques, Nystrom and random Fourier features, on speech, text, and image datasets. Experimental results show that Nystrom generally achieves better accuracy than random features but requires more iterations. The paper also analyzes the performance and scalability of computing kernel blocks in a distributed setting.
This document compares different machine learning techniques for web page classification, including k-Nearest Neighbors, Naive Bayes, Support Vector Machine, Classification and Regression Trees, Random Forest, and Particle Swarm Optimization. Experiments were performed using two datasets to evaluate the accuracy of each technique. The document discusses the implementation methodology, including representations of web pages, performance metrics, and the classification algorithms.
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural N...Scientific Review SR
This document summarizes a study that evaluated the performance of a kernel radial basis probabilistic neural network (Kernel RBPNN) model for classifying iris data, compared to backpropagation, radial basis function, and radial basis probabilistic neural network models. The Kernel RBPNN model achieved the highest classification accuracy of 89.12% on test data from the iris dataset, performing better than the other models. It also had the fastest training time, being over 80 times faster than the radial basis function model. Analysis of the receiver operating characteristic curves showed that the Kernel RBPNN model had the largest area under the curve, indicating it had the best classification prediction capability out of the four models evaluated.
Classification of Iris Data using Kernel Radial Basis Probabilistic Neural Ne...Scientific Review
Radial Basis Probabilistic Neural Network (RBPNN) has a broader generalized capability that been successfully applied to multiple fields. In this paper, the Euclidean distance of each data point in RBPNN is extended by calculating its kernel-induced distance instead of the conventional sum-of squares distance. The kernel function is a generalization of the distance metric that measures the distance between two data points as the data points are mapped into a high dimensional space. During the comparing of the four constructed classification models with Kernel RBPNN, Radial Basis Function networks, RBPNN and Back-Propagation networks as proposed, results showed that, model classification on Iris Data with Kernel RBPNN display an outstanding performance in this regard.
GENERALIZED LEGENDRE POLYNOMIALS FOR SUPPORT VECTOR MACHINES (SVMS) CLASSIFIC...IJNSA Journal
In this paper, we introduce a set of new kernel functions derived from the generalized Legendre polynomials to obtain more robust and higher support vector machine (SVM) classification accuracy. The generalized Legendre kernel functions are suggested to provide a value of how two given vectors are like each other by changing the inner product of these two vectors into a greater dimensional space. The proposed kernel functions satisfy the Mercer’s condition and orthogonality properties for reaching the optimal result with low number support vector (SV). For that, the new set of Legendre kernel functions could be utilized in classification applications as effective substitutes to those generally used like Gaussian, Polynomial and Wavelet kernel functions. The suggested kernel functions are calculated in compared to the current kernels such as Gaussian, Polynomial, Wavelets and Chebyshev kernels by application to various non-separable data sets with some attributes. It is seen that the suggested kernel functions could give competitive classification outcomes in comparison with other kernel functions. Thus, on the basis test outcomes, we show that the suggested kernel functions are more robust about the kernel parameter change and reach the minimal SV number for classification generally.
Recursive neural networks (RNNs) were developed to model recursive structures like images, sentences, and phrases. RNNs construct feature representations recursively from components. Later models like recursive autoencoders (RAEs), matrix-vector RNNs (MV-RNNs), and recursive neural tensor networks (RNTNs) improved on RNNs by handling unlabeled data, incorporating different composition rules, and reducing parameters. These recursive models achieved strong performance on tasks like image segmentation, sentiment analysis, and paraphrase detection.
This document provides an outline for a presentation on machine learning and deep learning. It begins with an introduction to machine learning basics and types of learning. It then discusses what deep learning is and why it is useful. The main components and hyperparameters of deep learning models are explained, including activation functions, optimizers, cost functions, regularization methods, and tuning. Basic deep neural network architectures like convolutional and recurrent networks are described. An example application of relation extraction is provided. The document concludes by listing additional deep learning topics.
The document discusses knowledge-based systems and their ability to reason over extensive knowledge bases. It addresses the theoretical problems of soundness, completeness, and tractability when using logical reasoning systems. Horn clauses and PROLOG are introduced as more efficient ways to perform inference compared to full predicate calculus. Different methods for reasoning including forward chaining and truth and assumption-based maintenance are also summarized.
Similar to Multihyperkernel Customization and Analysis on OVA SVMs (20)
Multihyperkernel Customization and Analysis on OVA SVMs
1. Ethan Bowen
Neural Networks
12/4/2011
Multihyperkernel Customization and Analysis on One Versus All Support Vector
Machines
Abstract
Kernels allow for mapping into a higher dimensional space in order to get non-
linearly separable data into a separable form. With the application of Multihyperkernels
it is possible to obtain an increase in correct classification for Support Vector Machines
while preserving their structure. This is done by the combination of kernels to form a
new kernel.
Introduction
Support Vector Machines perform binary classification by creating an N-
dimensional hyperplane that best separates the classes into two distinct categories. All
SVMs use a function called a kernel to do such mapping. All kernel functions are
written in the form K(x,w) = <ɸ(x), ɸ (w) > where ɸ is the features space and x and w are
from the input space. Two common kernels used are the polynomial and radial basis
kernel (RBF). I will be covering many more kernels but primarily these two types.
The dataset I used I received is a multiclass categorization problem from the UCI
Machine Learning Repository that uses 5 features to classify teachers based on their
performance. In the dataset, the class labels are low, medium, or high and the features
are English speaking, course instructor, course, summer or regular semester, and class
size.
Instead of using a multiclass SVM which essentially requires Y binary classifying
SVMs, where Y is the number of classes, I decided to do a One-Versus-All approach
2. Ethan Bowen
Neural Networks
12/4/2011
which classifies each class against the rest. This approach allowed me to easily show
how the type of kernel used can affect the percentage of correct label classification and
also showing how creating custom kernels can improve correct class classification. My
research and knowledge of the subject at hand was obtained through reading several
published papers (which are cited) and from the course of this Neural Networks class.
Methods
When considering kernel methods to use it is best to know the data that you are
working with in order to have an optimal kernel function. For instance, using a linear
kernel function would be more harmful than good if you knew that your data was non-
linearly separable. The method I approached when choosing my kernel for this dataset
was to just try to find the most optimal. I began to test the kernels mentioned earlier
against some new kernels I created. Since I was given no testing sample, my testing
sample is a subset of the training sample.
For testing the polynomial kernel I created two kernel functions. The first is
BowenPoly, which is a kernel function of the form k (xi,x) = k (xi,x)d
where d is the
number of features in the dataset. The second is BowenN1, which is a kernel function
of the form k (xi,x) = k (xi,x)d+1
where d is the number of features in the dataset.
For testing the RBF kernel I created BowenRBF which is a combination of two
RBF kernels. I denote r=||xi-x||2
and ɛ=1/2*σ where σ is sigma and α is an Nx1 matrix
containing weights and N is the number of kernels used in BowenPoly (for this case
N=2). It can be said that the summation from 1 to N on α=1. For my testing α1=0.5 and
α2=0.5 meaning each kernel used is weighed at 50% of its’ original value.
3. Ethan Bowen
Neural Networks
12/4/2011
Using the Laplacian, Exponential, Multiquadric, and Gaussian RBFs, I plotted
sigma from 1 to 10 in 0.01 increments to see how kernels classified. I choose to use
the Laplacian and Exponential kernels since they gave the best results compared to the
other kernels (including Gaussian) (4). So with the Laplacian kernel (LAP) in the form
k(xi,x)=e-r/σ
and the Exponential kernel (EXP) in the form k(xi,x)=e-r/2(σ^2)
I created
BowenRBF in the form k(xi,x)=α1LAP + α2EXP. I obtained this process from the use of
a multihyperkernel which is multiple “kernel on kernel” notions that implicitly do kernel
optimization inside a set family of kernels (such as Gaussian kernels with different
sigmas) in the form of k(xi,x) = α𝑁
1 iKi(xi,x). For my case sigma is the same for both
kernels in BowenRBF.
Once I found the classification of each previous kernel and for new kernels for
each class (low, medium, high), I observed my results to see first which kernel gave
better average correctness for classification and second to determine the usefulness of
the new kernels compared to the original kernels.
Results
For the linear I found that the most optimal kernel is k (xi,x) = xi
T
x was only about
70% correctly classified so I quickly switched to non-linear kernels for testing. Based on
(1) it shows that for this dataset BowenPoly and BowenPoly1 did not consistently map
better than the polynomial kernel so I could not accurately state that my kernels would
map better for different testing sets. For the non-linear data I tested against the most
commonly used RBF called Gaussian RBF in the form k(xi,x)= e-(ɛr)^2
and the
classification was 80% correct. (2) shows that for each OVA SVM, BowenRBF shows
4. Ethan Bowen
Neural Networks
12/4/2011
improvement compared to the Gaussian RBF for sigmas from 0.7 to 10 and based on
(3) you can see that the averages of the OVA SVMs for the class labels (low, medium,
and high) has BowenRBF at a much higher percent correct classification than the
Gaussian RBF for sigmas 0.7 to 10. Therefore, I can’t say for this dataset that there is
evidence to prove that BowenPoly and BowenN1 will regularly correctly classify at a
higher percentage at the homogeneous polynomial kernel, but I can say that for this
dataset that there is evidence (3) that BowenRBF will regularly correctly classify at a
higher percentage over the Gaussian RBF kernel and this justifies that I could use
BowenRBF to obtain a high percent correct classification for further testing samples.
Discussion
It can be said that doing just a few classifications tests on a dataset does not
justify BowenRBF to being a better kernel over Gaussian and much more research into
multihyperkernels is needed before a concrete justification can be given. I find this
research very interesting and found often I would be learning many new processes of
using kernels and applying them to specific applications. In this process α was simply
choosen as 0.5 for each element but I learned that there are algorithms for learning
these weights as well. Overall this was a very fun project and I enjoyed the process of
discovering a custom kernel that worked better than other known kernels.
5. Ethan Bowen
Neural Networks
12/4/2011
References
[1]Andrew Oliver Hatch. Kernel Optimization for Support Vector Machines : Application
to Speaker Verification.
PhD thesis, EECS Department, University of California, Berkeley, Dec 2006.
[2] C. S. Ong and A. J. Smola. Machine learning using hyperkernels. In Proceedings of
the
International Conference on Machine Learning, pages 568–575, 2003.
[3] Souza, César R. "Kernel Functions for Machine Learning Applications." 17 Mar.
2010. Web. <http://crsouza.blogspot.com/2010/03/kernel-functions-for-machine-
learning.html>.