Machine learning

Machine LearningBY:Vatsal J. Gajera(09BCE010)

What is Machine Learning? It is a branch of artificial intelligence.It is a scientfic discipline concerned with the design and development of algorithms that allow computers to evolve behaviours based on empirical data.Such as from sensors and data bases.

Techanical Definition of machine learning: According to Tom M. Mitchell, a computer is said to learn from experience E with respect to some class of tasks T and performance measure P,if its performance at tasks in T, as measured by P improves with experience.

A major focus of machine learning research is to automatically learn to recognize complex patterns and make intelligent decisions based on data; the difficulty lies in the fact that the set of all possible behaviors given all possible inputs is too large to be covered by the set of observed examples (training data). Hence the learner must generalize from the given examples, so as to be able to produce a useful output in new cases.Some machine learning systems attempt to eliminate the need for human interaction in data analysis, while others adopt a collaborative approach between human and machine. Human intuition cannot, however, be entirely eliminated, since the system's designer must specify how the data is to be represented and what mechanisms will be used to search for a characterization of the data.Application of machine learning:

Robot locomotion(Movement from One place to another place).

Etc.There are several algorithms for machine learning.Decision Tree Algorithm.Bayesian Classification Algorithm.Shortest Path Calculation Algorithm.Neural Network Algorithm.Genetic Algorithm.

Decision Tree Algorithm:It is used in statistics data mining and machine learning uses a decision tree as a predictive model which maps observation about an item to conclusion about the item’s target value.

The goal is to create a model that predicts the value of a target variable on several input variables.

2. Bayesian Classification:Bayesian classifiers are statistical classifiers.They can predict class membership probabilities,such as the probability that a given tupple belongs to a particular class.

This classification is based on Bayesian theorem.3.Neural Network Algorithm:An artificial neural network is a mathematical or computational model that is inspired by the structure and function aspects of biological neural network. A neural network consist of an interconnected group of artificial neurons and it processes information using a connectionist approach to computation.

1.Decision Tree Induction:During the late 1970s J.Ross Quinlan, a researcher in machine learning,developed a decision tree algorithm known as ID3(Iterative Dichotomiser).Quinlan later presented C4.5,which became a benchmark to which newer supervised learning algorithms are often compared.In 1984 a group of statisticians published the book classificatio and regression trees(CART),which describs the generation of binary decision trees.ID3,C4.5 and CART adopt a greedy(i.e., nonbacktracking) approach in which decision tres are constructed in a top down recursive divide and conquer manner. Inputs:Data partition D,Which is a set of training tuples and their associated class labels.

Attribute_list,the set of candidate attributes.

Attribute_selection_method a procedure to determine the splitting criterio that “best” partitions the data tuples into individual classes.This criterion consists of splitting_attribute and possibly,either a split point or splitting subset. Output: A decision tree. Method:Create a node N;If tuples in D are all of the same class,C then returns N as a leaf node labeled with the class C;If Attribute_list is empty then return N as a leaf node labeled with the majority class in D;Apply Attribute_selection_method to find “best” splitting_criterion;

5. Label node N with splitting_criterion;6. If splitting_attribute is discrete_valued and multiway splits allowed then attribute_list=attribute_list – splitting_attribute.7. For each outcome j of splitting_criterion8. Let Dj be the set of data tuples in D satisfying outcome j;9. If Dj is empty then10. Attach a leaf labeled with the majority class D to node N;11. Else attach the node returned by Generate_decision_tree to node N;endfor12.Return N;

An attribute selection measure is a experience based techniques for selecting the splitting criterion that “best” separates a given data partition D,of class-labeled training tuples into individual classes.If we were to split D into smaller partitions according to the outcomes of the splitting criterion,ideally each partition would be pure.(i.e.,all of the tuples that fall into a given partition would belong to the same class.)

There are main three measures for it.Information Gain.Gain Ratio.Gini Index.

Example:Age Income Student Credit_Rating Class:Buy_ComputerYoung high no fair noYoung high no excellent noMiddle high no fair yesSenior medium yes fair yesSenior low yes excellent noMiddle medium no fair yesSenior medium no excellent no

Information Gain:ID3 uses information gain as its attribute selection measure.The measure is based on pioneering work by Claude Shannon on information theory,which studied the value or information content of messages. Info(D)= -∑ Pi log(Pi) (where i=1 to m) Info A (D)=∑((|Dj| / |D|)*Info(Dj)) (where j=1 to v) Gain(A)= Info(D) – Info A (D)

In Example, class buy_computer has two distinct value {yes,no}. So m=2.Let class C1 correspond to yes and C2 correspond to no.Here total tuples with “yes” are 3 and with “no” are 4. Total=4+3=7 so Info(D)=-(3/7)Log(3/7) – (4/7)Log(4/7) =0.9851 Here for young=2,middle=2,senior=3.among young both are from “no” class. And among middle both are from “yes” class.and among senior 1 is from “yes” and 2 are from “no” class. so Info age (D)=((2/7)*(-2/2 Log(2/2) – 0/2 Log(0/2)))+ ((2/7)*(-2/2 Log(2/2) - 0/2 Log(0/2)))+ ((3/7)*(-1/3 Log(1/3) - 2/3 Log(2/3))) =0.05931 So Gain(age)=0.9851-0.05931 =0.9257 As we calculated ,gain for age, we have to calculate gain for all attribute.After calculating gain ,attribute which has highest gain value ,becomes our split node.

Machine learning

More Related Content

What's hot

Similar to Machine learning

Recently uploaded

Machine learning