• Save
A practical Introduction to Machine Learning in Python
Upcoming SlideShare
Loading in...5
×
 

A practical Introduction to Machine Learning in Python

on

  • 1,808 views

In this presentation, we will show to how use Python for Machine Learning. The Orange framework, a open-source data mining tool developed at the University of Ljubljiana will be used. Orange is a ...

In this presentation, we will show to how use Python for Machine Learning. The Orange framework, a open-source data mining tool developed at the University of Ljubljiana will be used. Orange is a scriptable environment for fast prototyping of new algorithms and testing schemes. It is a collection of Python-based modules that sit over the C++ core library and implement some functionality for which execution time is not crucial and which is easier done in Python than in C++.

Statistics

Views

Total Views
1,808
Views on SlideShare
1,803
Embed Views
5

Actions

Likes
1
Downloads
0
Comments
0

2 Embeds 5

http://www.linkedin.com 3
https://www.linkedin.com 2

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    A practical Introduction to Machine Learning in Python A practical Introduction to Machine Learning in Python Presentation Transcript

    • CVC TechPartyA practical Introduction to Machine Learning in Python Piero Casale    
    • CVC TechPartywww.ailab.si/orange/    
    • CVC TechParty Load-In Data and Basic Data Exploration- Loading Data: iris = orange.ExampleTable(iris.tab)- Exploring Features and Examples iris.domain.attributes iris.domain.classVar.name- Basic Dataset Characteristics GetDatasetStatistics()- Dataset Formats in Orange: csv, txt, xls- Dataset as Python Lists: indexing, append, extend, native    
    • CVC TechParty Dataset Visualization- Multi Dimensional Scaling: MultiDimensional Scaling Functions in orngMDS    
    • CVC TechPartyMy First Classifier in Orange : Bayes    
    • CVC TechParty My First Classifier in Orange : Bayes- Loading Data: iris = orange.ExampleTable(iris.tab)- Declare the Learning Function: bayes = orange.BayesLearner()- Train the Bayes Classifier on Data: BayesClassifier = bayes(iris)- Classify new data: Prediction = bayesClassifier(newExample)_ Example on Iris Dataset: exCodes.showBayes()    
    • CVC TechParty My (Second) Classifier in Orange : Decision Trees- As before: import orngTree treeLearner = orngTree.TreeLearner() treeClassifier = treeLearner(iris) prediction = treeClassifier(newExample)_ Measures for splitting : infoGain, gainRatio, gini treeLearner = orngTree.TreeLearner(measure=gini)- Print the Tree: - on screen : orngTree.printTree(treeClassifier) - save as an image : orngTree.printDot(treeClassifier, fileName=tree.dot) dot -Tpng tree.dot -otree.png    
    • CVC TechParty Testing and Evaluating a Classifier- Testing Functions in orngTest import orngTest learners = [bayesLearner, treeLearner]- Make a 10 folds Cross Validation xv = orngTest.crossValidation(learners, data, folds=10)- Scores Functions in orngStat import orngStat accuracy = orngStat.CA(xv) confusionMatrix = orngStat.cm(xv)- Example on Iris Dataset using Bayes, DecisionTree and Knn. exCodes.crossValidate()    
    • CVC TechParty Ensemble Methods- Basic Ensemble Methods in orngEnsemble Bagging, Boosting and Random Forest import orngEnsemble- Bagging of Decision Trees treeLearner = orngTree.TreeLearner() baggedTrees = orngEnsemble.BaggedLearner(treeLearner, t=10)- Boosting of Decision Trees treeLearner = orngTree.TreeLearner() boostedTrees = orngEnsemble.BoostedLearner(treeLearner, t=10)- Random Forest forest = orngEnsemble.RandomForestLearner(trees = 10)- Example on Iris Dataset: exCodes.crossValidateEnsembles()    
    • CVC TechParty Features Selection- Functions for Features Selectoin in orngFSS import orngFSS vehicle = orange.ExampleTable(vehicle.tab)- Measuring Import of features with Information Gain measures = orngFSS.attMeasure(vehicle) TenBests = orngFSS.bestNAtts(measures,n=10)- Measuring Import of features with Gain Ratio gainRatio = orange.MeasureAttribute_gainRatio() measures = orngFSS.attMeasure(vehicle,gainRatio) fiveBests = orngFSS.bestNAtts(measures,n=5)- Example on Vehicle Dataset: exCodes.measureAttributes()    
    • CVC TechParty More.....- Supervised Learning Algorithms: orngSVM,orngLR,orngC45- Unsupervised Learning Algorithm : orngClustering- Reinforcement Learning : orngReinforcement- Outlier Detection : orngOutlier- Discretization Functions : orngDisc    
    • CVC TechParty Enjoy..... More at www.ailab.si/orange Piero Casale