Practical Artificial Intelligence & Machine Learning (Arturo Servin)

  • 1,614 views
Uploaded on

 

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
1,614
On Slideshare
0
From Embeds
0
Number of Embeds
3

Actions

Shares
Downloads
20
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Practical Artificial Intelligence and Machine Learning
    • arturo.servin_at_gmail.com
    • http://arturo.servin.googlepages.com/
  • 2. About this presentation
    • Some theory on AI and ML
    • Some practical ideas and simple how to
    • What's out there using AI
    • Resources, Kits and Data
  • 3. Artificial Intelligence
    • Machine Learning
    • Natural Language Processing
    • Knowledge representation
    • Plannning
    • Multi-Agent Systems
    • and some other stuff depending of the author of the book
  • 4. Machine Learning
    • A program is learning when it executes a task T and acquires experience E and the measured performance P of T improves with experience E (T. Mitchell, Machine Learning, 1997)
  • 5. Machine Learning Flavours
    • Supervised Learning
      • Programs learn a concept/hypothesis by means of labeled examples
      • Examples: Artificial Neural Networks, Bayesian Methods, Decision Trees
    • Unsupervised Learning
      • Programs learn to categorise unlabelled examples
      • Examples: Non-negative matrix factorization and self-organising maps
  • 6. More flavours
    • Reinforcement Learning
      • Programs learn interacting with the environment, the execution of actions and observing the feedback in the form of + or – rewards
      • Examples: SARSA, Q-Learning
  • 7. Training Examples
    • Continuous
    • Discrete
    • Inputs know as Vectors or Features
    • Example in Wine Classification: Alcohol level, Malic acid, Ash, Alcalinity of ash, etc.
  • 8.
    • Linear and Non-linear feature relations
    • source: Oracle Data Mining Concepts
  • 9. More complex feature relations
  • 10. Decision Trees
    • Easy to understand and to interpret
    • Hierarchical structure
    • They use Entropy and Gini impurity to create groups
    • Disadvantage: It's an off-line method
    • Examples: ID3, C4.5
    Source: http://www2.cs.uregina.ca/~dbd/cs831/notes/ml/dtrees/c4.5
  • 11. Decision Trees, example
    • Create .names and .data files with training data
    • Generate tree and rules (c4.5 -f <file> and c4.5rules -f <file>)
    • Outlook Temperature Humidity Windy Play or Don't Play
    • Sunny 80 90 true Don't Play
    • Overcast 83 78 false Play
    • Rain 70 96 false Play
    • Categorize new data (consult,consultr). Use GPS/Geocoding, Google Maps and Yahoo Weather APIs to enhance
    • aservin@turin:~/Projects/C45: consult -f golf
    • C4.5 [release 8] decision tree interpreter Sat Jan 17 00:05:16 2009
    • ------------------------------------------
    • outlook: sunny
    • humidity: 80
    • Decision:
    • Don't Play CF = 1.00 [ 0.63 - 1.00 ]
  • 12. Bayesian Classifiers
    • Bayes Theorem: P(h|D) = P(D|h) P(h) P(D)
    • P(h) = prior probability of hypothesis h
    • P(D) = prior probability of training data D
    • P(h|D) = probability of h given D
    • P(D|h) = probability of D given h
    • Naive Bayes Classifier, Fisher Classifier
    • Commonly used in SPAM filters
  • 13. Classifying your RSS feeds
    • Use the unofficial Google Reader API http://blog.gpowered.net/2007/08/google-reader-api-functions.html
    • Some Python Code (Programming Collective Intelligence, Chapter 6)
    • Tag interesting and non-interesting items
    • Train using Naive-bayes or Fisher classsifier
    • >> cl.train('Google changes favicon','bad')
    • >> cl.train('SearchWiki: make search your own','good')
    • New items are tagged as interesting or not
    • >> cl.classify('Ignite Leeds Today')
    • Good
    • You can re-train online
    • Add more features, try with e-mail
  • 14. Finding Similarity
    • Euclidean Distance, Pearson Correlation Score, Manhattan
    • Document Clustering
    • Price Prediction
    • Item similarity
    • k-Nearest Neighbors, k-means, Hierarchical Clustering, Support-Vector Machines, Kernel Methods
  • 15. Similar items Source: http://home.dei.polimi.it/matteucc/Clustering/tutorial_html
  • 16. Artificial Neural Networks
    • Mathematical/Computational model based on biological neural networks
    • Many types. The most common use Backpropagation algorithm for training and Feedforward algorithm to get results/training
  • 17. Artificial Neural Networks
    • Input is high-dimensional, discrete or real-valued (e.g. raw sensor input)
    • Output is discrete or real valued
    • Output is a vector of values
    • Perceptron, linear
    • Sigmoid, non-linear and multi-layer
  • 18. Example, finding the best price
    • Create training data using Amazon/Ebay API
    • Laptop prices. Use price, screen size as features
    • Use a ANN, i.e. Fast Artificial Neural Network (FANN)
    • struct fann *ann = fann_create_standard(num_layers, num_input, num_neurons_hidden, num_output); #C++
    • ann = fann.create(connection_rate, (num_input, num_neurons_hidden, num_output)) #Python
    • $ann = fann_create(array(2, 4, 1),1.0,0.7); // PHP
    • You can also try k-Nearest Neighbours
    • Try it!
  • 19. Resources 1
    • Books
      • Practical Artificial Intelligence Programming in Java, Mark Watson http://www.markwatson.com/opencontent/ (There is a Ruby one as well)
      • Programming Collective Intelligence, Toby Segaran; O'Reilly
      • Artificial Intelligence: A Modern Approach, S. Russell, P. Norvig, J. Canny; Prentice Hall,
      • Machine Learning, Tom Mitchell; MIT Press
    • Online Stuff
      • ML course in Stanford http://www.stanford.edu/class/cs229/materials.html
      • Statistical ML http://bengio .abracadoudou.com/lectures/
  • 20. Resources 2
    • Code
      • FANN http://leenissen.dk/fann/index.php
      • NLP http://opennlp.sourceforge.net/
      • C4.5 http://www2.cs.uregina.ca/~dbd/cs831/notes/ml/dtrees/c4.5/tutorial.html
      • ML and Java http://www.developer.com/java/other/article.php/10936_1559871_1
    • Data
      • UC Irvine Machine Learning Repository http://archive.ics.uci.edu/ml/
      • Amazon Public Datasets http://aws.amazon.com/publicdatasets/
  • 21. More info
    • For questions, projects and job offers:
      • arturo.servin _(at)_ gmail.com
      • http://twitter.com/the_real_r2d2
      • http://arturo.servin.googlepages.com/