0
Apache Mahout
● What is it ?
● How does it work ?
● Machine Learning
● Algorithms
● Install
www.semtech-solutions.co.nz in...
Mahout – What is it ?
● Machine learning
● For large data
● Based on Hadoop
● But can work on a non Hadoop cluster
● Scale...
Mahout – How does it work ?
● Uses Hadoop Map Reduce
● Has many supplied algorithms
● Supports four use cases
– Recommenda...
Mahout - Machine Learning
Machine learning – what does it mean ?
● A branch of artificial intelligence
● Systems that lear...
Mahout – Algorithms
Some of the available algorithms (among many others)
– Collaborative filtering
● Narrow Sense – make p...
Mahout – Install
So how do we install Mahout and test it ?
– Install Maven
● sudo apt-get install maven3
– Install Apache ...
Mahout – Test Install
So let us run a test
● cd $MAHOUT_HOME/examples/bin
● ./build-reuters.sh
● choose option 1 kmeans cl...
Mahout – Test Install
cd $MAHOUT_HOME/examples/bin ; ./build-reuters.sh
Please call cluster-reuters.sh directly next time....
Contact Us
● Feel free to contact us at
– www.semtech-solutions.co.nz
– info@semtech-solutions.co.nz
● We offer IT project...
Contact Us
● Feel free to contact us at
– www.semtech-solutions.co.nz
– info@semtech-solutions.co.nz
● We offer IT project...
Upcoming SlideShare
Loading in...5
×

An introduction to Apache Mahout

1,157

Published on

A introduction to Apache Mahout, what is it and
how does it work ? What is machine inteligence ?
How can mahout be installed and tested on Hadoop ?

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,157
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
56
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Transcript of "An introduction to Apache Mahout"

  1. 1. Apache Mahout ● What is it ? ● How does it work ? ● Machine Learning ● Algorithms ● Install www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  2. 2. Mahout – What is it ? ● Machine learning ● For large data ● Based on Hadoop ● But can work on a non Hadoop cluster ● Scaleable ● Licensed by Apache www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  3. 3. Mahout – How does it work ? ● Uses Hadoop Map Reduce ● Has many supplied algorithms ● Supports four use cases – Recommendation mining – Clustering – Classification – Frequent Itemset Mining www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  4. 4. Mahout - Machine Learning Machine learning – what does it mean ? ● A branch of artificial intelligence ● Systems that learn from data ● Classify data after learning ● Learn on test data sets ● Generalisation – the ability to classify unseen data sets – after learning www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  5. 5. Mahout – Algorithms Some of the available algorithms (among many others) – Collaborative filtering ● Narrow Sense – make predictions about user interests by collecting preferences ● General - Multi agent collaboration for information filtering – Mean shift clustering ● Mode seeking, used for visual tracking – Parallel frequent pattern mining ● Find unique features www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  6. 6. Mahout – Install So how do we install Mahout and test it ? – Install Maven ● sudo apt-get install maven3 – Install Apache Mahout ● You will need subversion installed ● svn co http://svn.apache.org/repos/asf/mahout/trunk ● Go to dir containing pom.xml file – mvn install ## in ./trunk Full details available in the Mahout install guide on our web site shop www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  7. 7. Mahout – Test Install So let us run a test ● cd $MAHOUT_HOME/examples/bin ● ./build-reuters.sh ● choose option 1 kmeans clustering ● Should finish with – see next slide Full details available in the Mahout install guide on our web site shop www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  8. 8. Mahout – Test Install cd $MAHOUT_HOME/examples/bin ; ./build-reuters.sh Please call cluster-reuters.sh directly next time. This file is going away. Please select a number to choose the corresponding clustering algorithm 1. kmeans clustering 2. fuzzykmeans clustering 3. lda clustering Enter your choice : 1 ok. You chose 1 and we'll use kmeans Clustering ................................. Inter-Cluster Density: NaN Intra-Cluster Density: 0.0 CDbw Inter-Cluster Density: NaN CDbw Intra-Cluster Density: NaN CDbw Separation: NaN Full details available in the Mahout install guide on our web site shop www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  9. 9. Contact Us ● Feel free to contact us at – www.semtech-solutions.co.nz – info@semtech-solutions.co.nz ● We offer IT project consultancy ● We are happy to hear about your problems ● You can just pay for those hours that you need ● To solve your problems
  10. 10. Contact Us ● Feel free to contact us at – www.semtech-solutions.co.nz – info@semtech-solutions.co.nz ● We offer IT project consultancy ● We are happy to hear about your problems ● You can just pay for those hours that you need ● To solve your problems
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×