Orange Canvas - PyData 2013

7,023 views
6,433 views

Published on

Overview of the Orange Canvas visual programming environment for data mining.

0 Comments
9 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
7,023
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
205
Comments
0
Likes
9
Embeds 0
No embeds

No notes for slide
  • Orange Canvas – Visual programming environment for data mining
  • Example of a complete program for classification trees.
  • Orange was originally a collection of C++ algorithms, then Python was added, and finally a graphical interface
  • Why use Orange?It has a wide selection of data visualizations that you can use to explore your data. You can prototype machine learning algorithms using Orange Canvas without investing much time in programming. And there are a number of add-ons for bioinformatics, network analysis, and text mining, plus more contributed by the community.
  • Screen capture from the Orange home page.
  • Orange Canvas is an interactive environment for visual programming. It’s open source and free to use. In this example, you can click on a widget from the palette on the left and a copy of that widget gets transferred to the canvas. On this screen we see the File widget, which reads data into the system and the Data Table widget which displays the data in a table format with the ability to sort the data by column. The widgets are connected together by clicking on the right hand side of the File widget and dragging a line to the left hand side of the Data Table widget. The convention in Orange are for inputs to be on the left and outputs on the right. Notice that the Data Table’s right hand side is dotted, meaning the output is not in use. With this simple concept, let’s see how you can explore a data set.
  • Visualization widgets
  • Clustering and unsupervised learning widgets
  • Classification
  • Network and Text Mining add-ons
  • Bioinformatics widgets
  • Demo #1Simple classification example with classification treesExample scatterplotVizRank selection of interesting projectionsDemo #2 Comparing classifiersMultiple learnersTest learners evaluationShow evaluation metrics
  • To get started, first install Orange Canvas. Try the built-in tutorials listed here.
  • Orange Canvas - PyData 2013

    1. 1. Justin Sun PyData Boston July 27, 2013
    2. 2. Overview  What can you do with Orange?  History  Architecture  Installation  Widget Examples  Demo  Resources
    3. 3. Classification Tree Scheme
    4. 4. History  1996 – University of Ljubljana and Jožef Stefan Institute started development of ML*, a machine learning framework in C++.  1997 – Python integration layer  2003 – GUI based on PyQt  2013 – Orange Canvas 2.7 released – Major GUI redesign. Source: http://en.wikipedia.org/wiki/Orange_%28software%29
    5. 5. High-level Architecture Algorithms written in C++ Python integration layer (Python 2.7) Orange Canvas – Visual programming
    6. 6. Why Use Orange?  No programming needed – Visual programming  Data Visualization  Easy to try different Machine Learning Algorithms  Add-ons for  Bioinformatics  Network Analysis  Text mining  Free and open source software
    7. 7. Installation  Download installer from http://orange.biolab.si/  Run installer  Requires Python 2.6 or 2.7  Includes NumPy, SciPy, PyQt, other required libraries  To run, double-click on the Orange Canvas icon
    8. 8. Scheme Widgets
    9. 9. Demo  Classification example  Evaluation
    10. 10. Resources  Orange Website: http://orange.biolab.si/  Tutorials: http://www.biolab.si/janez/kyoto/  Interactive Network Analysis with Orange http://www.jstatsoft.org/v53/i06  Orange Whitepaper with scripting examples http://www.celta.paris- sorbonne.fr/anasem/papers/miscelanea/InteractiveDataMining.pdf
    11. 11. Thank You!  Email: justin@justinsun (dot) com  Slides: http://www.slideshare.net/justin_sun/

    ×