2-Day Professional development course on
DATA MINING -
PRACTICAL MACHINE LEARNING TOOLS AND
FOR COMMERCE, ENGINEERING AND BIOLOGICAL
Kuala Lumpur: Singapore:
23 & 24 Sep 2002 (9.00 am to 5.00 pm) 25 & 26 Sep 2002 (9.00 am to 5.00 pm)
Century Novotel Hotel Orchard Hotel
This course is based on the following book co- This course takes a practical
authored by the course leader, Professor Ian approach and course
Witten: Data Mining: Practical Machine Learning
participants need no knowledge
Tools and Techniques with Java
Implementations, published by Morgan of any programming language or
23 - 24 Sep 2002 (9.00 am topages.
Kaufmann, Oct 1999, 416 5.00 advanced mathematics.
Novotel Century Hotel A half-day practical computer-
based tutorial session using the
WEKA software and example
data sets is part of the course.
participant Information on WEKA is in
By Prof Ian Witten
will be given “Introduction” on the next page
Professor of Computerof this
a copy Science
University of Waikato,book! Zealand
Comment on this book:
In association with
"This is a milestone in the synthesis of data mining, data C T NEXUS TECHNOLOGY
analysis, information theory, and machine learning." -
Jim Gray, Microsoft Research
Introduction • Ethical issues.
An exciting and potentially far-reaching Topic 2: Input and output
development in contemporary computer science is • Concepts, instances, attributes.
the invention and application of methods of • Attribute types.
machine learning (ML). These enable a computer
• Missing values.
program to automatically analyze a large body of
• ARFF format.
data and decide what information is most relevant.
This crystallized information can then be used to • Knowledge representations: decision trees,
help people make decisions faster and more classification rules, association rules,
accurately. numeric prediction, clustering.
The research team at the Department of Computer
Science, University of Waikato, NZ has Topic 3: Basic algorithms
incorporated several standard ML techniques into a • Inferring rules.
software "workbench" called WEKA, for Waikato • Statistical modeling.
Environment for Knowledge Analysis. • Constructing decision trees.
• Covering algorithms.
Weka is a collection of ML algorithms for solving • Mining association rules..
real-world data mining problems. It is written in • Linear models
Java and runs on almost any platform, including • Instance-based learning.
Windows. The algorithms can either be applied
directly to a dataset or called from your own Java Topic 4: Evaluation
code. Weka is also well suited for developing new
• Training and testing.
machine learning schemes. Weka is open source
software issued under the GNU General Public
License. • Alternatives.
• Loss functions.
• Cost-sensitive learning.
Objectives • Minimum description length principle.
On completion of this course, participants will Day 2: Lectures and Tutorials
Morning Sessions: Lectures
• have a thorough knowledge of the basic
techniques of machine learning Topic 5: Advanced algorithms
• understand how they can be used to extract • Decision trees
information from raw data
• classification rules
• be able to use the Weka workbench to work on
• support vector machines
their own datasets.
• instance-based learning
• numeric prediction
Course Outline • clustering.
Day 1: Lectures
Topic 6: Engineering the input and output
Topic 1: What is machine learning/data mining? • Attribute selection
• Definitions. • Discretization
• Different kinds of structural description. • Data cleansing
• Combining multiple models.
• Toy examples of machine learning.
• Actual examples of data mining Afternoon Sessions:
applications. Tutorial Exercises using WEKA and data sets
Who Will Benefit From This Course
Information professionals who need to learn about technically aware and have some basic knowledge
data mining and these include information systems of data and databases. No specific programming
practitioners, programmers, consultants, developers, ability is needed to benefit from this course. Since
information technology managers, specification only basic high-school mathematics is all that is
writers, patent examiners, as well as students and needed to understand the lectures, professionals
academics. working in commerce and biological sciences,
besides engineering will benefit from this course.
The course covers both theory and practice, but
takes a practical approach. Participants should be
Professor Ian Witten
Ian H. Witten is a professor of computer science at He has published widely on machine learning, data
the University of Waikato in New Zealand. He mining, digital libraries, text compression,
directs the New Zealand Digital Library research hypertext, speech synthesis and signal processing,
project. His research interests include information and computer typography. He has authored and co-
retrieval, machine learning, text compression, and authored several books, the latest being Managing
programming by demonstration. Gigabytes (1999), Data Mining (2000) and How to
build a digital library (2002), all from Morgan
He received an MA in Mathematics from Kaufmann.
Cambridge University, England; an MSc in
Computer Science from the University of Calgary, He has many years experience of teaching, and has
Canada; and a PhD in Electrical Engineering from given numerous tutorials at international
Essex University, England. He is a fellow of the conferences on subjects ranging from machine
ACM and of the Royal Society of New Zealand. learning to digital library technology.
Data Mining: Practical Machine Learning Tools and Techniques with Java
Implementations, published by Morgan Kaufmann, Oct 1999, 416 pages.
This book complements the Weka software. It shows how to use Weka's Java algorithms to discern
meaningful patterns in your data, how to adapt them for your specialized data mining applications, and how
to develop your own machine learning schemes. It offers a thorough grounding in machine learning
concepts as well as practical advice on applying machine learning tools and techniques in real-world data
mining situations. Inside, you'll learn all you need to know about preparing inputs, interpreting outputs,
evaluating results, and the algorithmic methods at the heart of successful data mining. If you're involved at
any level in the work of extracting usable knowledge from large collections of data, this book will be a
Weka 3.0 requires a Java 1.1 (or later) compliant Java Virtual Machine. For the stable GUI version you
need Swing, which is included in Java 2 and available separately for Java 1.1. For the development version
you need Java 1.2 (or later).
FEES AND PAYMENT: A certificate of completion will be awarded
upon successful completion of the course.
Registration Singapore Kuala This serves as evidence of your professional
Fee Lumpur development.
Individual S$980 each S$880 each
Group Fee* S$880 each S$780 each Should you be unable to attend, a
substitute attendee is always welcome at
*For 3 or more delegates from the same any time. We regret that no refund will be
organization. made for any cancellation received less
than 7 days before the event.
Fees include daily lunch and refreshments
and course reference materials. In the event of unforeseen circumstances,
the organizers reserve the right to
Please make payment in Singapore dollars substitute other course leader, amend the
using crossed cheque in favour of C T Nexus program as necessary, or otherwise cancel
Technology Transfer the course.
For all events, fee must be sent with the 4 EASY WAYS TO REGISTER OR ENQUIRE
By Telephone: (65) 6487 6544
registration form to:
By Fax: (65) 6487 6428
C T Nexus Technology Transfer
By Email: email@example.com
Hougang Central Post Office
P O Box 107 Singapore 915304
C T Nexus Technology Transfer
CERTIFICATE OF COMPLETION Hougang Central Post Office
P O Box 107 Singapore 915304
There is no closing date but the class is kept small to foster interaction. Please register early to
ensure a place.
(Please photocopy this form to preserve the brochure and for additional registrations)
(COURSE ON DATA MINING – SEP 2002)
(1) Dr/Mr/Ms/Mrs_____________________________________________ Designation: __________________________
(2) Dr/Mr/Ms/Mrs_________________________________ ____________Designation: __________________________
(3) Dr/Mr/Ms/Mrs_________________________________ ____________Designation:___________________________
________________________________________Country: __________________Postcode: ______________
Contact Person: Dr/Mr/Ms/Mrs: _______________________________________Designation:______________________
Tel: _____________________ Fax : ____________________________ Email: ________________________________
Please mail the completed form with your payment to: C T NEXUS TECHNOLOGY
TRANSFER, Hougang Central Post Office, P O Box 107 Singapore 915304