2-Day Professional development course on

       DATA MINING -
Introduction                                            •   Ethical issues.

An exciting       and    potentially  far-rea...
Who Will Benefit From This Course
Information professionals who need to learn about          technically aware and have so...
Administrative Details
FEES AND PAYMENT:                                         A certificate of completion will be award...
Upcoming SlideShare
Loading in …5

Presented by


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Presented by

  1. 1. 2-Day Professional development course on DATA MINING - PRACTICAL MACHINE LEARNING TOOLS AND TECHNIQUES FOR COMMERCE, ENGINEERING AND BIOLOGICAL SCIENCES Kuala Lumpur: Singapore: 23 & 24 Sep 2002 (9.00 am to 5.00 pm) 25 & 26 Sep 2002 (9.00 am to 5.00 pm) Century Novotel Hotel Orchard Hotel This course is based on the following book co- This course takes a practical authored by the course leader, Professor Ian approach and course Witten: Data Mining: Practical Machine Learning participants need no knowledge Tools and Techniques with Java Kuala Lumpur: Implementations, published by Morgan of any programming language or 23 - 24 Sep 2002 (9.00 am topages. Kaufmann, Oct 1999, 416 5.00 advanced mathematics. pm) Novotel Century Hotel A half-day practical computer- based tutorial session using the WEKA software and example data sets is part of the course. Each participant Information on WEKA is in By Prof Ian Witten will be given “Introduction” on the next page Professor of Computerof this a copy Science University of Waikato,book! Zealand New Presented by TEKBAC AUSTRALIA PTY LTD Comment on this book: In association with "This is a milestone in the synthesis of data mining, data C T NEXUS TECHNOLOGY analysis, information theory, and machine learning." - TRANSFER (SINGAPORE) Jim Gray, Microsoft Research
  2. 2. Introduction • Ethical issues. An exciting and potentially far-reaching Topic 2: Input and output development in contemporary computer science is • Concepts, instances, attributes. the invention and application of methods of • Attribute types. machine learning (ML). These enable a computer • Missing values. program to automatically analyze a large body of • ARFF format. data and decide what information is most relevant. This crystallized information can then be used to • Knowledge representations: decision trees, help people make decisions faster and more classification rules, association rules, accurately. numeric prediction, clustering. The research team at the Department of Computer Science, University of Waikato, NZ has Topic 3: Basic algorithms incorporated several standard ML techniques into a • Inferring rules. software "workbench" called WEKA, for Waikato • Statistical modeling. Environment for Knowledge Analysis. • Constructing decision trees. • Covering algorithms. Weka is a collection of ML algorithms for solving • Mining association rules.. real-world data mining problems. It is written in • Linear models Java and runs on almost any platform, including • Instance-based learning. Windows. The algorithms can either be applied directly to a dataset or called from your own Java Topic 4: Evaluation code. Weka is also well suited for developing new • Training and testing. machine learning schemes. Weka is open source • Cross-validation. software issued under the GNU General Public License. • Alternatives. • Loss functions. • Cost-sensitive learning. Objectives • Minimum description length principle. On completion of this course, participants will Day 2: Lectures and Tutorials Morning Sessions: Lectures • have a thorough knowledge of the basic techniques of machine learning Topic 5: Advanced algorithms • understand how they can be used to extract • Decision trees information from raw data • classification rules • be able to use the Weka workbench to work on • support vector machines their own datasets. • instance-based learning • numeric prediction Course Outline • clustering. Day 1: Lectures Topic 6: Engineering the input and output Topic 1: What is machine learning/data mining? • Attribute selection • Definitions. • Discretization • Different kinds of structural description. • Data cleansing • Combining multiple models. • Toy examples of machine learning. • Actual examples of data mining Afternoon Sessions: applications. Tutorial Exercises using WEKA and data sets
  3. 3. Who Will Benefit From This Course Information professionals who need to learn about technically aware and have some basic knowledge data mining and these include information systems of data and databases. No specific programming practitioners, programmers, consultants, developers, ability is needed to benefit from this course. Since information technology managers, specification only basic high-school mathematics is all that is writers, patent examiners, as well as students and needed to understand the lectures, professionals academics. working in commerce and biological sciences, besides engineering will benefit from this course. The course covers both theory and practice, but takes a practical approach. Participants should be . Course Leader Professor Ian Witten Ian H. Witten is a professor of computer science at He has published widely on machine learning, data the University of Waikato in New Zealand. He mining, digital libraries, text compression, directs the New Zealand Digital Library research hypertext, speech synthesis and signal processing, project. His research interests include information and computer typography. He has authored and co- retrieval, machine learning, text compression, and authored several books, the latest being Managing programming by demonstration. Gigabytes (1999), Data Mining (2000) and How to build a digital library (2002), all from Morgan He received an MA in Mathematics from Kaufmann. Cambridge University, England; an MSc in Computer Science from the University of Calgary, He has many years experience of teaching, and has Canada; and a PhD in Electrical Engineering from given numerous tutorials at international Essex University, England. He is a fellow of the conferences on subjects ranging from machine ACM and of the Royal Society of New Zealand. learning to digital library technology. Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations, published by Morgan Kaufmann, Oct 1999, 416 pages. This book complements the Weka software. It shows how to use Weka's Java algorithms to discern meaningful patterns in your data, how to adapt them for your specialized data mining applications, and how to develop your own machine learning schemes. It offers a thorough grounding in machine learning concepts as well as practical advice on applying machine learning tools and techniques in real-world data mining situations. Inside, you'll learn all you need to know about preparing inputs, interpreting outputs, evaluating results, and the algorithmic methods at the heart of successful data mining. If you're involved at any level in the work of extracting usable knowledge from large collections of data, this book will be a valuable resource. WEKA Software Weka 3.0 requires a Java 1.1 (or later) compliant Java Virtual Machine. For the stable GUI version you need Swing, which is included in Java 2 and available separately for Java 1.1. For the development version you need Java 1.2 (or later).
  4. 4. Administrative Details FEES AND PAYMENT: A certificate of completion will be awarded upon successful completion of the course. Registration Singapore Kuala This serves as evidence of your professional Fee Lumpur development. Individual S$980 each S$880 each Fee CANCELLATIONS Group Fee* S$880 each S$780 each Should you be unable to attend, a substitute attendee is always welcome at *For 3 or more delegates from the same any time. We regret that no refund will be organization. made for any cancellation received less than 7 days before the event. Fees include daily lunch and refreshments and course reference materials. In the event of unforeseen circumstances, the organizers reserve the right to Please make payment in Singapore dollars substitute other course leader, amend the using crossed cheque in favour of C T Nexus program as necessary, or otherwise cancel Technology Transfer the course. For all events, fee must be sent with the 4 EASY WAYS TO REGISTER OR ENQUIRE  By Telephone: (65) 6487 6544 registration form to:  By Fax: (65) 6487 6428 C T Nexus Technology Transfer  By Email: tekbac@eisa.net.au Hougang Central Post Office  By Post: P O Box 107 Singapore 915304 C T Nexus Technology Transfer CERTIFICATE OF COMPLETION Hougang Central Post Office P O Box 107 Singapore 915304 There is no closing date but the class is kept small to foster interaction. Please register early to ensure a place. REGISTRATION FORM (Please photocopy this form to preserve the brochure and for additional registrations) (COURSE ON DATA MINING – SEP 2002) (1) Dr/Mr/Ms/Mrs_____________________________________________ Designation: __________________________ (2) Dr/Mr/Ms/Mrs_________________________________ ____________Designation: __________________________ (3) Dr/Mr/Ms/Mrs_________________________________ ____________Designation:___________________________ Organisation: ____________________________________________________________________________________ Address: _________________________________________________________________________________________ ________________________________________Country: __________________Postcode: ______________ Contact Person: Dr/Mr/Ms/Mrs: _______________________________________Designation:______________________ Tel: _____________________ Fax : ____________________________ Email: ________________________________ Please mail the completed form with your payment to: C T NEXUS TECHNOLOGY TRANSFER, Hougang Central Post Office, P O Box 107 Singapore 915304