Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Data Mining


Published on

  • Be the first to comment

  • Be the first to like this

Data Mining

  1. 1. De Montfort University Undergraduate Module Template Section 1 Basic Module Information 1. Basic module information • Module Title Data Mining • Module Code MGSC3404 • Credit Value 15 • Module Size [0.5,1,2 etc.] 0.5 • DMU Credit Level [0,1,2,3] 3 • Semester [1,2,x, both or year] year • SAB COMP • Faculty Computing Sciences & Engineering • Module Leader - with address, phone and email Joanne Bacon Tel: (0116) 2078485 e-mail: • Module Pre-requisites [module code(s) only] Not applicable Section 2 Module Definition The definition of the module characteristics, learning outcomes and assessment. 1. Module Characteristics Data is collected and stored in all different types of organisations – commercial, governmental, educational. Every day hundreds of megabytes of data are circulated via the Internet. We have enormous difficulty finding the information we need in large amounts of data. Data mining involves extracting meaningful information and knowledge from vast quantities of data, to help us to make informed decisions. Data mining techniques enable us to explore and analyse large quantities of data in order to discover meaningful patterns and rules. Although data mining is still largely a new, evolving field, it has already found numerous applications. In direct marketing, data mining is used for targeting people who are most likely to buy certain products and services. In trend analysis, it is used to identify trends in the marketplace by, for example, modelling the stock market. In fraud detection, data mining is used to identify insurance claims, cellular phone calls and credit card purchases that are most likely to be fraudulent. Data mining is fast becoming essential to the modern competitive business world. This module aims to review the methods available for uncovering important information from large data sets; to discuss the techniques and when and how to use them effectively. The module uses SAS and Predict. SAS is a comprehensive data management software package that combines data entry and manipulation capabilities with report production, graphical display and statistical analysis facilities. Predict is a sophisticated neural network based tool. 2. Learning Outcomes Outcome On successful completion of this module a student should be able to: no. Understand what is meant by data mining and appreciate the purpose and breadth of areas of application of data mining. 2 Describe and analyse the organisational structure of large data sets to facilitate effective data mining. 3 Recognise what data mining techniques are most suitable in a particular situation, to 1.29 Session 2001-2002 QA Handbook
  2. 2. apply the appropriate data mining techniques in a practical context, to interpret the results and to produce a report. 3. Learning and Teaching Strategies Student activity per week: 1 x 1 hour laboratory and 1 x 1 hour lecture. (Odd weeks unstaffed, even weeks staffed). Lectures will be used to provide an overview of material and to highlight the important points. Laboratories will be used to work on the practical aspects of the module. 3.1 Key Skills Key Skill[s] Opportunities to Opportunities to Learning Assessed [y/n] Code[s] learn [y/n] practice [y/n] Outcome[s] and ref. Number 3.2 Key Skills Notes Subject skills: Business Analysis Specific skills: Report writing (P,A) – practised and assessed in coursework. Answering exam questions (T,P,A) – taught in revision lectures, practised and assessed in the examination. Key skills: Application of number – in lectures and labs. Communication – written, via the coursework, which may require a student to submit a report. Problem solving – in lectures and labs. Cognitive skills: Critical evaluation. 4. Required Prior learning Not applicable 5. Module Syllabus What is data mining; the basic methodology. Review of basic exploratory data analysis. Data preparation; organising data for analysis. Data mining techniques, including: Principal component analysis, Cluster analysis, Decision trees, Neural networks – both supervised and unsupervised. The application of data mining, e.g.) banking, insurance, customer loyalty, credit cards, consumer purchasing trends, marketing. 6. Module Key Words Decision trees Cluster analysis Principal component analysis Neural network SAS 1.30 Session 2001-2002 QA Handbook
  3. 3. Predict 7. Assessment Relation Component Transcript Assessme Duration of Assessment % Essential to Type[s] Text nt assessment Weighting Thresh- (please ) outcomes [20 char] Descripto old r 2, 3 Coursewor Other 30% k Coursework 1,2 Exam exam 2 hours 70% Assessment Rationale The exam tests a student’s understanding of what is meant by data mining and how to interpret output from the various data mining techniques. The other coursework will typically consist of a phase test and a coursework. The phase test enables students to demonstrate their practical skills in the laboratory. The coursework enables students to demonstrate their ability to conduct a comprehensive data analysis and to present their findings in a professional report. Re-Assessment Re-assessed in failed component 8. Module Learning Materials Bibliography Fernandez G. Data Mining using SAS Applications. CRC press. Adrians P and Zantinge D. Data Mining. Addison-Wesley. 9. Resource Information HESA code for module i) Staff/Student Hours Activity (eg Staff hours per Staff hours per Student hours Student hours lecture, tutorial) week module per week per module ii) Student Numbers • Minimum and maximum student places (DMU) • Minimum and Maximum student places (each partner) • Module designed for • Staff/Student ratio • Total Cost • Cost per Student iii) Learning Resources • Please describe any additional learning resources required to support this module. Please indicate where these impact upon central University resources [Library, IT/AV services] 10. Quality Assurance Checked by Ian Smith 16/12/03 Approval and Modification 1.31 Session 2001-2002 QA Handbook
  4. 4. • version control – approved version for session: module code and session first approved for this version ( eg. EZRA1001-98/99-1) • Date approved • Review date • Modified • Withdrawn Monitoring and Evaluation Collaborative Provision Section 3 Module Delivery Variations This section must be completed only when the module is offered in a distinct way to different cohorts of students. For example, a module on taxation delivered in the UK and in South Africa or a statistics module used by both business and engineering students may need to be adapted to suit local need or subject context. When completing this section each field should show how the generic module information in Sections one and two is adapted. 1. Offering Definition 2. Offering Learning Strategy 3. Specific Module Content 4. Specific Assessment Issues relating to this offering 1.32 Session 2001-2002 QA Handbook