SlideShare a Scribd company logo
1 of 5
CC282 Introduction to Machine Learning - Autumn 2008

Dr. R. Palaniappan
Department of Computing and Electronic Systems
University of Essex

Solution to exercise set 1

This set covers material from Lecture 1
________________________________________________________________________________________

Question 1

Make the necessary design choices for E and P for the following tasks T:


Remember that:

Task T: The problem to be solved

Performance P: How well the problem is solved

Experience E: Data presented to the ML system

Also, notice that there can be several possible answers for the given tasks.

a) Face recognition using a digital camera.

As T is to recognize faces seen               through a digital camera, the algorithm must     learn to
associate many images with the                 corresponding person's name or identification    number.

P: An obvious overall performance measure is the percentage of correct face recognition.
We can also extend this to include a measure of false positives, e.g., when the algorithm
thinks it recognizes a person's face according to the training data, but in fact the face
shown belongs to nobody included in the training data set.

E: What kind of data will be shown and how they will be presented to the algorithm is not
always trivial. To make the algorithm robust and reliable, we want to include the following
data in the training set:

    1.   As many face images as possible (as inputs) along with the corresponding face
         IDs. This may include images obtained with different camera angles and lighting
         variations, people with and without glasses, etc.

    2.   Many face images that will be associated with a `nobody I know' output or
         similar. This is very important if, for example, the algorithm will be used to allow
         entrance to a building. We do not want the algorithm to wrongly think an
         unknown face belongs to someone who might be allowed into the building.

    3.   Images of objects that resemble real faces but that are not faces at all. These
         may include random objects, dummy faces, cartoon drawings, etc. These too
         should be associated with a `nobody I know' input.

b) Deciding whether to buy or to sell IBM shares in the stock market.

P: A possible performance measure would quantify how often the algorithm made the
`right decision'. The `right decision' usually means: i) selling right before share prices
begin to drop OR ii) buying right before share prices increase. Buying when prices are
stable may also indicate good performance in some cases (e.g., when prices are not
predicted to drop too soon AND there is a money surplus to buy more shares). However,
selling when prices are stable usually indicates poor performance UNLESS the investor is
in urgent need for money. Another alternative is to have P as a measure of how much
money has been lost or won in previous transactions.

E: Choosing the right training data for this case is a critical problem and is a major source
of research for people in the field. However, one possibility might include the following: in
the training set:

    1.   As much data as possible as inputs, including IBM's past share prices, IBM
         productivity parameter, indicator of general financial activities in the IT sector,
         plus any other variables one may suspect of playing a role on how well IBM
         shares    will   fare    in the future, e.g., political knowledge,  foreign currency
         exchange rates, Microsoft share prices, Intel share prices, petrol prices, etc.
         Data concerning the above should be gathered for as long a period as possible
         (e.g., years) and presented to the algorithm as inputs. The desired outputs
         during training would be the correct decisions the algorithm should make based
         on a given input vector.

    2.   Because this situation depends on many dynamic variables (i.e., variables that
         change with time AND whose present value is determined by past values), the
         training set should include dynamic data, e.g., an input vector may include both
         present and past data for a particular point in time. This can be done even if the
         learning algorithm itself will include some form of dynamics in it (e.g., by using
         learning with recurrence, to be seen later in the course).

    3.   Training data should include vectors whose corresponding outputs will lead to
         one of the following decisions: i) `sell', ii) `buy', iii) `wait longer before deciding',
         and possibly iv) `do not know what to do; get a human analyst to help'.
         Interestingly, option iv) may be useful if we want the algorithm to make a
         decision for us only when it has a high confidence level on its decision. If the
         level of confidence the algorithm has on its decision is low, we would usually
         prefer to be consulted rather than let the algorithm give the wrong investment
         advice!

c) Predicting the value of IBM stock shares in the future.

P: As opposed to the case in b) above, this time we do not want to make a decision about
anything. We merely want to be able to predict future IBM share prices given a number of
present and past variable. Although this `Predictor' could be used as part of task b) above
to help make a decision, it is important to see that b) and c) are two entirely different tasks.
b) deals with pattern recognition and decision making, c) deals with modeling/regression.
An obvious performance parameter would the prediction error for a known input-output set.
A more detail P would also allow us to see how the error changes as the period between
the training `present' and `future' increases. For example, we may want to know whether
using training data for October 2007 allows us to make good predictions for November 2007 AND
for December 2007 as well.

E: Training data in this case would include:

    1.   Past data including all the factors that are suspected               to   affect   the   price   of   IBM
         shares (the inputs to the algorithm) along with know share prices.

    2.   The structure of each input-output pair should allow us to predictions based on
         various time distances between input and output, or, if we so desire, the model
         should only be required to predict share prices for the day following that of each
         input vector (i.e., we could use today's data to predict tomorrows' share prices,
         but not the prices for the day after tomorrow or later). Which option we choose
         will depend on how complete we want our model to be.

Question 2

Describe the full process of designing (i.e. requirements) of a learning system.
First, choices need to be made for T, P and E as shown in Question 1 above. Then, the
following must be determined:

The target function to learn, i.e., what kind of input variables the ML system will use and exactly what kind of
output variables it will give.

A     suitable      representation      for       the      target     function,      i.e.,    how        the
ML system will yield an output given a specific input vector.

A mechanism for learning this function, i.e., a way to make sure the above target
function will actually lead to learning (as determined by P) as experience is given to
the ML system.

The type of learning experience, whether the ML will learn on its own or provided with a set of good moves etc.

Also, refer to slides 20-27 from Lecture 1.

Question 3

List four reasons why one may need to use machine learning for a particular task.

As given in the lecture, we would consider using ML when:

     •   Some problems are hard and complex to solve, e.g. pattern recognition problems (like classifying
         handwritten characters)

     •   To mine information (hidden data) in large data sets (e.g. analyse customer shopping patterns in
         supermarkets)

     •   ML systems can be faster or more accurate (sometimes)

     •   Ability to mimic human learning and replace certain monotonous tasks - but requiring some
         intelligence (e.g. driving on highways for 24 hours)

Also, refer to slide 9 in Lecture 1.

Question 4

a)   Given the abilities of machine learning (ML) algorithms, it is always wise to try ML
     before trying any other alternative.

         Answer: False

         Reason: Machine learning can be computationally costly in some case, and it can be unreliable as well.
         We should thus never assume ML should be used before we explore simpler
         approaches.

b)   A good ML algorithm will have the same performance whether or not training data are
     pre-processed and/or pre-selected before being presented to the learning system.

         Answer: False

         Reason: As briefly discussed in the lecture, the quality of the training data with regard to noise and to
         how closely it represents situations to be faced by the ML algorithm after learning is
         stopped are critical factor in the design of a learning task. Bad or incomplete training data
         will lead to poor ML performance regardless of how good the algorithm is.
c)   Learning implies not only being able to solve a given problem, but, more specifically,
     solving the problem better and better as experience is gained.

         Answer: True

         See the definitions in the lecture slides. In particular, Lecture 1 – slide 17:

         "A computer program is said to learn from experience E with respect to some class
         of task T and performance measure P, if its performance at tasks in T, as measured
         by P, improves with experience E." Mitchell (1997)

         i. e. Learning = task performance improves with experience.

Question 5

Name the four main approaches in machine learning, briefly explain each one of them.

         1.   Supervised: given an input, the desired output is known during training, e.g., when
              a target function is to be learned.

         2.   Unsupervised: no desired                output;      let           algorithm              find     new    representations   and
              features from the input data.

         3.   Reinforcement: give the algorithm punishment or reward according to a system's
              behaviour Note: this can be considered a form of supervised learning.

         4.   Rule-learning: find logical structures                            in           the      data     (could   be   supervised    or
              unsupervised, and even reinforced).

Also, refer to slide 24 in Lecture 1.

Question 6

For the credit scoring classification problem as in slide 11 (Lecture 1), the LOW risk rule is
            – IF income > Ѳ 1 AND savings > Ѳ 2 THEN low risk
Obtain similar rules for classifying HIGH risk.
                                            Savings




                                                                            Low risk
                                                       -
                                                                            +                 +
                                                               -                     +
                                                                                                  +

                                                                        +                +
                                            Ѳ2                 -

                                                      High risk
                                                                                     -
                                                           -        -




                                                                   Ѳ1                        Income



              –    IF income < Ѳ1 AND savings < Ѳ2 THEN high risk
              –    IF income > Ѳ1 AND savings < Ѳ2 THEN high risk
              –    IF income < Ѳ1 AND savings > Ѳ2 THEN high risk


NOTE: The solution

              –    IF income < Ѳ1 THEN high risk
              –    IF savings < Ѳ2 THEN high risk
Is not a very good solution, though it achieves the correct classification, this is because of the rule overlap for
some points.

More Related Content

What's hot

Stock market analysis using supervised machine learning
Stock market analysis using supervised machine learningStock market analysis using supervised machine learning
Stock market analysis using supervised machine learningPriyanshu Gandhi
 
Models of Operations Research is addressed
Models of Operations Research is addressedModels of Operations Research is addressed
Models of Operations Research is addressedSundar B N
 
Machine Learning Basics
Machine Learning BasicsMachine Learning Basics
Machine Learning BasicsSuresh Arora
 
churn prediction in telecom
churn prediction in telecom churn prediction in telecom
churn prediction in telecom Hong Bui Van
 
A LINEAR REGRESSION APPROACH TO PREDICTION OF STOCK MARKET TRADING VOLUME: A ...
A LINEAR REGRESSION APPROACH TO PREDICTION OF STOCK MARKET TRADING VOLUME: A ...A LINEAR REGRESSION APPROACH TO PREDICTION OF STOCK MARKET TRADING VOLUME: A ...
A LINEAR REGRESSION APPROACH TO PREDICTION OF STOCK MARKET TRADING VOLUME: A ...ijmvsc
 
Financial forecastings using neural networks ppt
Financial forecastings using neural networks pptFinancial forecastings using neural networks ppt
Financial forecastings using neural networks pptPuneet Gupta
 
Modeling & Simulation Lecture Notes
Modeling & Simulation Lecture NotesModeling & Simulation Lecture Notes
Modeling & Simulation Lecture NotesFellowBuddy.com
 
Software for Stock Market Prediction
Software for Stock Market PredictionSoftware for Stock Market Prediction
Software for Stock Market PredictionSSA KPI
 
IRJET- Stock Price Prediction using combination of LSTM Neural Networks, ARIM...
IRJET- Stock Price Prediction using combination of LSTM Neural Networks, ARIM...IRJET- Stock Price Prediction using combination of LSTM Neural Networks, ARIM...
IRJET- Stock Price Prediction using combination of LSTM Neural Networks, ARIM...IRJET Journal
 
How ml can improve purchase conversions
How ml can improve purchase conversionsHow ml can improve purchase conversions
How ml can improve purchase conversionsSudeep Shukla
 

What's hot (16)

Lecture 1.pptx
Lecture 1.pptxLecture 1.pptx
Lecture 1.pptx
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Stock market analysis using supervised machine learning
Stock market analysis using supervised machine learningStock market analysis using supervised machine learning
Stock market analysis using supervised machine learning
 
Models of Operations Research is addressed
Models of Operations Research is addressedModels of Operations Research is addressed
Models of Operations Research is addressed
 
Machine Learning Basics
Machine Learning BasicsMachine Learning Basics
Machine Learning Basics
 
Stock market with nn
Stock market with nnStock market with nn
Stock market with nn
 
Machine learning
Machine learningMachine learning
Machine learning
 
Fuzzy Presentation
Fuzzy PresentationFuzzy Presentation
Fuzzy Presentation
 
churn prediction in telecom
churn prediction in telecom churn prediction in telecom
churn prediction in telecom
 
A LINEAR REGRESSION APPROACH TO PREDICTION OF STOCK MARKET TRADING VOLUME: A ...
A LINEAR REGRESSION APPROACH TO PREDICTION OF STOCK MARKET TRADING VOLUME: A ...A LINEAR REGRESSION APPROACH TO PREDICTION OF STOCK MARKET TRADING VOLUME: A ...
A LINEAR REGRESSION APPROACH TO PREDICTION OF STOCK MARKET TRADING VOLUME: A ...
 
En36855867
En36855867En36855867
En36855867
 
Financial forecastings using neural networks ppt
Financial forecastings using neural networks pptFinancial forecastings using neural networks ppt
Financial forecastings using neural networks ppt
 
Modeling & Simulation Lecture Notes
Modeling & Simulation Lecture NotesModeling & Simulation Lecture Notes
Modeling & Simulation Lecture Notes
 
Software for Stock Market Prediction
Software for Stock Market PredictionSoftware for Stock Market Prediction
Software for Stock Market Prediction
 
IRJET- Stock Price Prediction using combination of LSTM Neural Networks, ARIM...
IRJET- Stock Price Prediction using combination of LSTM Neural Networks, ARIM...IRJET- Stock Price Prediction using combination of LSTM Neural Networks, ARIM...
IRJET- Stock Price Prediction using combination of LSTM Neural Networks, ARIM...
 
How ml can improve purchase conversions
How ml can improve purchase conversionsHow ml can improve purchase conversions
How ml can improve purchase conversions
 

Viewers also liked

Representative Previous Work
Representative Previous WorkRepresentative Previous Work
Representative Previous Workbutest
 
What s an Event ? How Ontologies and Linguistic Semantics ...
What s an Event ? How Ontologies and Linguistic Semantics ...What s an Event ? How Ontologies and Linguistic Semantics ...
What s an Event ? How Ontologies and Linguistic Semantics ...butest
 
RFP document template
RFP document templateRFP document template
RFP document templatebutest
 
T2L3.doc
T2L3.docT2L3.doc
T2L3.docbutest
 
Nabila__proposal4.doc
Nabila__proposal4.docNabila__proposal4.doc
Nabila__proposal4.docbutest
 
Background Report (DOC)
Background Report (DOC)Background Report (DOC)
Background Report (DOC)butest
 
4.1 Verso L'equitá E La Coesione Sociale Vezzoli 09 10
4.1 Verso L'equitá E La Coesione Sociale Vezzoli 09 104.1 Verso L'equitá E La Coesione Sociale Vezzoli 09 10
4.1 Verso L'equitá E La Coesione Sociale Vezzoli 09 10vezzoliDSS
 
ICDMWorkshopProposal.doc
ICDMWorkshopProposal.docICDMWorkshopProposal.doc
ICDMWorkshopProposal.docbutest
 
Serge P Nekoval Grails
Serge P  Nekoval  GrailsSerge P  Nekoval  Grails
Serge P Nekoval Grailsguest092df8
 
Applying Support Vector Learning to Stem Cells Classification
Applying Support Vector Learning to Stem Cells ClassificationApplying Support Vector Learning to Stem Cells Classification
Applying Support Vector Learning to Stem Cells Classificationbutest
 
MS word document.doc
MS word document.docMS word document.doc
MS word document.docbutest
 
Catégorisation automatisée de contenus documentaires : la ...
Catégorisation automatisée de contenus documentaires : la ...Catégorisation automatisée de contenus documentaires : la ...
Catégorisation automatisée de contenus documentaires : la ...butest
 
Machine Learning (CS 567)
Machine Learning (CS 567)Machine Learning (CS 567)
Machine Learning (CS 567)butest
 
Curriculum Vitae
Curriculum VitaeCurriculum Vitae
Curriculum Vitaebutest
 
User Perceptions of Machine Learning
User Perceptions of Machine LearningUser Perceptions of Machine Learning
User Perceptions of Machine Learningbutest
 

Viewers also liked (20)

INFLUENZA INVERNALE E IL COLOSTRO
INFLUENZA INVERNALE E IL COLOSTROINFLUENZA INVERNALE E IL COLOSTRO
INFLUENZA INVERNALE E IL COLOSTRO
 
Representative Previous Work
Representative Previous WorkRepresentative Previous Work
Representative Previous Work
 
What s an Event ? How Ontologies and Linguistic Semantics ...
What s an Event ? How Ontologies and Linguistic Semantics ...What s an Event ? How Ontologies and Linguistic Semantics ...
What s an Event ? How Ontologies and Linguistic Semantics ...
 
RFP document template
RFP document templateRFP document template
RFP document template
 
VersalFinans - all products
VersalFinans - all productsVersalFinans - all products
VersalFinans - all products
 
T2L3.doc
T2L3.docT2L3.doc
T2L3.doc
 
Nabila__proposal4.doc
Nabila__proposal4.docNabila__proposal4.doc
Nabila__proposal4.doc
 
ppt
pptppt
ppt
 
4.doc
4.doc4.doc
4.doc
 
Background Report (DOC)
Background Report (DOC)Background Report (DOC)
Background Report (DOC)
 
4.1 Verso L'equitá E La Coesione Sociale Vezzoli 09 10
4.1 Verso L'equitá E La Coesione Sociale Vezzoli 09 104.1 Verso L'equitá E La Coesione Sociale Vezzoli 09 10
4.1 Verso L'equitá E La Coesione Sociale Vezzoli 09 10
 
ICDMWorkshopProposal.doc
ICDMWorkshopProposal.docICDMWorkshopProposal.doc
ICDMWorkshopProposal.doc
 
Serge P Nekoval Grails
Serge P  Nekoval  GrailsSerge P  Nekoval  Grails
Serge P Nekoval Grails
 
Applying Support Vector Learning to Stem Cells Classification
Applying Support Vector Learning to Stem Cells ClassificationApplying Support Vector Learning to Stem Cells Classification
Applying Support Vector Learning to Stem Cells Classification
 
PPT
PPTPPT
PPT
 
MS word document.doc
MS word document.docMS word document.doc
MS word document.doc
 
Catégorisation automatisée de contenus documentaires : la ...
Catégorisation automatisée de contenus documentaires : la ...Catégorisation automatisée de contenus documentaires : la ...
Catégorisation automatisée de contenus documentaires : la ...
 
Machine Learning (CS 567)
Machine Learning (CS 567)Machine Learning (CS 567)
Machine Learning (CS 567)
 
Curriculum Vitae
Curriculum VitaeCurriculum Vitae
Curriculum Vitae
 
User Perceptions of Machine Learning
User Perceptions of Machine LearningUser Perceptions of Machine Learning
User Perceptions of Machine Learning
 

Similar to Predicting IBM stock prices using machine learning

Rachit Mishra_stock prediction_report
Rachit Mishra_stock prediction_reportRachit Mishra_stock prediction_report
Rachit Mishra_stock prediction_reportRachit Mishra
 
Machine Learning: Need of Machine Learning, Its Challenges and its Applications
Machine Learning: Need of Machine Learning, Its Challenges and its ApplicationsMachine Learning: Need of Machine Learning, Its Challenges and its Applications
Machine Learning: Need of Machine Learning, Its Challenges and its ApplicationsArpana Awasthi
 
Machine Learning: The First Salvo of the AI Business Revolution
Machine Learning: The First Salvo of the AI Business RevolutionMachine Learning: The First Salvo of the AI Business Revolution
Machine Learning: The First Salvo of the AI Business RevolutionCognizant
 
Chapter 05 Machine Learning.pptx
Chapter 05 Machine Learning.pptxChapter 05 Machine Learning.pptx
Chapter 05 Machine Learning.pptxssuser957b41
 
Rahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptx
Rahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptxRahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptx
Rahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptxRahulKirtoniya
 
Machine learning Chapter 1
Machine learning Chapter 1Machine learning Chapter 1
Machine learning Chapter 1JagadishPogu
 
Machine Learning.pptx
Machine Learning.pptxMachine Learning.pptx
Machine Learning.pptxchadhar227
 
Machine Learning with Python- Methods for Machine Learning.pptx
Machine Learning with Python- Methods for Machine Learning.pptxMachine Learning with Python- Methods for Machine Learning.pptx
Machine Learning with Python- Methods for Machine Learning.pptxiaeronlineexm
 
Machine learning introduction
Machine learning introductionMachine learning introduction
Machine learning introductionAnas Jamil
 
Machine Learning Contents.pptx
Machine Learning Contents.pptxMachine Learning Contents.pptx
Machine Learning Contents.pptxNaveenkushwaha18
 
Machine Learning Ch 1.ppt
Machine Learning Ch 1.pptMachine Learning Ch 1.ppt
Machine Learning Ch 1.pptARVIND SARDAR
 
Machine Learning vs Decision Optimization comparison
Machine Learning vs Decision Optimization comparisonMachine Learning vs Decision Optimization comparison
Machine Learning vs Decision Optimization comparisonAlain Chabrier
 
Intro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning PresentationIntro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning PresentationAnkit Gupta
 
Machine learning with Big Data power point presentation
Machine learning with Big Data power point presentationMachine learning with Big Data power point presentation
Machine learning with Big Data power point presentationDavid Raj Kanthi
 
Types of Machine Learning- Tanvir Siddike Moin
Types of Machine Learning- Tanvir Siddike MoinTypes of Machine Learning- Tanvir Siddike Moin
Types of Machine Learning- Tanvir Siddike MoinTanvir Moin
 
Explainable AI
Explainable AIExplainable AI
Explainable AIDinesh V
 
machine learning basic-1.pptx
machine learning basic-1.pptxmachine learning basic-1.pptx
machine learning basic-1.pptxDrLola1
 
Selecting the Right Type of Algorithm for Various Applications - Phdassistance
Selecting the Right Type of Algorithm for Various Applications - PhdassistanceSelecting the Right Type of Algorithm for Various Applications - Phdassistance
Selecting the Right Type of Algorithm for Various Applications - PhdassistancePhD Assistance
 

Similar to Predicting IBM stock prices using machine learning (20)

ML_Lecture_1.ppt
ML_Lecture_1.pptML_Lecture_1.ppt
ML_Lecture_1.ppt
 
Rachit Mishra_stock prediction_report
Rachit Mishra_stock prediction_reportRachit Mishra_stock prediction_report
Rachit Mishra_stock prediction_report
 
Machine Learning: Need of Machine Learning, Its Challenges and its Applications
Machine Learning: Need of Machine Learning, Its Challenges and its ApplicationsMachine Learning: Need of Machine Learning, Its Challenges and its Applications
Machine Learning: Need of Machine Learning, Its Challenges and its Applications
 
Machine Learning: The First Salvo of the AI Business Revolution
Machine Learning: The First Salvo of the AI Business RevolutionMachine Learning: The First Salvo of the AI Business Revolution
Machine Learning: The First Salvo of the AI Business Revolution
 
Chapter 05 Machine Learning.pptx
Chapter 05 Machine Learning.pptxChapter 05 Machine Learning.pptx
Chapter 05 Machine Learning.pptx
 
Rahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptx
Rahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptxRahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptx
Rahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptx
 
Machine learning Chapter 1
Machine learning Chapter 1Machine learning Chapter 1
Machine learning Chapter 1
 
Machine Learning.pptx
Machine Learning.pptxMachine Learning.pptx
Machine Learning.pptx
 
Machine Learning with Python- Methods for Machine Learning.pptx
Machine Learning with Python- Methods for Machine Learning.pptxMachine Learning with Python- Methods for Machine Learning.pptx
Machine Learning with Python- Methods for Machine Learning.pptx
 
Machine Learning by Rj
Machine Learning by RjMachine Learning by Rj
Machine Learning by Rj
 
Machine learning introduction
Machine learning introductionMachine learning introduction
Machine learning introduction
 
Machine Learning Contents.pptx
Machine Learning Contents.pptxMachine Learning Contents.pptx
Machine Learning Contents.pptx
 
Machine Learning Ch 1.ppt
Machine Learning Ch 1.pptMachine Learning Ch 1.ppt
Machine Learning Ch 1.ppt
 
Machine Learning vs Decision Optimization comparison
Machine Learning vs Decision Optimization comparisonMachine Learning vs Decision Optimization comparison
Machine Learning vs Decision Optimization comparison
 
Intro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning PresentationIntro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning Presentation
 
Machine learning with Big Data power point presentation
Machine learning with Big Data power point presentationMachine learning with Big Data power point presentation
Machine learning with Big Data power point presentation
 
Types of Machine Learning- Tanvir Siddike Moin
Types of Machine Learning- Tanvir Siddike MoinTypes of Machine Learning- Tanvir Siddike Moin
Types of Machine Learning- Tanvir Siddike Moin
 
Explainable AI
Explainable AIExplainable AI
Explainable AI
 
machine learning basic-1.pptx
machine learning basic-1.pptxmachine learning basic-1.pptx
machine learning basic-1.pptx
 
Selecting the Right Type of Algorithm for Various Applications - Phdassistance
Selecting the Right Type of Algorithm for Various Applications - PhdassistanceSelecting the Right Type of Algorithm for Various Applications - Phdassistance
Selecting the Right Type of Algorithm for Various Applications - Phdassistance
 

More from butest

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEbutest
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jacksonbutest
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer IIbutest
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazzbutest
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.docbutest
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1butest
 
Facebook
Facebook Facebook
Facebook butest
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...butest
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...butest
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTbutest
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docbutest
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docbutest
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.docbutest
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!butest
 

More from butest (20)

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBE
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jackson
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer II
 
PPT
PPTPPT
PPT
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.doc
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1
 
Facebook
Facebook Facebook
Facebook
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENT
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.doc
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.doc
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.doc
 
hier
hierhier
hier
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!
 

Predicting IBM stock prices using machine learning

  • 1. CC282 Introduction to Machine Learning - Autumn 2008 Dr. R. Palaniappan Department of Computing and Electronic Systems University of Essex Solution to exercise set 1 This set covers material from Lecture 1 ________________________________________________________________________________________ Question 1 Make the necessary design choices for E and P for the following tasks T: Remember that: Task T: The problem to be solved Performance P: How well the problem is solved Experience E: Data presented to the ML system Also, notice that there can be several possible answers for the given tasks. a) Face recognition using a digital camera. As T is to recognize faces seen through a digital camera, the algorithm must learn to associate many images with the corresponding person's name or identification number. P: An obvious overall performance measure is the percentage of correct face recognition. We can also extend this to include a measure of false positives, e.g., when the algorithm thinks it recognizes a person's face according to the training data, but in fact the face shown belongs to nobody included in the training data set. E: What kind of data will be shown and how they will be presented to the algorithm is not always trivial. To make the algorithm robust and reliable, we want to include the following data in the training set: 1. As many face images as possible (as inputs) along with the corresponding face IDs. This may include images obtained with different camera angles and lighting variations, people with and without glasses, etc. 2. Many face images that will be associated with a `nobody I know' output or similar. This is very important if, for example, the algorithm will be used to allow entrance to a building. We do not want the algorithm to wrongly think an unknown face belongs to someone who might be allowed into the building. 3. Images of objects that resemble real faces but that are not faces at all. These may include random objects, dummy faces, cartoon drawings, etc. These too should be associated with a `nobody I know' input. b) Deciding whether to buy or to sell IBM shares in the stock market. P: A possible performance measure would quantify how often the algorithm made the `right decision'. The `right decision' usually means: i) selling right before share prices begin to drop OR ii) buying right before share prices increase. Buying when prices are stable may also indicate good performance in some cases (e.g., when prices are not predicted to drop too soon AND there is a money surplus to buy more shares). However, selling when prices are stable usually indicates poor performance UNLESS the investor is
  • 2. in urgent need for money. Another alternative is to have P as a measure of how much money has been lost or won in previous transactions. E: Choosing the right training data for this case is a critical problem and is a major source of research for people in the field. However, one possibility might include the following: in the training set: 1. As much data as possible as inputs, including IBM's past share prices, IBM productivity parameter, indicator of general financial activities in the IT sector, plus any other variables one may suspect of playing a role on how well IBM shares will fare in the future, e.g., political knowledge, foreign currency exchange rates, Microsoft share prices, Intel share prices, petrol prices, etc. Data concerning the above should be gathered for as long a period as possible (e.g., years) and presented to the algorithm as inputs. The desired outputs during training would be the correct decisions the algorithm should make based on a given input vector. 2. Because this situation depends on many dynamic variables (i.e., variables that change with time AND whose present value is determined by past values), the training set should include dynamic data, e.g., an input vector may include both present and past data for a particular point in time. This can be done even if the learning algorithm itself will include some form of dynamics in it (e.g., by using learning with recurrence, to be seen later in the course). 3. Training data should include vectors whose corresponding outputs will lead to one of the following decisions: i) `sell', ii) `buy', iii) `wait longer before deciding', and possibly iv) `do not know what to do; get a human analyst to help'. Interestingly, option iv) may be useful if we want the algorithm to make a decision for us only when it has a high confidence level on its decision. If the level of confidence the algorithm has on its decision is low, we would usually prefer to be consulted rather than let the algorithm give the wrong investment advice! c) Predicting the value of IBM stock shares in the future. P: As opposed to the case in b) above, this time we do not want to make a decision about anything. We merely want to be able to predict future IBM share prices given a number of present and past variable. Although this `Predictor' could be used as part of task b) above to help make a decision, it is important to see that b) and c) are two entirely different tasks. b) deals with pattern recognition and decision making, c) deals with modeling/regression. An obvious performance parameter would the prediction error for a known input-output set. A more detail P would also allow us to see how the error changes as the period between the training `present' and `future' increases. For example, we may want to know whether using training data for October 2007 allows us to make good predictions for November 2007 AND for December 2007 as well. E: Training data in this case would include: 1. Past data including all the factors that are suspected to affect the price of IBM shares (the inputs to the algorithm) along with know share prices. 2. The structure of each input-output pair should allow us to predictions based on various time distances between input and output, or, if we so desire, the model should only be required to predict share prices for the day following that of each input vector (i.e., we could use today's data to predict tomorrows' share prices, but not the prices for the day after tomorrow or later). Which option we choose will depend on how complete we want our model to be. Question 2 Describe the full process of designing (i.e. requirements) of a learning system.
  • 3. First, choices need to be made for T, P and E as shown in Question 1 above. Then, the following must be determined: The target function to learn, i.e., what kind of input variables the ML system will use and exactly what kind of output variables it will give. A suitable representation for the target function, i.e., how the ML system will yield an output given a specific input vector. A mechanism for learning this function, i.e., a way to make sure the above target function will actually lead to learning (as determined by P) as experience is given to the ML system. The type of learning experience, whether the ML will learn on its own or provided with a set of good moves etc. Also, refer to slides 20-27 from Lecture 1. Question 3 List four reasons why one may need to use machine learning for a particular task. As given in the lecture, we would consider using ML when: • Some problems are hard and complex to solve, e.g. pattern recognition problems (like classifying handwritten characters) • To mine information (hidden data) in large data sets (e.g. analyse customer shopping patterns in supermarkets) • ML systems can be faster or more accurate (sometimes) • Ability to mimic human learning and replace certain monotonous tasks - but requiring some intelligence (e.g. driving on highways for 24 hours) Also, refer to slide 9 in Lecture 1. Question 4 a) Given the abilities of machine learning (ML) algorithms, it is always wise to try ML before trying any other alternative. Answer: False Reason: Machine learning can be computationally costly in some case, and it can be unreliable as well. We should thus never assume ML should be used before we explore simpler approaches. b) A good ML algorithm will have the same performance whether or not training data are pre-processed and/or pre-selected before being presented to the learning system. Answer: False Reason: As briefly discussed in the lecture, the quality of the training data with regard to noise and to how closely it represents situations to be faced by the ML algorithm after learning is stopped are critical factor in the design of a learning task. Bad or incomplete training data will lead to poor ML performance regardless of how good the algorithm is.
  • 4. c) Learning implies not only being able to solve a given problem, but, more specifically, solving the problem better and better as experience is gained. Answer: True See the definitions in the lecture slides. In particular, Lecture 1 – slide 17: "A computer program is said to learn from experience E with respect to some class of task T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E." Mitchell (1997) i. e. Learning = task performance improves with experience. Question 5 Name the four main approaches in machine learning, briefly explain each one of them. 1. Supervised: given an input, the desired output is known during training, e.g., when a target function is to be learned. 2. Unsupervised: no desired output; let algorithm find new representations and features from the input data. 3. Reinforcement: give the algorithm punishment or reward according to a system's behaviour Note: this can be considered a form of supervised learning. 4. Rule-learning: find logical structures in the data (could be supervised or unsupervised, and even reinforced). Also, refer to slide 24 in Lecture 1. Question 6 For the credit scoring classification problem as in slide 11 (Lecture 1), the LOW risk rule is – IF income > Ѳ 1 AND savings > Ѳ 2 THEN low risk Obtain similar rules for classifying HIGH risk. Savings Low risk - + + - + + + + Ѳ2 - High risk - - - Ѳ1 Income – IF income < Ѳ1 AND savings < Ѳ2 THEN high risk – IF income > Ѳ1 AND savings < Ѳ2 THEN high risk – IF income < Ѳ1 AND savings > Ѳ2 THEN high risk NOTE: The solution – IF income < Ѳ1 THEN high risk – IF savings < Ѳ2 THEN high risk
  • 5. Is not a very good solution, though it achieves the correct classification, this is because of the rule overlap for some points.