SlideShare a Scribd company logo
1 of 8
Using Linear & Logistic Regression along with Collaborative
         Filtering Technique for Effective Training Program Deployment
                                      Part I
                                                              Deepak Manjarekar

ABSTRACT—                                                                   C
                                                                            ountless times we hear that a bright student failing in a
“If a man does not keep pace with his companions, perhaps it is             specific test or being completely indifferent towards
because he hears a different drummer. Let him step to the music             learning a specific subject. Parents and teachers are
which he hears, however measured or far away.”                              equally perplexed about why a natural genius would
                                   - Henry David Thoreau                    fail or perform poorly in what someone might feel to be
A                                                                           an easy subject? The problem does not lie in our
ll over the world organizations are spending enormous                       incorrect classification of the student as a genius,
amounts of resources to train their employees. The                          rather it lies in the erroneous selection of the training
concept of “Learning Organizations” is beginning to                         program for that student. To make matters worse, we
emerge as a competitive necessity.1 The surge in the                        observe the same phenomenon in many organizations
internal training programs is largely in the hope that the                  where many high performing employees or high
employees will be able to cope up with the fast                             ranking students freshly minted from top notch
changing technologies and be productive in their job                        universities go inside the four walls of the training room
right from day one. The proliferation of internal training                  only to find themselves to be a fly on the wall. Each
programs is also due to the fact that the external                          year organizations spend millions of dollars and
training programs are usually very expensive, not in                        countless hours in training their new recruits and star
the close vicinity of the organization and may have                         performers only to find dismal results. At best
schedules that won’t fit the needs of the organization.                     employees come out of the training class with minimal
So in the current era where companies like to                               familiarity of the subject. Thus the learning division
outsource everything that is not their core business,                       within an organization usually suffers with low ROI on
we see a reverse trend of in-sourcing the training                          the aggregate spent on the learning activities. It is
programs for their employees. Learning organizations                        clear many corporate training programs are unable to
within the companies are thus cost centers whose                            deliver the results companies expect.2
primary responsibility is to deliver effective & custom
made training programs that may prepare the trainees                          Main reasons why the corporate training
in latest technologies or processes. Yet despite of all                     programs fail?
the customization; organizations are still grappling with                     We can attribute the marginal success of the training
the problem of little or no ROI on their training
                                                                            programs mainly to the following three reasons,
programs. What’s happening then? May be the
                                                                              1. Poorly organized training programs
delivery of the training was not right? Perhaps the
selection process of the trainees may be flawed? I                            2. Ineffective training delivery
tend to think that the ineffectiveness of the training                        3. Improper selection of trainees for the training
programs is largely due to the wrong selection of                                 program
trainees by the training program coordinators. This
paper will illustrate use of regression, logistics                            Now let’s look in details what goes wrong in all three
regression and collaborative filtering techniques to                        cases.
correctly identify employees, who may enjoy the
training, benefit from it and may continue to use the                       1. Poorly organized training programs
learned skills long after the training was over.                               "Forward-thinking companies have reinvented their
                                                                            training organizations around the concept of running
INTRODUCTION                                                                training like a business, and have tangible successes
                                                                            to show for it. These corporations now know what they
   _________________________________________________________
   
                                                                            are spending on training and what the investment
   Deepak Manjarekar is working as a Program Manager at KPIT                yields." says David van Adelsberg & Edward A.
INFOSYSTEMS LTD, Hinjewadi, Pune, India. Currently he is managing
the company’s second largest star customer account in the offshore          Trolley, co-authors of “Running Training Like a
delivery. He has seventeen years of experience in the IT Industry and has   Business”3
worked with many fortune 100 clients delivering solutions in Data
Warehousing and Business Intelligence space. Mr. Manjarekar is an
Electronics Engineer from Bombay University and has received his MBA           But let’s be very honest with ourselves and ask the
from Anderson School of Management at University of California at Los       question, “How many companies are really forward
Angeles (UCLA). He is also a PMI certified PMP. He currently resides in     thinking?” Let’s ask, “How many organizations treat
Pune, India. You can reach him at Deepak.manjarkar@kpitcummins.com.



©Deepak Manjarekar, KPIT INFOSYSTEMS LTD                                                                                            1
their learning organizations as profit centers versus        3. Improper selection of trainees for the training
cost centers?” The truth is that many organizations still         program
treat their own learning organizations as cost centers.         In KPIT Cummins, it’s a frequent gripe from the
So every time the company tries to cut cost, it’s usually    learning organization that Project Managers or
the learning organization that gets the hit first.           Program Managers don’t send their high ranking
                                                             personnel (or resources) for the various training
   Since organizations have started in-sourcing the          programs. “We always get the not so bright or low
training programs, they are burdened with managing all       performing employees to train” is what they say5. Now
the training programs using internal resources. These        one can argue about what should be the charter of the
resources may or may not come from education                 learning organization within a company. One might say
delivery background. Typically most trainers are high        that the learning organization’s prime responsibility
performing folks in their own technical forte who are        should be to train employees who are not really the
then delegated with the task to train others in the same     star performers or employees who need some kind of
technology. The trainers may or may not have any             technical training to progress ahead in their jobs. But
teaching background. The same folks are then                 does this mean that the high flyers should be deprived
delegated the task of creating the training materials for    from quality training? I can argue about the learning
the program. Such type of training materials lacks the       organization’s charter to the end of this article. But
simplicity as well as appropriate depth that are             that’s not the focus of this article.
required to go with the delivery.
                                                               The following points can highlight why the trainee
      Many times the training rooms lacks proper             selection is usually full of flaws. These are industry
infrastructure that is required for effective learning. In   observations and not necessarily reflect the trends at
today’s world where real estate is such a prime              KPIT Cummins.
commodity, we often find organizations taking the
liberty in creating crammed training rooms to make             a. Favoritism
some room for the corner offices. These crammed                b. Crying baby gets the milk
rooms are not conducive for any learning activities at         c. Reluctance to release bright and deserving
all. Sharing one computer among more than one                     candidates for higher and more appropriate
trainee, no set time for lab work, missing charts,                training for the fear that they may surpass their
workbooks and other training materials further adds to            supervisors
the poor organization of the training programs.                d. Many deserving candidates are so busy in their
                                                                  day-to-day work that they don’t find time for any
2. Ineffective training delivery                                  outside work activities like training.
    “Traditional training methods such as classroom            e. Supervisors would simply avoid sending their
and workshop training are by far the most conventional            high performers for training to maintain business
and popular methods of delivering training to                     continuity. Keeping their best people within the
employees. Their effectiveness will depend upon the               training walls may disrupt the business
content delivered and how interesting the presenter               continuity.
can make the material in order to engage and involve           f. Peer pressure on employees. Many times
employees.”- Writes TimothyF.Bednarz, Ph.D.4                      employees enroll themselves in a training
                                                                  program because their colleagues are enrolled in
   As I have mentioned earlier, most trainers are high            the same program, only to find themselves in a
performing individual contributors who have been                  wrong class.
delegated the task of training others in the same area         g. Sending substitutes for the training when the
of their expertise. They have little or no experience in          actual enrollee can’t attend.
conducting formal training. Some trainers even have            h. Supply and demand for the training on a specific
communication problems and find it difficult to conduct           technology. Companies sometimes arrange a
classes in the languages that are not native to them.             mass training on technologies such as Oracle
Many suffer from exhibiting no or low energy during the           Apps or SAP and force all the people on the
teaching, there by droning the class to sleep. Many               bench to attend the training so in case there is a
lack passion for teaching. Often their commitment to              requirement these people can fill it up.
class and the learning of its participants is                  i. Due to unforeseen circumstances sometimes
questionable. It’s a frequent feedback from the                   people find themselves inside the four walls of
participants that the trainer lacked any practical                the classroom undergoing the training that
experience or that he/she did not have anything to                seems to suffocate them.
share from the work outside of the four classroom
walls.                                                         Whatever may be the case, we can find countless



©Deepak Manjarekar, KPIT INFOSYSTEMS LTD                                                                         2
examples where the trainee provides feedback to             about a specific training program will definitely matter
  the organization stating that he/she has gained             to me, provided they have attended that specific
  marginally from the training. Many don’t use the            training program. It’s like going to a movie if your best
  learning from training due lack of applicability or job     friend recommends it to you after watching it
  change or role change.                                      himself/herself. It works because you share many
                                                              common traits in your individual profiles. So in that
  In summary there is a lot of subjectivity built into the    respect a trainee should definitely value the opinions of
  candidate selection for the training. Due to the            former trainees of the same program if their profiles
  reasons listed above and many more that are not             are more or less similar. But where should a trainee
  listed, we find many misfits in the classroom               find such folks? Should the person simply poll the
  scratching their heads why in the first place they          people within his network inside the organization? May
  ended up in that class?                                     be not. The reason being these people may be in his
                                                              network for various reasons and not just because they
  The Scope                                                   share same profiles. So any arbitrary opinions may
  T                                                           further pollute the perception of the trainee. To
  he scope of this paper is limited to the appropriate        circumvent this problem, I will use the collaborative
  candidate selection process via statistical and             filtering technique to add the subjectivity back into the
  mathematical modeling. There fore I will                    selection process. But now this subjectivity will be from
  conveniently ignore the first two points I had made         the trainee’s point of view and should further help the
  earlier about why corporate training programs fail?         trainee to decide whether s/he should attend the
  Let us just assume for the sake of simplicity that          training. Let’s call this the good subjectivity factor.
  corporations will take care of the first two problems
  by setting up state of the art training facilities and by   The design of the OSCS Model
  hiring the best trainers money can buy. Let us just           To design the OSCS model, I came up with three
  assume that in the ideal scenario, we have to deal          steps.
  with just the final point of improper selection of            1.   The first step will take out the bad subjectivity
  trainees for the training program. I am purposely                  factor from trainee selection process.
  limiting the scope of this paper to this last point as        2.   Second step will make an objective candidate
  you can soon realize that dealing with just this last              selection by using the results of the first step.
  point will be an insurmountable task. Taking out the          3.   Third step will add the good subjectivity factor
  subjectivity and bias from the candidate selection                 back into the selection process.
  process may sound very simple but will be very
  difficult to implement at the least. Let’s see how we          During the OSCS Model building we will have to
  can formulate the objective candidate selection             perform the following steps in a specific order. I am
  model for the training programs?                            essentially making two hypotheses here; and to test
                                                              them out I will have to use linear multi dimensional
Objective / subjective candidate selection                    regression technique to measure the statistical
model                                                         significance of my alternate hypothesis (Ha) in each
   I                                                          case (step 1 above). After testing the first two
n order to design the Objective / Subjective Candidate        hypotheses, I will use logistic regression technique to
Selection Model (hear after referred as OSCS Model) I         create a predictive model (step 2). This model will
will suggest the use of three mathematical/statistical        predict whether a certain employee if selected will be
techniques. But before moving onto the techniques, I          successful in a specific training program or not?
suppose, I owe an explanation of what is OSCS                 Finally, we will add the good subjectivity factor back
model? In the beginning of this article I have spent          into the equation (step 3). The four techniques I will
much time in articulating how subjectivity from               use are namely, sample size determination for finite
managers or trainers point of view is detrimental to the      population for survey, linear multi dimensional
success of the training program. Let’s call this the bad      regression, logistics regression predictive model &
subjectivity factor. With bad subjectivity factor we          collaborative filtering.
generally choose inappropriate candidates for the
training. So we must device a way to make the trainee
selection process objective. But if we think about            1.   Taking out the bad subjectivity factor from
subjectivity from the trainee’s point of view; it may not          trainee selection process.
necessarily as bad. Confused? May be. Let me explain
it little bit further. If I have couple of colleagues who          The first null hypothesis I am making here is,
are my best buddies and who also incidentally share
the same professional profile as I do, then their opinion
                                                              H    0
                                                                       : Every employee who attends a specific training
                                                              is always benefited from the training.


©Deepak Manjarekar, KPIT INFOSYSTEMS LTD                                                                             3
There fore my alternate hypothesis will be,                               provide us any predictive power. Also we will eliminate
H    a
         : Not all employees benefit from the training they                     the attributes that does not affect (zero covariance) the
                                                                                desired outcome, i.e. success in training.
receive.
                                                                                  Thus at the end of this process we will come out with
   To test out the first hypothesis I will need to conduct                      a set of attributes that are significant in unbiased
a survey of a statistically significant sample of                               candidate selection.
employees who had undergone some kind of training
in past. This sample can be chosen from the finite                              2.   Make an objective candidate selection using
population. We call it finite population because at any                              the results of the first step.
given time the organization will be able to identify all
the employees who had attended any training1. The                                 After identifying the set of attributes that are
sample size will depend on the confidence level we                              necessary and significant for the candidate selection,
would want on the results of our survey. (Please see                            we will need to use logistic regression to come out with
appendix 1 on how to calculate the sample size for a                            an objective decision of whether a certain candidate
finite population6)                                                             should be selected for a specific training.
  Upon identifying the random sample of trainees, we                               In order to achieve this we will need to build a
can survey them and find out the effectiveness of the                           predictive model using the logistic regression
past training for each trainee. We can invalidate the                           technique. To build such a model we will start with the
null hypothesis if our positive responses are below                             list of significant attributes from the step one. Let’s just
certain cut off level. This cut off level may or may not                        say we have identified a set of 10 attributes namely,
exist for a learning organization.                                              x1, x2, x3, …..x10 from step one. To build the model
                                                                                we will have to once more use the past training data.
      The second null hypothesis I am making here is,
H    0
         : Every single personal & professional attribute of                        In this case we will have to create the dataset (called
the candidate is equally important in the selection                             training dataset) with balanced outcomes. By this, I
process.                                                                        mean, our training dataset must contain equal
    There fore my alternate hypothesis will be,                                 proportions of favorable and unfavorable outcomes
                                                                                (passed/failed, successful/unsuccessful,         satisfied/
H    a
         : There are certain personal & professional                            unsatisfied, etc) along with the proper distributions of
attributes that matter more than others.                                        significant attributes for each candidate. Again the
                                                                                training dataset size really matters here. To achieve a
   To test out this first hypothesis I will need to conduct                     high degree of predictability, the training dataset must
a linear regression with multiple parameters. My                                contain reasonable amount of data covering most of
approach will be to start with all the data attributes that                     the possible domain values for all the attributes
the learning organization has captured so far about                             involved. When the training dataset is not sufficiently
each candidate and then weed out the ones that don’t                            large, predictive models tend to over fit the data7. Over
have any statistical significance in the effective training                     fitting causes a definite problem where the model
reception of the candidate. We can start with personal                          works very, very well on the training dataset but it will
parameters like age, gender, native language, 2nd, 3rd &                        fail spectacularly on the test dataset (unseen data).
4th languages if spoken, domicile state, primary                                We will need two other important datasets called
language of education, educations degrees, other                                holdout dataset and test dataset8. Holdout dataset can
training, etc. We can also start with professional                              be created from the same training dataset by randomly
parameters like number of years of experience, no of                            selecting 10% to 15% of the training data. These
years in the last assignment, tech skill set, title, prior                      records are kept aside and are not used during the
experience in the training subjects, etc.                                       model training. Instead once the final model is built, the
                                                                                holdout dataset is used to create the confusion matrix
  The attribute elimination process will be based on                            and assess the predictability of the model. Test
the covariance of the attributes with the expected                              dataset is created from the sample data that is
outcome, i.e. the success of the trainee in the                                 gathered after the model has been built and whose
program. During this process, we will eliminate many                            outcome was not known when the model was under
non significant attributes (low covariance) that does not                       construction. How to create these datasets is beyond
                                                                                the scope of this first part of the paper. I will cover it
   1
     ________________________________________________________                   briefly in the second paper.
_ You can say it’s a big assumption. May be so. In case if an organization is
_
very large and does not have data for all the past trainees, we can still
formulate an experiment and derive a sample size based on infinite
population.



©Deepak Manjarekar, KPIT INFOSYSTEMS LTD                                                                                                 4
One might say we could have very well used the linear                        using less than 10 attributes & reduce the need of
multi dimensional regression model form the first step                       huge training dataset.
here; rather than building a new model using the same
attributes. The following graph shows an output of a                           Thus at the end of the second stage we will get a
simple linear regression with just one predictor                             predictive model that will tell us whether to send a
variable.                                                                    particular trainee to the training or not based on the
                                                                             output of the model.

                                                                             3.   Add the good subjectivity factor back into the
                                                                                  selection process.

                                                                               After identifying the trainee from the second step, we
                                                                             may invite the trainee for the actual training program.
                                                                             But we will have to give the chance to the trainee to
                                                                             assess if s/he would like to self select for this training.
                                                                             Here we will be adding the subjectivity back into the
                                                                             decision criteria, but from the trainee’s point of view.

The reason we can’t use the equation2 we get from the                          The first thing we can expect here from the trainee is
first step is because the predictive values of y may                         “self selection”. Self selection happens when a person
cross the bounds of 0 & 1 based on the values of the                         uses his/her own decision criteria to a specific
ten weights and the respective attributes values for a                       problem. This process is very complex and hard to
trainee. The best reason for using the logistic                              quantify but most of the time delivers correct
regression is that the model will always return us the                       assessment. We also call this as “gut feel”. Everyone
values of y as a Bernouilli probability of either 0 or 1.                    has it and its accuracy increases as one get older and
Where the value of 1 can be mapped for success and                           experience various decision making situations.
0 for failure or vice versa.
                                                                                Once the candidate uses “self selection” process,
  If y = α 0 ± α1x1 ± α 2 x 2 ± .... ± α 10 x10 ± ε from                     half of the good subjectivity is added to the model by
the linear multi dimensional regression,                                     the trainee himself. But to add further value to the
Then p (Y = 1 | y ) = e y /(1 + e y ) Thus we will                           model, we can use another technique called,
                                                                             “collaborative filtering” to gauge the appropriateness of
basically receive a probability value of the Y = 1 (or                       the training for the trainee as experienced by other
success) between zero and one based on the given                             trainees in past who has undergone the same training.
values of the attributes and their respective weights.                       As mentioned above, this is like getting the
The following figure will depict the output values of a                      recommendation from your colleagues who you may or
logistics regression given one predictor input.                              may not know but share nearly the same profile as
                                                                             yours. Thus it’s useful when you want to make
                                                                             predictions on preferences by considering all of a long
                                                                             training history.

                                                                                Now let me walk you thru an example of how
                                                                             collaborative filtering works. Let’s assume a trainee
                                                                             named Akshay has already gone thru our first two
                                                                             steps and we arrived at the objective decision that
                                                                             Akshay should take the Hyperion training. We gave
                                                                             Akshay a chance to self select himself for the program.
                                                                             After much contemplation, Akshay thinks he should go
   Also while creating the predictive model in second
                                                                             for the training. At this time he may or may not have
step, we may or may not end up using all ten                                 any reservations. But in any case, we tell Akshay that
attributes. Using more variables may cause the model                         let’s see how many people who are like Akshay and
to over fit the data, if the training dataset size is not                    who have undergone similar training programs will
large enough to cover all the domain values. Thus by                         recommend the training to Akshay?
not using the equation in first step, we may end up
                                                                             Vote of Akshay for Hyperion training will be equal to
   2
     ________________________________________________________
_ At the end of the linear multi dimensional regression we will get the
_
                                                                             the average of other trainees’ vote for the Hyperion
equation                  of                  the                 form       training.
    y = α 0 ± α1x1 ± α 2 x 2 ± .... ± α10 x10 ± ε           , where α1, α2
are weights; x1, x2 are the attributes & ε is error term.                    But we need to take into account the following:


©Deepak Manjarekar, KPIT INFOSYSTEMS LTD                                                                                             5
• Different people have different “standards”                             bough this book also bought books a, b & c, etc.
  • People more similar to Akshay should be giving
         higher weighting in predicting Akshay’s                              In this part I of the paper, I am postulating my ideas
preferences.                                                                of how we can use past training data in the
                                                                            organization and use appropriate statistical analysis to
If a is the “active user”, and c the “candidate training”,                  correctly identify the deserving candidates for a
then   v ca , the predictive vote for user a for candidate
       ˆ                                                                    specific training using an objective criteria.
training c can be given as9,
                                                                              In part II of this paper, I will actually either prove or
                                                                            disprove my postulation. The task will not be easy. I
             v ca − v*a = ∑ w(a, u ) * (vc − v*u )
             ˆ                           u
                                                                            will have to uncover as much as past training data as
                            u≠a
                                                                            possible. Deal with the data quality issue of the found
                                                                            data, if any. It is going to be an interesting endeavor
Where     v*a is the mean vote of active user a &                           and may last for more than a year, as you might know
           v*u is the mean vote of user u                                   that building predictive models are very easy but
                                                                            testing them with actual test datasets takes time as the
                                                                            test data needs to come after the model is built.
Choosing an appropriate weight3 w(a, u) where
              c ( a, u )                                                      So stay tuned…
w(a, u ) =               and where k a can be derived as,
                 ka
k a = ∑ | c (a, u ) | and where c(a,u) is the votes                         APPENDIX 1
       u≠a
                                                                            1. Statistical Theory for Sampling of Finite
correlation between users a and u, and k a is the
                                                                            Population
normalizing factor so that the absolute weights sum to
one10.
                                                                            Suppose
                                                                            • The proxy for the total poplation is called the
  Putting all the various formula elements together we
                                                                            Sampling Frame (SF). The SF has N customers,
can further simplify the formula and write as,                              where N is not large
                                                                            • The mean and variance of the quantity of interest
                           ∑ c ( a , u ) * (v − v
                                             u
                                             c      *
                                                     u
                                                         )                  (QI) across the SF are m and s 2 respectively
             v =v
             a
             ˆ       a
                         + u≠a
                                                                            • We draw a simple random sample of      n * customers
                                ∑ | c ( a, u ) |
             c      *
                                                                                                _
                                  u≠a                                       The sample mean  x has a probability distribution,
                                                                                                                  1     1
                                                                            which has mean m and variance s * ( * − )
   Thus if the predicted vote for the active user a for                                                    2

the candidate training c is greater than the average                                                             n      N
vote of all other users who has taken the training c and                    The square root of the variance of the sampling
have rated that training above average (meaning they                        distribution is called the standard error of the mean
liked it and it proved useful to them); then we will give                   and is given as,
our “thumbs up” for the training c to the active user a                                             1    1
(i.e. in our example Akshay).                                                              S=s (      *
                                                                                                        − )
                                                                                                    n    N
  This may sound confusing without the actual
example. But the idea is very simple. You recommend                         Key insight: A sample of   n * out of N has the same
the specific training to the trainee if you find most other                 error as a sample of n out of ∞ if:
similar profiled candidates have given their “thumbs
up” to the training after taking it earlier. Another
example I could give you is from Amazon.com site.                                            1 1        1
                                                                                         s (   *
                                                                                                 − )=s
When you search for a specific book name and are                                             n     N    n
ready to buy the book the site actually recommends                                          1     1   1
you couple of other books saying, “customers who                                          ( * − )=
                                                                                           n      N   n
                                                                                           1      1 1
     ________________________________________________________
                                                                                              = +
   3

_ Weights can be defined in many ways. I am going to use a simple
_
correlation method here. The other two techniques I contemplate on using                  n* N n
would be cosine similarity and Pearson correlation. Both the formulae are
given in appendix 2



©Deepak Manjarekar, KPIT INFOSYSTEMS LTD                                                                                             6
nN
                 n* =
                      n+ N
                        n
                 n* =      as N is very small &
                      n+N
                                        n is very large
                              n
                 n* =
                                   n
                         1+
                                   N

APPENDIX 2
1. Cosine Similarity11
The similarity measure can be based on the cosine of the
angle between two feature vectors. This technique was
primarily used in information retrieval for calculating
similarity between two documents, where documents were
usually represented as vectors of word frequencies. In this
context, weights can be defined as:
                              vu1,i            v u 2 ,i
w(u1, u 2) =    ∑
               i∈items      ∑v
                            k∈i1
                                   u1, k 2   ∑v
                                             k∈i2
                                                     u 2,k 2



2.Pearson Correlation12
Weights can be defined in terms of the Pearson correlation
coefficient [5]. Pearson correlation is also used in statistics
to evaluate the degree of linear relationship between two
variables. It ranges from –1 (a perfect negative relationship)
to +1 (a perfect positive relationship), with 0 stating that
there is no relationship whatsoever. The formula is as
follows:
                                         _                     _

                    ∑ (vu1, j − v u1 )(vu 2, j − v u 2 )
                   j∈items
w(u1, u 2) =
                                        _                       _

                   ∑ (vu1, j − v u1 ) 2 (vu 2, j − v u 2 ) 2
                  j∈items




REFERENCES




©Deepak Manjarekar, KPIT INFOSYSTEMS LTD                            7
1
    Peter Senge, The fifth discipline
2
    Timothy F. Bednarz, Ph. D. in his e-book Maximizing training investment, The executive key to achieving results. Page 7.

3
    David van Adelsberg & Edward A. Trolley, co-authors of the book Running Training Like a Business
4
    Timothy F. Bednarz, Ph. D. in his e-book Maximizing training investment, The executive key to achieving results. Page 6.

5
    Based on the comments made by the participants in the Learning Organization Forum at KPIT INFOSYSTEMS LTD.
6
 Russell V. Lenth, Department of Statistics, University of Iowa. Some Practical Guidelines for Effective Sample-Size
Determination. Published on March 1, 2001
7
    Michael J. A. Berry & Gordon S. Linoff, Data Mining Techniques, Second Edition, Wiley Publications. Page – 234.
8
    Michael J. A. Berry & Gordon S. Linoff, Data Mining Techniques, Second Edition, Wiley Publications. Page – 52.
9
 P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom, J.
Riedl. GroupLens: An Open Architecture for
Collaborative Filtering for Netnews. Proceedings of
CSCW ’94. 1994.

 Prof. Anand Bodapati, Anderson School of Management, UCLA, CA, USA. Extracted from class notes of MGMT 267
10

One-on-one Marketing. Collaborative filtering: Weight calculation. Spring 2006.
11
  Miha Grčar, USER PROFILING:
COLLABORATIVE FILTERING
Department of Knowledge Technologies
Jozef Stefan Institute.
Jamova 39, 1000 Ljubljana, Slovenia
12
  P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom, J.
Riedl. GroupLens: An Open Architecture for
Collaborative Filtering for Netnews. Proceedings of
CSCW ’94. 1994.

More Related Content

What's hot

Training and development through e learning a case study
Training and development through e learning a case studyTraining and development through e learning a case study
Training and development through e learning a case studyTapasya123
 
Training and development
Training and developmentTraining and development
Training and developmentvish0110
 
Social Media and Corporate learning & development
Social Media and Corporate learning & developmentSocial Media and Corporate learning & development
Social Media and Corporate learning & developmentGautam Ghosh
 
Traning and development
Traning and developmentTraning and development
Traning and developmentTanuj Poddar
 
Focal Point | Upskilling Options
Focal Point | Upskilling OptionsFocal Point | Upskilling Options
Focal Point | Upskilling OptionsTom Atwood
 
Training & Development
Training & DevelopmentTraining & Development
Training & DevelopmentLALA RIAZ
 
Case study of ibm employee training through e learning
Case study of ibm employee training through e learningCase study of ibm employee training through e learning
Case study of ibm employee training through e learningSachin Kharecha
 
(Handbook) management of training & development
(Handbook) management of training & development(Handbook) management of training & development
(Handbook) management of training & developmentAamirBashir51
 
MBA760 Chapter 02
MBA760 Chapter 02MBA760 Chapter 02
MBA760 Chapter 02iDocs
 
traning and development
traning and developmenttraning and development
traning and developmentPradeep Singha
 
Imt cdl orientation presentation
Imt cdl orientation presentationImt cdl orientation presentation
Imt cdl orientation presentationlakshmigauatm
 
Chapter09 091202140019-phpapp02
Chapter09 091202140019-phpapp02Chapter09 091202140019-phpapp02
Chapter09 091202140019-phpapp02navdeep tyagi
 
MBA760 Chapter 08
MBA760 Chapter 08MBA760 Chapter 08
MBA760 Chapter 08iDocs
 
IMT - CDL Programme Guide
IMT - CDL Programme GuideIMT - CDL Programme Guide
IMT - CDL Programme Guideimt.noida
 
Training and development By Madhav Upadhyay
Training and development  By Madhav UpadhyayTraining and development  By Madhav Upadhyay
Training and development By Madhav UpadhyayMadhav Upadhyay
 

What's hot (20)

TRAINING & DEVELOPMENT
TRAINING & DEVELOPMENTTRAINING & DEVELOPMENT
TRAINING & DEVELOPMENT
 
Training and development through e learning a case study
Training and development through e learning a case studyTraining and development through e learning a case study
Training and development through e learning a case study
 
Training and development
Training and developmentTraining and development
Training and development
 
Obh 412
Obh 412Obh 412
Obh 412
 
Social Media and Corporate learning & development
Social Media and Corporate learning & developmentSocial Media and Corporate learning & development
Social Media and Corporate learning & development
 
Traning and development
Traning and developmentTraning and development
Traning and development
 
Focal Point | Upskilling Options
Focal Point | Upskilling OptionsFocal Point | Upskilling Options
Focal Point | Upskilling Options
 
Training & Development
Training & DevelopmentTraining & Development
Training & Development
 
Case study of ibm employee training through e learning
Case study of ibm employee training through e learningCase study of ibm employee training through e learning
Case study of ibm employee training through e learning
 
(Handbook) management of training & development
(Handbook) management of training & development(Handbook) management of training & development
(Handbook) management of training & development
 
MBA760 Chapter 02
MBA760 Chapter 02MBA760 Chapter 02
MBA760 Chapter 02
 
traning and development
traning and developmenttraning and development
traning and development
 
Hr
HrHr
Hr
 
Training n dev
Training n devTraining n dev
Training n dev
 
Training development
Training developmentTraining development
Training development
 
Imt cdl orientation presentation
Imt cdl orientation presentationImt cdl orientation presentation
Imt cdl orientation presentation
 
Chapter09 091202140019-phpapp02
Chapter09 091202140019-phpapp02Chapter09 091202140019-phpapp02
Chapter09 091202140019-phpapp02
 
MBA760 Chapter 08
MBA760 Chapter 08MBA760 Chapter 08
MBA760 Chapter 08
 
IMT - CDL Programme Guide
IMT - CDL Programme GuideIMT - CDL Programme Guide
IMT - CDL Programme Guide
 
Training and development By Madhav Upadhyay
Training and development  By Madhav UpadhyayTraining and development  By Madhav Upadhyay
Training and development By Madhav Upadhyay
 

Viewers also liked

Litmus Test For Dw Bi Projects
Litmus Test For Dw Bi ProjectsLitmus Test For Dw Bi Projects
Litmus Test For Dw Bi ProjectsDeepak Manjarekar
 
Effective Training Program Deployment
Effective Training Program DeploymentEffective Training Program Deployment
Effective Training Program DeploymentDeepak Manjarekar
 
Learn BEM: CSS Naming Convention
Learn BEM: CSS Naming ConventionLearn BEM: CSS Naming Convention
Learn BEM: CSS Naming ConventionIn a Rocket
 
SEO: Getting Personal
SEO: Getting PersonalSEO: Getting Personal
SEO: Getting PersonalKirsty Hulse
 
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika AldabaLightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldabaux singapore
 

Viewers also liked (7)

Business plan presentation world epn
Business plan presentation world epnBusiness plan presentation world epn
Business plan presentation world epn
 
Litmus Test For Dw Bi Projects
Litmus Test For Dw Bi ProjectsLitmus Test For Dw Bi Projects
Litmus Test For Dw Bi Projects
 
Effective Training Program Deployment
Effective Training Program DeploymentEffective Training Program Deployment
Effective Training Program Deployment
 
Learn BEM: CSS Naming Convention
Learn BEM: CSS Naming ConventionLearn BEM: CSS Naming Convention
Learn BEM: CSS Naming Convention
 
SEO: Getting Personal
SEO: Getting PersonalSEO: Getting Personal
SEO: Getting Personal
 
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika AldabaLightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
 
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job? Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
 

Similar to Using Collaborative Filtering For Effective Training Programs

Learning and Development at workplace: Changing Paradigms, Emerging Trends
Learning and Development at workplace: Changing Paradigms, Emerging TrendsLearning and Development at workplace: Changing Paradigms, Emerging Trends
Learning and Development at workplace: Changing Paradigms, Emerging Trends24x7 Learning
 
Training_And_Development_System_DMGT518.pdf
Training_And_Development_System_DMGT518.pdfTraining_And_Development_System_DMGT518.pdf
Training_And_Development_System_DMGT518.pdfRitu Canser
 
Job Training Methods and Process
Job Training Methods and ProcessJob Training Methods and Process
Job Training Methods and ProcessNadia Nahar
 
Training & development
Training & developmentTraining & development
Training & developmentDharmik
 
The Buyer's Guide to Technical Training: Optimizing Work Instructions for Job...
The Buyer's Guide to Technical Training: Optimizing Work Instructions for Job...The Buyer's Guide to Technical Training: Optimizing Work Instructions for Job...
The Buyer's Guide to Technical Training: Optimizing Work Instructions for Job...angelameek4
 
Running Head TRAINING AND DEVELOPMENT PROPOSALTRAINING AND DE.docx
Running Head TRAINING AND DEVELOPMENT PROPOSALTRAINING AND DE.docxRunning Head TRAINING AND DEVELOPMENT PROPOSALTRAINING AND DE.docx
Running Head TRAINING AND DEVELOPMENT PROPOSALTRAINING AND DE.docxagnesdcarey33086
 
Running head EXTENDING LEARNING WITH THE 6D’S 1EXTENDING LEARN.docx
Running head EXTENDING LEARNING WITH THE 6D’S 1EXTENDING LEARN.docxRunning head EXTENDING LEARNING WITH THE 6D’S 1EXTENDING LEARN.docx
Running head EXTENDING LEARNING WITH THE 6D’S 1EXTENDING LEARN.docxWilheminaRossi174
 
Signature assignment AET/ 570
Signature assignment AET/ 570Signature assignment AET/ 570
Signature assignment AET/ 570TammieJohnson12
 
Orientation and training programs
Orientation and training programsOrientation and training programs
Orientation and training programsArianne Mae Asis
 
Orientation and training programs
Orientation and training programsOrientation and training programs
Orientation and training programsArianne Mae Asis
 
Orientation and training programs
Orientation and training programsOrientation and training programs
Orientation and training programsArianne Mae Asis
 
Elearning: Your Foundation for Success -- White Paper July 2010
Elearning: Your Foundation for Success -- White Paper July 2010Elearning: Your Foundation for Success -- White Paper July 2010
Elearning: Your Foundation for Success -- White Paper July 2010The Blockchain Academy
 
Training presentation
Training presentationTraining presentation
Training presentationNilesh Rajput
 
Training and development
Training and developmentTraining and development
Training and developmentAkshatChauhan13
 
Trainning and development (t&d)
Trainning and development (t&d)Trainning and development (t&d)
Trainning and development (t&d)Saba Gul Rehmat
 
Choosing an e learning provider combined edits 092413 (final) (1)
Choosing an e learning provider combined edits 092413 (final) (1)Choosing an e learning provider combined edits 092413 (final) (1)
Choosing an e learning provider combined edits 092413 (final) (1)ej4video
 
650 action research proposal
650 action research proposal650 action research proposal
650 action research proposalCesarNin1
 

Similar to Using Collaborative Filtering For Effective Training Programs (20)

Reduce to modernize
Reduce to modernizeReduce to modernize
Reduce to modernize
 
Learning and Development at workplace: Changing Paradigms, Emerging Trends
Learning and Development at workplace: Changing Paradigms, Emerging TrendsLearning and Development at workplace: Changing Paradigms, Emerging Trends
Learning and Development at workplace: Changing Paradigms, Emerging Trends
 
Training_And_Development_System_DMGT518.pdf
Training_And_Development_System_DMGT518.pdfTraining_And_Development_System_DMGT518.pdf
Training_And_Development_System_DMGT518.pdf
 
Job Training Methods and Process
Job Training Methods and ProcessJob Training Methods and Process
Job Training Methods and Process
 
Training & development
Training & developmentTraining & development
Training & development
 
The Buyer's Guide to Technical Training: Optimizing Work Instructions for Job...
The Buyer's Guide to Technical Training: Optimizing Work Instructions for Job...The Buyer's Guide to Technical Training: Optimizing Work Instructions for Job...
The Buyer's Guide to Technical Training: Optimizing Work Instructions for Job...
 
Running Head TRAINING AND DEVELOPMENT PROPOSALTRAINING AND DE.docx
Running Head TRAINING AND DEVELOPMENT PROPOSALTRAINING AND DE.docxRunning Head TRAINING AND DEVELOPMENT PROPOSALTRAINING AND DE.docx
Running Head TRAINING AND DEVELOPMENT PROPOSALTRAINING AND DE.docx
 
Running head EXTENDING LEARNING WITH THE 6D’S 1EXTENDING LEARN.docx
Running head EXTENDING LEARNING WITH THE 6D’S 1EXTENDING LEARN.docxRunning head EXTENDING LEARNING WITH THE 6D’S 1EXTENDING LEARN.docx
Running head EXTENDING LEARNING WITH THE 6D’S 1EXTENDING LEARN.docx
 
Signature assignment AET/ 570
Signature assignment AET/ 570Signature assignment AET/ 570
Signature assignment AET/ 570
 
Unit 6 updated
Unit 6 updatedUnit 6 updated
Unit 6 updated
 
Orientation and training programs
Orientation and training programsOrientation and training programs
Orientation and training programs
 
Orientation and training programs
Orientation and training programsOrientation and training programs
Orientation and training programs
 
Orientation and training programs
Orientation and training programsOrientation and training programs
Orientation and training programs
 
T&D
T&DT&D
T&D
 
Elearning: Your Foundation for Success -- White Paper July 2010
Elearning: Your Foundation for Success -- White Paper July 2010Elearning: Your Foundation for Success -- White Paper July 2010
Elearning: Your Foundation for Success -- White Paper July 2010
 
Training presentation
Training presentationTraining presentation
Training presentation
 
Training and development
Training and developmentTraining and development
Training and development
 
Trainning and development (t&d)
Trainning and development (t&d)Trainning and development (t&d)
Trainning and development (t&d)
 
Choosing an e learning provider combined edits 092413 (final) (1)
Choosing an e learning provider combined edits 092413 (final) (1)Choosing an e learning provider combined edits 092413 (final) (1)
Choosing an e learning provider combined edits 092413 (final) (1)
 
650 action research proposal
650 action research proposal650 action research proposal
650 action research proposal
 

Using Collaborative Filtering For Effective Training Programs

  • 1. Using Linear & Logistic Regression along with Collaborative Filtering Technique for Effective Training Program Deployment Part I Deepak Manjarekar ABSTRACT— C ountless times we hear that a bright student failing in a “If a man does not keep pace with his companions, perhaps it is specific test or being completely indifferent towards because he hears a different drummer. Let him step to the music learning a specific subject. Parents and teachers are which he hears, however measured or far away.” equally perplexed about why a natural genius would - Henry David Thoreau fail or perform poorly in what someone might feel to be A an easy subject? The problem does not lie in our ll over the world organizations are spending enormous incorrect classification of the student as a genius, amounts of resources to train their employees. The rather it lies in the erroneous selection of the training concept of “Learning Organizations” is beginning to program for that student. To make matters worse, we emerge as a competitive necessity.1 The surge in the observe the same phenomenon in many organizations internal training programs is largely in the hope that the where many high performing employees or high employees will be able to cope up with the fast ranking students freshly minted from top notch changing technologies and be productive in their job universities go inside the four walls of the training room right from day one. The proliferation of internal training only to find themselves to be a fly on the wall. Each programs is also due to the fact that the external year organizations spend millions of dollars and training programs are usually very expensive, not in countless hours in training their new recruits and star the close vicinity of the organization and may have performers only to find dismal results. At best schedules that won’t fit the needs of the organization. employees come out of the training class with minimal So in the current era where companies like to familiarity of the subject. Thus the learning division outsource everything that is not their core business, within an organization usually suffers with low ROI on we see a reverse trend of in-sourcing the training the aggregate spent on the learning activities. It is programs for their employees. Learning organizations clear many corporate training programs are unable to within the companies are thus cost centers whose deliver the results companies expect.2 primary responsibility is to deliver effective & custom made training programs that may prepare the trainees Main reasons why the corporate training in latest technologies or processes. Yet despite of all programs fail? the customization; organizations are still grappling with We can attribute the marginal success of the training the problem of little or no ROI on their training programs mainly to the following three reasons, programs. What’s happening then? May be the 1. Poorly organized training programs delivery of the training was not right? Perhaps the selection process of the trainees may be flawed? I 2. Ineffective training delivery tend to think that the ineffectiveness of the training 3. Improper selection of trainees for the training programs is largely due to the wrong selection of program trainees by the training program coordinators. This paper will illustrate use of regression, logistics Now let’s look in details what goes wrong in all three regression and collaborative filtering techniques to cases. correctly identify employees, who may enjoy the training, benefit from it and may continue to use the 1. Poorly organized training programs learned skills long after the training was over. "Forward-thinking companies have reinvented their training organizations around the concept of running INTRODUCTION training like a business, and have tangible successes to show for it. These corporations now know what they _________________________________________________________  are spending on training and what the investment Deepak Manjarekar is working as a Program Manager at KPIT yields." says David van Adelsberg & Edward A. INFOSYSTEMS LTD, Hinjewadi, Pune, India. Currently he is managing the company’s second largest star customer account in the offshore Trolley, co-authors of “Running Training Like a delivery. He has seventeen years of experience in the IT Industry and has Business”3 worked with many fortune 100 clients delivering solutions in Data Warehousing and Business Intelligence space. Mr. Manjarekar is an Electronics Engineer from Bombay University and has received his MBA But let’s be very honest with ourselves and ask the from Anderson School of Management at University of California at Los question, “How many companies are really forward Angeles (UCLA). He is also a PMI certified PMP. He currently resides in thinking?” Let’s ask, “How many organizations treat Pune, India. You can reach him at Deepak.manjarkar@kpitcummins.com. ©Deepak Manjarekar, KPIT INFOSYSTEMS LTD 1
  • 2. their learning organizations as profit centers versus 3. Improper selection of trainees for the training cost centers?” The truth is that many organizations still program treat their own learning organizations as cost centers. In KPIT Cummins, it’s a frequent gripe from the So every time the company tries to cut cost, it’s usually learning organization that Project Managers or the learning organization that gets the hit first. Program Managers don’t send their high ranking personnel (or resources) for the various training Since organizations have started in-sourcing the programs. “We always get the not so bright or low training programs, they are burdened with managing all performing employees to train” is what they say5. Now the training programs using internal resources. These one can argue about what should be the charter of the resources may or may not come from education learning organization within a company. One might say delivery background. Typically most trainers are high that the learning organization’s prime responsibility performing folks in their own technical forte who are should be to train employees who are not really the then delegated with the task to train others in the same star performers or employees who need some kind of technology. The trainers may or may not have any technical training to progress ahead in their jobs. But teaching background. The same folks are then does this mean that the high flyers should be deprived delegated the task of creating the training materials for from quality training? I can argue about the learning the program. Such type of training materials lacks the organization’s charter to the end of this article. But simplicity as well as appropriate depth that are that’s not the focus of this article. required to go with the delivery. The following points can highlight why the trainee Many times the training rooms lacks proper selection is usually full of flaws. These are industry infrastructure that is required for effective learning. In observations and not necessarily reflect the trends at today’s world where real estate is such a prime KPIT Cummins. commodity, we often find organizations taking the liberty in creating crammed training rooms to make a. Favoritism some room for the corner offices. These crammed b. Crying baby gets the milk rooms are not conducive for any learning activities at c. Reluctance to release bright and deserving all. Sharing one computer among more than one candidates for higher and more appropriate trainee, no set time for lab work, missing charts, training for the fear that they may surpass their workbooks and other training materials further adds to supervisors the poor organization of the training programs. d. Many deserving candidates are so busy in their day-to-day work that they don’t find time for any 2. Ineffective training delivery outside work activities like training. “Traditional training methods such as classroom e. Supervisors would simply avoid sending their and workshop training are by far the most conventional high performers for training to maintain business and popular methods of delivering training to continuity. Keeping their best people within the employees. Their effectiveness will depend upon the training walls may disrupt the business content delivered and how interesting the presenter continuity. can make the material in order to engage and involve f. Peer pressure on employees. Many times employees.”- Writes TimothyF.Bednarz, Ph.D.4 employees enroll themselves in a training program because their colleagues are enrolled in As I have mentioned earlier, most trainers are high the same program, only to find themselves in a performing individual contributors who have been wrong class. delegated the task of training others in the same area g. Sending substitutes for the training when the of their expertise. They have little or no experience in actual enrollee can’t attend. conducting formal training. Some trainers even have h. Supply and demand for the training on a specific communication problems and find it difficult to conduct technology. Companies sometimes arrange a classes in the languages that are not native to them. mass training on technologies such as Oracle Many suffer from exhibiting no or low energy during the Apps or SAP and force all the people on the teaching, there by droning the class to sleep. Many bench to attend the training so in case there is a lack passion for teaching. Often their commitment to requirement these people can fill it up. class and the learning of its participants is i. Due to unforeseen circumstances sometimes questionable. It’s a frequent feedback from the people find themselves inside the four walls of participants that the trainer lacked any practical the classroom undergoing the training that experience or that he/she did not have anything to seems to suffocate them. share from the work outside of the four classroom walls. Whatever may be the case, we can find countless ©Deepak Manjarekar, KPIT INFOSYSTEMS LTD 2
  • 3. examples where the trainee provides feedback to about a specific training program will definitely matter the organization stating that he/she has gained to me, provided they have attended that specific marginally from the training. Many don’t use the training program. It’s like going to a movie if your best learning from training due lack of applicability or job friend recommends it to you after watching it change or role change. himself/herself. It works because you share many common traits in your individual profiles. So in that In summary there is a lot of subjectivity built into the respect a trainee should definitely value the opinions of candidate selection for the training. Due to the former trainees of the same program if their profiles reasons listed above and many more that are not are more or less similar. But where should a trainee listed, we find many misfits in the classroom find such folks? Should the person simply poll the scratching their heads why in the first place they people within his network inside the organization? May ended up in that class? be not. The reason being these people may be in his network for various reasons and not just because they The Scope share same profiles. So any arbitrary opinions may T further pollute the perception of the trainee. To he scope of this paper is limited to the appropriate circumvent this problem, I will use the collaborative candidate selection process via statistical and filtering technique to add the subjectivity back into the mathematical modeling. There fore I will selection process. But now this subjectivity will be from conveniently ignore the first two points I had made the trainee’s point of view and should further help the earlier about why corporate training programs fail? trainee to decide whether s/he should attend the Let us just assume for the sake of simplicity that training. Let’s call this the good subjectivity factor. corporations will take care of the first two problems by setting up state of the art training facilities and by The design of the OSCS Model hiring the best trainers money can buy. Let us just To design the OSCS model, I came up with three assume that in the ideal scenario, we have to deal steps. with just the final point of improper selection of 1. The first step will take out the bad subjectivity trainees for the training program. I am purposely factor from trainee selection process. limiting the scope of this paper to this last point as 2. Second step will make an objective candidate you can soon realize that dealing with just this last selection by using the results of the first step. point will be an insurmountable task. Taking out the 3. Third step will add the good subjectivity factor subjectivity and bias from the candidate selection back into the selection process. process may sound very simple but will be very difficult to implement at the least. Let’s see how we During the OSCS Model building we will have to can formulate the objective candidate selection perform the following steps in a specific order. I am model for the training programs? essentially making two hypotheses here; and to test them out I will have to use linear multi dimensional Objective / subjective candidate selection regression technique to measure the statistical model significance of my alternate hypothesis (Ha) in each I case (step 1 above). After testing the first two n order to design the Objective / Subjective Candidate hypotheses, I will use logistic regression technique to Selection Model (hear after referred as OSCS Model) I create a predictive model (step 2). This model will will suggest the use of three mathematical/statistical predict whether a certain employee if selected will be techniques. But before moving onto the techniques, I successful in a specific training program or not? suppose, I owe an explanation of what is OSCS Finally, we will add the good subjectivity factor back model? In the beginning of this article I have spent into the equation (step 3). The four techniques I will much time in articulating how subjectivity from use are namely, sample size determination for finite managers or trainers point of view is detrimental to the population for survey, linear multi dimensional success of the training program. Let’s call this the bad regression, logistics regression predictive model & subjectivity factor. With bad subjectivity factor we collaborative filtering. generally choose inappropriate candidates for the training. So we must device a way to make the trainee selection process objective. But if we think about 1. Taking out the bad subjectivity factor from subjectivity from the trainee’s point of view; it may not trainee selection process. necessarily as bad. Confused? May be. Let me explain it little bit further. If I have couple of colleagues who The first null hypothesis I am making here is, are my best buddies and who also incidentally share the same professional profile as I do, then their opinion H 0 : Every employee who attends a specific training is always benefited from the training. ©Deepak Manjarekar, KPIT INFOSYSTEMS LTD 3
  • 4. There fore my alternate hypothesis will be, provide us any predictive power. Also we will eliminate H a : Not all employees benefit from the training they the attributes that does not affect (zero covariance) the desired outcome, i.e. success in training. receive. Thus at the end of this process we will come out with To test out the first hypothesis I will need to conduct a set of attributes that are significant in unbiased a survey of a statistically significant sample of candidate selection. employees who had undergone some kind of training in past. This sample can be chosen from the finite 2. Make an objective candidate selection using population. We call it finite population because at any the results of the first step. given time the organization will be able to identify all the employees who had attended any training1. The After identifying the set of attributes that are sample size will depend on the confidence level we necessary and significant for the candidate selection, would want on the results of our survey. (Please see we will need to use logistic regression to come out with appendix 1 on how to calculate the sample size for a an objective decision of whether a certain candidate finite population6) should be selected for a specific training. Upon identifying the random sample of trainees, we In order to achieve this we will need to build a can survey them and find out the effectiveness of the predictive model using the logistic regression past training for each trainee. We can invalidate the technique. To build such a model we will start with the null hypothesis if our positive responses are below list of significant attributes from the step one. Let’s just certain cut off level. This cut off level may or may not say we have identified a set of 10 attributes namely, exist for a learning organization. x1, x2, x3, …..x10 from step one. To build the model we will have to once more use the past training data. The second null hypothesis I am making here is, H 0 : Every single personal & professional attribute of In this case we will have to create the dataset (called the candidate is equally important in the selection training dataset) with balanced outcomes. By this, I process. mean, our training dataset must contain equal There fore my alternate hypothesis will be, proportions of favorable and unfavorable outcomes (passed/failed, successful/unsuccessful, satisfied/ H a : There are certain personal & professional unsatisfied, etc) along with the proper distributions of attributes that matter more than others. significant attributes for each candidate. Again the training dataset size really matters here. To achieve a To test out this first hypothesis I will need to conduct high degree of predictability, the training dataset must a linear regression with multiple parameters. My contain reasonable amount of data covering most of approach will be to start with all the data attributes that the possible domain values for all the attributes the learning organization has captured so far about involved. When the training dataset is not sufficiently each candidate and then weed out the ones that don’t large, predictive models tend to over fit the data7. Over have any statistical significance in the effective training fitting causes a definite problem where the model reception of the candidate. We can start with personal works very, very well on the training dataset but it will parameters like age, gender, native language, 2nd, 3rd & fail spectacularly on the test dataset (unseen data). 4th languages if spoken, domicile state, primary We will need two other important datasets called language of education, educations degrees, other holdout dataset and test dataset8. Holdout dataset can training, etc. We can also start with professional be created from the same training dataset by randomly parameters like number of years of experience, no of selecting 10% to 15% of the training data. These years in the last assignment, tech skill set, title, prior records are kept aside and are not used during the experience in the training subjects, etc. model training. Instead once the final model is built, the holdout dataset is used to create the confusion matrix The attribute elimination process will be based on and assess the predictability of the model. Test the covariance of the attributes with the expected dataset is created from the sample data that is outcome, i.e. the success of the trainee in the gathered after the model has been built and whose program. During this process, we will eliminate many outcome was not known when the model was under non significant attributes (low covariance) that does not construction. How to create these datasets is beyond the scope of this first part of the paper. I will cover it 1 ________________________________________________________ briefly in the second paper. _ You can say it’s a big assumption. May be so. In case if an organization is _ very large and does not have data for all the past trainees, we can still formulate an experiment and derive a sample size based on infinite population. ©Deepak Manjarekar, KPIT INFOSYSTEMS LTD 4
  • 5. One might say we could have very well used the linear using less than 10 attributes & reduce the need of multi dimensional regression model form the first step huge training dataset. here; rather than building a new model using the same attributes. The following graph shows an output of a Thus at the end of the second stage we will get a simple linear regression with just one predictor predictive model that will tell us whether to send a variable. particular trainee to the training or not based on the output of the model. 3. Add the good subjectivity factor back into the selection process. After identifying the trainee from the second step, we may invite the trainee for the actual training program. But we will have to give the chance to the trainee to assess if s/he would like to self select for this training. Here we will be adding the subjectivity back into the decision criteria, but from the trainee’s point of view. The reason we can’t use the equation2 we get from the The first thing we can expect here from the trainee is first step is because the predictive values of y may “self selection”. Self selection happens when a person cross the bounds of 0 & 1 based on the values of the uses his/her own decision criteria to a specific ten weights and the respective attributes values for a problem. This process is very complex and hard to trainee. The best reason for using the logistic quantify but most of the time delivers correct regression is that the model will always return us the assessment. We also call this as “gut feel”. Everyone values of y as a Bernouilli probability of either 0 or 1. has it and its accuracy increases as one get older and Where the value of 1 can be mapped for success and experience various decision making situations. 0 for failure or vice versa. Once the candidate uses “self selection” process, If y = α 0 ± α1x1 ± α 2 x 2 ± .... ± α 10 x10 ± ε from half of the good subjectivity is added to the model by the linear multi dimensional regression, the trainee himself. But to add further value to the Then p (Y = 1 | y ) = e y /(1 + e y ) Thus we will model, we can use another technique called, “collaborative filtering” to gauge the appropriateness of basically receive a probability value of the Y = 1 (or the training for the trainee as experienced by other success) between zero and one based on the given trainees in past who has undergone the same training. values of the attributes and their respective weights. As mentioned above, this is like getting the The following figure will depict the output values of a recommendation from your colleagues who you may or logistics regression given one predictor input. may not know but share nearly the same profile as yours. Thus it’s useful when you want to make predictions on preferences by considering all of a long training history. Now let me walk you thru an example of how collaborative filtering works. Let’s assume a trainee named Akshay has already gone thru our first two steps and we arrived at the objective decision that Akshay should take the Hyperion training. We gave Akshay a chance to self select himself for the program. After much contemplation, Akshay thinks he should go Also while creating the predictive model in second for the training. At this time he may or may not have step, we may or may not end up using all ten any reservations. But in any case, we tell Akshay that attributes. Using more variables may cause the model let’s see how many people who are like Akshay and to over fit the data, if the training dataset size is not who have undergone similar training programs will large enough to cover all the domain values. Thus by recommend the training to Akshay? not using the equation in first step, we may end up Vote of Akshay for Hyperion training will be equal to 2 ________________________________________________________ _ At the end of the linear multi dimensional regression we will get the _ the average of other trainees’ vote for the Hyperion equation of the form training. y = α 0 ± α1x1 ± α 2 x 2 ± .... ± α10 x10 ± ε , where α1, α2 are weights; x1, x2 are the attributes & ε is error term. But we need to take into account the following: ©Deepak Manjarekar, KPIT INFOSYSTEMS LTD 5
  • 6. • Different people have different “standards” bough this book also bought books a, b & c, etc. • People more similar to Akshay should be giving higher weighting in predicting Akshay’s In this part I of the paper, I am postulating my ideas preferences. of how we can use past training data in the organization and use appropriate statistical analysis to If a is the “active user”, and c the “candidate training”, correctly identify the deserving candidates for a then v ca , the predictive vote for user a for candidate ˆ specific training using an objective criteria. training c can be given as9, In part II of this paper, I will actually either prove or disprove my postulation. The task will not be easy. I v ca − v*a = ∑ w(a, u ) * (vc − v*u ) ˆ u will have to uncover as much as past training data as u≠a possible. Deal with the data quality issue of the found data, if any. It is going to be an interesting endeavor Where v*a is the mean vote of active user a & and may last for more than a year, as you might know v*u is the mean vote of user u that building predictive models are very easy but testing them with actual test datasets takes time as the test data needs to come after the model is built. Choosing an appropriate weight3 w(a, u) where c ( a, u ) So stay tuned… w(a, u ) = and where k a can be derived as, ka k a = ∑ | c (a, u ) | and where c(a,u) is the votes APPENDIX 1 u≠a 1. Statistical Theory for Sampling of Finite correlation between users a and u, and k a is the Population normalizing factor so that the absolute weights sum to one10. Suppose • The proxy for the total poplation is called the Putting all the various formula elements together we Sampling Frame (SF). The SF has N customers, can further simplify the formula and write as, where N is not large • The mean and variance of the quantity of interest ∑ c ( a , u ) * (v − v u c * u ) (QI) across the SF are m and s 2 respectively v =v a ˆ a + u≠a • We draw a simple random sample of n * customers ∑ | c ( a, u ) | c * _ u≠a The sample mean x has a probability distribution, 1 1 which has mean m and variance s * ( * − ) Thus if the predicted vote for the active user a for 2 the candidate training c is greater than the average n N vote of all other users who has taken the training c and The square root of the variance of the sampling have rated that training above average (meaning they distribution is called the standard error of the mean liked it and it proved useful to them); then we will give and is given as, our “thumbs up” for the training c to the active user a 1 1 (i.e. in our example Akshay). S=s ( * − ) n N This may sound confusing without the actual example. But the idea is very simple. You recommend Key insight: A sample of n * out of N has the same the specific training to the trainee if you find most other error as a sample of n out of ∞ if: similar profiled candidates have given their “thumbs up” to the training after taking it earlier. Another example I could give you is from Amazon.com site. 1 1 1 s ( * − )=s When you search for a specific book name and are n N n ready to buy the book the site actually recommends 1 1 1 you couple of other books saying, “customers who ( * − )= n N n 1 1 1 ________________________________________________________ = + 3 _ Weights can be defined in many ways. I am going to use a simple _ correlation method here. The other two techniques I contemplate on using n* N n would be cosine similarity and Pearson correlation. Both the formulae are given in appendix 2 ©Deepak Manjarekar, KPIT INFOSYSTEMS LTD 6
  • 7. nN n* = n+ N n n* = as N is very small & n+N n is very large n n* = n 1+ N APPENDIX 2 1. Cosine Similarity11 The similarity measure can be based on the cosine of the angle between two feature vectors. This technique was primarily used in information retrieval for calculating similarity between two documents, where documents were usually represented as vectors of word frequencies. In this context, weights can be defined as: vu1,i v u 2 ,i w(u1, u 2) = ∑ i∈items ∑v k∈i1 u1, k 2 ∑v k∈i2 u 2,k 2 2.Pearson Correlation12 Weights can be defined in terms of the Pearson correlation coefficient [5]. Pearson correlation is also used in statistics to evaluate the degree of linear relationship between two variables. It ranges from –1 (a perfect negative relationship) to +1 (a perfect positive relationship), with 0 stating that there is no relationship whatsoever. The formula is as follows: _ _ ∑ (vu1, j − v u1 )(vu 2, j − v u 2 ) j∈items w(u1, u 2) = _ _ ∑ (vu1, j − v u1 ) 2 (vu 2, j − v u 2 ) 2 j∈items REFERENCES ©Deepak Manjarekar, KPIT INFOSYSTEMS LTD 7
  • 8. 1 Peter Senge, The fifth discipline 2 Timothy F. Bednarz, Ph. D. in his e-book Maximizing training investment, The executive key to achieving results. Page 7. 3 David van Adelsberg & Edward A. Trolley, co-authors of the book Running Training Like a Business 4 Timothy F. Bednarz, Ph. D. in his e-book Maximizing training investment, The executive key to achieving results. Page 6. 5 Based on the comments made by the participants in the Learning Organization Forum at KPIT INFOSYSTEMS LTD. 6 Russell V. Lenth, Department of Statistics, University of Iowa. Some Practical Guidelines for Effective Sample-Size Determination. Published on March 1, 2001 7 Michael J. A. Berry & Gordon S. Linoff, Data Mining Techniques, Second Edition, Wiley Publications. Page – 234. 8 Michael J. A. Berry & Gordon S. Linoff, Data Mining Techniques, Second Edition, Wiley Publications. Page – 52. 9 P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom, J. Riedl. GroupLens: An Open Architecture for Collaborative Filtering for Netnews. Proceedings of CSCW ’94. 1994. Prof. Anand Bodapati, Anderson School of Management, UCLA, CA, USA. Extracted from class notes of MGMT 267 10 One-on-one Marketing. Collaborative filtering: Weight calculation. Spring 2006. 11 Miha Grčar, USER PROFILING: COLLABORATIVE FILTERING Department of Knowledge Technologies Jozef Stefan Institute. Jamova 39, 1000 Ljubljana, Slovenia 12 P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom, J. Riedl. GroupLens: An Open Architecture for Collaborative Filtering for Netnews. Proceedings of CSCW ’94. 1994.