SlideShare a Scribd company logo
1 of 62
Effectiveness of Macro/VBA
(Visual Basic for Application) of MS-Excel
            in Data Analysis.
Executive Summary

Due primarily to its widespread availability, Microsoft Excel is often the de facto choice
of engineers and scientists in need of software for measurement data analysis and
manipulation. Microsoft Excel lends itself well to extremely simple test and
measurement applications and the financial uses for which it was designed; however, in
an era when companies are forced to do more with less, choosing the appropriate tools
to maximize efficiency (thereby reducing costs) is imperative.

One thing virtually every reader has in common is the need to automate some aspect of
Excel. That is what VBA is all about.

Microsoft Excel primarily used for the management, inspection, analysis, and reporting
of acquired or simulated engineering and scientific data – offers efficiency gains and
scalability with features of data post-processing applications.

Visual Basic for Applications (VBA) can be used in conjunction with Microsoft Excel to
automate data analysis tasks. In particular, VBA can be used to lock in on and parse
business data then automatically engage Excel's data processing tools to perform the
analysis tasks you want. When the analysis is completed, VBA can then be used to
automatically create reports on Excel worksheets or in other applications such as Word
and PowerPoint. VBA can also be used to create your own custom data analysis tools.

You're probably aware that people use Excel for thousands of different tasks. Here are
just a few examples:

       Keeping lists of things, such as customer names, students' grades, or holiday gift
       ideas
       Budgeting and forecasting
       Analyzing scientific data
       Creating invoices and other forms
       Developing charts from data tab mark

                                                                                    2
The list could go on and on, but you get the idea. Excel is used for a wide variety of
things, and everyone reading this article has different needs and expectations regarding
Excel.

VBA is a fantastic tool for empowering analysts to build their own solutions to problems.
It gives analysts the power to create innovative new bits of kit without learning the sort
of heavyweight programming that is the preserve of full-time coders with computer
science degrees. What analysts produce in VBA - and I speak from personal experience
here - is quite often horrifying to their IT departments. Even very good code by analyst
standards is a world away from the way that a good programmer might chose to solve a
problem. For one thing, no programmer worth the name would have started their build
in VBA.

The thing is, even with that coding deficiency, VBA works. It makes an awful lot of
businesses run. And along the way, it's completely hamstrung Microsoft with Excel
upgrades, because it's too embedded in too many places to change it now.

VBA survived and it prospered. It did that because it met a need and it's a need that
large companies in particular, go out of their way to avoid happening with other bits of
software. Excel was the Trojan horse that put IT capability into the hands of people who
aren't supposed to have it. As VBA starts to show its age, we're in danger of drifting
towards a world where centralized IT departments control access to data and access to
the tools that can work with it.

For example, you might create a VBA program to format and print your month-end sales
report. After developing and testing the program, you can execute the macro with a
single command, causing Excel to automatically perform many time-consuming
procedures. Rather than struggle through a tedious sequence of commands, you can
grab a cup of coffee and let your computer do the work — which is how it's supposed to
be, right?



                                                                                    3
Objective Of Study

      Detail study of Macros /VBA (Visual Basic for Applications) and its effectiveness
      in Data Analysis.
      The project would enable the reader gain important insights into MS Excel and
      VBA with regards to its evolution and future growth.
      The study would also reflect upon the various methods or functions of Excel and
      VBA to stay ahead of the curve.
      The study would also delve into the technological aspects of the Data analysis
      with regards to its network spread, communication technologies, emerging
      technologies, etc.
      The project report would seek to shed light on the futuristic scope of the VBA
      with regards to new development language for businesses and its impact, etc.




Broad Action Plan

      Detail exploration on helpfulness of VBA/ Macros in data analysis and Life cycle
      of macro and inventorization process in ABC Company.
      To gain a historical perspective of the MS Excel.
      To study the VBA from the perspective of its effectiveness in terms of analyzing
      data, etc.
      To identify the challenges faced while analyzing the data.
      To analyze future trends and thereby cite business opportunities for effective
      usage of the MS Excel and VBA.




                                                                                 4
TABLE OF CONTENTS

     Sr. No                                   Topic    Page
                                                        No
1.            Background                              10

2.            Introduction to MS Excel                13

3.            History                                 17

4.            Introduction to Visual Basic (VB)       18
       4.1    What is Visual Basic                    18

5.            Introduction to VBA                     19
       5.1    Why VBA                                 20
       5.2    Calculations without VBA                22
       5.3    Advantages of Using VBA                 23

6.            Miscellany                              26

              Simulation Example                      29
7.
       7.1    What is the algorithm                   29
       7.2    VBA code for this example               31
       7.3    A trick to speed up the calculations    33

8             Reporting in Excel                      35

9.            Attributes of Good VBA Models           37
       9.1    Documenting VBA Models                  42

10.           Caveats                                 45
       10.1   General Issues                          48
       10.2   Results of Analyses                     49
       10.3   Additional Analyses                     54
       10.4   Requesting Many Analyses                57
       10.5   Working with Many Columns               57

11.           Beyond VBA                              59

12.           Conclusion                              60

                                                       5
13.   Recommendations          63

14.   Scope for Future Study   64

15.   References               65




                                6
1. Background

   I have experienced measurement, statistics, and data analysis courses for a period of
the past 5 years.   By and large, my colleagues have come from the non-technical,
particularly from education and Finance. I generally find that people in the education
will often have a negative attitude toward automation, and finance peoples seeks for
automated methods, a fact that has at times added more than a small bit of extra
challenge.

   My working has always had a practical bent. I make an effort to automate the data
research or analysis, and I fully integrate technology in my work life. I cannot always
assume that peoples are as technology-literate as I would like; at times it’s necessary to
set aside instructional time in order to deal with specific technology topics, as I will
mention below.




                                                                                    7
Main Areas of Work




Above graph displays the main areas of work on the basis of employees

Main area of work in which software is used




Today each sector is using the software in less or more manner; so above graph talks
about the percentage of usage of software’s in main areas.




                                                                                  8
Percentage of respondents using each package




As we have software’s in organizations and above is the percentage of respondents
using each package




                                                                                    9
2. Introduction to MS Excel

   Excel is the backbone to any custom built financial model, and requires having good
technical Excel skills.

   By connecting to any type of databases (Oracle, IBM, SQL Server, OLAP) Excel can
retrieve data from your corporate databases and files, you don't have to retype the data
that you want to analyze in Excel. You can also refresh your financial spreadsheets and
summaries automatically from the original source database whenever the database is
updated with new information.

   A powerful and easy to use operational or financial model in Excel provides decision
makers with analytical capabilities to assess the outcomes of a range of scenarios.

   Good financial management and financial governance are at the core of good
management. They help to drive performance by supporting effective decision making,
aiding the efficient running of organizations and maximizing the effective use of
resources. Good financial management is also essential to maintain the stewardship and
accountability of public funds. The way government bodies collect, analyze and utilize
financial management information directly impacts on the performance of their
organizations and the delivery of their objectives.

 Financial modeling with Excel

   A Financial Model is complex spreadsheet structured, dynamic and flexible. It
contains a set of variable assumptions, inputs, outputs, calculations, and scenarios. The
objective is, by changing input data, to explore relationships between several variables
and to test the results of these changes on the output results of ad hoc scenario. It allows
simulate a wide range of scenarios and sensitivity analysis in short period of time. And,
this can be done faster and easier in an Excel spreadsheet than in any Analytics
application (SAP, SAS, Siebel, etc).



                                                                                      10
Concretely, a Financial Model can be used to match different needs or objectives. It
can be a Business Case, a Profitability Analysis, a Budget, a Reporting or a Forecasting
study.

   All of them are efficient tools that help executives and managers monitor, manage
and run their business. Because they are in all the levels and departments of company,
Excel spreadsheets are the solutions. The challenge is to design a low maintenance user-
friendly reporting tools that automatically consolidate, analyze, transform, update and
present the information needed.

 Business Cases

   A business case has to translate business ideas from vague concepts into a concrete
set of numbers, and to score high in credibility, accuracy, and practical value in order to
be the financial backbone to the project’s concepts.

   A business case is based on ―What-if analysis and scenario management‖ essential to
answer typical business case analysis questions, and has a strong focus on cash-flow
evaluation of strategic business decisions in order to assess the financial feasibility of the
project.

   A business base is referred to frequently during the project, to determine whether it
is currently on track. And at the end of the project, success is measured against the
ability to meet the objectives defined in the Business Case. So the completion of a
Business Case is critical to the success of the project.

 Cost and Profitability Analysis

   A cost and profitability analysis helps to determine where resources should be
allocated to maximize profit. This type of analysis not only serves as a tool to make more
informed decisions, but can also identify ways to improve business processes. For
example, it helps to identify who are the most profitable customers in order to focus on


                                                                                       11
them. The well-known 80/20 states 80% of your profit usually comes from 20% of your
customers - but which 20%?

   A profitability analysis may not be only based on customers, but also on products or
activities.

 Budgeting

   Creating, monitoring and managing a budget is key to business success. It should
help you allocate resources where they are needed. It is extremely important to know
how much money you have to spend, and where you are spending it. It is the most
effective way to control your cash-flow, to keep your business - and its finances - on
track - allowing you to invest in new opportunities at the appropriate time.

   If your business is growing, you may not always be able to be hands-on with every
part of it. You may have to split your budget up between different areas or departments
such as sales, production, marketing, administration, etc.

   You'll find that money starts to move in many different directions through your
organization - budgets are a vital tool in ensuring that you stay in control of expenditure.
A budget is a plan to:

       control your finances
       ensure you can continue to fund your current commitments
       enable you to make confident financial decisions and meet your objectives
       ensure you have enough money for your future projects


   You should stick to your budget as far as possible, but review and revise it as needed.
Successful businesses often have a rolling budget.




                                                                                      12
 Management Reporting

   Management Reporting keeps track of all the change and compares historical data
versus actual or original projection versus reality. We assist in setting up and monitoring
effective and timely management reports, using financial and non-financial data:

      Measure the business - weekly, monthly and annually
      Effectively handle market changes and manage associated costs
      Set up internal business practices and structures for reporting internally
      Pin point problem areas.


 Forecasting

   Once you have created a budget and related this to actual numbers, you can create a
dynamic rolling forecast which can be updated on a regular basis. This can be done on a
weekly or even daily basis to give you a more accurate and up to date picture of cash
flow and profit and loss.

   Before you start forecasting, remember that revenue projections are only as
meaningful as your baseline data.

   Make sure the data is complete, correct and ordered. There needs to be enough
historical sales data to accurately perform an analysis, typically seven to ten time
periods; the longer that forecast timeline the more accurate the forecast. The data must
be ordered from oldest to newest. If there is any missing data for a time period, then
estimate the number as accurately as possible. the time periods need to be uniform; for
example compare months to months or year to years.




                                                                                     13
3. History




             14
4. Introduction to Visual Basic (VB)

   Visual basic is one of the most popular programming languages in the market today.
Microsoft has positioned it to fit multiple purposes in development. The language ranges
from light weight vb script programming, to application specific programming with vb
for applications

4.1 What is Visual Basic?

   The visual part refers to the method used to create GUI. Rather than writing
numerous lines of code to describe the appearance and location of interface elements, we
simplify add rebuilt objects into place on screens.

   VB is high level programming language evolved from earlier DOS version called
BASIC. VB is event driven programming VB programs are made up of many sub
programs , each has its in own program codes and each can be executed independently
and at the same time each can be linked in one way or another.

   VB is designed to deploy applications across the enterprise and to scale of any size
needed the ability to develop object mode is databases integration, server components,
and Internet/Intranet applications provides an extensive range of capabilities and tools
of the developer. In particular VB lets us to add menus, textboxes, command buttons,
option buttons, check boxes, scroll bars, and file & directory boxes to blank windows.
We can communicate with other window applications and perhaps most importantly we
will have an easy method to let users’ control and access database.




                                                                                  15
5. Introduction to VBA

   Visual Basic for Applications, Excel’s powerful built-in programming language,
permits you to easily incorporate user written functions into a spreadsheet. User can
easily calculate Black-Scholes and binomial option prices,

   For example, in case you think VBA is something esoteric which you will never
otherwise need to know, VBA is now the core macro language for all Microsoft’s office
products, including Word. It has also been incorporated into software from other
vendors. You need not write complicated programs using VBA in order for it to be
useful to you. At the very least, knowing VBA will make it easier for you to analyze
relatively complex problems for yourself.

   This document presumes that you have a basic knowledge of Excel, including the use
of built-in functions and named ranges. I do not presume that you know anything about
writing macros or programming. The examples here are mostly related to option pricing,
but the principles apply generally to any situation where you use Excel as a tool for
numerical analysis.

   The Windows version of Excel supports programming through Microsoft'sVisual
Basic for Applications (VBA), which is a dialect of Visual Basic. Programming with VBA
allows spreadsheet manipulation that is awkward or impossible with standard
spreadsheet techniques. Programmers may write code directly using the Visual Basic
Editor (VBE), which includes a window for writing code, debugging code, and code
module organization environment. The user can implement numerical methods as well
as automating tasks such as formatting or data organization in VBA and guide the
calculation using any desired intermediate results reported back to the spreadsheet.

   VBA was removed from Mac Excel 2008, as the developers did not believe that a
timely release would allow porting the VBA engine natively to Mac OS X. VBA was
restored in the next version, Mac Excel 2011.


                                                                                   16
A common and easy way to generate VBA code is by using the Macro Recorder. The
Macro Recorder records actions of the user and generates VBA code in the form of a
macro. These actions can then be repeated automatically by running the macro. The
macros can also be linked to different trigger types like keyboard shortcuts, a command
button or a graphic. The actions in the macro can be executed from these trigger types or
from the generic toolbar options. The VBA code of the macro can also be edited in the
VBE. Certain features such as loop functions and screen prompts by their own
properties, and some graphical display items, cannot be recorded, but must be entered
into the VBA module directly by the programmer. Advanced users can employ user
prompts to create an interactive program, or react to events such as sheets being loaded
or changed.

   Users should be aware that using Macro Recorded code may not be compatible from
one version of Excel to another. Some code that is used in Excel 2010 cannot be used in
Excel 2003. Making a Macro that changes the cell colors and making changes to other
aspects of cells may not be backward compatible.

   VBA code interacts with the spreadsheet through the Excel Object Model,[14]a
vocabulary identifying spreadsheet objects, and a set of supplied functions or
methods that enable reading and writing to the spreadsheet and interaction with its
users (for example, through custom toolbars or command bars and message boxes).

   User-created VBA subroutines execute these actions and operate like macros
generated using the macro recorder, but are more flexible and efficient.

5.1 Why VBA?

   Macros have been used as development tool since the early days of the Microsoft
Office product line. Microsoft Access macros incorporate generalized database functions
using existing Microsoft Access capabilities. Errors in a macro can be easily resolved by



                                                                                   17
using the Microsoft supplied Help function. The ease with which you can generate
Macros makes Macro development seems easier to accomplish.

   You can generate macros by selecting database operations and commands in the
Macro window. These macros can then be converted to Microsoft Access VBA. In most
cases, you need only make minor edits to the saved code in order to have a functional
program. All syntax, spacing and functionality are included in the saved file, which
contains VBA code specific to the particular application being recorded. Unskilled
programmers are able to interpret the code and learn how to generate code to
accomplish specific tasks. In the process, the novice programmer may gain a useful
introduction to VBA code. Building Macros can be easier and faster than writing VBA
code, for simple applications, and making global key assignments, however, more
advanced and complex applications are not so easily accomplished using macros.

   People tend to consider Macros because VBA code is perceived to be more
programmatic, offering a variety of options that appear confusing and time consuming
to understand. These options, however, provide developers with tools to extend
Microsoft Access capabilities beyond those packaged with the Microsoft Access
software. If building or generating macros comes easily and does not consume great
amounts of your time, you may want to consider their use, particularly if you want to
accomplish rather simple tasks. If, however, you find macros to be time consuming and
tedious, as many have attested to, you may want to consider building VBA code. By
learning and building upon VBA skills, you acquire a programming skill set that is
applicable and portable to various other applications. Macros, on the other hand, are
used in many applications, but they are specific to a particular application. Macros, in
most cases, are not portable to other applications.

   VBA is one of the more easy-to-learn programming languages. It does not require the
complex programming techniques that are necessary to program C++ or other high level
languages. VBA provides a user-friendly, forms-based interface to assign variables and

                                                                                  18
simplify code development. VBA is a widely used application so that help is available
from a variety of sources. A second party would have to know and understand your
particular application in order to assist you with building a macro.

   VBA can be used to perform any operation that a macro can perform. VBA also
allows you to perform a multitude of more advanced operations to include the
following:

       Incorporate error-handling modules to assist in the running of your applications.
       Integrate Word and Excel features in your database
       Present users with professional forms-based layouts to interface with your
       database
       Process data in the background
       Create multi-purpose forms
       Perform conditional looping


5.2 Calculations without VBA

   Suppose you wish to compute the Black-Scholes formula in a spreadsheet.

   Suppose also that you have named cells 2 for the stock price (s), strike price (k),
interest rate (r), time to expiration (t), volatility (v), and dividend yield (d). You could
enter the following into a cell:

s*exp(-d*t)*norms dist((ln(s/k)+(r-d+vˆ2/2)* t)/(v*t ˆ0.5))−k *exp(- r * t)* normsdist((ln(s
/k)+(r -d-vˆ2/2)*t)/(v*tˆ0.5))

   Typing this formula is cumbersome, though of course you can copy the formula
wherever you would like it to appear. It is possible to use Excel’s data table feature to
create a table of Black-Scholes prices, but this is cumbersome and inflexible. If you want
to calculate option Greeks (e.g. delta, gamma, etc...) you must again enter or copy the
formulas into each cell where you want a calculation to appear. And if you decide to

                                                                                      19
change some aspect of your formula, you have to hunt down all occurrences and make
the changes. When the same formula is copied throughout a worksheet, that worksheet
potentially becomes harder to modify in a safe and reliable fashion. When the worksheet
is to b e used by others, maintainability becomes even more of a concern.

   Spreadsheet construction becomes even harder if you want to, for example, compute
a price for a finite- lived America n option. There is no way to do this in one cell, so you
must compute the binomial tree in a range of cells, and copy the appropriate formulas
for the stock price and the option price.

   It is not so bad with a 3-step binomial calculation, but for 100 steps you will spend
quite a while setting up the spreadsheet. You must do this separately for each time you
want a binomial price to appear in the spreadsheet. And if you decide you want to se t
up a put pricing tree, there is no easy way to edit your call tree to price puts. Of course
you can make the formulas quite flexible and general by using lots of ―if‖ statements.
But things would be come much easier if you could create your own formulas within
Excel.

   You can — with Visual Basic for Applications.

5.3 Advantages of Using VBA

   VBA, or Visual Basic for Applications, is the simple programming language that can
be used within Excel 2007 (and earlier versions, though there are a few changes that have
been implemented with the Office 2007 release) to develop macros and complex
programs. The advantages of which are:

         The ability to do what you normally do in Excel, but a thousand times faster
         The ease with which you can work with enormous sets of data
         To develop analysis and reporting programs downstream from large central
         databases such as Sybase, SQL Server, and accounting, financial and production
         programs such as Oracle, SAP, and others.
                                                                                        20
Macros save keystrokes by automating frequently used sequences of commands, and
developers use macros to integrate Office with enterprise applications - for example, to
extract customer data automatically from Outlook e-mails or to look up related
information in CRM systems or to generate Excel spreadsheets from data extracted from
enterprise resource planning (ERP) systems.

   To create an Excel spreadsheet with functionality beyond the standard defaults, you
write code. Microsoft Visual Basic is a programming environment that uses a computer
language to do just that. Although VBA is a language of its own, it is in reality derived
from the big Visual Basic computer language developed by Microsoft, which is now the
core macro language for all Microsoft applications.

   To take advantage of the functionality of the Microsoft Visual Basic environment,
there are many suggestions you can use or should follow. Below we will take a look at a
few hints and tips for VBA security and protection in Excel, a more in-depth
understanding of which can be gained by attending a VBA Excel 2007 course, delivered
by a Microsoft certified trainer.

 Password protecting the code

   As a VBA Excel user you may want to protect your code so that nobody may modify
it and to protect against the loss of intellectual property if people access source code
without permission. This is easily achieved in the VBE editor by going to "Tools/VBA
Project Properties/Protection". Check the box and enter a password.

 Hiding worksheets

   In any or all of your Excel workbooks you might want to hide a worksheet that
contains sensitive or confidential information from the view of other users of the
workbook. If you just hide the worksheet in the standard way the next user will be able
to simply un-hide it, but by using a VBA method to hide and password protect a



                                                                                   21
worksheet, without protecting the entire workbook, you will be able to allow other users
access without affecting the confidentiality of the data.

 Protecting workbooks

   There are different levels of protection for workbooks, from not allowing anyone
access to the workbook to not allowing any changes to be made to it, i.e. setting the
security to 'read only' so that no changes can be made to the templates you have created.




                                                                                   22
6. Miscellany

 Getting Excel to generate your macros for you

   Suppose you want to perform a task and you don’t have a clue how to program it in
VBA. For example, suppose you want to c re ate a subroutine to set up a graph. You can
set up a graph manually, and tell Excel to record the VBA commands which accomplish
the same thing. You then examine the result and see how it works. To do this, select
Tools |Record Macro |Record New Macro. Excel will record all your actions in a new
module located atthe end of your workbook, i.e. following Sheet16. You stop the
recording by clicking the - ?- button which should have app eared on your spreadsheet
then you started recording. Macro recording is an extremely useful tool for
understanding how Excel and VBA work and interact; this is in fact how the Excel
experts learn how to w rite macros which control Excel’s actions.

   For example, here is the macro code Excel generates if you use the chart wizard to set
up a chart using data in the range A2:C4. You can see among other things that the
selected graph style was the fourth line graph in the graph gallery, and that the chart
was titled ―Here is the title ‖. Also, each data series is in a column and that the first
column was used as the x-axis

(―CategoryLabels:=1‖).
’ Macro1 Macro
’ Macro recorded <Date> by <UserName>


Sub Macro1()
Range(”A2:C4”).Select
ActiveSheet.ChartObjects. Add (196.5, 39, 252.75, 162).Select
ActiveChart.ChartWizard                      Source:=Range(”A2:C4”),Gallery:=xlLine,
Format:=4,PlotBy:=xlColumns,CategoryLabels:=1,SeriesLabels:=0,
HasLegend:=1,Title:=”Here is the Title”, CategoryTitle:=”X−Axis”, ValueTitle
:=”Y −Axis”, ExtraTitle:=””
End Sub


                                                                                   23
 Using multiple modules

   You can split up your functions and subroutines among as many modules as you like
—functions from one module can call another, for example. Using multiple module s is
often convenient for clarity. If you put everything in one module you will sp end a lot of
time sc rolling around.

 Re-calculation speed

   One unfortunate drawback of VBA—and of most macro c o de in most applications
—is that it is slow. When you are using built- in functions, Excel performs clever internal
checking to know whether something requires recalculation (you should be aware that
on occasion it appears that this clever checking goes awry and something which should
be recalculated, isn’t).

   When you write a custom function, however, Excel is not able to perform its checking
on your functions, and it therefore tends to recalculate everything. This means that if
you have a complicated spreadsheet, you may find very slow recalculation times. This is
a problem with custom functions and not one you can do anything about.

   There are tricks for speeding things up. Here are two:

• If you are looping a great deal, be sure to declare your looping index variables as
   integers. This will speed up Excel’s handling of these variables. For example, if you
   use i as the index in a for loop, use the statement
   Dim i as integer


   While a lengthy subroutine is executing, you may wish to turn off Excel’s screen
updating. You do this by

   Application.ScreenUpdating = False




                                                                                     24
This will only work in subroutines, not in functions. If you want to check the
progress of your calculations, you can turn Screen Updating off at the beginning of your
subroutine. Whenever you would like to see your calculation’s progress (for example
every 100th iteration) you can turn it on and then immediately turn it off again. This will
update the display.

• Finally, here is a good thing to know: Ctrl-Break will (Usually) stop a recalculation!

Remember this. Someday you will thank me for it. Ctrl-Break is more reliable if your
macro writes output to the screen or spreadsheet.

 Debugging

   We will not go into details here, but VBA has very sophisticated de bugging
capabilities. For example, you can set breakpoints (i.e. lines in your routine where Excel
will stop calculating to give you a chance to see what is happening) and watches (which
means that you can look at the values of variables at diff erent points in the routine).
Look up ―debugging‖ in the online help.

 Creating an Add-in

   Suppose you have written a useful set of option functions and wish to make them
broadly available in your spreadsheets. You can make the functions automatically
available in any spreadsheet your write by creating an add-in. To do this, you simply
switch to a macro module and then Tools |Make Add-in. Excel will create a file with the
XLA extension which contains your functions. You can then make these functions
automatically available by Tools |Add-ins, and browse to locate your own add-in
module if it does not appear on the list.

   Any functions available through an add-in will automatically show up in the
function list under the set of ―userdefined‖ functions.




                                                                                      25
7. Simulation Example

   Suppose you have a large amount of money to invest. Suppose that at the end of the
next five years you wish to have it fully invested in stocks. It is often asserted in the
popular press that it is preferable to invest it in the market gradually, rather than all at
once. In particular, consider the strategy of each quarter taking a pro rata share of what
is left and investing it in stocks. So the first quarter invest 1/20th in stocks, the second
invest 1/19th of money remaining in stocks, etc... It is obvious that the strategy in which
we invest in stocks over time should have a smaller average return and a lower standard
deviation than a strategy in which we plunge into stocks, but how much lower and
smaller? Monte Carlo simulation is a natural tool to address a question like this. We will
first see how to structure the problem and then analyze it in Excel. You may not
understand the details of how the random stock price is generated. That does not matter
for purposes of this example; rather, the important thing is to understand how the
problem is structured and how that structure is translated into VBA.

7.1 What is the algorithm?
   To begin, we de scribe the investment strategy and the evolution over time of the
portfolio. Suppose we initially have $100 which is invested in bonds and nothing
invested in stock. Let the variable s BONDS and STOCK denote the amount invested in
each. Let h be the fraction of a year between investments in stock (s o for example if h =
.25, there are 4 transfers per year from bonds to stock), and let r, µ, and σ denote the risk-
free rate, the expected return on the stock, and the volatility of the stock.

   Suppose we switch from bonds to stock 20 times, once a quarter for 5 years. Le t n =
the number of times we will switch. We need to know the stock price each time we
switch. Denote these prices by PRICE(0), PRICE(1), PRICE(2), ... , PRICE(20). Now each
period, at the beginning.




                                                                                       26
The following example is considerably more complicated than those that precede it.
It is designed to illustrate many of the basic concepts in a non-trivial fashion. You may
wish to skip it initially, and return to it once you have had some experience with VBA.

   If you are thinking about option-pricing, you might expect this example to be
computed using the risk-neutral distribution. Instead, we will compare the actual payoff
distributions of the two strategies in order to compare the means and standard
deviations. If we wished to value the two strategies, we would substitute the risk-neutral
distribution by replacing the 15% expected rate of return with the 10% risk-free rate.
After making this substitution, both strategies would have the same expected payoff of
$161 (100 × 1.15).

   Since both strategies entail buying assets at a fair price, there is no need to perform a
valuation! Both will be worth the initial investment of the period we first switch some
funds from bonds to stock. At the end of the period, we figure out how much we earned
over the period. If we wish to switch a roughly constant proportion each period, we
could s witch 1/20 the first period, 1/19 with 19 periods to go, and s o forth. This
suggests that at the beginning of period j,

bonds (j)=bonds(j−1) ∗ (1−1/(n+1 −j))
stock (j)=stock(j−1)+bonds(j −1)/(n+1 −j)


   At the end of the period we have

stock (j)=stock(j ) ∗ price (j)/ price (j −1)
bonds (j) = bonds(j) ∗ exp(r ∗ h)

   In words, during period j, we earn interest on the bonds and capital gains on the
stock. We can think of the STOCK(j) and BONDS(j) on the right-hand side as denoting
beginning of period values after we have allocated some dollars from bonds to s to ck,
and the value s on the right-hand side as the end-of-period values after we have earned
interest and capital gains. We compute capital gains on the stock by


                                                                                     27
price (j) = price (j −1) ∗ exp((mu − 0.5 ∗ v ˆ 2) ∗ h+ v ∗ h ˆ (0.5) ∗
WorksheetFunction.Norm SInv(rnd()))


   As mentioned above, it is not important if you do not understand this expression. It
is the standard way to create a random log normally-distributed stock price, where the
expected return on the stock is mu, the volatility is v, and the length of a period is h. At
the end, j = n, and we will invest all remaining bonds in stock, and earn returns for one
final period.

   This describes the stock and bond calculations for a single set of randomly-drawn
lognormal prices. Now we want to rep eat this process many time s. Each time, we will
save the results of the trial and us e the m to compute the distribution.

7.2 VBA code for this example.
   We will set this up as a subroutine. The first several lines in the routine simply
activate the worksheet where we will write the data, and then clear the area. We need
two columns: one to store the terminal portfolio value if we invest fully in stock at the
outset, the other to store the terminal value if invest slowly. Note that we have se t it up
to run 2000 trials, and we also clear 2000 rows . We tell VBA that the variables ―bonds‖,
―stock‖, and ―price‖ are going to be arrays of type double, but we do not yet know what
size to make the array. The Worksheets(―Invest Output‖).Activate makes the ―invest
output‖ worksheet the default worksheet, so that all reading and writing will be done to
it unless another worksheet is specified.

   1 Sub Monte invest ()
   2 Dim bonds () As Double
   3 Dim stock() As Double
   4 Dim price () As Double
   5 Worksheets(” Invest Output”). Activate
   6 Range(”a1 .. b2000”). Select
   7 Selection.Clear
   8 ’number of monte−carlo trials
   9 iter = 2000


                                                                                      28
Now we set the parameters. The risk-free rate, mean return on the stock, and
volatility are all annual numbers. We invest each quarter, so h = 0.25.

   There are 20 periods to keep track of since we invest each quarter for 5 years. Note
that once we specify 20 periods, we can dimension the bonds, stock, and price variable s
to run from 0 to 20. We do this using the ―Redim‖ command.

   10 ’ number of reinvestment periods
   11 n = 20
   12 ’ Reset the dimension of the bonds and stock variable
   13 ReDim bonds (0 To n), stock(0 To n), price(0 To n)
   14 ’ length of each period
   15 h = 0.25
   16 ’ expected return on stock
   17 mu = 0.15
   18 ’ risk−free interest rate
   19 r = 0.1
   20 ’ volatility
   21 v = 0.3

   Now we have an outer loop. Each time through this outer loop, we have one trial, i.e.
we draw a series of 20 random stock prices and we see what the terminal payoff is from
our 2 strategies.

   Note that before we run through a single trial we have to initialize our variables : the
initial stock price is 100, we have $100 of bonds and no stock, and ―Price (0)‖, which is
the intial stock price , is set to 100.

   22 ’each time through this lo op is one com ple te iteration
   23 For i = 1 To iter: price (0) = 100
   24 bonds (0) = 100
   25 stock (0) = 0

   This is the heart of the program. Each period for 20 periods we perform our
allocation as above. Note that we draw a new random stock price using our standard
lognormal expression.


                                                                                     29
26 For j = 1 To n
   27 ’allocate 1/ n of bonds to stock
   28 stock ( j ) = stock ( j −1) + bonds(j−1) / (n + 1 − j)
   29 bonds (j) = bonds (j −1) ∗ (1 $ −$ 1 / (n + 1 $ −$ j))
   30
   31 ’ draw a new lognormal stock price
   32 price (j) = price (j −1) ∗ Exp ((mu − 0.5 ∗ v ˆ 2) ∗ h +
   33 v ∗ h ˆ (0.5) ∗ Workshee tFunc tionNormSInv( nd ()))
   34
   35 ’ earn returns on bonds and stock
   36 bonds (j) = bonds (j) ∗ Exp (r ∗ h)
   37 stock (j ) = s to ck ( j ) ∗ ( price ( j ) / price ( j −1))
   38
   39 Next j

   Once through this loop, all that remains is to write the results to ―sheet1‖. The
following two statements do that, by writing the terminal price to column 1, row i, and
the value of the terminal stock position to column 2, row i.

   40 ActiveSheet.Cells (i, 1) = price (n)
   41 ActiveSheet.Cells (i, 2) = stock (n)
   42 Next i
   43 End Sub

   Note that you could also write the data across in columns : you would do this by
writing

   ActiveSheet.Cells(1, i) = p1

   This would write the terminal price across the first row.

7.3 A trick to speed up the calculations
   Modify the inner loop by adding the two lines referring to ―screenupdating‖:

   ’ each time through this loop is one complete iteration
   For i = 1 To iter
        Application.ScreenUpdating = false
        ...
        If (i mod 100 = 0) then application.screenupdating = true
        ActiveSheet.Cells (i , 1) = price (i)
                                                                                  30
ActiveSheet.Cells (i , 2) = stock (i)
   Next i
   The first line prevents Excel from updating the display as the subroutine is run. It
turns out that it takes Excel quite a lot of time to redraw the spreadsheet and graphs
when numbers are added.
   The second line at the end redraws the spreadsheet every 100 iterations. The ―mod‖
function returns the remainder from dividing the first number by the second.

   Thus, i mod 100 will equal 0 whenever i is evenly divisible by 100. So on iteration
numbers 100, 200, etc..., the spreadsheet will b e redrawn. This cuts the calculation time
approximately in half.

   Note that ―Application.ScreenUpdating‖ is an example of a command which only
works within a subroutine. It will not work within a function.




                                                                                    31
8. Reporting in Excel

   While Excel was designed specifically to provide powerful, easy-to-use tools for
transforming quantitative analysis into visual representations, Excel remains an excellent
and extremely efficient way for business analysts to share the results of their work.
Familiar, accessible, and widely available, Excel makes it relatively easy to generate
attractive, flexible presentations that can be widely distributed. Reports created in Excel
also make it easy for others to access underlying data, cut and paste it into their own
spreadsheets, and make full use of the data in subsequent work.

   Integrating analysis into Excel is fast, easy, and reliable and can be done in any of
three different ways, depending on the way the results will be used.

    Scheduled Reports: For static reports that are updated on a regular basis-daily,
      weekly, or monthly,
      for example, A file−based solution is ideal. The process begins by integrating data
      sources into Excel. Once the computation is complete, VBA scripts automatically
      generate tables and graphics. They are delivered to Excel in comma-separated
      files and Windows metafiles, using a simple VB script to embed the results
      directly into a preformatted report.


    Interactive Desktop Applications: Where more interactivity is required, an Excel
      add-in can be created that includes menus and dialogs for controlling the
      parameters of the report and the data to be analyzed. An VB script is created to
      run the analysis based on the chosen parameter. A call from Excel to VB is then
      made using a COM API that initiates the script and inserts the results into the
      report. This option is best suited to dynamic reports that are distributed to
      relatively small numbers of end users.


    Client-Server Applications: The server-based option is ideal in situations where
      interactive reports are created or accessed by larger numbers of users and where
                                                                                     32
the ability to change the underlying analytics quickly, and distribute them widely
is desired. Similar to the client-based solution, an Excel add-in is created that
includes menus and dialogs for controlling parameters of the analysis. Excel then
uses an HTTP API to call a remote server where the VB script is run. Results are
then inserted into Excel. This server-based approach enables organizations to take
advantage of the power of server-based distributed technology to generate and
disseminate analytics




                                                                            33
9. Attributes of Good VBA Models

   While VBA models can be widely different from one another, all good ones need to
have certain common attributes. In this section I briefly describe the attributes that you
should try to build into your models. Some of these apply to Excel models as well. I am
including them both here and under Excel so that you can have comprehensive lists of
the attributes at both places.

    Realistic

   Most models you develop will be directly or indirectly used to make some decisions.
The output of the model must therefore be realistic. This means that the assumptions,
mathematical relationships, and inputs you use in the model must be realistic. For most
―real-world‖ models, making sure of this takes a lot of time and effort, but this is not a
place where you should cut corners. If a model does not produce realistic outputs, it
does not matter how good its outputs look or how well it works otherwise.

    Error-Free

   It is equally important that a model be error-free. You must test a model extensively
to make sure of this. It is generally much easier to fix a problem when a model just does
not work or produces obviously wrong answers. It is much harder to find errors that are
more subtle and occur for only certain combinations of input values. See the chapter on
debugging for help on making your models error-free.

    Flexible

   The more different types of question a model can answer, the more useful it is. In the
planning stage, you should try to anticipate the different types of questions the model is
likely to be used to answer. You then do not have to make major changes every time
someone tries to use it for something slightly different.




                                                                                    34
 Easy to Provide Inputs

   Most VBA models need inputs from the user, and the easier it is for the user to
provide the inputs, the better. Generally a VBA model can get inputs either through
input dialog boxes (that is, through the InputBox function) or by reading them in from a
spreadsheet (or database).

   Using input dialog boxes to get input data works well when there are only a few
inputs—probably five or less. If the model needs more inputs, it is better to set up an
input area in a spreadsheet (or, for large models, even a separate input spreadsheet)
where the user can enter the input data before running the model.

   This approach is particularly helpful if the user is likely to change only one or two
inputs from one run to the next. If a model uses a large number of input dialog boxes,
the user will have to enter data in each of them every time he runs the model—even if he
wants to change only one or two inputs. However, if the user has to provide some input
(based on some intermediate outputs) while a procedure is running, then using input
dialog boxes is the only option.

   If the model uses input dialog boxes, the prompt should provide enough specific
information to help the user enter the right data in the right format. Similarly, if the
input data is to be provided in certain cells in a spreadsheet, then there should be
enough information in the next cell (or nearby) to help the user enter the right data in the
right format.

    Good Output Production

   A model that does not produce good outputs to get its major results across
persuasively is not as useful as it can be. Producing reports with VBA models is gen-
erally a two-step process: the model produces outputs on spreadsheets and then parts or
all of the spreadsheets have to be printed out. For printed outputs good models should
include built-in reports (in Excel) that any user can produce easily. The spreadsheet

                                                                                      35
outputs produced by a VBA model should be such that they do not require too much
manipulation before creating printed reports. These reports should be attractive, easy to
read, and uncluttered. Avoid trying to squeeze in too much information on one page. If a
report has to include a lot of data, organize it in layers so that people can start by looking
at summary results and then dig into additional details as necessary. One of the
advantages of VBA compared to other programming languages is that it can produce
excellent graphical outputs using Excel’s charting features.VBA models should include
graphical outputs wherever they will enhance the usefulness of the models.

   Another thing to keep in mind is that unlike an Excel model, the VBA model does not
show intermediate results (unless either through message boxes or through spreadsheet
outputs or charts). The modeler should therefore anticipate what output—intermediate
and final—the user may want to see and provide for them in the model.

    Data Validations

   It is generally more important to provide thorough data validation in VBA models
than it is in Excel models. If the user accidentally enters invalid data, most of the time the
model simply will not run, but it will not provide any useful information on what the
problem is and leave the user in a helpless situation.

   You can, of course, have the VBA code check input data for various possible errors
before using them. A simple alternate approach is to have the input data read in from
spreadsheets and provide data validation to the input cells on the spreadsheet using
Excel’s Data Validation feature. (To keep the codes short and to avoid repeating the
same lines of codes, I have generally omitted data validation in the models in this book.
Instead of writing data validation codes repeatedly, you can create and keep a few Sub
procedures for the type of data validation you need to do for the type of models you
work with most often and call them as needed.)




                                                                                       36
 Judicious Formatting

   The formatting here refers to formatting of the model’s output. Poor, haphazard
formatting reduces a model’s usefulness because it is distracting. Use formatting (fonts,
borders, patterns, colors, etc.) judiciously to make your model’s outputs easier to
understand and use. (As much as possible, create the formatting parts of the code by
recording them with the Macro Recorder.)

    Appropriate Numbers Formatting

   In the model’s outputs, you should format numbers with the minimum number of
decimal points necessary. Using too many decimal points makes numbers difficult to
read and often gives a false sense of precision as well. Make the format-ting of similar
numbers uniform throughout the output. (Remember that displaying numbers with
fewer decimal points does not reduce the accuracy of the model in any way because
internally Excel and VBA continue using the same number of significant digits.)

   Wherever appropriate make numbers more readable by using special for-matting to
show them in thousands or millions.

    Well Organized and Easy to Follow

   The better organized a model is, the easier it is to follow and update. The key to
making your code organized is to break it down into segments, each of which carry out
one distinct activity or set of computations. One way to accomplish this is to use separate
Sub procedures and Function procedures for many of such segments, especially the ones
that will be repeated many times. In the extreme, the main Sub procedure may simply
consist of calls to other Sub procedures and Function procedures. An additional
advantage of this approach is that you can develop a number of Sub procedures and
Function procedures to do things that you often need to do and incorporate them in
other codes as needed.




                                                                                     37
Using structured programming also makes a code easier to follow. In a structured
program, the procedure is segmented into a number of stand-alone units, each of which
has only one entry and one exit point. Control does not jump into or exit from the
middle of these units.

   The proper visual design of a code can also make it easier to follow. For example,
statements should be properly indented to show clearly how they fit into the various If,
For, and other structures. Similarly, each major and minor segment of the code should be
separated by blank lines and other means and informatively labeled. (The easiest way to
learn these techniques is by imitating well-written codes.)

    Statements Are Easy to Read and Understand

   Experienced programmers try to make their codes as concise as possible, often using
obscure features of the programming language. Such codes may be admired by other
experienced programmers, but they often baffle beginners.

   With the high speed of modern PCs, codes do not usually have to be concise or
highly efficient. It is best to aim for codes that are easy to understand; even if that means
that it has more lines of code than is absolutely necessary.

   Avoid writing long equations whenever you can. Break them up by doing long
calculations in easily understandable steps. Make all variable names short but
descriptive and not cryptic. If in a large model you decide to use a naming scheme, try to
make it intuitive and provide an explanation of the scheme in the documentations.

    Robust

   ―Robust‖ here refers to a code that is resistant to ―crashing.‖ It often takes significant
extra work to make a code ―bulletproof,‖ and that time and effort may not be justified
for many of the codes you will write. Nonetheless, the code should guard against
obvious problems. For example, unless specified otherwise, a VBA code always works
with the currently active spreadsheet. So throughout a code you should make sure that
                                                                                       38
the right worksheet is active at the right time or else precede cell addresses, and so on,
by the appropriate worksheet reference. Using effective data validation for the input
data is another way of making your code robust.

    Minimum Hard Coding

   Hard-coded values are difficult to change, especially in large models because there is
always the danger of missing them in a few places. It is best to set up any value that may
have to be changed later as a variable and use the variable in all equations.

   Even for values that are not going to change it is better to define and use constants.
Then use them in the equations. This makes equations easier to read and guards against
possible mistakes of typing in the wrong number.

    Good Documentation

   Good documentation is key to understanding VBA models and is a must for all but
trivial ones. For hints on producing good documentation, see the next section.

9.1 Documenting VBA Models

   Documenting a model means putting in writing, diagrams, flowcharts, and so on, the
information someone else (or you in the future) will need to figure out what a model
does, how it is structured, what assumptions are built into it, and so forth. A user can
then make changes to (update) it if necessary. It should also include, for example, notes
on any shortcuts you may have taken for now that should be fixed later and any
assumptions or data you have used now that may need to be updated later.

   There is no standard format or structure for documenting a model. You have to be
guided by the objectives mentioned above. Here are some common approaches to
documenting your VBA models. Every model needs to be documented differently and
everyone does documentation differently. Over time you will develop your own style.




                                                                                    39
 Including Comments in the Code

   The most useful documenting tool in VBA is comments. Comments are notes and
reminders you include at various places in the VBA code. You indicate a comment by an
apostrophe. Except when it occurs in texts within quotation marks, VBA interprets an
apostrophe as the beginning of a comment and ignores the rest of the line. You can use
an entire line or blocks of line for comments or you can put a comment after a statement
in a line (for example, to explain something about the statement).

   You should include in your code all the comments that may be helpful, but do not go
overboard and include comments to explain things that are obvious. Including a lot of
superfluous comments can make codes harder rather than easier to read. Here are some
ideas on what types of comments you may want to include in your code:


      At the beginning of a procedure include a brief description of what the code does.
      At times it may also be useful to list the key inputs and outputs and some other
      information as well.
      Every time significant changes are made to the code, insert comments near the
      beginning of the code below the code description to keep track of the change date,
      the important changes made at that time, and who made the changes Sometimes
      it also helps to insert additional comments above or next to the statement(s) that
      has been changed to explain what was changed and why. Also record who made
      the change and when.
      If the procedure uses a particular variable naming scheme, then use comments to
      explain it.
      Use distinctive comment lines (for example, '*********) to break down long
      procedures into sections, and at the beginning of each section include a short
      name or description of the section.
      Use comments next to a variable to explain what it stands for, where its value
      came from, and anything else that may be helpful.


                                                                                    40
You can get more ideas about what kind of comments to include in your code
      from the examples in this and other books. Over time you will develop your own
      style of providing comments in code.


   Make sure you insert comments as you code. If you put it off until later, your
comments may not be as useful, inserting them may take longer because you may have
to spend time trying to remember things, and worst of all, you may never get around to
it. If you do not include good comments in your code, modifying it a few months later
may take much longer.

    Documenting Larger Models

   If you are developing a large model and saving different versions of the work-book
as I have suggested, then the workbook should include a worksheet titled―Version
Description.‖ In this worksheet, list against each version number the major changes you
made to the code in that version. Every time you save your work under a new version
name, start a new row of description under that version number in the Version
Description worksheet and keep adding to it as you make major changes. The key is to
do this as you go along and not wait until later when you may forget some of the
changes you made. This is essentially the history (log) of the model’s development. If
you ever want to go back to an earlier stage and go in a different direction from there,
the log will save you a lot of time. Also, you may want to have several different versions
of a model. You can document here how they differ from each other.

   For large models, you may also need to create a book of formal documentation
(which will include information on why and how certain modeling decisions were made,
flow charts for the model, etc.) and a user’s manual. For most of your work, however,
documentation of the type I discussed should be adequate.




                                                                                    41
10. Caveats

   We used Excel to do some basic data analysis tasks to see whether it is a reasonable
alternative to using a statistical package for the same tasks. We concluded that Excel is a
poor choice for statistical analysis beyond textbook examples, the simplest descriptive
statistics, or for more than a very few columns. The problems we encountered that led to
this conclusion are in four general areas:


       Missing values are handled inconsistently, and sometimes incorrectly.
       Data organization differs according to analysis, forcing you to reorganize your
       data in many ways if you want to do many different analyses.
       Many analyses can only be done on one column at a time, making it inconvenient
       to do the same analysis on many columns.
       Output is poorly organized, sometimes inadequately labeled, and there is no
       record of how an analysis was accomplished.


   Excel is convenient for data entry, and for quickly manipulating rows and columns
prior to statistical analysis. However when you are ready to do the statistical analysis,
we recommend the use of a statistical package such as SAS, SPSS, Stata, Systat or
Minitab.

   Excel is probably the most commonly used spreadsheet for PCs. Newly purchased
computers often arrive with Excel already loaded. It is easily used to do a variety of
calculations, includes a collection of statistical functions, and a Data Analysis ToolPak.
As a result, if you suddenly find you need to do some statistical analysis, you may turn
to it as the obvious choice. We decided to do some testing to see how well Excel would
serve as a Data Analysis application.

   To present the results, we will use a small example. The data for this example is
fictitious. It was chosen to have two categorical and two continuous variables, so that we
could test a variety of basic statistical techniques. Since almost all real data sets have at

                                                                                       42
least a few missing data points, and since the ability to deal with missing data correctly
is one of the features that we take for granted in a statistical analysis package, we
introduced two empty cells in the data:

                    Treatment     Outcome      X          Y
                              1            1       10.2        9.9
                              1            1        9.7
                              2            1       10.4       10.2
                              1            2        9.8        9.7
                              2            1       10.3       10.1
                              1            2        9.6        9.4
                              2            1       10.6       10.3
                              1            2        9.9        9.5
                              2            2       10.1         10
                              2            2                  10.2



   Each row of the spreadsheet represents a subject. The first subject received Treatment
1, and had Outcome 1. X and Y are the values of two measurements on each subject. We
were unable to get a measurement for Y on the second subject, or on X for the last
subject, so these cells are blank. The subjects are entered in the order that the data
became available, so the data is not ordered in any particular way.

   We used this data to do some simple analyses and compared the results with a
standard statistical package. The comparison considered the accuracy of the results as
well as the ease with which the interface could be used for bigger data sets - i.e. more
columns. We used SPSS as the standard, though any of the statistical packages supports
would do equally well for this purpose. In this article when we say "a statistical
package," we mean SPSS, SAS, STATA, SYSTAT, or Minitab.

   Most of Excels statistical procedures are part of the Data Analysis tool pack, which is
in the Tools menu. It includes a variety of choices including simple descriptive statistics,
t-tests, correlations, 1 or 2-way analysis of variance, regression, etc. If you do not have a

                                                                                       43
Data Analysis item on the Tools menu, you need to install the Data Analysis
ToolPak. Search in Help for "Data Analysis Tools" for instructions on loading the
ToolPak.

   Two other Excel features are useful for certain analyses, but the Data Analysis tool
pack is the only one that provides reasonably complete tests of statistical significance.
Pivot Table in the Data menu can be used to generate summary tables of means,
standard deviations, counts, etc. Also, you could use functions to generate some
statistical measures, such as a correlation coefficient. Functions generate a single
number, so using functions you will likely have to combine bits and pieces to get what
you want. Even so, you may not be able to generate all the parts you need for a complete
analysis.

   Unless otherwise stated, all statistical tests using Excel were done with the Data
Analysis ToolPak. In order to check a variety of statistical tests, we chose the following
tasks:


         Get means and standard deviations of X and Y for the entire group, and for each
         treatment group.
         Get the correlation between X and Y.
         Do a two sample t-test to test whether the two treatment groups differ on X and
         Y.
         Do a paired t-test to test whether X and Y are statistically different from each
         other.
         Compare the number of subjects with each outcome by treatment group, using a
         chi-squared test.


   All of these tasks are routine for a data set of this nature, and all of them could be
easily done using any of the aobve listed statistical packages.




                                                                                        44
10.1 General Issues

Enable the Analysis ToolPak

   The Data Analysis ToolPak is not installed with the standard Excel setup. Look in
the Tools menu. If you do not have a Data Analysis item, you will need to install the
Data Analysis tools. Search Help for "Data Analysis Tools" for instructions.


Missing Values

   A blank cell is the only way for Excel to deal with missing data. If you have any
other missing value codes, you will need to change them to blanks.

Data Arrangement

   Different analyses require the data to be arranged in various ways. If you plan on a
variety of different tests, there may not be a single arrangement that will work. You will
probably need to rearrange the data several ways to get everything you need.


Dialog Boxes

   Choose Tools/Data Analysis, and select the kind of analysis you want to do. The
typical dialog box will have the following items:

   Input Range: Type the upper left and lower right corner cells. e.g. A1:B100. You can
only choose adjacent rows and columns. Unless there is a checkbox for grouping data by
rows or columns (and there usually is not), all the data is considered as one glop.

   Labels - There is sometimes a box you can check off to indicate that the first row of
your sheet contains labels. If you have labels in the first row, check this box, and your
output MAY be labeled with your label. Then again, it may not.

    Output location - New Sheet is the default. Or, type in the cell address of the upper
left corner of where you want to place the output in the current sheet. New Worksheet is



                                                                                      45
another option, which I have not tried. Ramifications of this choice are discussed below.
Other items, depending on the analysis.


Output location

   The output from each analysis can go to a new sheet within your current Excel file
(this is the default), or you can place it within the current sheet by specifying the upper
left corner cell where you want it placed. Either way is a bit of a nuisance. If each
output is in a new sheet, you end up with lots of sheets, each with a small bit of output.
If you place them in the current sheet, you need to place them appropriately; leave room
for adding comments and labels; changes you need to make to format one output
properly may affect another output adversely. Example: Output from Descriptive has a
column of labels such as Standard Deviation, Standard Error, etc. You will want to make
this column wide in order to be able to read the labels. But if a simple Frequency output
is right underneath, then the column displaying the values being counted, which may
just contain small integers, will also be wide.


10.2 Results of Analyses

Descriptive Statistics

   The quickest way to get means and standard deviations for a entire group is using
Descriptives in the Data Analysis tools. You can choose several adjacent columns for the
Input Range (in this case the X and Y columns), and each column is analyzed separately.
The labels in the first row are used to label the output, and the empty cells are ignored. If
you have more, non-adjacent columns you need to analyze, you will have to repeat the
process for each group of contiguous columns. The procedure is straightforward, can
manage many columns reasonably efficiently, and empty cells are treated properly.

   To get the means and standard deviations of X and Y for each treatment group
requires the use of Pivot Tables (unless you want to rearrange the data sheet to separate
the two groups). After selecting the (contiguous) data range, in the Pivot Table Wizard's

                                                                                      46
Layout option, drag Treatment to the Row variable area, and X to the Data area. Double
click on ―Count of X‖ in the Data area, and change it to Average. Drag X into the Data
box again, and this time change Count to StdDev. Finally, drag X in one more time,
leaving it as Count of X. This will give us the Average, standard deviation and number
of observations in each treatment group for X. Do the same for Y, so we will get the
average, standard deviation and number of observations for Y also. This will put a total
of six items in the Data box (three for X and three for Y). As you can see, if you want to
get a variety of descriptive statistics for several variables, the process will get tedious.

   A statistical package lets you choose as many variables as you wish for descriptive
statistics, whether or not they are contiguous. You can get the descriptive statistics for all
the subjects together, or broken down by a categorical variable such as treatment. You
can select the statistics you want to see once, and it will apply to all variables chosen.


Correlations

   Using the Data Analysis tools, the dialog for correlations is much like the one for
descriptives - you can choose several contiguous columns, and get an output matrix of
all pairs of correlations. Empty cells are ignored appropriately. The output does NOT
include the number of pairs of data points used to compute each correlation (which can
vary, depending on where you have missing data), and does not indicate whether any of
the correlations are statistically significant. If you want correlations on non-contiguous
columns, you would either have to include the intervening columns, or copy the desired
columns to a contiguous location.

   A statistical package would permit you to choose non-contiguous columns for your
correlations. The output would tell you how many pairs of data points were used to
compute each correlation, and which correlations are statistically significant.




                                                                                         47
Two-Sample T-test

   This test can be used to check whether the two treatment groups differ on the values
of either X or Y. In order to do the test you need to enter a cell range for each group.
Since the data were not entered by treatment group, we first need to sort the rows by
treatment. Be sure to take all the other columns along with treatment, so that the data for
each subject remains intact. After the data is sorted, you can enter the range of cells
containing the X measurements for each treatment. Do not include the row with the
labels, because the second group does not have a label row. Therefore your output will
not be labeled to indicate that this output is for X. If you want the output labeled, you
have to copy the cells corresponding to the second group to a separate column, and enter
a row with a label for the second group. If you also want to do the t-test for the Y
measurements, you� need to repeat the process. The empty cells are ignored, and other
                  ll
than the problems with labeling the output, the results are correct.

   A statistical package would do this task without any need to sort the data or copy it
to another column, and the output would always be properly labeled to the extent that
you provide labels for your variables and treatment groups. It would also allow you to
choose more than one variable at a time for the t-test (e.g. X and Y).


Paired t-test

   The paired t-test is a method for testing whether the difference between two
measurements on the same subject is significantly different from 0. In this example, we
wish to test the difference between X and Y measured on the same subject. The
important feature of this test is that it compares the measurements within each subject. If
you scan the X and Y columns separately, they do not look obviously different. But if
you look at each X-Y pair, you will notice that in every case, X is greater than Y. The
paired t-test should be sensitive to this difference. In the two cases where either X or Y is
missing, it is not possible to compare the two measures on a subject. Hence, only 8 rows
are usable for the paired t-test.

                                                                                       48
When you run the paired t-test on this data, you get a t-statistic of 0.09, with a 2-tail
probability of 0.93. The test does not find any significant difference between X and Y.
looking at the output more carefully, we notice that it says there are 9 observations. As
noted above, there should only be 8. It appears that Excel has failed to exclude the
observations that did not have both X and Y measurements. To get the correct results
copy X and Y to two new columns and remove the data in the cells that have no value
for the other measure. Now re-run the paired t-test. This time the t-statistic is
6.14817 with a 2-tail probability of 0.000468. The conclusion is completely different!

   Of course, this is an extreme example. But the point is that Excel does not calculate
the paired t-test correctly when some observations have one of the measurements but
not the other. Although it is possible to get the correct result, you would have no reason
to suspect the results you get unless you are sufficiently alert to notice that the number
of observations is wrong. There is nothing in online help that would warn you about this
issue.

   Interestingly, there is also a TTEST function, which gives the correct results for this
example. Apparently the functions and the Data Analysis tools are not consistent in how
they deal with missing cells. Nevertheless, I cannot recommend the use of functions in
preference to the Data Analysis tools, because the result of using a function is a single
number - in this case, the 2-tail probability of the t-statistic. The function does not give
you the t-statistic itself, the degrees of freedom, or any number of other items that you
would want to see if you were doing a statistical test.

   A statistical package will correctly exclude the cases with one of the measurements
missing, and will provide all the supporting statistics you need to interpret the output.


Cross tabulation and Chi-Squared Test of Independence

   Our final task is to count the two outcomes in each treatment group, and use a chi-
square test of independence to test for a relationship between treatment and outcome. In

                                                                                         49
order to count the outcomes by treatment group, you need to use Pivot Tables. In the
Pivot Table Wizard's Layout option, drag Treatment to Row, Outcome to Column and
also to Data. The Data area should say "Count of Outcome" – if not, double-click on it
and select "Count". If you want percents, double-click "Count of Outcome", and click
Options; in the ―Show Data As‖ box which appears, select "% of row". If you want both
counts and percents, you can drag the same variable into the Data area twice, and use it
once for counts and once for percents.

   Getting the chi-square test is not so simple, however. It is only available as a function,
and the input needed for the function is the observed counts in each combination of
treatment and outcome (which you have in your pivot table), and the expected counts in
each combination. Expected counts? What are they? How do you get them? If you have
sufficient statistical background to know how to calculate the expected counts, and can
do Excel calculations using relative and absolute cell addresses, you should be able to
navigate through this. If not, you’re out of luck.

   Assuming that you surmounted the problem of expected counts, you can use the Chi-
test function to get the probability of observing a chi-square value bigger than the one
for this table. Again, since we are using functions, you do not get many other necessary
pieces of the calculation, notably the value of the chi-square statistic or its degrees of
freedom.

   No statistical package would require you to provide the expected values before
computing a chi-square test of independence. Further, the results would always include
the chi-square statistic and its degrees of freedom, as well as its probability. Often you
will get some additional statistics as well.




                                                                                       50
10.3 Additional Analyses

   The remaining analyses were not done on this data set, but some comments about
them are included for completeness.

   Simple Frequencies

   You can use Pivot Tables to get simple frequencies. (See Crosstabulations for more
about how to get Pivot Tables.) Using Pivot Tables, each column is considered a
separate variable, and labels in row 1 will appear on the output. You can only do one
variable at a time.

   Another possibility is to use the Frequencies function. The main advantage of this
method is that once you have defined the frequencies function for one column, you can
use Copy/Paste to get it for other columns. First, you will need to enter a column with
the values you want counted (bins). If you intend to do the frequencies for many
columns, be sure to enter values for the column with the most categories. e.g., if 3
columns have values of 1 or 2, and the fourth has values of 1,2,3,4, you will need to enter
the bin values as 1,2,3,4. Now select enough empty cells in one column to store the
results - 4 in this example, even if the current column only has 2 values. Next choose
Insert/Function/Statistical/Frequencies on the menu. Fill in the input range for the first
column you want to count using relative addresses (e.g. A1:A100). Fill in the Bin Range
using the absolute addresses of the locations where you entered the values to be counted
(e.g. $M$1:$M$4). Click Finish. Note the box above the column headings of the sheet,
where the formula is displayed. It start with "= FREQUENCIES(". Place the cursor to
the left of the = sign in the formula, and press Ctrl-Shift-Enter. The frequency counts
now appear in the cells you selected.

   To get the frequency counts of other columns, select the cells with the frequencies in
them, and choose Edit/Copy on the menu. If the next column you want to count is one
column to the right of the previous one, select the cell to the right of the first frequency


                                                                                      51
cell, and choose Edit/Paste (ctrl-V). Continue moving to the right and pasting for each
column you want to count. Each time you move one column to the right of the original
frequency cells, the column to be counted is shifted right from the first column you
counted.

   If you want percents as well, you’ll have to use the Sum function to compute the sum
of the frequencies, and define the formula to get the percent for one cell. Select the cell to
store the first percent, and type the formula into the formula box at the top of the sheet -
e.g. = N1*100/N$5 - where N1 is the cell with the frequency for the first category, and
N5 is the cell with the sum of the frequencies. Use Copy/Paste to get the formula for the
remaining cells of the first column. Once you have the percents for one column, you can
Copy/Paste them to the other columns. You’ll need to be careful about the use of
relative and absolute addresses! In the example above, we used N$5 for the
denominator, so when we copy the formula down to the next frequency on the same
column, it will still look for the sum in row 5; but when we copy the formula right to
another column, it will shift to the frequencies in the next column.

   Finally, you can use Histogram on the Data Analysis menu. You can only do one
variable at a time. As with the Frequencies function, you must enter a column with "bin"
boundaries. To count the number of occurrences of 1 and 2, you need to enter 0, 1, and 2
in three adjacent cells, and give the range of these three cells as the Bins on the dialog
box.   The output is not labeled with any labels you may have in row 1, nor even with
the column letter. If you do frequencies on lots of variables, you will have difficulty
knowing which frequency belongs to which column of data.


Linear Regression

   Since regression is one of the more frequently used statistical analyses, we tried it out
even though we did not do a regression analysis for this example. The Regression
procedure in the Data Analysis tools lets you choose one column as the dependent
variable, and a set of contiguous columns for the independents. However, it does not
                                                                                       52
tolerate any empty cells anywhere in the input ranges, and you are limited to 16
independent variables. Therefore, if you have any empty cells, you will need to copy all
the columns involved in the regression to new columns, and delete any rows that
contain any empty cells. Large models, with more than 16 predictors, cannot be done at
all.


Analysis of Variance

       In general, the Excel's ANOVA features are limited to a few special cases rarely
found outside textbooks, and require lots of data re-arrangements.


One-way ANOVA

       Data must be arranged in separate and adjacent columns (or rows) for each group.
Clearly, this is not conducive to doing 1-ways on more than one grouping. If you have
labels in row 1, the output will use the labels.


Two-Factor ANOVA without Replication

       This only does the case with one observation per cell (i.e. no Within Cell error term).
The input range is a rectangular arrangement of cells, with rows representing levels of
one factor, columns the levels of the other factor, and the cell contents the one value in
that cell.

Two-Factor ANOVA with Replicates

       This does a two-way ANOVA with equal cell sizes. Input must be a rectangular
region with columns representing the levels of one factor, and rows representing
replicates within levels of the other factor. The input range MUST also include an
additional row at the top, and column on the left, with labels indicating the factors.
However, these labels are not used to label the resulting ANOVA table. Click Help on
the ANOVA dialog for a picture of what the input range must look like.



                                                                                        53
10.4 Requesting Many Analyses

   If you had a variety of different statistical procedures that you wanted to perform on
your data, you would almost certainly find yourself doing a lot of sorting, rearranging,
copying and pasting of your data. This is because each procedure requires that the data
be arranged in a particular way, often different from the way another procedure wants
the data arranged. In our small test, we had to sort the rows in order to do the t-test, and
copy some cells in order to get labels for the output. We had to clear the contents of some
cells in order to get the correct paired t-test, but did not want those cells cleared for some
other test. And we were only doing five tasks. It does not get better when you try to do
more. There is no single arrangement of the data that would allow you to do many
different analyses without making many different copies of the data. The need to
manipulate the data in many ways greatly increases the chance of introducing errors.

   Using a statistical program, the data would normally be arranged with the rows
representing the subjects, and the columns representing variables (as they are in our
sample data). With this arrangement you can do any of the analyses discussed here, and
many others as well, without having to sort or rearrange your data in any way. Only
much more complex analyses, beyond the capabilities of Excel and the scope of this
article would require data rearrangement.


10.5 Working with Many Columns

   What if your data had not 4, but 40 columns, with a mix of categorical and
continuous measures? How easily do the above procedures scale to a larger problem?

   At best, some of the statistical procedures can accept multiple contiguous columns
for input, and interpret each column as a different measure. The descriptives and
correlations procedures are of this type, so you can request descriptive statistics or
correlations for a large number of continuous variables, as long as they are entered in



                                                                                       54
adjacent columns. If they are not adjacent, you need to rearrange columns or use copy
and paste to make them adjacent.

   Many procedures, however, can only be applied to one column at a time. T-tests
(either independent or paired), simple frequency counts, the chi-square test of
independence, and many other procedures are in this class. This would become a serious
drawback if you had more than a handful of columns, even if you use cut and paste or
macros to reduce the work. In addition to having to repeat the request many times, you
have to decide where to store the results of each, and make sure it is properly labeled so
you can easily locate and identify each output.

   Finally, Excel does not give you a log or other record to track what you have done.
This can be a serious drawback if you want to be able to repeat the same (or similar)
analysis in the future, or even if you’ve simply forgotten what you’ve already done.

   Using a statistical package, you can request a test for as many variables as you need
at once. Each one will be properly labeled and arranged in the output, so there is no
confusion as to what’s what. You can also expect to get a log, and often a set of
commands as well, which can be used to document your work or to repeat an analysis
without having to go through all the steps again.




                                                                                    55
11. Beyond VBA

   When you say ―Visual Basic,‖ most developers—particularly those reading this
magazine—will think of the Visual Basic development environment that has been the
topic of these columns for several years now. So what do I mean by ―beyond‖ Visual
Basic? I am, however, interested in exploring the capabilities of Visual Basic wherever it
leads me, and that sometimes means going outside the traditional Visual Basic
development environment. You’ll be surprised at the programming power you’ll find.

   I am, of course, talking about Visual Basic for Applications, or VBA—the ―macro‖
language that is supported by many Microsoft application programs. I put ―macro‖ in
quotes because, while VBA may have its roots in the keyboard macro tools of the past,
which permitted recording and playback of keystroke sequences, it has evolved into
something entirely different. In fact, VBA is essentially the regular Visual Basic language
modified for use in controlling existing applications rather than creating stand-alone
applications. You have the same rich set of language constructs, data types, control
statements, and so on available to you. In fact, from the perspective of the language
itself, a programmer would have trouble telling Visual Basic and VBA apart. Even so,
VBA programs are still referred to as macros.

   VBA is embedded in many Microsoft applications, most notably those that are part of
Microsoft Office: Word, Excel, Access, Outlook, PowerPoint, and FrontPage. VBA has
also been licensed by Microsoft to some other publishers of Windows software. You can
use VBA in a keyboard macro mode in which you start recording, perform some actions
in the program, and then save the recorded macro to be played back later as needed.
While recording macros only scratches the surface of VBA’s capabilities, it is nonetheless
an extremely useful technique that I use on a daily basis. It is important to note that a
recorded macro is not saved as a sequence of keystrokes, as was the case in some older
programs. Rather it is saved as a Visual Basic subroutine, and the statements that carry
out the recorded actions consist primarily of manipulation of properties and methods of
the application’s objects.
                                                                                     56
12. Conclusion

   Although Excel is a fine spreadsheet, it is not a statistical data analysis package. In all
fairness, it was never intended to be one. Keep in mind that the Data Analysis ToolPak is
an "add-in" - an extra feature that enables you to do a few quick calculations. So it should
not be surprising that that is just what it is good for - a few quick calculations. If you
attempt to use it for more extensive analyses, you will encounter difficulties due to any
or all of the following limitations:


       Potential problems with analyses involving missing data. These can be insidious,
       in that the unwary user is unlikely to realize that anything is wrong.
       Lack of flexibility in analyses that can be done due to its expectations regarding
       the arrangement of data. This results in the need to cut/paste/sort/ and
       otherwise rearrange the data sheet in various ways, increasing the likelyhood of
       errors.
       Output scattered in many different worksheets, or all over one worksheet, which
       you must take responsibility for arranging in a sensible way.
       Output may be incomplete or may not be properly labeled, increasing possibility
       of misidentifying output.
       Need to repeat requests for the some analyses multiple times in order to run it for
       multiple variables, or to request multiple options.
       Need to do some things by defining your own functions/formulae, with its
       attendant risk of errors.
       No record of what you did to generate your results, making it difficult to
       document your analysis, or to repeat it at a later time, should that be necessary.


   If you have more than about 10 or 12 columns, and/or want to do anything beyond
descriptive statistics and perhaps correlations, you should be using a statistical package.
There are several suitable ones available by site license through OIT, or you can use
them in any of the OIT PC labs. If you have Excel on your own PC, and don’t want to

                                                                                       57
pay for a statistical program, by all means use Excel to enter the data (with rows
representing the subjects, and columns for the variables). All the mentioned statistical
packages can read Excel files, so you can do the (time-consuming) data entry at home,
and go to the labs to do the analysis.


   I have found Excel to be eminently suitable for use in my measurement and data
analysis classes. It’s not only suitable, but a very effective and readily-available tool for
introducing students to contemporary data analysis methods. Excel’s fundamental data
table design, coupled with useful chart capabilities, easily leads students down paths
which will pave the way for their later application of such systems as SPSS and SAS.

   Although Excel is a fine spreadsheet, it is not a statistical data analysis package. In all
fairness, it was never intended to be one. Keep in mind that the Data Analysis ToolPak is
an "add-in" - an extra feature that enables you to do a few quick calculations. So it should
not be surprising that that is just what it is good for - a few quick calculations. If you
attempt to use it for more extensive analyses, you will encounter difficulties due to any
or all of the following limitations:


       Potential problems with analyses involving missing data. These can be insidious,
       in that the unwary user is unlikely to realize that anything is wrong.
       Lack of flexibility in analyses that can be done due to its expectations regarding
       the arrangement of data. This results in the need to cut/paste/sort/ and
       otherwise rearrange the data sheet in various ways, increasing the likelyhood of
       errors.
       Output scattered in many different worksheets, or all over one worksheet, which
       you must take responsibility for arranging in a sensible way.
       Output may be incomplete or may not be properly labeled, increasing possibility
       of misidentifying output.
       Need to repeat requests for the some analyses multiple times in order to run it for
       multiple variables, or to request multiple options.

                                                                                         58
Need to do some things by defining your own functions/formulae, with its
       attendant risk of errors.
       No record of what you did to generate your results, making it difficult to
       document your analysis, or to repeat it at a later time, should that be necessary.


   If you have more than about 10 or 12 columns, and/or want to do anything beyond
descriptive statistics and perhaps correlations, you should be using a statistical package.
There are several suitable ones available by site license through OIT, or you can use
them in any of the OIT PC labs. If you have Excel on your own PC, and don� want to
                                                                          t
pay for a statistical program, by all means use Excel to enter the data (with rows
representing the subjects, and columns for the variables). All the mentioned statistical
packages can read Excel files, so you can do the (time-consuming) data entry at home,
and go to the labs to do the analysis.




                                                                                      59
13. Recommendations




                      60
14. Scope for Future Study




                             61
15. References




                 62

More Related Content

What's hot

What's hot (20)

Vba part 1
Vba part 1Vba part 1
Vba part 1
 
E learning excel vba programming lesson 3
E learning excel vba programming  lesson 3E learning excel vba programming  lesson 3
E learning excel vba programming lesson 3
 
Excel-VBA
Excel-VBAExcel-VBA
Excel-VBA
 
Presentation programs
Presentation programsPresentation programs
Presentation programs
 
Data Analysis with MS Excel.pptx
Data Analysis with MS Excel.pptxData Analysis with MS Excel.pptx
Data Analysis with MS Excel.pptx
 
Msaccess
MsaccessMsaccess
Msaccess
 
Microsoft visio
Microsoft visioMicrosoft visio
Microsoft visio
 
DAX (Data Analysis eXpressions) from Zero to Hero
DAX (Data Analysis eXpressions) from Zero to HeroDAX (Data Analysis eXpressions) from Zero to Hero
DAX (Data Analysis eXpressions) from Zero to Hero
 
What is a special gl
What is a special glWhat is a special gl
What is a special gl
 
Microsoft office introduction
Microsoft office introductionMicrosoft office introduction
Microsoft office introduction
 
Ms access
Ms accessMs access
Ms access
 
E learning excel vba programming lesson 1
E learning excel vba programming  lesson 1E learning excel vba programming  lesson 1
E learning excel vba programming lesson 1
 
Power bi
Power biPower bi
Power bi
 
Pivot table
Pivot tablePivot table
Pivot table
 
Using vlookup in excel
Using vlookup in excelUsing vlookup in excel
Using vlookup in excel
 
Visual basic 6.0
Visual basic 6.0Visual basic 6.0
Visual basic 6.0
 
Vb basics
Vb basicsVb basics
Vb basics
 
Getting started with Microsoft Excel Macros
Getting started with Microsoft Excel MacrosGetting started with Microsoft Excel Macros
Getting started with Microsoft Excel Macros
 
Excel Macro Magic
Excel Macro MagicExcel Macro Magic
Excel Macro Magic
 
Excel training
Excel trainingExcel training
Excel training
 

Viewers also liked

Ug recording excelmacros
Ug recording excelmacrosUg recording excelmacros
Ug recording excelmacrosHarry Adnan
 
Introduction To Excel 2007 Macros
Introduction To Excel 2007 MacrosIntroduction To Excel 2007 Macros
Introduction To Excel 2007 MacrosExcel
 
Deloitte 07 08 june 2012 - presentation material
Deloitte 07 08 june 2012 - presentation materialDeloitte 07 08 june 2012 - presentation material
Deloitte 07 08 june 2012 - presentation materialAyu Intani
 
CFA II Quantitative Analysis
CFA II Quantitative AnalysisCFA II Quantitative Analysis
CFA II Quantitative AnalysisPristine Careers
 
Boddington Modelling Services
Boddington Modelling ServicesBoddington Modelling Services
Boddington Modelling ServicesBoddingt
 
Macros vba word office
Macros vba word officeMacros vba word office
Macros vba word officefgu
 
формы работы с детьми по фгос
формы работы с детьми по фгосформы работы с детьми по фгос
формы работы с детьми по фгосvirtualtaganrog
 
Binomial, Geometric and Poisson distributions in excel
Binomial, Geometric and Poisson distributions in excelBinomial, Geometric and Poisson distributions in excel
Binomial, Geometric and Poisson distributions in excelBrent Heard
 

Viewers also liked (19)

Ug recording excelmacros
Ug recording excelmacrosUg recording excelmacros
Ug recording excelmacros
 
Introduction To Excel 2007 Macros
Introduction To Excel 2007 MacrosIntroduction To Excel 2007 Macros
Introduction To Excel 2007 Macros
 
Deloitte 07 08 june 2012 - presentation material
Deloitte 07 08 june 2012 - presentation materialDeloitte 07 08 june 2012 - presentation material
Deloitte 07 08 june 2012 - presentation material
 
Quantitative Methods - Level II - CFA Program
Quantitative Methods - Level II - CFA ProgramQuantitative Methods - Level II - CFA Program
Quantitative Methods - Level II - CFA Program
 
Economics - Level II - CFA Program
Economics - Level II - CFA ProgramEconomics - Level II - CFA Program
Economics - Level II - CFA Program
 
CFA II Quantitative Analysis
CFA II Quantitative AnalysisCFA II Quantitative Analysis
CFA II Quantitative Analysis
 
Boddington Modelling Services
Boddington Modelling ServicesBoddington Modelling Services
Boddington Modelling Services
 
Excel macro
Excel macroExcel macro
Excel macro
 
Introduction to Interest Rrate Risk Management
Introduction to Interest Rrate Risk ManagementIntroduction to Interest Rrate Risk Management
Introduction to Interest Rrate Risk Management
 
Introduction to Liquidity Risk Management
Introduction to Liquidity Risk ManagementIntroduction to Liquidity Risk Management
Introduction to Liquidity Risk Management
 
Macros vba word office
Macros vba word officeMacros vba word office
Macros vba word office
 
Credit Risk FRM Part II
Credit Risk FRM Part IICredit Risk FRM Part II
Credit Risk FRM Part II
 
How to reduce file size in excel
How to reduce file size in excelHow to reduce file size in excel
How to reduce file size in excel
 
Macros y VBA tema 5
Macros y VBA tema 5Macros y VBA tema 5
Macros y VBA tema 5
 
FAST financial model design
FAST financial model designFAST financial model design
FAST financial model design
 
Setting up excel for financial modelling
Setting up excel for financial modellingSetting up excel for financial modelling
Setting up excel for financial modelling
 
формы работы с детьми по фгос
формы работы с детьми по фгосформы работы с детьми по фгос
формы работы с детьми по фгос
 
Assignment Marketing-segmentation,position,etc...
Assignment Marketing-segmentation,position,etc...Assignment Marketing-segmentation,position,etc...
Assignment Marketing-segmentation,position,etc...
 
Binomial, Geometric and Poisson distributions in excel
Binomial, Geometric and Poisson distributions in excelBinomial, Geometric and Poisson distributions in excel
Binomial, Geometric and Poisson distributions in excel
 

Similar to MS Excel Macros/ VBA Project report

Excel Macros and VBA Programming Training Bangalore:
Excel Macros and VBA Programming Training Bangalore:Excel Macros and VBA Programming Training Bangalore:
Excel Macros and VBA Programming Training Bangalore:IGEEKS TECHNOLOGIES
 
Excel training
Excel trainingExcel training
Excel trainingseomonster
 
ATI Courses Professional Development Short Course Engineering Systems Modelin...
ATI Courses Professional Development Short Course Engineering Systems Modelin...ATI Courses Professional Development Short Course Engineering Systems Modelin...
ATI Courses Professional Development Short Course Engineering Systems Modelin...Jim Jenkins
 
Vba for financial engg course structure
Vba for financial engg course structureVba for financial engg course structure
Vba for financial engg course structureqcfinance
 
Controlling Excel Chaos
Controlling Excel ChaosControlling Excel Chaos
Controlling Excel Chaosgreghawes
 
Excel for QA - must know
Excel for QA - must knowExcel for QA - must know
Excel for QA - must knowOlesia Hirnyk
 
High impact data visualization with power view, power map, and power bi
High impact data visualization with power view, power map, and power biHigh impact data visualization with power view, power map, and power bi
High impact data visualization with power view, power map, and power biHoàng Việt
 
IT Training Courses
IT Training CoursesIT Training Courses
IT Training Coursesmglonud3988
 
IT 330 Final Project Guidelines and Rubric Overview .docx
IT 330 Final Project Guidelines and Rubric  Overview .docxIT 330 Final Project Guidelines and Rubric  Overview .docx
IT 330 Final Project Guidelines and Rubric Overview .docxchristiandean12115
 
2014_report
2014_report2014_report
2014_reportK SEZER
 
Mastering-Advanced-Excel-Techniques-Macros-VBA-and-Automation.pptx
Mastering-Advanced-Excel-Techniques-Macros-VBA-and-Automation.pptxMastering-Advanced-Excel-Techniques-Macros-VBA-and-Automation.pptx
Mastering-Advanced-Excel-Techniques-Macros-VBA-and-Automation.pptxAttitude Tally Academy
 

Similar to MS Excel Macros/ VBA Project report (20)

Excel Macros and VBA Programming Training Bangalore:
Excel Macros and VBA Programming Training Bangalore:Excel Macros and VBA Programming Training Bangalore:
Excel Macros and VBA Programming Training Bangalore:
 
Excel training
Excel trainingExcel training
Excel training
 
ATI Courses Professional Development Short Course Engineering Systems Modelin...
ATI Courses Professional Development Short Course Engineering Systems Modelin...ATI Courses Professional Development Short Course Engineering Systems Modelin...
ATI Courses Professional Development Short Course Engineering Systems Modelin...
 
Brochure
BrochureBrochure
Brochure
 
Vba for financial engg course structure
Vba for financial engg course structureVba for financial engg course structure
Vba for financial engg course structure
 
Controlling Excel Chaos
Controlling Excel ChaosControlling Excel Chaos
Controlling Excel Chaos
 
Why MVC?
Why MVC?Why MVC?
Why MVC?
 
Advanced excel brochure
Advanced excel   brochureAdvanced excel   brochure
Advanced excel brochure
 
MS-Excel
MS-ExcelMS-Excel
MS-Excel
 
Excel for QA - must know
Excel for QA - must knowExcel for QA - must know
Excel for QA - must know
 
High impact data visualization with power view, power map, and power bi
High impact data visualization with power view, power map, and power biHigh impact data visualization with power view, power map, and power bi
High impact data visualization with power view, power map, and power bi
 
Vbabook ed2
Vbabook ed2Vbabook ed2
Vbabook ed2
 
IT Training Courses
IT Training CoursesIT Training Courses
IT Training Courses
 
IT 330 Final Project Guidelines and Rubric Overview .docx
IT 330 Final Project Guidelines and Rubric  Overview .docxIT 330 Final Project Guidelines and Rubric  Overview .docx
IT 330 Final Project Guidelines and Rubric Overview .docx
 
2014_report
2014_report2014_report
2014_report
 
Mastering-Advanced-Excel-Techniques-Macros-VBA-and-Automation.pptx
Mastering-Advanced-Excel-Techniques-Macros-VBA-and-Automation.pptxMastering-Advanced-Excel-Techniques-Macros-VBA-and-Automation.pptx
Mastering-Advanced-Excel-Techniques-Macros-VBA-and-Automation.pptx
 
VBA Macros Course Content
VBA Macros Course ContentVBA Macros Course Content
VBA Macros Course Content
 
VBA Macros Course Content
VBA Macros Course ContentVBA Macros Course Content
VBA Macros Course Content
 
Aprende ecxel
Aprende ecxelAprende ecxel
Aprende ecxel
 
Aprende ecxel
Aprende ecxelAprende ecxel
Aprende ecxel
 

Recently uploaded

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 

Recently uploaded (20)

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 

MS Excel Macros/ VBA Project report

  • 1. Effectiveness of Macro/VBA (Visual Basic for Application) of MS-Excel in Data Analysis.
  • 2. Executive Summary Due primarily to its widespread availability, Microsoft Excel is often the de facto choice of engineers and scientists in need of software for measurement data analysis and manipulation. Microsoft Excel lends itself well to extremely simple test and measurement applications and the financial uses for which it was designed; however, in an era when companies are forced to do more with less, choosing the appropriate tools to maximize efficiency (thereby reducing costs) is imperative. One thing virtually every reader has in common is the need to automate some aspect of Excel. That is what VBA is all about. Microsoft Excel primarily used for the management, inspection, analysis, and reporting of acquired or simulated engineering and scientific data – offers efficiency gains and scalability with features of data post-processing applications. Visual Basic for Applications (VBA) can be used in conjunction with Microsoft Excel to automate data analysis tasks. In particular, VBA can be used to lock in on and parse business data then automatically engage Excel's data processing tools to perform the analysis tasks you want. When the analysis is completed, VBA can then be used to automatically create reports on Excel worksheets or in other applications such as Word and PowerPoint. VBA can also be used to create your own custom data analysis tools. You're probably aware that people use Excel for thousands of different tasks. Here are just a few examples: Keeping lists of things, such as customer names, students' grades, or holiday gift ideas Budgeting and forecasting Analyzing scientific data Creating invoices and other forms Developing charts from data tab mark 2
  • 3. The list could go on and on, but you get the idea. Excel is used for a wide variety of things, and everyone reading this article has different needs and expectations regarding Excel. VBA is a fantastic tool for empowering analysts to build their own solutions to problems. It gives analysts the power to create innovative new bits of kit without learning the sort of heavyweight programming that is the preserve of full-time coders with computer science degrees. What analysts produce in VBA - and I speak from personal experience here - is quite often horrifying to their IT departments. Even very good code by analyst standards is a world away from the way that a good programmer might chose to solve a problem. For one thing, no programmer worth the name would have started their build in VBA. The thing is, even with that coding deficiency, VBA works. It makes an awful lot of businesses run. And along the way, it's completely hamstrung Microsoft with Excel upgrades, because it's too embedded in too many places to change it now. VBA survived and it prospered. It did that because it met a need and it's a need that large companies in particular, go out of their way to avoid happening with other bits of software. Excel was the Trojan horse that put IT capability into the hands of people who aren't supposed to have it. As VBA starts to show its age, we're in danger of drifting towards a world where centralized IT departments control access to data and access to the tools that can work with it. For example, you might create a VBA program to format and print your month-end sales report. After developing and testing the program, you can execute the macro with a single command, causing Excel to automatically perform many time-consuming procedures. Rather than struggle through a tedious sequence of commands, you can grab a cup of coffee and let your computer do the work — which is how it's supposed to be, right? 3
  • 4. Objective Of Study Detail study of Macros /VBA (Visual Basic for Applications) and its effectiveness in Data Analysis. The project would enable the reader gain important insights into MS Excel and VBA with regards to its evolution and future growth. The study would also reflect upon the various methods or functions of Excel and VBA to stay ahead of the curve. The study would also delve into the technological aspects of the Data analysis with regards to its network spread, communication technologies, emerging technologies, etc. The project report would seek to shed light on the futuristic scope of the VBA with regards to new development language for businesses and its impact, etc. Broad Action Plan Detail exploration on helpfulness of VBA/ Macros in data analysis and Life cycle of macro and inventorization process in ABC Company. To gain a historical perspective of the MS Excel. To study the VBA from the perspective of its effectiveness in terms of analyzing data, etc. To identify the challenges faced while analyzing the data. To analyze future trends and thereby cite business opportunities for effective usage of the MS Excel and VBA. 4
  • 5. TABLE OF CONTENTS Sr. No Topic Page No 1. Background 10 2. Introduction to MS Excel 13 3. History 17 4. Introduction to Visual Basic (VB) 18 4.1 What is Visual Basic 18 5. Introduction to VBA 19 5.1 Why VBA 20 5.2 Calculations without VBA 22 5.3 Advantages of Using VBA 23 6. Miscellany 26 Simulation Example 29 7. 7.1 What is the algorithm 29 7.2 VBA code for this example 31 7.3 A trick to speed up the calculations 33 8 Reporting in Excel 35 9. Attributes of Good VBA Models 37 9.1 Documenting VBA Models 42 10. Caveats 45 10.1 General Issues 48 10.2 Results of Analyses 49 10.3 Additional Analyses 54 10.4 Requesting Many Analyses 57 10.5 Working with Many Columns 57 11. Beyond VBA 59 12. Conclusion 60 5
  • 6. 13. Recommendations 63 14. Scope for Future Study 64 15. References 65 6
  • 7. 1. Background I have experienced measurement, statistics, and data analysis courses for a period of the past 5 years. By and large, my colleagues have come from the non-technical, particularly from education and Finance. I generally find that people in the education will often have a negative attitude toward automation, and finance peoples seeks for automated methods, a fact that has at times added more than a small bit of extra challenge. My working has always had a practical bent. I make an effort to automate the data research or analysis, and I fully integrate technology in my work life. I cannot always assume that peoples are as technology-literate as I would like; at times it’s necessary to set aside instructional time in order to deal with specific technology topics, as I will mention below. 7
  • 8. Main Areas of Work Above graph displays the main areas of work on the basis of employees Main area of work in which software is used Today each sector is using the software in less or more manner; so above graph talks about the percentage of usage of software’s in main areas. 8
  • 9. Percentage of respondents using each package As we have software’s in organizations and above is the percentage of respondents using each package 9
  • 10. 2. Introduction to MS Excel Excel is the backbone to any custom built financial model, and requires having good technical Excel skills. By connecting to any type of databases (Oracle, IBM, SQL Server, OLAP) Excel can retrieve data from your corporate databases and files, you don't have to retype the data that you want to analyze in Excel. You can also refresh your financial spreadsheets and summaries automatically from the original source database whenever the database is updated with new information. A powerful and easy to use operational or financial model in Excel provides decision makers with analytical capabilities to assess the outcomes of a range of scenarios. Good financial management and financial governance are at the core of good management. They help to drive performance by supporting effective decision making, aiding the efficient running of organizations and maximizing the effective use of resources. Good financial management is also essential to maintain the stewardship and accountability of public funds. The way government bodies collect, analyze and utilize financial management information directly impacts on the performance of their organizations and the delivery of their objectives.  Financial modeling with Excel A Financial Model is complex spreadsheet structured, dynamic and flexible. It contains a set of variable assumptions, inputs, outputs, calculations, and scenarios. The objective is, by changing input data, to explore relationships between several variables and to test the results of these changes on the output results of ad hoc scenario. It allows simulate a wide range of scenarios and sensitivity analysis in short period of time. And, this can be done faster and easier in an Excel spreadsheet than in any Analytics application (SAP, SAS, Siebel, etc). 10
  • 11. Concretely, a Financial Model can be used to match different needs or objectives. It can be a Business Case, a Profitability Analysis, a Budget, a Reporting or a Forecasting study. All of them are efficient tools that help executives and managers monitor, manage and run their business. Because they are in all the levels and departments of company, Excel spreadsheets are the solutions. The challenge is to design a low maintenance user- friendly reporting tools that automatically consolidate, analyze, transform, update and present the information needed.  Business Cases A business case has to translate business ideas from vague concepts into a concrete set of numbers, and to score high in credibility, accuracy, and practical value in order to be the financial backbone to the project’s concepts. A business case is based on ―What-if analysis and scenario management‖ essential to answer typical business case analysis questions, and has a strong focus on cash-flow evaluation of strategic business decisions in order to assess the financial feasibility of the project. A business base is referred to frequently during the project, to determine whether it is currently on track. And at the end of the project, success is measured against the ability to meet the objectives defined in the Business Case. So the completion of a Business Case is critical to the success of the project.  Cost and Profitability Analysis A cost and profitability analysis helps to determine where resources should be allocated to maximize profit. This type of analysis not only serves as a tool to make more informed decisions, but can also identify ways to improve business processes. For example, it helps to identify who are the most profitable customers in order to focus on 11
  • 12. them. The well-known 80/20 states 80% of your profit usually comes from 20% of your customers - but which 20%? A profitability analysis may not be only based on customers, but also on products or activities.  Budgeting Creating, monitoring and managing a budget is key to business success. It should help you allocate resources where they are needed. It is extremely important to know how much money you have to spend, and where you are spending it. It is the most effective way to control your cash-flow, to keep your business - and its finances - on track - allowing you to invest in new opportunities at the appropriate time. If your business is growing, you may not always be able to be hands-on with every part of it. You may have to split your budget up between different areas or departments such as sales, production, marketing, administration, etc. You'll find that money starts to move in many different directions through your organization - budgets are a vital tool in ensuring that you stay in control of expenditure. A budget is a plan to: control your finances ensure you can continue to fund your current commitments enable you to make confident financial decisions and meet your objectives ensure you have enough money for your future projects You should stick to your budget as far as possible, but review and revise it as needed. Successful businesses often have a rolling budget. 12
  • 13.  Management Reporting Management Reporting keeps track of all the change and compares historical data versus actual or original projection versus reality. We assist in setting up and monitoring effective and timely management reports, using financial and non-financial data: Measure the business - weekly, monthly and annually Effectively handle market changes and manage associated costs Set up internal business practices and structures for reporting internally Pin point problem areas.  Forecasting Once you have created a budget and related this to actual numbers, you can create a dynamic rolling forecast which can be updated on a regular basis. This can be done on a weekly or even daily basis to give you a more accurate and up to date picture of cash flow and profit and loss. Before you start forecasting, remember that revenue projections are only as meaningful as your baseline data. Make sure the data is complete, correct and ordered. There needs to be enough historical sales data to accurately perform an analysis, typically seven to ten time periods; the longer that forecast timeline the more accurate the forecast. The data must be ordered from oldest to newest. If there is any missing data for a time period, then estimate the number as accurately as possible. the time periods need to be uniform; for example compare months to months or year to years. 13
  • 15. 4. Introduction to Visual Basic (VB) Visual basic is one of the most popular programming languages in the market today. Microsoft has positioned it to fit multiple purposes in development. The language ranges from light weight vb script programming, to application specific programming with vb for applications 4.1 What is Visual Basic? The visual part refers to the method used to create GUI. Rather than writing numerous lines of code to describe the appearance and location of interface elements, we simplify add rebuilt objects into place on screens. VB is high level programming language evolved from earlier DOS version called BASIC. VB is event driven programming VB programs are made up of many sub programs , each has its in own program codes and each can be executed independently and at the same time each can be linked in one way or another. VB is designed to deploy applications across the enterprise and to scale of any size needed the ability to develop object mode is databases integration, server components, and Internet/Intranet applications provides an extensive range of capabilities and tools of the developer. In particular VB lets us to add menus, textboxes, command buttons, option buttons, check boxes, scroll bars, and file & directory boxes to blank windows. We can communicate with other window applications and perhaps most importantly we will have an easy method to let users’ control and access database. 15
  • 16. 5. Introduction to VBA Visual Basic for Applications, Excel’s powerful built-in programming language, permits you to easily incorporate user written functions into a spreadsheet. User can easily calculate Black-Scholes and binomial option prices, For example, in case you think VBA is something esoteric which you will never otherwise need to know, VBA is now the core macro language for all Microsoft’s office products, including Word. It has also been incorporated into software from other vendors. You need not write complicated programs using VBA in order for it to be useful to you. At the very least, knowing VBA will make it easier for you to analyze relatively complex problems for yourself. This document presumes that you have a basic knowledge of Excel, including the use of built-in functions and named ranges. I do not presume that you know anything about writing macros or programming. The examples here are mostly related to option pricing, but the principles apply generally to any situation where you use Excel as a tool for numerical analysis. The Windows version of Excel supports programming through Microsoft'sVisual Basic for Applications (VBA), which is a dialect of Visual Basic. Programming with VBA allows spreadsheet manipulation that is awkward or impossible with standard spreadsheet techniques. Programmers may write code directly using the Visual Basic Editor (VBE), which includes a window for writing code, debugging code, and code module organization environment. The user can implement numerical methods as well as automating tasks such as formatting or data organization in VBA and guide the calculation using any desired intermediate results reported back to the spreadsheet. VBA was removed from Mac Excel 2008, as the developers did not believe that a timely release would allow porting the VBA engine natively to Mac OS X. VBA was restored in the next version, Mac Excel 2011. 16
  • 17. A common and easy way to generate VBA code is by using the Macro Recorder. The Macro Recorder records actions of the user and generates VBA code in the form of a macro. These actions can then be repeated automatically by running the macro. The macros can also be linked to different trigger types like keyboard shortcuts, a command button or a graphic. The actions in the macro can be executed from these trigger types or from the generic toolbar options. The VBA code of the macro can also be edited in the VBE. Certain features such as loop functions and screen prompts by their own properties, and some graphical display items, cannot be recorded, but must be entered into the VBA module directly by the programmer. Advanced users can employ user prompts to create an interactive program, or react to events such as sheets being loaded or changed. Users should be aware that using Macro Recorded code may not be compatible from one version of Excel to another. Some code that is used in Excel 2010 cannot be used in Excel 2003. Making a Macro that changes the cell colors and making changes to other aspects of cells may not be backward compatible. VBA code interacts with the spreadsheet through the Excel Object Model,[14]a vocabulary identifying spreadsheet objects, and a set of supplied functions or methods that enable reading and writing to the spreadsheet and interaction with its users (for example, through custom toolbars or command bars and message boxes). User-created VBA subroutines execute these actions and operate like macros generated using the macro recorder, but are more flexible and efficient. 5.1 Why VBA? Macros have been used as development tool since the early days of the Microsoft Office product line. Microsoft Access macros incorporate generalized database functions using existing Microsoft Access capabilities. Errors in a macro can be easily resolved by 17
  • 18. using the Microsoft supplied Help function. The ease with which you can generate Macros makes Macro development seems easier to accomplish. You can generate macros by selecting database operations and commands in the Macro window. These macros can then be converted to Microsoft Access VBA. In most cases, you need only make minor edits to the saved code in order to have a functional program. All syntax, spacing and functionality are included in the saved file, which contains VBA code specific to the particular application being recorded. Unskilled programmers are able to interpret the code and learn how to generate code to accomplish specific tasks. In the process, the novice programmer may gain a useful introduction to VBA code. Building Macros can be easier and faster than writing VBA code, for simple applications, and making global key assignments, however, more advanced and complex applications are not so easily accomplished using macros. People tend to consider Macros because VBA code is perceived to be more programmatic, offering a variety of options that appear confusing and time consuming to understand. These options, however, provide developers with tools to extend Microsoft Access capabilities beyond those packaged with the Microsoft Access software. If building or generating macros comes easily and does not consume great amounts of your time, you may want to consider their use, particularly if you want to accomplish rather simple tasks. If, however, you find macros to be time consuming and tedious, as many have attested to, you may want to consider building VBA code. By learning and building upon VBA skills, you acquire a programming skill set that is applicable and portable to various other applications. Macros, on the other hand, are used in many applications, but they are specific to a particular application. Macros, in most cases, are not portable to other applications. VBA is one of the more easy-to-learn programming languages. It does not require the complex programming techniques that are necessary to program C++ or other high level languages. VBA provides a user-friendly, forms-based interface to assign variables and 18
  • 19. simplify code development. VBA is a widely used application so that help is available from a variety of sources. A second party would have to know and understand your particular application in order to assist you with building a macro. VBA can be used to perform any operation that a macro can perform. VBA also allows you to perform a multitude of more advanced operations to include the following: Incorporate error-handling modules to assist in the running of your applications. Integrate Word and Excel features in your database Present users with professional forms-based layouts to interface with your database Process data in the background Create multi-purpose forms Perform conditional looping 5.2 Calculations without VBA Suppose you wish to compute the Black-Scholes formula in a spreadsheet. Suppose also that you have named cells 2 for the stock price (s), strike price (k), interest rate (r), time to expiration (t), volatility (v), and dividend yield (d). You could enter the following into a cell: s*exp(-d*t)*norms dist((ln(s/k)+(r-d+vˆ2/2)* t)/(v*t ˆ0.5))−k *exp(- r * t)* normsdist((ln(s /k)+(r -d-vˆ2/2)*t)/(v*tˆ0.5)) Typing this formula is cumbersome, though of course you can copy the formula wherever you would like it to appear. It is possible to use Excel’s data table feature to create a table of Black-Scholes prices, but this is cumbersome and inflexible. If you want to calculate option Greeks (e.g. delta, gamma, etc...) you must again enter or copy the formulas into each cell where you want a calculation to appear. And if you decide to 19
  • 20. change some aspect of your formula, you have to hunt down all occurrences and make the changes. When the same formula is copied throughout a worksheet, that worksheet potentially becomes harder to modify in a safe and reliable fashion. When the worksheet is to b e used by others, maintainability becomes even more of a concern. Spreadsheet construction becomes even harder if you want to, for example, compute a price for a finite- lived America n option. There is no way to do this in one cell, so you must compute the binomial tree in a range of cells, and copy the appropriate formulas for the stock price and the option price. It is not so bad with a 3-step binomial calculation, but for 100 steps you will spend quite a while setting up the spreadsheet. You must do this separately for each time you want a binomial price to appear in the spreadsheet. And if you decide you want to se t up a put pricing tree, there is no easy way to edit your call tree to price puts. Of course you can make the formulas quite flexible and general by using lots of ―if‖ statements. But things would be come much easier if you could create your own formulas within Excel. You can — with Visual Basic for Applications. 5.3 Advantages of Using VBA VBA, or Visual Basic for Applications, is the simple programming language that can be used within Excel 2007 (and earlier versions, though there are a few changes that have been implemented with the Office 2007 release) to develop macros and complex programs. The advantages of which are: The ability to do what you normally do in Excel, but a thousand times faster The ease with which you can work with enormous sets of data To develop analysis and reporting programs downstream from large central databases such as Sybase, SQL Server, and accounting, financial and production programs such as Oracle, SAP, and others. 20
  • 21. Macros save keystrokes by automating frequently used sequences of commands, and developers use macros to integrate Office with enterprise applications - for example, to extract customer data automatically from Outlook e-mails or to look up related information in CRM systems or to generate Excel spreadsheets from data extracted from enterprise resource planning (ERP) systems. To create an Excel spreadsheet with functionality beyond the standard defaults, you write code. Microsoft Visual Basic is a programming environment that uses a computer language to do just that. Although VBA is a language of its own, it is in reality derived from the big Visual Basic computer language developed by Microsoft, which is now the core macro language for all Microsoft applications. To take advantage of the functionality of the Microsoft Visual Basic environment, there are many suggestions you can use or should follow. Below we will take a look at a few hints and tips for VBA security and protection in Excel, a more in-depth understanding of which can be gained by attending a VBA Excel 2007 course, delivered by a Microsoft certified trainer.  Password protecting the code As a VBA Excel user you may want to protect your code so that nobody may modify it and to protect against the loss of intellectual property if people access source code without permission. This is easily achieved in the VBE editor by going to "Tools/VBA Project Properties/Protection". Check the box and enter a password.  Hiding worksheets In any or all of your Excel workbooks you might want to hide a worksheet that contains sensitive or confidential information from the view of other users of the workbook. If you just hide the worksheet in the standard way the next user will be able to simply un-hide it, but by using a VBA method to hide and password protect a 21
  • 22. worksheet, without protecting the entire workbook, you will be able to allow other users access without affecting the confidentiality of the data.  Protecting workbooks There are different levels of protection for workbooks, from not allowing anyone access to the workbook to not allowing any changes to be made to it, i.e. setting the security to 'read only' so that no changes can be made to the templates you have created. 22
  • 23. 6. Miscellany  Getting Excel to generate your macros for you Suppose you want to perform a task and you don’t have a clue how to program it in VBA. For example, suppose you want to c re ate a subroutine to set up a graph. You can set up a graph manually, and tell Excel to record the VBA commands which accomplish the same thing. You then examine the result and see how it works. To do this, select Tools |Record Macro |Record New Macro. Excel will record all your actions in a new module located atthe end of your workbook, i.e. following Sheet16. You stop the recording by clicking the - ?- button which should have app eared on your spreadsheet then you started recording. Macro recording is an extremely useful tool for understanding how Excel and VBA work and interact; this is in fact how the Excel experts learn how to w rite macros which control Excel’s actions. For example, here is the macro code Excel generates if you use the chart wizard to set up a chart using data in the range A2:C4. You can see among other things that the selected graph style was the fourth line graph in the graph gallery, and that the chart was titled ―Here is the title ‖. Also, each data series is in a column and that the first column was used as the x-axis (―CategoryLabels:=1‖). ’ Macro1 Macro ’ Macro recorded <Date> by <UserName> Sub Macro1() Range(”A2:C4”).Select ActiveSheet.ChartObjects. Add (196.5, 39, 252.75, 162).Select ActiveChart.ChartWizard Source:=Range(”A2:C4”),Gallery:=xlLine, Format:=4,PlotBy:=xlColumns,CategoryLabels:=1,SeriesLabels:=0, HasLegend:=1,Title:=”Here is the Title”, CategoryTitle:=”X−Axis”, ValueTitle :=”Y −Axis”, ExtraTitle:=”” End Sub 23
  • 24.  Using multiple modules You can split up your functions and subroutines among as many modules as you like —functions from one module can call another, for example. Using multiple module s is often convenient for clarity. If you put everything in one module you will sp end a lot of time sc rolling around.  Re-calculation speed One unfortunate drawback of VBA—and of most macro c o de in most applications —is that it is slow. When you are using built- in functions, Excel performs clever internal checking to know whether something requires recalculation (you should be aware that on occasion it appears that this clever checking goes awry and something which should be recalculated, isn’t). When you write a custom function, however, Excel is not able to perform its checking on your functions, and it therefore tends to recalculate everything. This means that if you have a complicated spreadsheet, you may find very slow recalculation times. This is a problem with custom functions and not one you can do anything about. There are tricks for speeding things up. Here are two: • If you are looping a great deal, be sure to declare your looping index variables as integers. This will speed up Excel’s handling of these variables. For example, if you use i as the index in a for loop, use the statement Dim i as integer While a lengthy subroutine is executing, you may wish to turn off Excel’s screen updating. You do this by Application.ScreenUpdating = False 24
  • 25. This will only work in subroutines, not in functions. If you want to check the progress of your calculations, you can turn Screen Updating off at the beginning of your subroutine. Whenever you would like to see your calculation’s progress (for example every 100th iteration) you can turn it on and then immediately turn it off again. This will update the display. • Finally, here is a good thing to know: Ctrl-Break will (Usually) stop a recalculation! Remember this. Someday you will thank me for it. Ctrl-Break is more reliable if your macro writes output to the screen or spreadsheet.  Debugging We will not go into details here, but VBA has very sophisticated de bugging capabilities. For example, you can set breakpoints (i.e. lines in your routine where Excel will stop calculating to give you a chance to see what is happening) and watches (which means that you can look at the values of variables at diff erent points in the routine). Look up ―debugging‖ in the online help.  Creating an Add-in Suppose you have written a useful set of option functions and wish to make them broadly available in your spreadsheets. You can make the functions automatically available in any spreadsheet your write by creating an add-in. To do this, you simply switch to a macro module and then Tools |Make Add-in. Excel will create a file with the XLA extension which contains your functions. You can then make these functions automatically available by Tools |Add-ins, and browse to locate your own add-in module if it does not appear on the list. Any functions available through an add-in will automatically show up in the function list under the set of ―userdefined‖ functions. 25
  • 26. 7. Simulation Example Suppose you have a large amount of money to invest. Suppose that at the end of the next five years you wish to have it fully invested in stocks. It is often asserted in the popular press that it is preferable to invest it in the market gradually, rather than all at once. In particular, consider the strategy of each quarter taking a pro rata share of what is left and investing it in stocks. So the first quarter invest 1/20th in stocks, the second invest 1/19th of money remaining in stocks, etc... It is obvious that the strategy in which we invest in stocks over time should have a smaller average return and a lower standard deviation than a strategy in which we plunge into stocks, but how much lower and smaller? Monte Carlo simulation is a natural tool to address a question like this. We will first see how to structure the problem and then analyze it in Excel. You may not understand the details of how the random stock price is generated. That does not matter for purposes of this example; rather, the important thing is to understand how the problem is structured and how that structure is translated into VBA. 7.1 What is the algorithm? To begin, we de scribe the investment strategy and the evolution over time of the portfolio. Suppose we initially have $100 which is invested in bonds and nothing invested in stock. Let the variable s BONDS and STOCK denote the amount invested in each. Let h be the fraction of a year between investments in stock (s o for example if h = .25, there are 4 transfers per year from bonds to stock), and let r, µ, and σ denote the risk- free rate, the expected return on the stock, and the volatility of the stock. Suppose we switch from bonds to stock 20 times, once a quarter for 5 years. Le t n = the number of times we will switch. We need to know the stock price each time we switch. Denote these prices by PRICE(0), PRICE(1), PRICE(2), ... , PRICE(20). Now each period, at the beginning. 26
  • 27. The following example is considerably more complicated than those that precede it. It is designed to illustrate many of the basic concepts in a non-trivial fashion. You may wish to skip it initially, and return to it once you have had some experience with VBA. If you are thinking about option-pricing, you might expect this example to be computed using the risk-neutral distribution. Instead, we will compare the actual payoff distributions of the two strategies in order to compare the means and standard deviations. If we wished to value the two strategies, we would substitute the risk-neutral distribution by replacing the 15% expected rate of return with the 10% risk-free rate. After making this substitution, both strategies would have the same expected payoff of $161 (100 × 1.15). Since both strategies entail buying assets at a fair price, there is no need to perform a valuation! Both will be worth the initial investment of the period we first switch some funds from bonds to stock. At the end of the period, we figure out how much we earned over the period. If we wish to switch a roughly constant proportion each period, we could s witch 1/20 the first period, 1/19 with 19 periods to go, and s o forth. This suggests that at the beginning of period j, bonds (j)=bonds(j−1) ∗ (1−1/(n+1 −j)) stock (j)=stock(j−1)+bonds(j −1)/(n+1 −j) At the end of the period we have stock (j)=stock(j ) ∗ price (j)/ price (j −1) bonds (j) = bonds(j) ∗ exp(r ∗ h) In words, during period j, we earn interest on the bonds and capital gains on the stock. We can think of the STOCK(j) and BONDS(j) on the right-hand side as denoting beginning of period values after we have allocated some dollars from bonds to s to ck, and the value s on the right-hand side as the end-of-period values after we have earned interest and capital gains. We compute capital gains on the stock by 27
  • 28. price (j) = price (j −1) ∗ exp((mu − 0.5 ∗ v ˆ 2) ∗ h+ v ∗ h ˆ (0.5) ∗ WorksheetFunction.Norm SInv(rnd())) As mentioned above, it is not important if you do not understand this expression. It is the standard way to create a random log normally-distributed stock price, where the expected return on the stock is mu, the volatility is v, and the length of a period is h. At the end, j = n, and we will invest all remaining bonds in stock, and earn returns for one final period. This describes the stock and bond calculations for a single set of randomly-drawn lognormal prices. Now we want to rep eat this process many time s. Each time, we will save the results of the trial and us e the m to compute the distribution. 7.2 VBA code for this example. We will set this up as a subroutine. The first several lines in the routine simply activate the worksheet where we will write the data, and then clear the area. We need two columns: one to store the terminal portfolio value if we invest fully in stock at the outset, the other to store the terminal value if invest slowly. Note that we have se t it up to run 2000 trials, and we also clear 2000 rows . We tell VBA that the variables ―bonds‖, ―stock‖, and ―price‖ are going to be arrays of type double, but we do not yet know what size to make the array. The Worksheets(―Invest Output‖).Activate makes the ―invest output‖ worksheet the default worksheet, so that all reading and writing will be done to it unless another worksheet is specified. 1 Sub Monte invest () 2 Dim bonds () As Double 3 Dim stock() As Double 4 Dim price () As Double 5 Worksheets(” Invest Output”). Activate 6 Range(”a1 .. b2000”). Select 7 Selection.Clear 8 ’number of monte−carlo trials 9 iter = 2000 28
  • 29. Now we set the parameters. The risk-free rate, mean return on the stock, and volatility are all annual numbers. We invest each quarter, so h = 0.25. There are 20 periods to keep track of since we invest each quarter for 5 years. Note that once we specify 20 periods, we can dimension the bonds, stock, and price variable s to run from 0 to 20. We do this using the ―Redim‖ command. 10 ’ number of reinvestment periods 11 n = 20 12 ’ Reset the dimension of the bonds and stock variable 13 ReDim bonds (0 To n), stock(0 To n), price(0 To n) 14 ’ length of each period 15 h = 0.25 16 ’ expected return on stock 17 mu = 0.15 18 ’ risk−free interest rate 19 r = 0.1 20 ’ volatility 21 v = 0.3 Now we have an outer loop. Each time through this outer loop, we have one trial, i.e. we draw a series of 20 random stock prices and we see what the terminal payoff is from our 2 strategies. Note that before we run through a single trial we have to initialize our variables : the initial stock price is 100, we have $100 of bonds and no stock, and ―Price (0)‖, which is the intial stock price , is set to 100. 22 ’each time through this lo op is one com ple te iteration 23 For i = 1 To iter: price (0) = 100 24 bonds (0) = 100 25 stock (0) = 0 This is the heart of the program. Each period for 20 periods we perform our allocation as above. Note that we draw a new random stock price using our standard lognormal expression. 29
  • 30. 26 For j = 1 To n 27 ’allocate 1/ n of bonds to stock 28 stock ( j ) = stock ( j −1) + bonds(j−1) / (n + 1 − j) 29 bonds (j) = bonds (j −1) ∗ (1 $ −$ 1 / (n + 1 $ −$ j)) 30 31 ’ draw a new lognormal stock price 32 price (j) = price (j −1) ∗ Exp ((mu − 0.5 ∗ v ˆ 2) ∗ h + 33 v ∗ h ˆ (0.5) ∗ Workshee tFunc tionNormSInv( nd ())) 34 35 ’ earn returns on bonds and stock 36 bonds (j) = bonds (j) ∗ Exp (r ∗ h) 37 stock (j ) = s to ck ( j ) ∗ ( price ( j ) / price ( j −1)) 38 39 Next j Once through this loop, all that remains is to write the results to ―sheet1‖. The following two statements do that, by writing the terminal price to column 1, row i, and the value of the terminal stock position to column 2, row i. 40 ActiveSheet.Cells (i, 1) = price (n) 41 ActiveSheet.Cells (i, 2) = stock (n) 42 Next i 43 End Sub Note that you could also write the data across in columns : you would do this by writing ActiveSheet.Cells(1, i) = p1 This would write the terminal price across the first row. 7.3 A trick to speed up the calculations Modify the inner loop by adding the two lines referring to ―screenupdating‖: ’ each time through this loop is one complete iteration For i = 1 To iter Application.ScreenUpdating = false ... If (i mod 100 = 0) then application.screenupdating = true ActiveSheet.Cells (i , 1) = price (i) 30
  • 31. ActiveSheet.Cells (i , 2) = stock (i) Next i The first line prevents Excel from updating the display as the subroutine is run. It turns out that it takes Excel quite a lot of time to redraw the spreadsheet and graphs when numbers are added. The second line at the end redraws the spreadsheet every 100 iterations. The ―mod‖ function returns the remainder from dividing the first number by the second. Thus, i mod 100 will equal 0 whenever i is evenly divisible by 100. So on iteration numbers 100, 200, etc..., the spreadsheet will b e redrawn. This cuts the calculation time approximately in half. Note that ―Application.ScreenUpdating‖ is an example of a command which only works within a subroutine. It will not work within a function. 31
  • 32. 8. Reporting in Excel While Excel was designed specifically to provide powerful, easy-to-use tools for transforming quantitative analysis into visual representations, Excel remains an excellent and extremely efficient way for business analysts to share the results of their work. Familiar, accessible, and widely available, Excel makes it relatively easy to generate attractive, flexible presentations that can be widely distributed. Reports created in Excel also make it easy for others to access underlying data, cut and paste it into their own spreadsheets, and make full use of the data in subsequent work. Integrating analysis into Excel is fast, easy, and reliable and can be done in any of three different ways, depending on the way the results will be used.  Scheduled Reports: For static reports that are updated on a regular basis-daily, weekly, or monthly, for example, A file−based solution is ideal. The process begins by integrating data sources into Excel. Once the computation is complete, VBA scripts automatically generate tables and graphics. They are delivered to Excel in comma-separated files and Windows metafiles, using a simple VB script to embed the results directly into a preformatted report.  Interactive Desktop Applications: Where more interactivity is required, an Excel add-in can be created that includes menus and dialogs for controlling the parameters of the report and the data to be analyzed. An VB script is created to run the analysis based on the chosen parameter. A call from Excel to VB is then made using a COM API that initiates the script and inserts the results into the report. This option is best suited to dynamic reports that are distributed to relatively small numbers of end users.  Client-Server Applications: The server-based option is ideal in situations where interactive reports are created or accessed by larger numbers of users and where 32
  • 33. the ability to change the underlying analytics quickly, and distribute them widely is desired. Similar to the client-based solution, an Excel add-in is created that includes menus and dialogs for controlling parameters of the analysis. Excel then uses an HTTP API to call a remote server where the VB script is run. Results are then inserted into Excel. This server-based approach enables organizations to take advantage of the power of server-based distributed technology to generate and disseminate analytics 33
  • 34. 9. Attributes of Good VBA Models While VBA models can be widely different from one another, all good ones need to have certain common attributes. In this section I briefly describe the attributes that you should try to build into your models. Some of these apply to Excel models as well. I am including them both here and under Excel so that you can have comprehensive lists of the attributes at both places.  Realistic Most models you develop will be directly or indirectly used to make some decisions. The output of the model must therefore be realistic. This means that the assumptions, mathematical relationships, and inputs you use in the model must be realistic. For most ―real-world‖ models, making sure of this takes a lot of time and effort, but this is not a place where you should cut corners. If a model does not produce realistic outputs, it does not matter how good its outputs look or how well it works otherwise.  Error-Free It is equally important that a model be error-free. You must test a model extensively to make sure of this. It is generally much easier to fix a problem when a model just does not work or produces obviously wrong answers. It is much harder to find errors that are more subtle and occur for only certain combinations of input values. See the chapter on debugging for help on making your models error-free.  Flexible The more different types of question a model can answer, the more useful it is. In the planning stage, you should try to anticipate the different types of questions the model is likely to be used to answer. You then do not have to make major changes every time someone tries to use it for something slightly different. 34
  • 35.  Easy to Provide Inputs Most VBA models need inputs from the user, and the easier it is for the user to provide the inputs, the better. Generally a VBA model can get inputs either through input dialog boxes (that is, through the InputBox function) or by reading them in from a spreadsheet (or database). Using input dialog boxes to get input data works well when there are only a few inputs—probably five or less. If the model needs more inputs, it is better to set up an input area in a spreadsheet (or, for large models, even a separate input spreadsheet) where the user can enter the input data before running the model. This approach is particularly helpful if the user is likely to change only one or two inputs from one run to the next. If a model uses a large number of input dialog boxes, the user will have to enter data in each of them every time he runs the model—even if he wants to change only one or two inputs. However, if the user has to provide some input (based on some intermediate outputs) while a procedure is running, then using input dialog boxes is the only option. If the model uses input dialog boxes, the prompt should provide enough specific information to help the user enter the right data in the right format. Similarly, if the input data is to be provided in certain cells in a spreadsheet, then there should be enough information in the next cell (or nearby) to help the user enter the right data in the right format.  Good Output Production A model that does not produce good outputs to get its major results across persuasively is not as useful as it can be. Producing reports with VBA models is gen- erally a two-step process: the model produces outputs on spreadsheets and then parts or all of the spreadsheets have to be printed out. For printed outputs good models should include built-in reports (in Excel) that any user can produce easily. The spreadsheet 35
  • 36. outputs produced by a VBA model should be such that they do not require too much manipulation before creating printed reports. These reports should be attractive, easy to read, and uncluttered. Avoid trying to squeeze in too much information on one page. If a report has to include a lot of data, organize it in layers so that people can start by looking at summary results and then dig into additional details as necessary. One of the advantages of VBA compared to other programming languages is that it can produce excellent graphical outputs using Excel’s charting features.VBA models should include graphical outputs wherever they will enhance the usefulness of the models. Another thing to keep in mind is that unlike an Excel model, the VBA model does not show intermediate results (unless either through message boxes or through spreadsheet outputs or charts). The modeler should therefore anticipate what output—intermediate and final—the user may want to see and provide for them in the model.  Data Validations It is generally more important to provide thorough data validation in VBA models than it is in Excel models. If the user accidentally enters invalid data, most of the time the model simply will not run, but it will not provide any useful information on what the problem is and leave the user in a helpless situation. You can, of course, have the VBA code check input data for various possible errors before using them. A simple alternate approach is to have the input data read in from spreadsheets and provide data validation to the input cells on the spreadsheet using Excel’s Data Validation feature. (To keep the codes short and to avoid repeating the same lines of codes, I have generally omitted data validation in the models in this book. Instead of writing data validation codes repeatedly, you can create and keep a few Sub procedures for the type of data validation you need to do for the type of models you work with most often and call them as needed.) 36
  • 37.  Judicious Formatting The formatting here refers to formatting of the model’s output. Poor, haphazard formatting reduces a model’s usefulness because it is distracting. Use formatting (fonts, borders, patterns, colors, etc.) judiciously to make your model’s outputs easier to understand and use. (As much as possible, create the formatting parts of the code by recording them with the Macro Recorder.)  Appropriate Numbers Formatting In the model’s outputs, you should format numbers with the minimum number of decimal points necessary. Using too many decimal points makes numbers difficult to read and often gives a false sense of precision as well. Make the format-ting of similar numbers uniform throughout the output. (Remember that displaying numbers with fewer decimal points does not reduce the accuracy of the model in any way because internally Excel and VBA continue using the same number of significant digits.) Wherever appropriate make numbers more readable by using special for-matting to show them in thousands or millions.  Well Organized and Easy to Follow The better organized a model is, the easier it is to follow and update. The key to making your code organized is to break it down into segments, each of which carry out one distinct activity or set of computations. One way to accomplish this is to use separate Sub procedures and Function procedures for many of such segments, especially the ones that will be repeated many times. In the extreme, the main Sub procedure may simply consist of calls to other Sub procedures and Function procedures. An additional advantage of this approach is that you can develop a number of Sub procedures and Function procedures to do things that you often need to do and incorporate them in other codes as needed. 37
  • 38. Using structured programming also makes a code easier to follow. In a structured program, the procedure is segmented into a number of stand-alone units, each of which has only one entry and one exit point. Control does not jump into or exit from the middle of these units. The proper visual design of a code can also make it easier to follow. For example, statements should be properly indented to show clearly how they fit into the various If, For, and other structures. Similarly, each major and minor segment of the code should be separated by blank lines and other means and informatively labeled. (The easiest way to learn these techniques is by imitating well-written codes.)  Statements Are Easy to Read and Understand Experienced programmers try to make their codes as concise as possible, often using obscure features of the programming language. Such codes may be admired by other experienced programmers, but they often baffle beginners. With the high speed of modern PCs, codes do not usually have to be concise or highly efficient. It is best to aim for codes that are easy to understand; even if that means that it has more lines of code than is absolutely necessary. Avoid writing long equations whenever you can. Break them up by doing long calculations in easily understandable steps. Make all variable names short but descriptive and not cryptic. If in a large model you decide to use a naming scheme, try to make it intuitive and provide an explanation of the scheme in the documentations.  Robust ―Robust‖ here refers to a code that is resistant to ―crashing.‖ It often takes significant extra work to make a code ―bulletproof,‖ and that time and effort may not be justified for many of the codes you will write. Nonetheless, the code should guard against obvious problems. For example, unless specified otherwise, a VBA code always works with the currently active spreadsheet. So throughout a code you should make sure that 38
  • 39. the right worksheet is active at the right time or else precede cell addresses, and so on, by the appropriate worksheet reference. Using effective data validation for the input data is another way of making your code robust.  Minimum Hard Coding Hard-coded values are difficult to change, especially in large models because there is always the danger of missing them in a few places. It is best to set up any value that may have to be changed later as a variable and use the variable in all equations. Even for values that are not going to change it is better to define and use constants. Then use them in the equations. This makes equations easier to read and guards against possible mistakes of typing in the wrong number.  Good Documentation Good documentation is key to understanding VBA models and is a must for all but trivial ones. For hints on producing good documentation, see the next section. 9.1 Documenting VBA Models Documenting a model means putting in writing, diagrams, flowcharts, and so on, the information someone else (or you in the future) will need to figure out what a model does, how it is structured, what assumptions are built into it, and so forth. A user can then make changes to (update) it if necessary. It should also include, for example, notes on any shortcuts you may have taken for now that should be fixed later and any assumptions or data you have used now that may need to be updated later. There is no standard format or structure for documenting a model. You have to be guided by the objectives mentioned above. Here are some common approaches to documenting your VBA models. Every model needs to be documented differently and everyone does documentation differently. Over time you will develop your own style. 39
  • 40.  Including Comments in the Code The most useful documenting tool in VBA is comments. Comments are notes and reminders you include at various places in the VBA code. You indicate a comment by an apostrophe. Except when it occurs in texts within quotation marks, VBA interprets an apostrophe as the beginning of a comment and ignores the rest of the line. You can use an entire line or blocks of line for comments or you can put a comment after a statement in a line (for example, to explain something about the statement). You should include in your code all the comments that may be helpful, but do not go overboard and include comments to explain things that are obvious. Including a lot of superfluous comments can make codes harder rather than easier to read. Here are some ideas on what types of comments you may want to include in your code: At the beginning of a procedure include a brief description of what the code does. At times it may also be useful to list the key inputs and outputs and some other information as well. Every time significant changes are made to the code, insert comments near the beginning of the code below the code description to keep track of the change date, the important changes made at that time, and who made the changes Sometimes it also helps to insert additional comments above or next to the statement(s) that has been changed to explain what was changed and why. Also record who made the change and when. If the procedure uses a particular variable naming scheme, then use comments to explain it. Use distinctive comment lines (for example, '*********) to break down long procedures into sections, and at the beginning of each section include a short name or description of the section. Use comments next to a variable to explain what it stands for, where its value came from, and anything else that may be helpful. 40
  • 41. You can get more ideas about what kind of comments to include in your code from the examples in this and other books. Over time you will develop your own style of providing comments in code. Make sure you insert comments as you code. If you put it off until later, your comments may not be as useful, inserting them may take longer because you may have to spend time trying to remember things, and worst of all, you may never get around to it. If you do not include good comments in your code, modifying it a few months later may take much longer.  Documenting Larger Models If you are developing a large model and saving different versions of the work-book as I have suggested, then the workbook should include a worksheet titled―Version Description.‖ In this worksheet, list against each version number the major changes you made to the code in that version. Every time you save your work under a new version name, start a new row of description under that version number in the Version Description worksheet and keep adding to it as you make major changes. The key is to do this as you go along and not wait until later when you may forget some of the changes you made. This is essentially the history (log) of the model’s development. If you ever want to go back to an earlier stage and go in a different direction from there, the log will save you a lot of time. Also, you may want to have several different versions of a model. You can document here how they differ from each other. For large models, you may also need to create a book of formal documentation (which will include information on why and how certain modeling decisions were made, flow charts for the model, etc.) and a user’s manual. For most of your work, however, documentation of the type I discussed should be adequate. 41
  • 42. 10. Caveats We used Excel to do some basic data analysis tasks to see whether it is a reasonable alternative to using a statistical package for the same tasks. We concluded that Excel is a poor choice for statistical analysis beyond textbook examples, the simplest descriptive statistics, or for more than a very few columns. The problems we encountered that led to this conclusion are in four general areas: Missing values are handled inconsistently, and sometimes incorrectly. Data organization differs according to analysis, forcing you to reorganize your data in many ways if you want to do many different analyses. Many analyses can only be done on one column at a time, making it inconvenient to do the same analysis on many columns. Output is poorly organized, sometimes inadequately labeled, and there is no record of how an analysis was accomplished. Excel is convenient for data entry, and for quickly manipulating rows and columns prior to statistical analysis. However when you are ready to do the statistical analysis, we recommend the use of a statistical package such as SAS, SPSS, Stata, Systat or Minitab. Excel is probably the most commonly used spreadsheet for PCs. Newly purchased computers often arrive with Excel already loaded. It is easily used to do a variety of calculations, includes a collection of statistical functions, and a Data Analysis ToolPak. As a result, if you suddenly find you need to do some statistical analysis, you may turn to it as the obvious choice. We decided to do some testing to see how well Excel would serve as a Data Analysis application. To present the results, we will use a small example. The data for this example is fictitious. It was chosen to have two categorical and two continuous variables, so that we could test a variety of basic statistical techniques. Since almost all real data sets have at 42
  • 43. least a few missing data points, and since the ability to deal with missing data correctly is one of the features that we take for granted in a statistical analysis package, we introduced two empty cells in the data: Treatment Outcome X Y 1 1 10.2 9.9 1 1 9.7 2 1 10.4 10.2 1 2 9.8 9.7 2 1 10.3 10.1 1 2 9.6 9.4 2 1 10.6 10.3 1 2 9.9 9.5 2 2 10.1 10 2 2 10.2 Each row of the spreadsheet represents a subject. The first subject received Treatment 1, and had Outcome 1. X and Y are the values of two measurements on each subject. We were unable to get a measurement for Y on the second subject, or on X for the last subject, so these cells are blank. The subjects are entered in the order that the data became available, so the data is not ordered in any particular way. We used this data to do some simple analyses and compared the results with a standard statistical package. The comparison considered the accuracy of the results as well as the ease with which the interface could be used for bigger data sets - i.e. more columns. We used SPSS as the standard, though any of the statistical packages supports would do equally well for this purpose. In this article when we say "a statistical package," we mean SPSS, SAS, STATA, SYSTAT, or Minitab. Most of Excels statistical procedures are part of the Data Analysis tool pack, which is in the Tools menu. It includes a variety of choices including simple descriptive statistics, t-tests, correlations, 1 or 2-way analysis of variance, regression, etc. If you do not have a 43
  • 44. Data Analysis item on the Tools menu, you need to install the Data Analysis ToolPak. Search in Help for "Data Analysis Tools" for instructions on loading the ToolPak. Two other Excel features are useful for certain analyses, but the Data Analysis tool pack is the only one that provides reasonably complete tests of statistical significance. Pivot Table in the Data menu can be used to generate summary tables of means, standard deviations, counts, etc. Also, you could use functions to generate some statistical measures, such as a correlation coefficient. Functions generate a single number, so using functions you will likely have to combine bits and pieces to get what you want. Even so, you may not be able to generate all the parts you need for a complete analysis. Unless otherwise stated, all statistical tests using Excel were done with the Data Analysis ToolPak. In order to check a variety of statistical tests, we chose the following tasks: Get means and standard deviations of X and Y for the entire group, and for each treatment group. Get the correlation between X and Y. Do a two sample t-test to test whether the two treatment groups differ on X and Y. Do a paired t-test to test whether X and Y are statistically different from each other. Compare the number of subjects with each outcome by treatment group, using a chi-squared test. All of these tasks are routine for a data set of this nature, and all of them could be easily done using any of the aobve listed statistical packages. 44
  • 45. 10.1 General Issues Enable the Analysis ToolPak The Data Analysis ToolPak is not installed with the standard Excel setup. Look in the Tools menu. If you do not have a Data Analysis item, you will need to install the Data Analysis tools. Search Help for "Data Analysis Tools" for instructions. Missing Values A blank cell is the only way for Excel to deal with missing data. If you have any other missing value codes, you will need to change them to blanks. Data Arrangement Different analyses require the data to be arranged in various ways. If you plan on a variety of different tests, there may not be a single arrangement that will work. You will probably need to rearrange the data several ways to get everything you need. Dialog Boxes Choose Tools/Data Analysis, and select the kind of analysis you want to do. The typical dialog box will have the following items: Input Range: Type the upper left and lower right corner cells. e.g. A1:B100. You can only choose adjacent rows and columns. Unless there is a checkbox for grouping data by rows or columns (and there usually is not), all the data is considered as one glop. Labels - There is sometimes a box you can check off to indicate that the first row of your sheet contains labels. If you have labels in the first row, check this box, and your output MAY be labeled with your label. Then again, it may not. Output location - New Sheet is the default. Or, type in the cell address of the upper left corner of where you want to place the output in the current sheet. New Worksheet is 45
  • 46. another option, which I have not tried. Ramifications of this choice are discussed below. Other items, depending on the analysis. Output location The output from each analysis can go to a new sheet within your current Excel file (this is the default), or you can place it within the current sheet by specifying the upper left corner cell where you want it placed. Either way is a bit of a nuisance. If each output is in a new sheet, you end up with lots of sheets, each with a small bit of output. If you place them in the current sheet, you need to place them appropriately; leave room for adding comments and labels; changes you need to make to format one output properly may affect another output adversely. Example: Output from Descriptive has a column of labels such as Standard Deviation, Standard Error, etc. You will want to make this column wide in order to be able to read the labels. But if a simple Frequency output is right underneath, then the column displaying the values being counted, which may just contain small integers, will also be wide. 10.2 Results of Analyses Descriptive Statistics The quickest way to get means and standard deviations for a entire group is using Descriptives in the Data Analysis tools. You can choose several adjacent columns for the Input Range (in this case the X and Y columns), and each column is analyzed separately. The labels in the first row are used to label the output, and the empty cells are ignored. If you have more, non-adjacent columns you need to analyze, you will have to repeat the process for each group of contiguous columns. The procedure is straightforward, can manage many columns reasonably efficiently, and empty cells are treated properly. To get the means and standard deviations of X and Y for each treatment group requires the use of Pivot Tables (unless you want to rearrange the data sheet to separate the two groups). After selecting the (contiguous) data range, in the Pivot Table Wizard's 46
  • 47. Layout option, drag Treatment to the Row variable area, and X to the Data area. Double click on ―Count of X‖ in the Data area, and change it to Average. Drag X into the Data box again, and this time change Count to StdDev. Finally, drag X in one more time, leaving it as Count of X. This will give us the Average, standard deviation and number of observations in each treatment group for X. Do the same for Y, so we will get the average, standard deviation and number of observations for Y also. This will put a total of six items in the Data box (three for X and three for Y). As you can see, if you want to get a variety of descriptive statistics for several variables, the process will get tedious. A statistical package lets you choose as many variables as you wish for descriptive statistics, whether or not they are contiguous. You can get the descriptive statistics for all the subjects together, or broken down by a categorical variable such as treatment. You can select the statistics you want to see once, and it will apply to all variables chosen. Correlations Using the Data Analysis tools, the dialog for correlations is much like the one for descriptives - you can choose several contiguous columns, and get an output matrix of all pairs of correlations. Empty cells are ignored appropriately. The output does NOT include the number of pairs of data points used to compute each correlation (which can vary, depending on where you have missing data), and does not indicate whether any of the correlations are statistically significant. If you want correlations on non-contiguous columns, you would either have to include the intervening columns, or copy the desired columns to a contiguous location. A statistical package would permit you to choose non-contiguous columns for your correlations. The output would tell you how many pairs of data points were used to compute each correlation, and which correlations are statistically significant. 47
  • 48. Two-Sample T-test This test can be used to check whether the two treatment groups differ on the values of either X or Y. In order to do the test you need to enter a cell range for each group. Since the data were not entered by treatment group, we first need to sort the rows by treatment. Be sure to take all the other columns along with treatment, so that the data for each subject remains intact. After the data is sorted, you can enter the range of cells containing the X measurements for each treatment. Do not include the row with the labels, because the second group does not have a label row. Therefore your output will not be labeled to indicate that this output is for X. If you want the output labeled, you have to copy the cells corresponding to the second group to a separate column, and enter a row with a label for the second group. If you also want to do the t-test for the Y measurements, you� need to repeat the process. The empty cells are ignored, and other ll than the problems with labeling the output, the results are correct. A statistical package would do this task without any need to sort the data or copy it to another column, and the output would always be properly labeled to the extent that you provide labels for your variables and treatment groups. It would also allow you to choose more than one variable at a time for the t-test (e.g. X and Y). Paired t-test The paired t-test is a method for testing whether the difference between two measurements on the same subject is significantly different from 0. In this example, we wish to test the difference between X and Y measured on the same subject. The important feature of this test is that it compares the measurements within each subject. If you scan the X and Y columns separately, they do not look obviously different. But if you look at each X-Y pair, you will notice that in every case, X is greater than Y. The paired t-test should be sensitive to this difference. In the two cases where either X or Y is missing, it is not possible to compare the two measures on a subject. Hence, only 8 rows are usable for the paired t-test. 48
  • 49. When you run the paired t-test on this data, you get a t-statistic of 0.09, with a 2-tail probability of 0.93. The test does not find any significant difference between X and Y. looking at the output more carefully, we notice that it says there are 9 observations. As noted above, there should only be 8. It appears that Excel has failed to exclude the observations that did not have both X and Y measurements. To get the correct results copy X and Y to two new columns and remove the data in the cells that have no value for the other measure. Now re-run the paired t-test. This time the t-statistic is 6.14817 with a 2-tail probability of 0.000468. The conclusion is completely different! Of course, this is an extreme example. But the point is that Excel does not calculate the paired t-test correctly when some observations have one of the measurements but not the other. Although it is possible to get the correct result, you would have no reason to suspect the results you get unless you are sufficiently alert to notice that the number of observations is wrong. There is nothing in online help that would warn you about this issue. Interestingly, there is also a TTEST function, which gives the correct results for this example. Apparently the functions and the Data Analysis tools are not consistent in how they deal with missing cells. Nevertheless, I cannot recommend the use of functions in preference to the Data Analysis tools, because the result of using a function is a single number - in this case, the 2-tail probability of the t-statistic. The function does not give you the t-statistic itself, the degrees of freedom, or any number of other items that you would want to see if you were doing a statistical test. A statistical package will correctly exclude the cases with one of the measurements missing, and will provide all the supporting statistics you need to interpret the output. Cross tabulation and Chi-Squared Test of Independence Our final task is to count the two outcomes in each treatment group, and use a chi- square test of independence to test for a relationship between treatment and outcome. In 49
  • 50. order to count the outcomes by treatment group, you need to use Pivot Tables. In the Pivot Table Wizard's Layout option, drag Treatment to Row, Outcome to Column and also to Data. The Data area should say "Count of Outcome" – if not, double-click on it and select "Count". If you want percents, double-click "Count of Outcome", and click Options; in the ―Show Data As‖ box which appears, select "% of row". If you want both counts and percents, you can drag the same variable into the Data area twice, and use it once for counts and once for percents. Getting the chi-square test is not so simple, however. It is only available as a function, and the input needed for the function is the observed counts in each combination of treatment and outcome (which you have in your pivot table), and the expected counts in each combination. Expected counts? What are they? How do you get them? If you have sufficient statistical background to know how to calculate the expected counts, and can do Excel calculations using relative and absolute cell addresses, you should be able to navigate through this. If not, you’re out of luck. Assuming that you surmounted the problem of expected counts, you can use the Chi- test function to get the probability of observing a chi-square value bigger than the one for this table. Again, since we are using functions, you do not get many other necessary pieces of the calculation, notably the value of the chi-square statistic or its degrees of freedom. No statistical package would require you to provide the expected values before computing a chi-square test of independence. Further, the results would always include the chi-square statistic and its degrees of freedom, as well as its probability. Often you will get some additional statistics as well. 50
  • 51. 10.3 Additional Analyses The remaining analyses were not done on this data set, but some comments about them are included for completeness. Simple Frequencies You can use Pivot Tables to get simple frequencies. (See Crosstabulations for more about how to get Pivot Tables.) Using Pivot Tables, each column is considered a separate variable, and labels in row 1 will appear on the output. You can only do one variable at a time. Another possibility is to use the Frequencies function. The main advantage of this method is that once you have defined the frequencies function for one column, you can use Copy/Paste to get it for other columns. First, you will need to enter a column with the values you want counted (bins). If you intend to do the frequencies for many columns, be sure to enter values for the column with the most categories. e.g., if 3 columns have values of 1 or 2, and the fourth has values of 1,2,3,4, you will need to enter the bin values as 1,2,3,4. Now select enough empty cells in one column to store the results - 4 in this example, even if the current column only has 2 values. Next choose Insert/Function/Statistical/Frequencies on the menu. Fill in the input range for the first column you want to count using relative addresses (e.g. A1:A100). Fill in the Bin Range using the absolute addresses of the locations where you entered the values to be counted (e.g. $M$1:$M$4). Click Finish. Note the box above the column headings of the sheet, where the formula is displayed. It start with "= FREQUENCIES(". Place the cursor to the left of the = sign in the formula, and press Ctrl-Shift-Enter. The frequency counts now appear in the cells you selected. To get the frequency counts of other columns, select the cells with the frequencies in them, and choose Edit/Copy on the menu. If the next column you want to count is one column to the right of the previous one, select the cell to the right of the first frequency 51
  • 52. cell, and choose Edit/Paste (ctrl-V). Continue moving to the right and pasting for each column you want to count. Each time you move one column to the right of the original frequency cells, the column to be counted is shifted right from the first column you counted. If you want percents as well, you’ll have to use the Sum function to compute the sum of the frequencies, and define the formula to get the percent for one cell. Select the cell to store the first percent, and type the formula into the formula box at the top of the sheet - e.g. = N1*100/N$5 - where N1 is the cell with the frequency for the first category, and N5 is the cell with the sum of the frequencies. Use Copy/Paste to get the formula for the remaining cells of the first column. Once you have the percents for one column, you can Copy/Paste them to the other columns. You’ll need to be careful about the use of relative and absolute addresses! In the example above, we used N$5 for the denominator, so when we copy the formula down to the next frequency on the same column, it will still look for the sum in row 5; but when we copy the formula right to another column, it will shift to the frequencies in the next column. Finally, you can use Histogram on the Data Analysis menu. You can only do one variable at a time. As with the Frequencies function, you must enter a column with "bin" boundaries. To count the number of occurrences of 1 and 2, you need to enter 0, 1, and 2 in three adjacent cells, and give the range of these three cells as the Bins on the dialog box. The output is not labeled with any labels you may have in row 1, nor even with the column letter. If you do frequencies on lots of variables, you will have difficulty knowing which frequency belongs to which column of data. Linear Regression Since regression is one of the more frequently used statistical analyses, we tried it out even though we did not do a regression analysis for this example. The Regression procedure in the Data Analysis tools lets you choose one column as the dependent variable, and a set of contiguous columns for the independents. However, it does not 52
  • 53. tolerate any empty cells anywhere in the input ranges, and you are limited to 16 independent variables. Therefore, if you have any empty cells, you will need to copy all the columns involved in the regression to new columns, and delete any rows that contain any empty cells. Large models, with more than 16 predictors, cannot be done at all. Analysis of Variance In general, the Excel's ANOVA features are limited to a few special cases rarely found outside textbooks, and require lots of data re-arrangements. One-way ANOVA Data must be arranged in separate and adjacent columns (or rows) for each group. Clearly, this is not conducive to doing 1-ways on more than one grouping. If you have labels in row 1, the output will use the labels. Two-Factor ANOVA without Replication This only does the case with one observation per cell (i.e. no Within Cell error term). The input range is a rectangular arrangement of cells, with rows representing levels of one factor, columns the levels of the other factor, and the cell contents the one value in that cell. Two-Factor ANOVA with Replicates This does a two-way ANOVA with equal cell sizes. Input must be a rectangular region with columns representing the levels of one factor, and rows representing replicates within levels of the other factor. The input range MUST also include an additional row at the top, and column on the left, with labels indicating the factors. However, these labels are not used to label the resulting ANOVA table. Click Help on the ANOVA dialog for a picture of what the input range must look like. 53
  • 54. 10.4 Requesting Many Analyses If you had a variety of different statistical procedures that you wanted to perform on your data, you would almost certainly find yourself doing a lot of sorting, rearranging, copying and pasting of your data. This is because each procedure requires that the data be arranged in a particular way, often different from the way another procedure wants the data arranged. In our small test, we had to sort the rows in order to do the t-test, and copy some cells in order to get labels for the output. We had to clear the contents of some cells in order to get the correct paired t-test, but did not want those cells cleared for some other test. And we were only doing five tasks. It does not get better when you try to do more. There is no single arrangement of the data that would allow you to do many different analyses without making many different copies of the data. The need to manipulate the data in many ways greatly increases the chance of introducing errors. Using a statistical program, the data would normally be arranged with the rows representing the subjects, and the columns representing variables (as they are in our sample data). With this arrangement you can do any of the analyses discussed here, and many others as well, without having to sort or rearrange your data in any way. Only much more complex analyses, beyond the capabilities of Excel and the scope of this article would require data rearrangement. 10.5 Working with Many Columns What if your data had not 4, but 40 columns, with a mix of categorical and continuous measures? How easily do the above procedures scale to a larger problem? At best, some of the statistical procedures can accept multiple contiguous columns for input, and interpret each column as a different measure. The descriptives and correlations procedures are of this type, so you can request descriptive statistics or correlations for a large number of continuous variables, as long as they are entered in 54
  • 55. adjacent columns. If they are not adjacent, you need to rearrange columns or use copy and paste to make them adjacent. Many procedures, however, can only be applied to one column at a time. T-tests (either independent or paired), simple frequency counts, the chi-square test of independence, and many other procedures are in this class. This would become a serious drawback if you had more than a handful of columns, even if you use cut and paste or macros to reduce the work. In addition to having to repeat the request many times, you have to decide where to store the results of each, and make sure it is properly labeled so you can easily locate and identify each output. Finally, Excel does not give you a log or other record to track what you have done. This can be a serious drawback if you want to be able to repeat the same (or similar) analysis in the future, or even if you’ve simply forgotten what you’ve already done. Using a statistical package, you can request a test for as many variables as you need at once. Each one will be properly labeled and arranged in the output, so there is no confusion as to what’s what. You can also expect to get a log, and often a set of commands as well, which can be used to document your work or to repeat an analysis without having to go through all the steps again. 55
  • 56. 11. Beyond VBA When you say ―Visual Basic,‖ most developers—particularly those reading this magazine—will think of the Visual Basic development environment that has been the topic of these columns for several years now. So what do I mean by ―beyond‖ Visual Basic? I am, however, interested in exploring the capabilities of Visual Basic wherever it leads me, and that sometimes means going outside the traditional Visual Basic development environment. You’ll be surprised at the programming power you’ll find. I am, of course, talking about Visual Basic for Applications, or VBA—the ―macro‖ language that is supported by many Microsoft application programs. I put ―macro‖ in quotes because, while VBA may have its roots in the keyboard macro tools of the past, which permitted recording and playback of keystroke sequences, it has evolved into something entirely different. In fact, VBA is essentially the regular Visual Basic language modified for use in controlling existing applications rather than creating stand-alone applications. You have the same rich set of language constructs, data types, control statements, and so on available to you. In fact, from the perspective of the language itself, a programmer would have trouble telling Visual Basic and VBA apart. Even so, VBA programs are still referred to as macros. VBA is embedded in many Microsoft applications, most notably those that are part of Microsoft Office: Word, Excel, Access, Outlook, PowerPoint, and FrontPage. VBA has also been licensed by Microsoft to some other publishers of Windows software. You can use VBA in a keyboard macro mode in which you start recording, perform some actions in the program, and then save the recorded macro to be played back later as needed. While recording macros only scratches the surface of VBA’s capabilities, it is nonetheless an extremely useful technique that I use on a daily basis. It is important to note that a recorded macro is not saved as a sequence of keystrokes, as was the case in some older programs. Rather it is saved as a Visual Basic subroutine, and the statements that carry out the recorded actions consist primarily of manipulation of properties and methods of the application’s objects. 56
  • 57. 12. Conclusion Although Excel is a fine spreadsheet, it is not a statistical data analysis package. In all fairness, it was never intended to be one. Keep in mind that the Data Analysis ToolPak is an "add-in" - an extra feature that enables you to do a few quick calculations. So it should not be surprising that that is just what it is good for - a few quick calculations. If you attempt to use it for more extensive analyses, you will encounter difficulties due to any or all of the following limitations: Potential problems with analyses involving missing data. These can be insidious, in that the unwary user is unlikely to realize that anything is wrong. Lack of flexibility in analyses that can be done due to its expectations regarding the arrangement of data. This results in the need to cut/paste/sort/ and otherwise rearrange the data sheet in various ways, increasing the likelyhood of errors. Output scattered in many different worksheets, or all over one worksheet, which you must take responsibility for arranging in a sensible way. Output may be incomplete or may not be properly labeled, increasing possibility of misidentifying output. Need to repeat requests for the some analyses multiple times in order to run it for multiple variables, or to request multiple options. Need to do some things by defining your own functions/formulae, with its attendant risk of errors. No record of what you did to generate your results, making it difficult to document your analysis, or to repeat it at a later time, should that be necessary. If you have more than about 10 or 12 columns, and/or want to do anything beyond descriptive statistics and perhaps correlations, you should be using a statistical package. There are several suitable ones available by site license through OIT, or you can use them in any of the OIT PC labs. If you have Excel on your own PC, and don’t want to 57
  • 58. pay for a statistical program, by all means use Excel to enter the data (with rows representing the subjects, and columns for the variables). All the mentioned statistical packages can read Excel files, so you can do the (time-consuming) data entry at home, and go to the labs to do the analysis. I have found Excel to be eminently suitable for use in my measurement and data analysis classes. It’s not only suitable, but a very effective and readily-available tool for introducing students to contemporary data analysis methods. Excel’s fundamental data table design, coupled with useful chart capabilities, easily leads students down paths which will pave the way for their later application of such systems as SPSS and SAS. Although Excel is a fine spreadsheet, it is not a statistical data analysis package. In all fairness, it was never intended to be one. Keep in mind that the Data Analysis ToolPak is an "add-in" - an extra feature that enables you to do a few quick calculations. So it should not be surprising that that is just what it is good for - a few quick calculations. If you attempt to use it for more extensive analyses, you will encounter difficulties due to any or all of the following limitations: Potential problems with analyses involving missing data. These can be insidious, in that the unwary user is unlikely to realize that anything is wrong. Lack of flexibility in analyses that can be done due to its expectations regarding the arrangement of data. This results in the need to cut/paste/sort/ and otherwise rearrange the data sheet in various ways, increasing the likelyhood of errors. Output scattered in many different worksheets, or all over one worksheet, which you must take responsibility for arranging in a sensible way. Output may be incomplete or may not be properly labeled, increasing possibility of misidentifying output. Need to repeat requests for the some analyses multiple times in order to run it for multiple variables, or to request multiple options. 58
  • 59. Need to do some things by defining your own functions/formulae, with its attendant risk of errors. No record of what you did to generate your results, making it difficult to document your analysis, or to repeat it at a later time, should that be necessary. If you have more than about 10 or 12 columns, and/or want to do anything beyond descriptive statistics and perhaps correlations, you should be using a statistical package. There are several suitable ones available by site license through OIT, or you can use them in any of the OIT PC labs. If you have Excel on your own PC, and don� want to t pay for a statistical program, by all means use Excel to enter the data (with rows representing the subjects, and columns for the variables). All the mentioned statistical packages can read Excel files, so you can do the (time-consuming) data entry at home, and go to the labs to do the analysis. 59
  • 61. 14. Scope for Future Study 61