This document provides an overview of regression analysis and examples of its applications. It discusses the basics of regression, including distinguishing between continuous and discrete variables and summarizing each. It also explains that regression can be used to determine relationships between variables, predict dependent variable values, and control for other independent variables. Examples provided include predicting box office revenues and analyzing the impact of Southwest Airlines on airfares.
This case study utilizes a large database (2000 stores, 6-years of scanner data) to study pricing strategies for brands. Methods include Advanced regression, PCA and Clustering algorithms.
This case study utilizes a large database (2000 stores, 6-years of scanner data) to study pricing strategies for brands. Methods include Advanced regression, PCA and Clustering algorithms.
A large appliance manufacturer was interested in using propensity models to better target consumers with direct mail campaigns. A data set containing transactional data from past purchases and enriched with all kinds of data about the consumer, the household or the zip code, from third party providers was used to develop a model to predict non-responders and avoid targeting them. Simulations varying the estimated revenue per customer and the cutoff point used to filter out potential consumers allowed me to identify different optimal point in the Reach-vs-Response-Rate tradeoff.
The article re-institutes the investor faith in CAPM model and talks about how CAPM is very closely coupled to the actual investment practices at the ground level. The general criticism of CAPM model that it does not fit empirical asset pricing well cast doubt on the validity of model have been explained.
Since the CAPM model Sharpe (1965) and the first “fundamental” model by King (1966) the use of “factors” in alpha generation and risk modeling has become mainstream. However, the types of factors we employ and the techniques we use to model relationships have in general not progressed much since. In addition, many of our favorite techniques assume that the world is static, whereas of course markets evolve and change dramatically; as we have seen so vividly illustrated over the last few years.
We review fundamental, macro-economic, and statistical factors, describing the advantages and disadvantages of each, and review some newer techniques that explicitly allow for evolving relationships in data sets and harness emerging technologies that can capture much more nuanced relationships than simple correlation: “flexible” least-squares regression, artificial immune systems, single-pass clustering, semantic clustering, social network influence measurement, layer-embedded networks, block-modeling, and more.
In Machine Learning in Credit Risk Modeling, we provide an explanation of the main Machine Learning models used in James so that Efficiency does not come at the expense of Explainability.
(Contact Yvan De Munck for more info or to receive other and future updates on the subject @yvandemunck or yvan@james.finance)
A large appliance manufacturer was interested in using propensity models to better target consumers with direct mail campaigns. A data set containing transactional data from past purchases and enriched with all kinds of data about the consumer, the household or the zip code, from third party providers was used to develop a model to predict non-responders and avoid targeting them. Simulations varying the estimated revenue per customer and the cutoff point used to filter out potential consumers allowed me to identify different optimal point in the Reach-vs-Response-Rate tradeoff.
The article re-institutes the investor faith in CAPM model and talks about how CAPM is very closely coupled to the actual investment practices at the ground level. The general criticism of CAPM model that it does not fit empirical asset pricing well cast doubt on the validity of model have been explained.
Since the CAPM model Sharpe (1965) and the first “fundamental” model by King (1966) the use of “factors” in alpha generation and risk modeling has become mainstream. However, the types of factors we employ and the techniques we use to model relationships have in general not progressed much since. In addition, many of our favorite techniques assume that the world is static, whereas of course markets evolve and change dramatically; as we have seen so vividly illustrated over the last few years.
We review fundamental, macro-economic, and statistical factors, describing the advantages and disadvantages of each, and review some newer techniques that explicitly allow for evolving relationships in data sets and harness emerging technologies that can capture much more nuanced relationships than simple correlation: “flexible” least-squares regression, artificial immune systems, single-pass clustering, semantic clustering, social network influence measurement, layer-embedded networks, block-modeling, and more.
In Machine Learning in Credit Risk Modeling, we provide an explanation of the main Machine Learning models used in James so that Efficiency does not come at the expense of Explainability.
(Contact Yvan De Munck for more info or to receive other and future updates on the subject @yvandemunck or yvan@james.finance)
Marketing Research Approaches to Demand EstimationConsumer Surveysdata from survey questionsObservational Researchdata from observed behaviorConsumer Clinicsdata from laboratory experimentsMarket Experimentsdata from real market tests
Regression Analysis
Scatter Diagram
Regression AnalysisRegression Line: Line of Best Fit
Regression Line: Minimizes the sum of the squared vertical deviations (et) of each point from the regression line.
Ordinary Least Squares (OLS) Method
Ordinary Least Squares (OLS)
Model:
Ordinary Least Squares (OLS)
Objective: Determine the slope and intercept that minimize the sum of the squared errors.
Ordinary Least Squares (OLS)
Estimation Procedure
Ordinary Least Squares (OLS)
Estimation Example
Ordinary Least Squares (OLS)
Estimation Example
Tests of Significance
Standard Error of the Slope Estimate
Tests of Significance
Example Calculation
Tests of Significance
Example Calculation
Tests of Significance
Calculation of the t Statistic
Degrees of Freedom = (n-k) = (10-2) = 8
Critical Value at 5% level =2.306
Tests of Significance
Decomposition of Sum of Squares
Total Variation = Explained Variation + Unexplained Variation
Tests of Significance
Coefficient of Determination
Tests of Significance
Coefficient of Correlation
Multiple Regression Analysis
Model:
Multiple Regression Analysis
Adjusted Coefficient of Determination
Multiple Regression Analysis
Analysis of Variance and F Statistic
Problems in Regression AnalysisMulticollinearity: Two or more explanatory variables are highly correlated.Heteroskedasticity: Variance of error term is not independent of the Y variable.Autocorrelation: Consecutive error terms are correlated.
Durbin-Watson Statistic
Test for Autocorrelation
If d = 2, autocorrelation is absent.
Steps in Demand EstimationModel Specification: Identify VariablesCollect DataSpecify Functional FormEstimate FunctionTest the Results
Functional Form Specifications
Linear Function:
Power Function:
Estimation Format:
Chapter 5 Appendix
Getting StartedInstall the Analysis ToolPak add-in from the Excel installation media if it has not already been installedAttach the Analysis ToolPak add-inFrom the menu, select Tools and then Add-Ins...When the Add-Ins dialog appears, select Analysis ToolPak and then click OK.
Entering DataData on each variable must be entered in a separate columnLabel the top of each column with a symbol or brief description to identify the variableMultiple regression analysis requires that all data on independent variables be in adjacent columns
Example Data
Running the RegressionSelect the Regression tool from the Analysis ToolPak dialogFrom the menu, select Tools and then Data Analysis...On the Data Anal.
GENETIC ALGORITHM FOR FUNCTION APPROXIMATION: AN EXPERIMENTAL INVESTIGATIONijaia
Function Approximation is a popular engineering problems used in system identification or Equation
optimization. Due to the complex search space it requires, AI techniques has been used extensively to spot
the best curves that match the real behavior of the system. Genetic algorithm is known for their fast
convergence and their ability to find an optimal structure of the solution. We propose using a genetic
algorithm as a function approximator. Our attempt will focus on using the polynomial form of the
approximation. After implementing the algorithm, we are going to report our results and compare it with
the real function output.
In this research work we have developed a mathematical model for predicting the success class [flop , hit , super hit] of the Indian movies, for doing this we have develop a methodology in which the historical data of each component [e. G actor , actress, director, music ]that influences the success or failure of a movie is given is due weightage and then based on multiple thresholds calculated on the basis of descriptive statistics of dataset of each component it is given class [flop , hit, super hit] label. This dataset is then subjected to neural network [LM] based learning algorithm for automating the process and results in terms of match between actual class labels and predicted labels are evaluated. Results show that our strategy of identifying the class of success is highly effective and accurate which apparent from the classification matrix also.
We study the problem of profit maximization in social networks through influence diffusion. We propose elegant model that describes the diffusion process, distinguishes between the states of being influenced and adopting a product. We then give efficient and effective algorithms to solve this NP-hard problem.
Premium MEAN Stack Development Solutions for Modern BusinessesSynapseIndia
Stay ahead of the curve with our premium MEAN Stack Development Solutions. Our expert developers utilize MongoDB, Express.js, AngularJS, and Node.js to create modern and responsive web applications. Trust us for cutting-edge solutions that drive your business growth and success.
Know more: https://www.synapseindia.com/technology/mean-stack-development-company.html
Discover the innovative and creative projects that highlight my journey throu...dylandmeas
Discover the innovative and creative projects that highlight my journey through Full Sail University. Below, you’ll find a collection of my work showcasing my skills and expertise in digital marketing, event planning, and media production.
Improving profitability for small businessBen Wann
In this comprehensive presentation, we will explore strategies and practical tips for enhancing profitability in small businesses. Tailored to meet the unique challenges faced by small enterprises, this session covers various aspects that directly impact the bottom line. Attendees will learn how to optimize operational efficiency, manage expenses, and increase revenue through innovative marketing and customer engagement techniques.
Recruiting in the Digital Age: A Social Media MasterclassLuanWise
In this masterclass, presented at the Global HR Summit on 5th June 2024, Luan Wise explored the essential features of social media platforms that support talent acquisition, including LinkedIn, Facebook, Instagram, X (formerly Twitter) and TikTok.
Business Valuation Principles for EntrepreneursBen Wann
This insightful presentation is designed to equip entrepreneurs with the essential knowledge and tools needed to accurately value their businesses. Understanding business valuation is crucial for making informed decisions, whether you're seeking investment, planning to sell, or simply want to gauge your company's worth.
Cracking the Workplace Discipline Code Main.pptxWorkforce Group
Cultivating and maintaining discipline within teams is a critical differentiator for successful organisations.
Forward-thinking leaders and business managers understand the impact that discipline has on organisational success. A disciplined workforce operates with clarity, focus, and a shared understanding of expectations, ultimately driving better results, optimising productivity, and facilitating seamless collaboration.
Although discipline is not a one-size-fits-all approach, it can help create a work environment that encourages personal growth and accountability rather than solely relying on punitive measures.
In this deck, you will learn the significance of workplace discipline for organisational success. You’ll also learn
• Four (4) workplace discipline methods you should consider
• The best and most practical approach to implementing workplace discipline.
• Three (3) key tips to maintain a disciplined workplace.
Digital Transformation and IT Strategy Toolkit and TemplatesAurelien Domont, MBA
This Digital Transformation and IT Strategy Toolkit was created by ex-McKinsey, Deloitte and BCG Management Consultants, after more than 5,000 hours of work. It is considered the world's best & most comprehensive Digital Transformation and IT Strategy Toolkit. It includes all the Frameworks, Best Practices & Templates required to successfully undertake the Digital Transformation of your organization and define a robust IT Strategy.
Editable Toolkit to help you reuse our content: 700 Powerpoint slides | 35 Excel sheets | 84 minutes of Video training
This PowerPoint presentation is only a small preview of our Toolkits. For more details, visit www.domontconsulting.com
Building Your Employer Brand with Social MediaLuanWise
Presented at The Global HR Summit, 6th June 2024
In this keynote, Luan Wise will provide invaluable insights to elevate your employer brand on social media platforms including LinkedIn, Facebook, Instagram, X (formerly Twitter) and TikTok. You'll learn how compelling content can authentically showcase your company culture, values, and employee experiences to support your talent acquisition and retention objectives. Additionally, you'll understand the power of employee advocacy to amplify reach and engagement – helping to position your organization as an employer of choice in today's competitive talent landscape.
In the Adani-Hindenburg case, what is SEBI investigating.pptxAdani case
Adani SEBI investigation revealed that the latter had sought information from five foreign jurisdictions concerning the holdings of the firm’s foreign portfolio investors (FPIs) in relation to the alleged violations of the MPS Regulations. Nevertheless, the economic interest of the twelve FPIs based in tax haven jurisdictions still needs to be determined. The Adani Group firms classed these FPIs as public shareholders. According to Hindenburg, FPIs were used to get around regulatory standards.
Buy Verified PayPal Account | Buy Google 5 Star Reviewsusawebmarket
Buy Verified PayPal Account
Looking to buy verified PayPal accounts? Discover 7 expert tips for safely purchasing a verified PayPal account in 2024. Ensure security and reliability for your transactions.
PayPal Services Features-
🟢 Email Access
🟢 Bank Added
🟢 Card Verified
🟢 Full SSN Provided
🟢 Phone Number Access
🟢 Driving License Copy
🟢 Fasted Delivery
Client Satisfaction is Our First priority. Our services is very appropriate to buy. We assume that the first-rate way to purchase our offerings is to order on the website. If you have any worry in our cooperation usually You can order us on Skype or Telegram.
24/7 Hours Reply/Please Contact
usawebmarketEmail: support@usawebmarket.com
Skype: usawebmarket
Telegram: @usawebmarket
WhatsApp: +1(218) 203-5951
USA WEB MARKET is the Best Verified PayPal, Payoneer, Cash App, Skrill, Neteller, Stripe Account and SEO, SMM Service provider.100%Satisfection granted.100% replacement Granted.
B2B payments are rapidly changing. Find out the 5 key questions you need to be asking yourself to be sure you are mastering B2B payments today. Learn more at www.BlueSnap.com.
2. What we should know
Basics of the software
Know the difference between a continuous & discrete
(nominal) variable
Know how to summarize a continuous (e.g. mean income) and nominal
(e.g. % Female)
Relationship between 2 variables
Both continuous (correlation)
Both Nominal (cross-tab, mosaic plot)
One Continuous & one Nominal (e.g. take Mean of continuous
variable by Nominal)
Understand p-value: Only time we are interested in
‘statistical’ test is when doing controlled experiments
3. Why do need models?
o Graphs are useful for understanding but don’t
scale (when we have too many potential
predictors).
o We want to automate the analysis
o Which Ad to display?
o How to provide an insurance quote based on the
information provided by a new customer
o Conduct ‘what-if’ analysis for planning Black Friday.
4. Example
Predicting auto insurance
Traditional measures
Usage based
GPS device
Forecasting sales for Sony digital camera at Best
Buy
Build a demand model based on historical data from
1000 stores
5. Regression: Key Points
Regression: widely used research tool
Determine whether the independent variables explain a significant
variation in the dependent variable: whether a relationship exists.
Determine how much of the variation in the dependent variable can
be explained by the independent variables: strength of the
relationship.
Control for other independent variables when evaluating the
contributions of a specific variable or set of variables. Marginal effect
Forecast/Predict the values of the dependent variable.
Use regression results as inputs to additional computations:
Optimal pricing, promotion, time to launch a product….
8. Box Office Prediction
Suppose you are helping Warner Bros. in
developing a model for forecasting Box Office
revenues for their new movie The Watchman. In
the file “BoxOffice.csv” you are provided the
opening week revenues (in millions of $) for
various past movies along with several predictor
variables:
Variable Description of the Variable
Opening_Week_Revenue Opening Week Revenue in millions of $
# of Theaters Number of movie theaters each movie was initially released
Overall Rating Critic ratings for each movie (high number implies more favorable ratings)
Genre 1 for Action, 2 for Comedy, 3 for Kids, and 4 Other
9. Data
Movie Opening_Week_Revenue Num_Theaters Overall_Rating Genre
The Dark Knight 158.4 4366 82 1
Iron Man 98.6 4105 79 1
Indiana Jones and the Kingdom of the Crystal Skull 100.1 4260 65 1
Hancock 62.6 3965 49 1
Quantum of Solace 67.5 3451 58 1
The Incredible Hulk 55.4 3505 61 1
Wanted 50.9 3175 64 1
Get Smart 38.7 3911 54 1
The Mummy: Tomb of the Dragon Emperor 40.5 3760 31 1
Journey to the Center of the Earth 21 2811 57 1
Eagle Eye 29.2 3510 43 1
10,000 B.C. 35.9 3410 34 1
Valkyrie 21 2711 56 1
Jumper 27.4 3428 35 1
Cloverfield 40.1 3411 64 1
The Day the Earth Stood Still (2008) 30.5 3560 40 1
Hellboy II: The Golden Army 34.5 3204 78 1
Spider-Man 3 151.1 4252 59 1
Transformers 70.5 4011 61 1
Pirates of the Caribbean: At World's End 114.7 4362 50 1
10. Objective
Develop a regression model for “Opening
week Revenues” and all other variables as
predictors. Interpret your parameters.
Prediction: The attributes for the movie
“Watchman” are as follows:
– Theaters= 3611, Rating= 57, Action= 1
– Given this information, what are the predicted
first week revenues for the new movie
Watchman?
14. Regression: Forecasting Box-office Revenues
You need to convert the “Genre” variable into a series of dummy variables. This
is a nominal variable (i.e. categories such as 1=Action, 2=Comedy..). Adding this
variable directly into regression does not teach us anything. For example, our
coding could have been 1=Comedy, 2=Action...).
In addition, note that total number of dummy variables we include/need is 1
less than the number of categories. The left out category is absorbed in the
intercept.
It does not matter what you leave out—all included dummy variables will be
interpreted with respect to what you leave out.
For example, suppose we leave out “Action” and include dummy variables for
“comedy”, “kids” and “other”. The output of this regression:
15. Regression with Genre Dummy Variables Only
We left out “Action” as the
base. Compare the Intercept &
Average for Action
Just looking at the means, we
see that “Kids” movies generate
(56.66 - 45.10 = 11.56) less
than action. This is the
coefficient for ‘kids’ in the
regression.
16. Output from JMP
Note: In JMP output, go to red triangle and then select Estimates- Indicator Function
Parameterization to get “dummy” variable output
JMP Output
What is the interpretation of
Action here?
17. Leave out Comedy this time
We left out “Comedy” this itme
which is the intercept now.
See that Action is 24.68 More
than Comedy. Compare this to
the -24.68 coefficient on
Comedy in the previous
regression
Obviously none of the model fit
change. The coefficients get
adjusted based on the left out
category (Comedy in this case)
18. Add All Predictors
• Regression is OWR (dependent variable) & #of Theaters,
Ratings, Genre as predictors
# of Theaters: Each additional point in overall
rating increases OWR by $.278mn
Overall_Rating: Each additional point in overall
rating increases OWR by $.278mn.
Genre (Kids): Compared to “Other”, kids
movies generate 17.53 less in OWR after
controlling for the effect of # of Theaters and
Ratings
19. Objective
Develop a regression model for “Opening
week Revenues” and all other variables as
predictors. Interpret your parameters.
Prediction: The attributes for the movie
“Watchman” are as follows:
– Theaters= 3611, Rating= 57, Action= 1
– Given this information, what are the predicted
first week revenues for the new movie
Watchman?
21. Context
Southwest & the Wright Amendment
Click on article or
google “Southwest
Wright Amendment”
to get context
22. Impact of Southwest Airlines on Price
Suppose you are representing Southwest and want to claim that
presence of SW in a market is good for consumers-- because it lowers
the fares.
For analysis, you are provided data on Fares from approximately 600
“city-pairs” with following variables:
Objective: Analyze the impact of Southwest
presence on the average fares
25. Compare Mean Fare by SW
NOTE: If you square the t-ratio 6.71:
(6.71* 6.71) you get 45.03 (F-ratio)
26. Basic intuition of Regression Based Models
o Conceptually, fares do not just depend on presence of
Southwest
o Other factors
o In our example: Competition, Distance
o Analyze relationship b/w these variables & Fares
o In analyzing output with single predictors, note the
correspondence between regression output vs. ANOVA (t-
test)
o We get the same output from regression as a t-test or ANOVA
o More important point is to understand the workings of a
“dummy” variable in regression
32. Understand Output
Rsquare: Of the
total variation in
Fares, 41.6% is
explained by our
model
Distance is the most
important predictor
& Southwest is least
important
33. Interpretation Of Coefficients
Southwest: After Controlling for Distance and Competition (#of airlines),
absence of Southwest in the market increases fares by approximately $49.
Distance: Increasing distance by 100 miles, increases the fare by $ 21.5
# of Airline: Increasing the number of airline serving the markets by 1, reduces
the fare by approximately $41.
34. • Least Squares Principle: Choose β’s so that the sum of the
squared prediction errors,
is a small as possible.
Ok, but what does that mean? Open the file SSQ_Intuition.xls
2
m3m2
1
m10m )SF()( CompDistWareSSQ
M
m
How does the software Compute the parameters?
35. Average Fare by # of Airlines
Split by Presence of Southwest (Interactions—for later)
36. Conclusion
T-test and ANOVA are
both used to compare
means across different
groups
T-test for 2 groups and
ANOVA for many
groups
We can always convert
the question to a
regression problem
using dummy variables
Advantage of
regression is that it is
straightforward to
control for any number
of other variables that
might impact the
outcome
From now on, we will
focus on regression
analysis
37. Regression: Key Points
Regression: widely used research tool
• Determine whether the independent variables explain a significant
variation in the dependent variable: whether a relationship exists.
• Determine how much of the variation in the dependent variable can
be explained by the independent variables: strength of the
relationship.
• Control for other independent variables when evaluating the
contributions of a specific variable or set of variables. Marginal effect
• Forecast/Predict the values of the dependent variable.
• Use regression results as inputs to additional computations:
Optimal pricing, promotion, time to launch a product….