Mapping Inequality:How does the historical practice of redlining relate to contemporary mortgage rates and lending practices in Nassau & Suffolk County, New York?
Similar to Mapping Inequality:How does the historical practice of redlining relate to contemporary mortgage rates and lending practices in Nassau & Suffolk County, New York?
Similar to Mapping Inequality:How does the historical practice of redlining relate to contemporary mortgage rates and lending practices in Nassau & Suffolk County, New York? (20)
Data Science Project: Advancements in Fetal Health Classification
Mapping Inequality:How does the historical practice of redlining relate to contemporary mortgage rates and lending practices in Nassau & Suffolk County, New York?
1. USC: Plus 668 Big Data for Planning & Development
Professor: Dr. Kevin Kane
Rahsaan L. Browne Sr.-Summer 2023
I. Introduction
II. Univariate Analysis
III. Bivariate Analysis
IV. Additional Univariate/Bivariate Analysis
V. Spatial Analysis
VI. Spatial Weights Matrix
VII. Moran's I and LISA Analysis
VIII. Conclusion
IX. Recommendations and Future Research
X. References
2. ("Mapping Inequality: Redlining in New Deal America")
I. Introduction
The history of redlining in Nassau/Suffolk counties:
The above photo is from the outer borough closest to the long island suburbs from the Home
Owners' Loan Corporation maps (HOLC), a former U.S. government agency established in 1933
to help stabilize real estate that had depreciated during the Depression and to refinance the urban
mortgage debt. Created as part of a New Deal program to assist struggling homeowners, HOLC
tasked local officials with creating neighborhood maps that captured credit default risk across
neighborhoods. In this process, officials assigned neighborhoods to one of four color-coded letter
grades: D = “hazardous,” C = “definitely declining,” B = “still desirable,” and A = “best.” “A”
ratings were often assigned to affluent White neighborhoods, while “D” ratings were often
assigned to neighborhoods that had a greater share of Black, lower-class, or immigrant residents.
The post-World War II era witnessed a remarkable surge in suburban growth, not only in Long
Island but also in other major metropolitan areas, fueled by returning veterans eager to establish
families and acquire homes. This period saw the transformation of vast farmland in the suburbs
into sprawling housing developments, supported by significant federal infrastructure
3. investments, particularly in the construction of highways, which made commuting to work more
accessible and attractive.
Amid this rapid expansion, between 1960 and 1964, approximately half a million white
individuals relocated from New York City to predominantly white suburban enclaves such as
long island. However, this seemingly organic growth was underpinned by a dark and enduring
legacy of housing discrimination known as "redlining." This discriminatory practice involved
banks and real estate agencies designating specific areas for racial groups, laying the groundwork
for community and school segregation patterns that continue to persist across the nation today.
During the late 1940s and early 1950s, Levitt and Sons, a prominent housing developer,
constructed 17,447 low-cost two and three-bedroom homes on Long Island, with 90% of these
units being purchased by families of World War II veterans. Tragically, the company initially
included clauses in mortgage and rental agreements that explicitly restricted occupancy to
"Caucasians," except for "domestic servants." Even after the removal of these clauses, they
continued to systematically refuse sales or rentals to African Americans, contributing to the
staggering statistic that, by 1960, only 57 out of 65,276 residents in Levittown were Black,
constituting less than 0.1% of the population.
The legacy of redlining and housing discrimination plays a significant role in perpetuating the
black-white wealth gap in the United States. As the primary asset for most American families,
the soaring value of Levittown homes, originally sold for less than $7,000 in 1950, has led to
substantial inheritable family wealth for the original white homebuyers, reaching prices between
$500,000 and $700,000 today.
However, the roots of racial segregation in suburban areas predate the post-War expansion.
Federal housing policies dating back to the 1930s were overtly racist, contributing to white flight
from cities and predominantly white suburbs surrounding metropolitan areas. African Americans
were largely excluded from benefiting from New Deal legislation, such as Social Security,
unemployment compensation, minimum wage protection, and labor union rights.
In 1934, the Federal Housing Act established the Federal Housing Administration (FHA) to
regulate and stabilize the home mortgage market to prevent foreclosures. However,
discriminatory practices were deeply ingrained in these policies, exemplified by the FHA's 1938
Underwriting Manual, which aimed to minimize "adverse influences" from lower-class and
racially diverse neighborhoods. The FHA assigned neighborhoods rankings and labels based on
racial characteristics, further entrenching segregation.
This historical segregation continues to reverberate, leading to the existence of racially and
ethnically segregated mini-school districts in Long Island, where neighboring districts can have
vastly different racial compositions.
Additionally, racial guidelines established in the 1930s continue to impact the health of residents
in different communities. Studies have shown that tree canopy density plays a vital role in
moderating temperatures and maintaining air quality. However, poor neighborhoods tend to have
25% less tree canopy than affluent communities, resulting in higher rates of extreme heat-related
4. health issues. Through this investigation, we hope to contribute to a deeper understanding of the
ongoing disparities in the nation's housing landscape and advocate for policies that promote
housing equity and financial inclusion for all. ("Underwriting Manual" 1938)
Background and Significance:
The historical practice of redlining in the United States involved discriminatory assessments of
areas based on race, designating certain neighborhoods as "dangerous" and undesirable for
investment. Some of these same formerly blighted areas are now prime real estate. These
redlining policies had lasting consequences, contributing to housing, and lending disparities that
persist to this day. The discriminatory practices in redlining shaped the spatial distribution of
neighborhoods, impacting access to credit, wealth accumulation, and socioeconomic
opportunities for different communities.
Research Question:
How does the historical practice of redlining relate to contemporary mortgage rates and lending
practices in Nassau & Suffolk County, New York?
Hypothesis:
Areas that were historically redlined are likely to experience higher mortgage rates and
disparities in lending practices compared to non-redlined areas in Nassau & Suffolk County.
For this research, we focus on studying the impact of historical redlining in Nassau & Suffolk
Counties, New York. This is an essential case study, representing a diverse region with a
complex history of housing policies and urban development. By examining the historical context
of redlining and its connection to contemporary mortgage rates and lending practices, we aim to
shed light on the enduring effects of housing discrimination and its implications for housing
equity and access to credit.
5. Purpose and Implications:
The purpose of this research is to examine the correlation between historical redlining and
present-day mortgage rates and lending practices in Nassau & Suffolk County. By understanding
the lingering impacts of redlining, this study seeks to contribute to the ongoing dialogue on
housing equity and access to credit.
The implications of this research extend to various stakeholders, including policymakers,
community organizations, and researchers. By identifying disparities in mortgage rates and
lending practices, we can advocate for policies that promote fair housing opportunities and
combat discriminatory lending practices. Moreover, this study may inspire further research on
historical redlining's lasting effects and its intersection with other socio-economic factors in
shaping the current housing landscape.
Dataset:
We will focus on census and mortgage tract data in Nassau & Suffolk County, New York, as our
study area. The HMDA (Home Mortgage Disclosure Act) dataset will be used to obtain relevant
mortgage rate and lending practice information for the study area.
Data Source Information:
6. HMDA Data: The Home Mortgage Disclosure Act data provides comprehensive information on
mortgage applications, approvals, denials, and rates for financial institutions. The dataset is
available at the census tract level, offering insights into the spatial distribution of mortgage-
related variables.
Variables of Interest:
Totpop-total population,
Medage-median age,
Medhovl-mediann home value,
is_cnv-Conventional,
is_fh_s-FHA insured,
is_fsr-FS_RHS Guaranteed,
is_v_sm-VA Guaranteed,
is_wnh-white non-Hispanic,
is_hsp_ -Hispanic,
is_bnh-black non-Hispanic,
(The Home Mortgage Disclosure Act (HMDA) Historic Data Dictionary - LAR Record Codes is
provided by the Consumer Financial Protection Bureau (n.d.). It can be accessed at
https://files.consumerfinance.gov/hmda-historic-data-dictionaries/lar_record_codes.pdf.)
All analyses for the research project on redlining effects and housing disparities in Nassau and
Suffolk Counties were conducted using GeoDa. GeoDa provides a wide array of spatial analysis
techniques, allowing for the exploration of spatial patterns, spatial autocorrelation, and
conducting spatial regression analyses. Leveraging GeoDa's capabilities, the research aims to
gain deeper insights into the relationships between historical redlining, contemporary mortgage
rates, and lending practices, ultimately shedding light on the persistent effects of redlining on
housing equity and disparities in the study area.
II. Univariate Analysis
A. Histograms
1. Totpop (Total Population)
2. Medage (Median Age)
3. Medhovl (Median Home Value)
4. IS_CNV (loan type-conventional)
B. Interpretation of Histograms
7. Histogram 1 - "Totpop" (Total Population): This histogram displays the distribution of total
population data for Nassau and Suffolk Counties. The x-axis represents different population
ranges, and the y-axis shows the frequency or count of occurrences within each range. The taller
bars or higher frequencies in certain population ranges may indicate areas with larger population
concentrations, potentially corresponding to major towns. These major towns are likely to be
located at lower or more accessible elevations, as suggested by the elevation ranges in the
histogram. However, it's important to note that the histogram itself does not provide specific
geographical information or town names. To gain a more comprehensive understanding and
verify the insights, researchers would need to refer to maps or additional geographical data
highlighting the locations of major towns in relation to elevation levels. For long island, the
majority of the population resides along the shorelines.
8. Histogram 2 - "Medhomvl" (Median Home Value): In this histogram, "medhomvl" represents
the distribution of median home values in the specified counties. The x-axis shows different
price ranges for median home values, while the y-axis represents the frequency or count of
properties falling within each range. The specified range in this histogram is from $400,000 to
$533,333. The tallest bar in the histogram represents the frequency of properties falling within
this specific price range, indicating that these median home values are more common in the
given area. This histogram provides insights into the housing market and the prevalence of
properties with median values between $400,000 and $533,333. Additionally, the average value
of this range would be $466,666.5. As a realtor, I am aware that the median home values locally
range from the mid-500k mark and upward.
9. Histogram 3 - "Medage" (Median Age): The "Medage" histogram depicts the distribution of
median ages for the population on Long Island. The x-axis represents different age ranges, while
the y-axis shows the frequency or count of occurrences within each range. The value
"medage:41" indicates that the majority of the population on Long Island has a median age of 41.
The tallest bar corresponds to the age group with the highest frequency, making it the most
common age group on Long Island according to the provided data. To gain a deeper
understanding of this histogram, additional analysis of demographic data and trends over time is
essential.
10. Histogram 4 - "Conventional" (Conventional Mortgages): This histogram displays the
distribution of conventional mortgage values in Nassau and Suffolk Counties. The x-axis
represents the range of mortgage values, and the y-axis shows the frequency or count of
mortgages falling within each range (bin). The highest bar between 0 and 7.3 suggests that a
significant number of conventional mortgages have values falling within this range in the
counties. This distribution can be influenced by various factors, such as local housing prices,
median income, interest rates, and other economic factors. It may indicate that a considerable
portion of people in these counties have opted for lower-priced conventional mortgages or that
the real estate market offers more affordable properties in this range.
Overall, these histograms can offer valuable insights into population distribution, housing market
trends, and mortgage-related factors in Nassau and Suffolk Counties. To examine potential
redlining effects, researchers would need to analyze these histograms alongside historical data,
demographic information, and lending practices to assess if there are disparities in mortgage
rates and lending practices between historically redlined and non-redlined areas.
11. Histograms and Redlining Effects:
Histogram "totpop" (Total Population):
The histogram showing population distribution can help identify areas with higher population
concentrations. It could provide insights into how redlining practices may have affected
population distribution in the counties. If certain areas that were historically redlined exhibit
lower population concentrations or disparities in population growth compared to non-redlined
areas, it might suggest the lingering effects of redlining on demographic patterns.
Histogram "medage" (Median Age):
Analyzing the median age distribution across different regions can help researchers understand
potential demographic changes resulting from historical redlining effects. If redlined areas show
a lack of younger populations or a disproportionately older population, it could indicate how past
discriminatory practices have influenced age demographics and socioeconomic disparities.
Histogram "medhomvl" (Median Home Value):
The histogram of median home values could reveal whether historically redlined areas continue
to experience disparities in property values. If areas that were previously redlined have lower
median home values compared to non-redlined areas, it could indicate the long-term impacts of
redlining on property markets and housing wealth.
Histogram "conventional loans":
Analyzing the distribution of conventional mortgage values in the counties can help uncover
potential disparities in lending practices. If historically redlined areas experience limited access
to conventional loans or face higher mortgage rates compared to non-redlined areas, it could
indicate the continuation of redlining effects on housing finance and access to credit.
In conclusion, the histograms can be utilized as valuable tools to investigate how historical
redlining practices have influenced contemporary mortgage rates, lending practices, and
socioeconomic disparities in Nassau & Suffolk Counties. By comparing the distribution patterns
in redlined and non-redlined areas, researchers can gain insights into the potential lasting impacts
of redlining on the housing market and access to financial resources for different communities.
12. III. Bivariate Analysis
• Scatterplots
• Interpretation of Scatterplots
1. Totpop vs. is_wnh (Total population- White Non-Hispanic)
Number of Observations (#obs): 253
R-squared (R^2): 0.072
Intercept (const a): 20.523
Standard Error of Intercept (std-err a): 1.498.
t-statistic of Intercept (t-stat a): 13.700
p-value of Intercept (p-value a): 0.001
Slope (coefficient) of is_wnh (slope b): 0.000.
Standard Error of Slope (std-err b): 0.000.
t-statistic of Slope (t-stat b): 4.427
p-value of Slope (p-value b): 0.000
Interpretation:
1. Number of Observations (#obs): This indicates the total number of data points (observations)
used for the scatterplot. In this case, there are 253 data points.
13. 2. R-squared (R^2): R-squared is a measure of how well the data points fit the regression line. It
represents the proportion of the variance in the dependent variable (totpop) that is predictable
from the independent variable (is_wnh). An R-squared of 0.072 suggests that about 7.2% of
the variation in total population (totpop) can be explained by the variable is_wnh.
3. Intercept (const a): This is the value of the dependent variable (totpop) when the independent
variable (is_wnh) is zero. In this case, when there are no white non-Hispanic mortgage
applicants (is_wnh = 0), the predicted total population is approximately 20.523.
4. Standard Error of Intercept (std-err a): This is the standard deviation of the estimated
intercept. It measures the accuracy of the estimate and how much it varies from sample to
sample. A smaller standard error indicates a more reliable estimate.
5. t-statistic of Intercept (t-stat a): The t-statistic measures how many standard errors the
estimated intercept is away from zero. It is used to test the significance of the intercept. In
this case, the t-statistic is 13.700, which indicates that the intercept is significantly different
from zero.
6. p-value of Intercept (p-value a): The p-value is the probability of observing a t-statistic as
extreme as the one calculated, assuming that the null hypothesis is true (null hypothesis:
intercept = 0). A p-value of 0.001 suggests that the intercept is statistically significant at the
0.1% level.
7. Slope (coefficient) of is_wnh (slope b): The slope represents the change in the dependent
variable (totpop) for a one-unit change in the independent variable (is_wnh). In this case, the
slope is 0.000, which means there is no statistically significant linear relationship between
white non-Hispanic mortgage applicants (is_wnh) and total population (totpop).
8. Standard Error of Slope (std-err b): This is the standard deviation of the estimated slope. Like
the standard error of the intercept, a smaller value indicates a more reliable estimate.
9. t-statistic of Slope (t-stat b): The t-statistic for the slope tests the significance of the slope
coefficient. A t-statistic of 4.427 suggests that there may be a statistically significant
relationship between is_wnh and totpop.
10. p-value of Slope (p-value b): The p-value for the slope tests the null hypothesis that the slope
coefficient is equal to zero. A p-value of 0.000 suggests that there is a statistically significant
relationship between is_wnh and totpop at the 0.1% level.
The scatterplot "totpop_is_wnh" shows the relationship between the total population (totpop) and
the presence of white non-Hispanic mortgage applicants (is_wnh). The R-squared value indicates
that about 7.2% of the total population variation can be explained by the variable is_wnh. The p-
values for both the intercept and slope are very low, indicating that there is a statistically
significant relationship between is_wnh and totpop. However, the small slope coefficient
suggests that the effect of is_wnh on totpop may be minimal.
14. 2. Medhovl vs. is_hsp ( Median Home value_Hispanic)
Number of Observations (#obs): 357
R-squared (R^2): 0.206
Intercept (const a): 10.118
Standard Error of Intercept (std-err a): 0.680
t-statistic of Intercept (t-stat a): 14.889
p-value of Intercept (p-value a): 0
Slope (coefficient) of is_hsp (slope b): -0.000
Standard Error of Slope (std-err b): 0.000
t-statistic of Slope (t-stat b): -9.609
p-value of Slope (p-value b): 0
Interpretation:
1. Number of Observations (#obs): This indicates the total number of data points (observations)
used for the scatterplot. In this case, there are 357 data points.
2. R-squared (R^2): R-squared is a measure of how well the data points fit the regression line. It
represents the proportion of the variance in the dependent variable (medhomvl) that is
predictable from the independent variable (is_hsp). An R-squared of 0.206 suggests that
15. about 20.6% of the variation in median home value (medhomvl) can be explained by the
variable is_hsp.
3. Intercept (const a): This is the value of the dependent variable (medhomvl) when the
independent variable (is_hsp) is zero. In this case, when there are no Hispanic mortgage
applicants (is_hsp = 0), the predicted median home value is approximately 10.118.
4. Standard Error of Intercept (std-err a): This is the standard deviation of the estimated
intercept. It measures the accuracy of the estimate and how much it varies from sample to
sample. A smaller standard error indicates a more reliable estimate.
5. t-statistic of Intercept (t-stat a): The t-statistic measures how many standard errors the
estimated intercept is away from zero. It is used to test the significance of the intercept. In
this case, the t-statistic is 14.889, which indicates that the intercept is significantly different
from zero.
6. p-value of Intercept (p-value a): The p-value is the probability of observing a t-statistic as
extreme as the one calculated, assuming that the null hypothesis is true (null hypothesis:
intercept = 0). A p-value of 0 suggests that the intercept is statistically significant.
7. Slope (coefficient) of is_hsp (slope b): The slope represents the change in the dependent
variable (medhomvl) for a one-unit change in the independent variable (is_hsp). In this case,
the slope is -0.000, which means there is no statistically significant linear relationship
between Hispanic mortgage applicants (is_hsp) and median home value (medhomvl).
8. Standard Error of Slope (std-err b): This is the standard deviation of the estimated slope. Like
the standard error of the intercept, a smaller value indicates a more reliable estimate.
9. t-statistic of Slope (t-stat b): The t-statistic for the slope tests the significance of the slope
coefficient. A t-statistic of -9.609 suggests that there is a statistically significant relationship
between is_hsp and medhomvl.
10. p-value of Slope (p-value b): The p-value for the slope tests the null hypothesis that the slope
coefficient is equal to zero. A p-value of 0 suggests that there is a statistically significant
relationship between is_hsp and medhomvl.
The scatterplot "medhomvl_is_hsp" shows the relationship between median home value
(medhomvl) and the presence of Hispanic mortgage applicants (is_hsp). The R-squared value
indicates that about 20.6% of the median home value variation can be explained by the variable
is_hsp. The p-values for both the intercept and slope are very low, indicating that there is a
statistically significant relationship between is_hsp and medhomvl. However, the small slope
coefficient suggests that the effect of is_hsp on medhomvl may be minimal.
16. 3. Additional scatterplots based on relevant variables.
A. Scatterplot: Medhovl_wnh- (Median home value applicant white non Hispanic)
In the scatterplot of "Median Home Value" (Medhovl) vs. "White Non-Hispanic Mortgage
Applicants" (is_wnh), the statistical information provided is as follows:
• Number of Observations (#obs): 357
• R-squared (R^2): 0.0000
• Intercept (Constant) (a): 24.598
• Standard Error of the Intercept (std-err a): 1.862
• t-statistic of the Intercept (t-stat a): 13.207
• p-value of the Intercept (p-value a): 0.000
• Slope (Coefficient) (b): -0.000
• Standard Error of the Slope (std-err b): 0.000
17. • t-statistic of the Slope (t-stat b): -0.399
• p-value of the Slope (p-value b): 0.690
Interpretation:
1. Number of Observations (#obs): This indicates the total number of data points
(observations) used for the scatterplot. In this case, there are 357 data points.
2. R-squared (R^2): The R-squared value is 0.0000, indicating that there is no linear
relationship between median home value (Medhovl) and the percentage of White Non-
Hispanic mortgage applicants (is_wnh) in the analyzed areas. The R-squared value of
0.0000 suggests that the median home value does not explain any variation in the
percentage of White Non-Hispanic mortgage applicants.
3. Intercept (a): The intercept represents the value of the dependent variable (is_wnh) when
the independent variable (Medhovl) is zero. In this case, when the median home value is
zero, the expected percentage of White Non-Hispanic mortgage applicants is 24.598.
4. Standard Error of the Intercept (std-err a): The standard error of the intercept is an
estimate of the uncertainty in the intercept's value. A value of 1.862 indicates more
certainty.
5. t-statistic of the Intercept (t-stat a): The t-statistic measures the significance of the
intercept. In this case, the t-statistic of 13.207 suggests that the intercept is highly
significant at the 0.05 significance level. This means that the intercept value of 24.598 is
statistically significant.
6. p-value of the Intercept (p-value a): The p-value represents the probability of observing
the given t-statistic or more extreme results if the null hypothesis (intercept = 0) is true. A
p-value of 0.000 indicates that the intercept is highly statistically significant at the 0.05
significance level.
7. Slope (Coefficient) (b): The slope represents the change in the dependent variable
(is_wnh) for a one-unit change in the independent variable (Medhovl). In this case, the
slope is -0.000, indicating no change in the percentage of White Non-Hispanic mortgage
applicants with a change in median home value. This result further supports the R-
squared value of 0.0000, suggesting no relationship.
8. Standard Error of the Slope (std-err b): The standard error of the slope is an estimate of
the uncertainty in the slope's value. A value of 0.000 indicates very high precision.
9. t-statistic of the Slope (t-stat b): The t-statistic measures the significance of the slope. In
this case, the t-statistic of -0.399 indicates that the slope is not significantly different from
zero at the 0.05 significance level. This means that the slope of -0.000 is not statistically
significant.
10. p-value of the Slope (p-value b): The p-value represents the probability of observing the
given t-statistic or more extreme results if the null hypothesis (slope = 0) is true. A p-
value of 0.690 indicates that the slope is not statistically significant at the 0.05
significance level.
Overall, there is no significant relationship between the median home value and the percentage
of White Non-Hispanic mortgage applicants. The scatterplot would likely show data points
scattered with no clear pattern or trend between the median home value and the percentage of
White Non-Hispanic mortgage applicants in the analyzed areas.
18. B. Scatterplot: totpop_is_hsp (total population applicant Hispanic)
• Number of Observations (#obs): 253
• R-squared (R^2): 0.134
• Intercept (Constant) (a): -0.839
• Standard Error of the Intercept (std-err a): 0.725
• t-statistic of the Intercept (t-stat a): -1.156
• p-value of the Intercept (p-value a): 0.249
• Slope (Coefficient) (b): 0.001
• Standard Error of the Slope (std-err b): 0.000
• t-statistic of the Slope (t-stat b): 6.244
• p-value of the Slope (p-value b): 0.000
Interpretation:
1. Number of Observations (#obs): This indicates the total number of data points (observations)
used for the scatterplot. In this case, there are 253 data points.
2. R-squared (R^2): The R-squared value indicates that approximately 13.4% of the variation in
the percentage of Hispanic mortgage applicants (is_hsp) can be explained by the variation in
the total population (Totpop).
19. 3. Intercept (a): The intercept represents the value of the dependent variable (is_hsp) when the
independent variable (Totpop) is zero. In this case, when the total population is zero, the
expected percentage of Hispanic mortgage applicants is -0.839, which is not meaningful in
the context of the analysis.
4. Standard Error of the Intercept (std-err a): The standard error of the intercept is an estimate of
the uncertainty in the intercept's value. A higher value indicates more uncertainty.
5. t-statistic of the Intercept (t-stat a): The t-statistic measures the significance of the intercept.
In this case, the t-statistic of -1.156 suggests that the intercept is not significantly different
from zero at the 0.05 significance level.
6. p-value of the Intercept (p-value a): The p-value represents the probability of observing the
given t-statistic or more extreme results if the null hypothesis (intercept = 0) is true. A p-
value of 0.249 indicates that the intercept is not statistically significant at the 0.05
significance level.
7. Slope (Coefficient) (b): The slope represents the change in the dependent variable (is_hsp)
for a one-unit change in the independent variable (Totpop). In this case, for every one-unit
increase in the total population, the expected percentage of Hispanic mortgage applicants
increases by 0.001.
8. Standard Error of the Slope (std-err b): The standard error of the slope is an estimate of the
uncertainty in the slope's value. A very low value (0.000) suggests high precision.
9. t-statistic of the Slope (t-stat b): The t-statistic measures the significance of the slope. In this
case, the t-statistic of 6.244 suggests that the slope is significantly different from zero at the
0.05 significance level.
10. p-value of the Slope (p-value b): The p-value represents the probability of observing the
given t-statistic or more extreme results if the null hypothesis (slope = 0) is true. A very low
p-value (0.000) indicates that the slope is statistically significant at the 0.05 significance
level.
The scatterplot and the statistical information suggest that there is a significant positive
relationship between the total population and the percentage of Hispanic mortgage applicants in
the analyzed areas. As the total population increases, the percentage of Hispanic mortgage
applicants also tends to increase.
The conducted analyses provide valuable insights into the distribution of variables and their
potential implications for redlining effects. However, to draw definitive conclusions about
redlining effects, further spatial analysis, and comparison between historically redlined and non-
redlined areas are necessary and will be shown below.
This will involve mapping variables' distributions, generating spatial weights matrices,
calculating Moran's I values, and evaluating LISA cluster maps to identify spatial patterns and
clusters related to redlining practices. Through a comprehensive investigation, researchers can
better understand the historical and contemporary impacts of redlining on mortgage rates,
lending practices, and socioeconomic disparities in Nassau and Suffolk Counties.
20. 1. Medhovl vs. is_hsp (Median Home Value - Hispanic) Scatterplot: R-squared (R^2) value:
0.206 Interpretation: The R-squared value of 0.206 indicates that approximately 20.6% of
the variation in median home values (Medhovl) can be explained by the variation in the
presence of Hispanic mortgage applicants (is_hsp).
• Direction of Relationship: The negative slope coefficient (-0.000) suggests a weak
inverse relationship between the percentage of Hispanic mortgage applicants and
median home values. This means that as the percentage of Hispanic mortgage
applicants increases, median home values tend to decrease slightly, and vice versa.
• Strength of Relationship: The R-squared value of 0.206 indicates a moderate
relationship between the two variables. While the relationship is statistically
significant due to the very low p-value, the effect size is relatively small, with the
presence of Hispanic mortgage applicants explaining only 20.6% of the variation in
median home values.
2. Medhovl vs. is_wnh (Median Home Value - White Non-Hispanic) Scatterplot: R-squared
(R^2) value: 0.0000 Interpretation: The R-squared value of 0.0000 suggests that there is
no linear relationship between median home values (Medhovl) and the percentage of
White Non-Hispanic mortgage applicants (is_wnh) in the analyzed areas. The R-squared
value of 0.0000 indicates that median home value does not explain any variation in the
percentage of White Non-Hispanic mortgage applicants.
• Direction of Relationship: The slope coefficient of -0.000 indicates no change in the
percentage of White Non-Hispanic mortgage applicants with a change in median
home value. This result further supports the lack of a significant relationship between
the variables.
• Strength of Relationship: Since the R-squared value is 0.0000, it suggests that the
percentage of White Non-Hispanic mortgage applicants is not influenced by changes
in median home values. There is essentially no relationship between the two
variables.
3. Totpop vs. is_hsp (Total Population - Hispanic) Scatterplot: R-squared (R^2) value:
0.134 Interpretation: The R-squared value of 0.134 indicates that approximately 13.4% of
the variation in the total population (Totpop) can be explained by the variation in the
presence of Hispanic mortgage applicants (is_hsp).
• Direction of Relationship: The positive slope coefficient (0.001) suggests a positive
relationship between the percentage of Hispanic mortgage applicants and the total
population. This means that as the percentage of Hispanic mortgage applicants
increases, the total population tends to increase, and vice versa.
• Strength of Relationship: The R-squared value of 0.134 suggests a weak to
moderate relationship between the two variables. While the relationship is statistically
significant due to the low p-value, the small R-squared value indicates that the
21. presence of Hispanic mortgage applicants explains only a small portion (13.4%) of
the variation in the total population.
4. Totpop vs. is_wnh (Total Population - White Non-Hispanic) Scatterplot: R-squared (R^2)
value: 0.072 Interpretation: The R-squared value of 0.072 indicates that approximately
7.2% of the variation in the total population (Totpop) can be explained by the variation in
the presence of White Non-Hispanic mortgage applicants (is_wnh).
• Direction of Relationship: The positive slope coefficient (0.001) suggests a positive
relationship between the percentage of White Non-Hispanic mortgage applicants and
the total population. This means that as the percentage of White Non-Hispanic
mortgage applicants increases, the total population tends to increase, and vice versa.
• Strength of Relationship: The R-squared value of 0.072 suggests a weak
relationship between the two variables. While the relationship is statistically
significant due to the low p-value, the small R-squared value indicates that the
presence of White Non-Hispanic mortgage applicants explains only a small portion
(7.2%) of the variation in the total population.
Overall, the scatterplots show varying degrees of relationship strength between the studied
variables. The scatterplot of "Medhovl vs. is_hsp" indicates a moderate relationship between
median home values and Hispanic mortgage applicants. On the other hand, the scatterplot of
"Medhovl vs. is_wnh" shows no significant relationship between median home values and White
Non-Hispanic mortgage applicants.
Regarding the scatterplots of "Totpop vs. is_hsp" and "Totpop vs. is_wnh," both demonstrate a
weak to moderate relationship between the total population and the presence of Hispanic and
White Non-Hispanic mortgage applicants, respectively.
It's important to interpret these relationships in the context of the research question and consider
the effect sizes in relation to other potential influencing factors. Further analysis and additional
data may be necessary to fully understand the implications of these relationships for redlining
effects and housing disparities in Nassau and Suffolk Counties. Additionally, it's crucial to note
that while the model shows a significant correlation, it does not necessarily imply causation.
Therefore, caution should be exercised in drawing definitive conclusions solely based on this
analysis. A comprehensive investigation that includes multiple variables and a broader historical
context will provide a more nuanced understanding of the complex factors contributing to
housing disparities in the region.
22. IV. Additional Univariate/Bivariate Analysis:
A. Scatterplot Matrix
B. The scatterplot matrix includes the following variables: "Medhovl" (Median Home Value),
"is_hsp" (Hispanic), "is_wnh" (White Non-Hispanic), "Totpop" (Total Population), and "is_bnh"
(Black Non-Hispanic).
Interpretation: The scatterplot matrix displays scatterplots of all possible pairwise combinations
of the selected variables. Each scatterplot will have one variable on the x-axis and another on the
y-axis. The diagonal of the matrix will show histograms of each variable.
1. Median Home Value (Medhovl) vs. Hispanic (is_hsp): This scatterplot will show the
relationship between median home value and the percentage of Hispanic mortgage
applicants. It will help us understand if there is any correlation between median home
value and the presence of Hispanic mortgage applicants in different areas.
2. Median Home Value (Medhovl) vs. White Non-Hispanic (is_wnh): This scatterplot will
show the relationship between median home value and the percentage of White Non-
Hispanic mortgage applicants. It will help us assess any correlation or disparities between
median home values and the presence of White Non-Hispanic mortgage applicants.
3. Median Home Value (Medhovl) vs. Total Population (Totpop): This scatterplot will
explore the relationship between median home value and the total population. It can help
23. identify any patterns or trends between housing values and the size of the population in
different areas.
4. Median Home Value (Medhovl) vs. Black Non-Hispanic (is_bnh): This scatterplot will
illustrate the relationship between median home value and the percentage of Black Non-
Hispanic mortgage applicants. It will allow us to examine any correlations or disparities
between housing values and the presence of Black Non-Hispanic mortgage applicants.
Contribution to the Research Question: The scatterplot matrix provides a visual representation of
how different variables relate to each other. By examining the scatterplots, we can identify
potential relationships and patterns between median home values, racial demographics, and total
population. The matrix allows us to quickly assess correlations or discrepancies between
variables and their potential implications for redlining effects.
This analysis contributes to the research question by providing an integrated view of the
relationships between median home values and racial demographics in Nassau and Suffolk
Counties. It helps us explore how housing values and mortgage lending patterns differ or
correlate based on racial and ethnic backgrounds. Moreover, it allows us to identify areas with
potential disparities in housing opportunities, providing further evidence for the presence of
historical redlining effects.
The scatterplot matrix complements the previous analyses by offering a comprehensive and
intuitive way to visualize the interconnections between multiple variables. It helps researchers
gain deeper insights into the relationships between median home values, race/ethnicity, and
population size, furthering our understanding of potential redlining effects and housing
disparities in the study area.
24. V. Spatial Analysis
A. Choropleth Maps
1. Map 1: Visualized distribution of Totpop across the study area.
2. Map 2: Visualized distribution of Medhovl across the study area.
25. 3. Map 3: Visualized distribution of ins_conv.
B. Interpretation of Choropleth Maps
In my observation of the choropleth maps, I noticed several interesting spatial patterns and
disparities related to the selected variables.
Firstly, when examining the Totpop (Total Population) choropleth map, I observed that darker or
higher color shades represented areas with higher population concentrations. These regions
seemed to align with urban centers and densely populated areas. On the other hand, lighter or
lower color shades indicated regions with lower population concentrations, which appeared to be
more rural or less densely populated.
Moving on to the Medhovl (Median Home Value) choropleth map, I noticed that darker shades
on the map corresponded to areas with higher median home values. These regions appeared to be
affluent or desirable neighborhoods with more expensive properties. Conversely, lighter shades
indicated areas with lower median home values, suggesting more affordable or economically
challenged neighborhoods.
Regarding the is_cnv (Conventional Mortgages) choropleth map, I observed that darker shades
represented regions with a higher percentage of conventional mortgages. These areas appeared to
have a larger proportion of homeowners opting for conventional financing. On the other hand,
lighter shades indicated areas with a lower percentage of conventional mortgages, suggesting a
higher prevalence of alternative mortgage types or financing options.
Considering the spatial clusters, I identified certain regions with high population density and low
population density in the study area. These clusters could signify areas with distinct demographic
26. characteristics or socioeconomic conditions. I also observed spatial patterns of median home
values, with some areas forming clusters of high property values, while others showed clusters of
lower property values.
In terms of the conventional mortgages, I noticed spatial clusters of high and low percentages in
different regions, potentially reflecting disparities in mortgage lending practices or access to
conventional financing.
My observation of the choropleth maps provided valuable insights into the distribution and
disparities of total population, median home values, and conventional mortgages across the study
area. These spatial patterns and clusters may have significant implications for understanding
historical redlining effects or other socioeconomic factors influencing housing and mortgage
outcomes in the region.
VI. Spatial Weights Matrix
In the spatial weights matrix analysis for the 666 counties across Long Island, the following
basic diagnostics were observed:
1. Number of Neighborless Observations: There were no neighborless observations in the
dataset. This means that each county in Long Island has at least one neighboring county.
2. Mean Number of Neighbors: The mean number of neighbors for all the counties in Long
Island is approximately 6.05. This indicates that, on average, each county is connected to
around 6 other neighboring counties.
3. Maximum Number of Neighbors: The maximum number of neighbors among all the
counties in Long Island is 22. This means that there is one county in Long Island that has
22 neighboring counties, indicating a densely connected region.
27. Interpretation:
The absence of neighborless observations suggests that there are no isolated counties in Long
Island, and each county is part of a connected network with at least one neighboring county. The
average of approximately 6.05 neighbors per county indicates a moderate level of spatial
connectivity among the counties.
The presence of one county with 22 neighbors indicates a highly interconnected area, possibly
representing a densely populated region or an area with shared characteristics and strong spatial
interactions with its neighboring counties.
The spatial weights matrix analysis provides insights into the spatial relationships between the
counties in Long Island, which can be valuable for understanding potential spatial patterns,
clusters, and trends related to the research question, such as the impact of redlining or other
socioeconomic factors across the region.
VII. Moran's I and LISA Analysis
In the "Moran's I" dialog, I chose the appropriate variables (Totpop, Medhovl, is_cnv) for
analysis.
28.
29.
30. Spatial analysis using Moran's I to assess the spatial autocorrelation of three variables:
"Totpop" (Total Population), "Medhomvl" (Median Home Value), and "is_cnv"
(Conventional Mortgage). Moran's I is a statistical measure that helps us understand the
degree of spatial clustering or dispersion of values within a geographic area. The Moran's
I values range from -1 to 1, where a positive value indicates positive spatial
autocorrelation (similar values are clustered together), a negative value indicates negative
spatial autocorrelation (similar values are dispersed), and a value close to zero indicates
no spatial autocorrelation (values are randomly distributed).
For "Totpop," we found a Moran's I value of 0.160 with a p-value of 0.0100. This
indicates that there is a positive spatial autocorrelation in the distribution of total
population across the counties. In other words, areas with similar population sizes are
clustered together on Long Island.
For "Medhomvl," the Moran's I value is 0.565 with a p-value of 0.0100. This result
indicates a strong positive spatial autocorrelation in the distribution of median home
values. It suggests that areas with similar median home values are spatially clustered, and
there might be distinct high-value and low-value clusters across the study area.
For "is_cnv," the Moran's I value is 0.350 with a p-value of 0.0100. This indicates
positive spatial autocorrelation in the distribution of conventional mortgages. It suggests
that areas with similar percentages of conventional mortgages are spatially clustered,
31. possibly indicating distinct regions with high or low usage of conventional mortgage
types.
These findings have significant implications for our research question, which focuses on
understanding potential redlining effects and housing disparities on Long Island. The
positive spatial autocorrelation observed in total population, median home values, and
conventional mortgages indicates the presence of spatial clusters or patterns in these
variables. This may suggest the existence of distinct regions with similar socioeconomic
characteristics or lending practices.
To gain a deeper understanding of the redlining effects, I conducted a LISA (Local
Indicators of Spatial Association) analysis to identify specific areas with significant
spatial clusters of interest. The LISA analysis will provide insights into the presence of
hotspots (high-high clusters) and coldspots (low-low clusters) for each variable,
potentially shedding light on areas that have experienced historical redlining practices or
have notable housing disparities.
By combining the findings from Moran's I and LISA analyses, my research aims to
provide valuable insights into the spatial distribution of key variables related to housing
and lending practices on Long Island. These insights can help inform policy decisions,
urban planning, and targeted interventions to address potential redlining effects and
promote equitable housing opportunities for all residents.
32. VIII. Conclusion
The spatial analysis conducted in this research has revealed several significant findings
related to the distribution of key variables on Long Island. The main findings from the
spatial analysis are as follows:
1. Positive Spatial Autocorrelation: Moran's I values indicated positive spatial
autocorrelation for "Totpop" (Total Population), "Medhomvl" (Median Home Value), and
"is_cnv" (Conventional Mortgage). This suggests that areas with similar values of these
variables tend to be spatially clustered, indicating the presence of distinct regions with
similar socioeconomic characteristics or lending practices.
2. Spatial Clusters: The LISA analysis identified significant hotspots (high-high clusters)
and coldspots (low-low clusters) for each variable. These clusters provide insights into
areas with concentrated high or low values of total population, median home values, and
conventional mortgages, potentially indicating areas that have experienced historical
redlining practices or housing disparities.
The results of the spatial analysis have provided valuable information to inform our
research question regarding the potential redlining effects and housing equity on Long
Island. The positive spatial autocorrelation observed in total population, median home
values, and conventional mortgages support the hypothesis that there are spatial clusters
of similar values, signifying the presence of distinct neighborhoods or communities with
similar socio-economic characteristics.
The identification of spatial clusters through LISA analysis further strengthens the
evidence of potential redlining effects. It suggests that certain areas have experienced
historical discriminatory lending practices or have faced housing disparities, leading to
concentrated pockets of high or low total population, median home values, and
conventional mortgages.
The spatial analysis has significant implications for understanding redlining effects and
housing equity on Long Island. By identifying areas with spatial clusters of specific
variables, policymakers, and urban planners can gain insights into regions that may have
been historically marginalized or disadvantaged in terms of housing opportunities and
access to credit.
The findings can help inform targeted interventions and policies aimed at promoting
housing equity and combating the legacy of redlining. Initiatives such as affordable
housing programs increased financial literacy and support for underserved communities,
and equitable lending practices can be developed to address disparities highlighted by
spatial analysis.
33. IX. Recommendations and Future Research
Suggested potential policy recommendations or interventions based on the findings:
1. Affordable Housing Initiatives: Develop and implement affordable housing programs in
areas identified as coldspots (low-low clusters) to increase housing opportunities for
disadvantaged communities.
2. Equitable Lending Practices: Encourage financial institutions to adopt equitable lending
practices to ensure that all qualified borrowers have equal access to mortgage loans,
irrespective of their geographic location.
3. Community Investment: Direct targeted community investment and resources to areas
identified as hotspots (high-high clusters) to support neighborhood revitalization and
economic development.
B. Identify areas for further research and exploration:
1. Long-Term Trends: Investigate the historical trends and changes in spatial clustering to
understand how redlining effects have evolved and identify areas that have experienced
gentrification or demographic shifts.
2. Demographic Changes: Analyze demographic changes in clustered regions to assess the
impact of redlining on population dynamics, racial composition, and socio-economic
profiles.
3. Rental Market Analysis: Explore spatial patterns in the rental market to understand rental
disparities and potential discrimination in housing rentals.
4. Longitudinal Analysis: Conduct longitudinal studies to examine the persistence of spatial
clusters and the effectiveness of policy interventions over time.
The detailed spatial analysis conducted in this research provides valuable insights into
housing disparities and redlining effects on Long Island. By leveraging these findings to
inform targeted policies and interventions, we can work towards achieving greater
housing equity and fostering inclusive communities for all residents.
Additional Future Recommendations:
1. Longitudinal Data Analysis: To gain a comprehensive understanding of the long-term
impacts of historical redlining on contemporary housing disparities, conducting a
longitudinal study is essential. By analyzing data spanning multiple decades, researchers
can track changes in housing patterns, mortgage rates, and lending practices, allowing for
a more nuanced assessment of the persistence of redlining effects over time.
2. Inclusion of Qualitative Data: While quantitative data provides valuable insights,
incorporating qualitative data through interviews, focus groups, or surveys with affected
communities can enrich the research findings. Understanding the lived experiences and
perspectives of individuals impacted by redlining and housing disparities can offer deeper
insights into the multifaceted nature of these issues.
3. Intersectionality and Racial Dynamics: Recognizing the intersecting identities and
experiences of individuals concerning race, gender, class, and other factors is crucial.
34. Future research should explore how these intersecting identities may compound or
mitigate the effects of redlining on housing disparities and access to mortgage lending.
4. Comparative Studies: Conducting comparative studies across different regions or
metropolitan areas can help identify similarities and differences in redlining effects and
housing disparities. Comparing the experiences of Nassau and Suffolk Counties with
other areas that have faced similar historical challenges can provide valuable lessons and
inform targeted policy interventions.
5. Policy Analysis and Recommendations: The research project should delve deeper into
analyzing existing housing policies and regulations in Nassau and Suffolk Counties.
Identifying gaps and potential shortcomings in current policies can inform the
development of evidence-based policy recommendations to address housing disparities
and promote housing equity.
6. Community Engagement and Advocacy: Engaging with local communities and housing
advocacy groups is essential to ensure that research findings are relevant and impactful.
Researchers should actively involve community stakeholders in the research process,
seeking their input and feedback, and collaborating with them to advocate for policy
changes and equitable housing practices.
7. Historical Preservation Efforts: Recognizing the historical significance of redlined
neighborhoods and their enduring impacts, preservation efforts should be encouraged.
Documenting and preserving the histories of affected communities can promote
awareness and foster a collective commitment to rectify past injustices.
8. Financial Education and Support: To address the barriers faced by marginalized
communities in accessing mortgage lending and homeownership opportunities, providing
financial education and support is crucial. Initiatives such as homeownership counseling
and down payment assistance programs can empower individuals and families to
navigate the housing market successfully.
9. Collaborative Research Partnerships: Establishing interdisciplinary research partnerships
with academic institutions, local governments, housing organizations, and advocacy
groups can enhance the research project's scope and impact. These partnerships can foster
knowledge exchange and mobilization, enabling collective efforts to address housing
disparities effectively.
10. Revisiting Zoning and Land Use Policies: Analyzing the impact of current zoning and
land use policies on housing affordability and access is essential. Revisiting and
reforming these policies to promote mixed-income communities and affordable housing
options can help address historical redlining's spatial consequences.
In conclusion, exploring additional future recommendations for the research project on
redlining effects and housing disparities in Nassau and Suffolk Counties is essential for
advancing our understanding of these complex issues. By adopting an interdisciplinary
and community-centered approach, researchers can contribute to evidence-based policies
and actions that foster housing equity and strive to rectify the historical injustices
perpetuated by redlining.
35. XI.References:
Anselin, L., Syabri, I., & Kho, Y. (n.d.). GeoDa: An Introduction to Spatial Data Analysis.
Geographical Analysis, 38(1), 5-22. Retrieved from
https://geodacenter.github.io/workbook/1a_introduction.html
Consumer Financial Protection Bureau. "Home Mortgage Disclosure Act (HMDA) Historic
Data." n.d. Consumer Financial Protection Bureau.
https://www.consumerfinance.gov/data-research/hmda/historic-data/.
Consumer Financial Protection Bureau. "Home Mortgage Disclosure Act (HMDA) Historic Data
Dictionary - LAR Record Codes." n.d. https://files.consumerfinance.gov/hmda-historic-
data-dictionaries/lar_record_codes.pdf.
"Mapping Inequality: Redlining in New Deal America." University of Richmond, Virginia.
https://dsl.richmond.edu/panorama/redlining/#loc=5/40.39/-74.21.
"New York Historical Society." https://www.nyhistory.org/.
"New York State Department of Financial Services (DFS)." https://www.dfs.ny.gov/.
Underwriting Manual. 1938. Underwriting and Valuation Procedure Under Title II of the
National Housing Act. Federal Housing Administration, Washington DC.
"U.S. Census Bureau." https://www.census.gov/.