Fundamentalsof Crime Mapping 8

Understand the difference between qualitative and

quantitative data.
Define and explain levels of measurement including

nominal, ordinal, interval, and ratio.
Understand the difference between discrete and continuous

variables.
Understand descriptive statistics, including typical measures

of central tendency and dispersion.
Understand inferential statistics, including typical tests of

significance and measures of association.
Understand what a regression model is and how it works.


Understand the limitations of statistics and how their

improper application can yield misleading results.
Define and explain classification in crime mapping and be

able to identify strengths and weaknesses of each method.

Qualitative

◦ Yields narrative-oriented information
 Park, Blue, Yes, Tall, Short, etc
Quantitative

◦ Produces number-oriented information
Key Factors or ―Variables‖


Ratio

◦ Highest level
◦ Can be reclassified to any of the other
levels
◦ - ∞ to + ∞
Interval

◦ Precise value of a measure is known
and thus can also be ranked
◦ 1,2,3,4,5,6,7,8,9,10
Ordinal

◦ Rank order nominal data and order can
be important
◦ Officer, Sergeant, Lt, Commander, Majo
r, Chief
Nominal

◦ Male, Female

Nominal

◦ Dichotomous  Caucasian
African American  Non-Caucasian

Caucasian

Hispanic

Native American

Asian

Other

Must be mutually
exclusive and
exhaustive

Traits, concepts,
and ideas in
criminal justice
can be difficult to
Ordinal operationalize, or

measure.
◦ Categorical or numerical data
that can be ranked, but the
precise value is not known
 Likert scale example
I feel safe walking in my neighborhood
alone at night
1 -Strongly agree
2 – Agree What is your annual household
3 – Neutral income?
4 – Disagree 1. Less than $20,000
5 - Strongly disagree 2. Between $20,000 and $40,000
6 - Don’t know
3. Between $40,001 and $60,000
4. Between $60,001 and $80,000
5. More than $80,000

Validity

◦ A variable accurately
reflects the trait or
concept it is measuring
Reliability

◦ The measure is
representative
consistently across
people, places, and time

Interval

◦ What is your annual
household income?
__________________
 Ranking possible and
precise value known
 112 burglaries occurred in
beat 32

Ratio

◦ Treated the same as
interval data
 112.23 burglaries occurred
on average in beat 32
 Can we have .23 of a
burglary?

$16095.32 $16095.00 $0 - $25,000 Below $35,000
$17262.67 $17262.00 $25,001 - $35,000 Over $35,000
$24262.78 $24262.00 $35,001 - $45,000
$26095.32 $26095.00 $45,001 - $55,000
$27262.67 $27262.00
$55,001 - $65,000
$32262.78 $32262.00
Over $65,000
$33095.32 $33095.00
$35262.67 $35262.00
$36262.78 $36262.00
$36095.32 $36095.00
$40262.67 $40262.00
$41262.78 $41262.00
$52095.32 $52095.00
$55262.67 $55262.00
$68262.78 $68262.00

Discrete Continuous
 
◦ Variables that cannot be Can be subdivided—

subdivided theoretically they can
be subdivided an
 The number of persons
living in a household is a infinite number of
discrete variable. For
times.
example, there cannot be
 Time for example
2.3 persons living in a
 Days, Hrs, Mins, Secs,
household. There can be 2,
Nanosecs, etc.
or there can be 3, but not
2.3.

Rates Ratios
 
◦ Violent crimes per Violent Crimes ―per‖

100,000 population Property crime
 Violent Crimes /  Violent crimes = 10
(Population/100000) =  Property crimes = 300
Rate  PC/VC (300/10)=30
 For every one violent
crime, there are 30 property
crimes

Percent Change

◦ For comparing time
periods
((New-Old)/Old) *100

2009 property crimes =2567

2008 property crimes = 2655

Percent change=

 (2567-2655)/2655
 or -0.033 * 100 = -3.3%

Measures of Central

25
Tendency
55
◦ Mean or Average
56
65
 Average of a distribution of
Median = 82-72
72
values
= 10/2
82
= 72+5 ◦ Mode
82
84
 Most often found value in a
90
distribution
97
◦ Median
 The middle value in a
distribution

Bi-Modal

25
55
55
65
Median = 82-72
72
= 10/2
82
= 72+5
82
84
90
97

Mean
 Positive or Right Skewed
◦ Should not be used
when distribution is
greatly ―skewed‖
 As with most crime data
◦ Use Median where it
Almost normal
makes sense instead

Negative or Left
Skewed

Measures of Variance or

Dispersion 25
◦ Range 55
55
 The distance between the 1st Quartile = 57.5
65
lowest and highest score
72
◦ Interquartile range 26
82
 The distance between the 82
3rd Quartile = 83.5
25th and 75th percentile 84
◦ Variance 90
 The average squared 97
distance of each score in a
distribution from the mean
of the distribution
◦ Standard deviation
 The average distance of each
score from the mean

Measures of Variance or

Dispersion
◦ Range
 The distance between the
lowest and highest score
◦ Interquartile range
 The distance between the
25th and 75th percentile
◦ Variance
 The average squared
distance of each score in a
distribution from the mean
of the distribution
◦ Standard deviation
 The average distance of each
score from the mean

Sample Analyzed and

―infer‖ information to
the population
◦ Probability theory
 The number of times
any given outcome will
occur if the event is
repeated many times.

Bell-Shaped or Normal

Curve

Mode & Median same as Mean

Histogram

◦ Normal Average 13.6
Median 10
Mode 1
◦ Skewed

Average 20
Average 26.20
Median 20
Median 30
Mode 20
Mode 40

What variables are available?

What is the overall n?

What is the unit of analysis?

What do I want to know about the variable(s)?

What is the level of measurement of the

variable(s)?
Are the variables discrete or continuous?

How many groups will be compared in the

analysis?
Am I interested in just describing the data or

finding inferences within it?

Independent variable

◦ The variable that analysts are trying to explain
 (in crime mapping, the dependent variable is often some
crime measure).
Dependent variable

◦ Variables that produce a change in our dependent
variable

X
Casual relationship

Intervening variable
◦
Antecedent variable
Multicollinearity
◦
Contingent variable
◦ Z Y
Multicollinearity
◦
 When X, Y, and Z have overlapping measures of the same
concept
◦ Spurious relationships
 When X and Y have no direct relationship but are both
affected by Z

Chi-square

T-tests

Z-tests

ANOVA

◦ Essentially, they work by determining whether or not
variable distributions or differences between groups
or areas would be expected based on random
chance

Lambda

Gamma

Kendall’s tau statistics

Spearman’s rho

Pearson’s correlation coefficient

◦ To determine the strength and direction of a
relationship between two variables
◦ Values between -1 and +1
◦ Inverse/negative or positive relationships possible

Variable 2 Variable 2
Variable 1 Variable 1

Spatial Autocorrelation

◦ Moran’s I
 A value between 0 and 1 indicates positive spatial
autocorrelation (or clustering).
 A value between 1 and 0 indicates negative spatial
autocorrelation (random distribution).
◦ Geary’s C
 Values under 1 signify positive spatial autocorrelation
 Values over 1 designate negative spatial autocorrelation

Linear relationship

◦ (OLS) Ordinary least-squares
 Y =a + b1 X1 + b2 X2 + b3 X3 …
◦ Units of analysis
 Must be the same

Nominal (categories), Ordinal, Interval and Ratio

(Quantities) can be used with different methods
Fills and outlines


Nominal data
example

Ratio Data
Example

Category data

symbology
comes next
It displays data

by unique values
of a field, or
multiple fields
Nominal, ordinal,

ratio or interval
data

Next, comes the

quantities
symbology
method
 It uses a number
field in the table
to display data by
classified values
 Ratio and interval
data

Six different ways to classify data, with an

added manual method for infinite freedom

Equal Interval

Defined Interval

Quantile

Natural Breaks

Geometrical Interval

Standard Deviation


Categorical (Qualitative)

Grouping based on some quality
◦
Labels or categories
◦
E.g.; Sex = Male or Female
◦
Nominal or Ordinal
◦
 Nominal the order is not important
 E.g.: Sex = male or female
 Ordinal the order is important
 E.g.; Rank = Officer, Sergeant, Lieutenant, etc
◦ Can be binary or non-binary
 Binary = only two values (male or female)
 Non-Binary = More than two (red, blonde, brunette, etc)

Measurement (Quantitative)

◦ Grouping based on some quantity or value
◦ Always numbers
◦ Discrete or continuous
 Discrete = only certain values are possible and data
could have gaps (1, 2, 3, or 4)
 Continuous = Any value along some interval (any value
between 1 and 4 (ie: 3.24211)
◦ Interval or ratio
 In interval data the interval between values is important
(ie; temperature of 30 compared to 110 means
something)
 Ratio data is the best, and the ―0‖ value can be
informative (ie; a grid can have 0 crimes, or any value
up to infinity)

http://www.socialresearchmethods.net/kb

/index.php

Number of
Equal Interval (ratio, Interval)
 classes desired
◦ The range between the classifications is thedetermines
interval
same

Take the
high value-low
value and for
each of the 5
classes, the value
is 199.61

Defined Interval (ratio, interval)

◦ Similar to the equal interval, but here, we
define what the interval will be and thus
establish the classes

In this case the
interval was set
to 150, and so
the number of
classes is
determined by
the interval

Quantile (ratio, interval)

◦ A percentage of the values in the class
falling with the range. Each class contains
an equal number of features.

Each of the 10
classes has the
same number of
features within
each class, or
makes up 10% of
the total records

Natural Breaks (ratio, interval)

◦ Breaks the data where there are natural
holes between values

Use test exam score example

Geometrical Interval (ratio, interval)

◦ This is a classification scheme where the
class breaks are based on class intervals
that have a geometrical series. This
ensures that each class range has
approximately the same number of values
with each class and that the change
between intervals is fairly consistent.
The interval is
determined by a
geometric
equation (large
and small
changes
depending on
breaks in data)

Standard Deviation (ratio, interval)

◦ Classes are determined by mean and
standard deviation of values. Can display
by 1, ½, ¼ standard deviations as needed

Getting to know your data, and the factors that

influence crime can help analysts create more useful
maps and analysis products and do problem solving
Handling data properly will keep your from making

incorrect assumptions and coming to unrealistic
conclusions
Remember the wheel of science


Fundamentalsof Crime Mapping 8

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (7)

More from Osokop

More from Osokop (9)

Recently uploaded

Recently uploaded (20)

Fundamentalsof Crime Mapping 8