2. METHODOLOGY USED
• Most of the variable were categorical in nature which are to aggregated to perform the clustering analysis .
• The data is then uploaded into the database and aggregated by DISTRICT ,WARD and COMMUNITY AREA
to find the count of crimes by categories, following which the hierarchal clustering is performed
• Tools used :
• TERADATA
• TABLEAU
• SPOTFIRE
4. INSIGHT: WHERE MALE TO FEMALE RATIO IS LESS, THERE ARE MORE
CRIMES COMMITTED.
MAP SHOWING MALE TO FEMALE RATIO
BUBBLE SIZE SHOWS SUM OF TOTAL NUMBER OF CRIMES COMMITTED
5. MAP SHOWING DISTRIBUTION OF THE HOUSE HOLD INCOME
INSIGHT: WHERE HOUSE HOLD INCOME IS LOW, CRIMES ARE MORE
6. MAP SHOWING DISTRIBUTION OF THE PER CAPITA INCOME
INSIGHT: WHERE PER CAPITA IS MORE, MORE THEFTS WERE
COMMITTED
7. MAP SHOWING MALE TO FEMALE RATIO
BUBBLE SIZE SHOWS SUM OF HOMICIDES WITH DISTRICT NUMBERS
INSIGHT: WHERE MALE TO FEMALE RATIO IS LESS, HOMICIDES ARE
MORE
8. MAP SHOWING MALE TO FEMALE RATIO
BUBBLE SIZE SHOWS SUM OF CRIME - PROSTITUTION.
INSIGHT: WHERE MALE TO FEMALE RATIO IS LESS, PROSTITUTION IS
MORE
9. HEAT MAP – DISTRICT
INSIGHT: ON CLUSTERING THE COUNT OF
CRIMES, WE OBSERVE THAT DISTRICTS ARE
GETTING CLUSTERED INTO 3 CATEGORIES –
HIGH, INTERMEDIATE AND LOW.
AREA
INSIGHT: WE SEE THAT FEW DISTRICTS
IN THE TOP TWO CLUSTERS HAVE
INVERSE PATTERN ON FEW VARIABLES.
PARALLEL COORDINATE – DISTRICT
THE SAME PATTERN CAN BE OBSERVED FOR WARD AND COMMUNITY AREA
10. FUTURE SCOPE OF STUDY
• Application of Dummy Variables can be explored
• Association rules among crime types can be applied
• Location type based clustering can be performed
• Network analysis – To identify (closeness )distance between two crimes