- The document describes a project to analyze Chicago Police Department data to develop a model that predicts an individual's risk level of being involved in a shooting based on various factors.
- An artificial neural network model was found to best predict the risk score, achieving an R2 of 0.9234 and mean average error of 10.583.
- K-means clustering identified 3 clusters that characterize individuals based on attributes like gender, race, age, and criminal history with drugs and weapons.
psy 303 new,argosy psy 303 new,argosy psy 303 new complete course,argosy psy 303 new entire course,psy 303 all assignments new,argosy psy 303 new module 1,argosy psy 303 new module 2,argosy psy 303 new module 3m argosy psy 303 new module 4,argosy psy 303 new module 5,argosy psy 303 new tutorials,psy 303 new assignments,psy 303 new help
This document summarizes a crime analysis project conducted by a university team. The team analyzed multiple data sets to build models predicting crime rates based on factors like population, weather, daylight hours, and economic indicators. They created binary, numeric, and crime ratio models and found the crime ratio model was most statistically significant. The team's analyses found crime rates generally increased with daylight savings time and increased slightly with higher temperatures. Their best model could predict monthly crime totals by city for most crime types except rare crimes like homicide and sexual assault.
This document summarizes a study that used a regression equation to identify at-risk youth for violence and then provided those youth with evidence-based treatments. The regression equation incorporated demographic, behavioral, and test score data from past perpetrators of violent crimes. At-risk youth were identified in several urban Midwestern high schools and received anger management training, job opportunities, and mentoring. After treatment, homicides decreased by 32%, shootings by 46%, and assaults by 77%, saving approximately 104 lives and $492 million, with a return on investment of 6.42. The study showed promise for using a predictive model along with proven interventions to reduce violence and associated costs.
This document summarizes a student's research analyzing self-reported crime data to understand the decline in youth violence. The student used data from the GREAT surveys from 1995-1999 and 2006-2010. The results showed that self-reported simple assault decreased consistently across both time periods, matching conclusions from police and victimization data. Specifically, the prevalence and frequency of reported simple assault declined, suggesting this contributed most to the overall drop in youth violence.
The goal of this project is to build a model using multiple linear regression that accurately predicts patterns between various socioeconomic factors that affect hate crime in the US before the 2016 US presidential elections and how these factors tied to the success of the president Donald Trump using MS Excel and SAS Studio.
Towards a More Holistic Approach on Online Abuse and AntisemitismIIIT Hyderabad
In our first work, we take a holistic approach towards analysing the different forms of abusive behaviours found in the web communities. We introduce three abuse detection tasks -- 1) presence of abuse, 2) severity of abuse, 3) target of abuse. Due to the absence of a rich abuse-based dataset of considerable size, labeled across all aspects -- presence, severity, and target, we provide a corpus with 7,601 posts collected from a popular alt-right social media platform Gab, each of which is manually labeled comprehensively across all such aspects. We also propose a Transformer based text classifier which outperforms the existing baselines on each of the three proposed tasks on the presented corpus. Our proposed classifier obtains an accuracy of 80% for abuse presence, 82% for abuse target detection, and 64% for abuse severity detection.
To the best of our knowledge, both of the presented works are first in the respective directions. Through our studies we aim to lay foundation for future research works to explore the area of hate speech and online abuse in a more holistic and complete manner.
This document discusses using data mining techniques like clustering to detect crime patterns from crime data. It proposes using a k-means clustering algorithm with attribute weighting to group similar crimes. Testing on real crime data from a sheriff's office, it was able to identify crime patterns that detectives could validate matched actual crime sprees. The method provides an automated way to detect patterns and help detectives solve crimes faster by focusing on clustered groups of related incidents.
psy 303 new,argosy psy 303 new,argosy psy 303 new complete course,argosy psy 303 new entire course,psy 303 all assignments new,argosy psy 303 new module 1,argosy psy 303 new module 2,argosy psy 303 new module 3m argosy psy 303 new module 4,argosy psy 303 new module 5,argosy psy 303 new tutorials,psy 303 new assignments,psy 303 new help
This document summarizes a crime analysis project conducted by a university team. The team analyzed multiple data sets to build models predicting crime rates based on factors like population, weather, daylight hours, and economic indicators. They created binary, numeric, and crime ratio models and found the crime ratio model was most statistically significant. The team's analyses found crime rates generally increased with daylight savings time and increased slightly with higher temperatures. Their best model could predict monthly crime totals by city for most crime types except rare crimes like homicide and sexual assault.
This document summarizes a study that used a regression equation to identify at-risk youth for violence and then provided those youth with evidence-based treatments. The regression equation incorporated demographic, behavioral, and test score data from past perpetrators of violent crimes. At-risk youth were identified in several urban Midwestern high schools and received anger management training, job opportunities, and mentoring. After treatment, homicides decreased by 32%, shootings by 46%, and assaults by 77%, saving approximately 104 lives and $492 million, with a return on investment of 6.42. The study showed promise for using a predictive model along with proven interventions to reduce violence and associated costs.
This document summarizes a student's research analyzing self-reported crime data to understand the decline in youth violence. The student used data from the GREAT surveys from 1995-1999 and 2006-2010. The results showed that self-reported simple assault decreased consistently across both time periods, matching conclusions from police and victimization data. Specifically, the prevalence and frequency of reported simple assault declined, suggesting this contributed most to the overall drop in youth violence.
The goal of this project is to build a model using multiple linear regression that accurately predicts patterns between various socioeconomic factors that affect hate crime in the US before the 2016 US presidential elections and how these factors tied to the success of the president Donald Trump using MS Excel and SAS Studio.
Towards a More Holistic Approach on Online Abuse and AntisemitismIIIT Hyderabad
In our first work, we take a holistic approach towards analysing the different forms of abusive behaviours found in the web communities. We introduce three abuse detection tasks -- 1) presence of abuse, 2) severity of abuse, 3) target of abuse. Due to the absence of a rich abuse-based dataset of considerable size, labeled across all aspects -- presence, severity, and target, we provide a corpus with 7,601 posts collected from a popular alt-right social media platform Gab, each of which is manually labeled comprehensively across all such aspects. We also propose a Transformer based text classifier which outperforms the existing baselines on each of the three proposed tasks on the presented corpus. Our proposed classifier obtains an accuracy of 80% for abuse presence, 82% for abuse target detection, and 64% for abuse severity detection.
To the best of our knowledge, both of the presented works are first in the respective directions. Through our studies we aim to lay foundation for future research works to explore the area of hate speech and online abuse in a more holistic and complete manner.
This document discusses using data mining techniques like clustering to detect crime patterns from crime data. It proposes using a k-means clustering algorithm with attribute weighting to group similar crimes. Testing on real crime data from a sheriff's office, it was able to identify crime patterns that detectives could validate matched actual crime sprees. The method provides an automated way to detect patterns and help detectives solve crimes faster by focusing on clustered groups of related incidents.
This paper focuses on finding spatial and temporal criminal hotspots. It analyses two different real-world crimes datasets for Denver, CO and Los Angeles, CA and provides a comparison between the two datasets through a statistical analysis supported by several graphs. Then, it clarifies how we conducted Apriori algorithm to produce interesting frequent patterns for criminal hotspots. In addition, the paper shows how we used Decision Tree classifier and Naïve Bayesian classifier in order to predict potential crime types. To further analyse crimes’ datasets, the paper introduces an analysis study by combining our findings of Denver crimes’ dataset with its demographics information in order to capture the factors that might affect the safety of neighborhoods. The results of this solution could be used to raise people’s awareness regarding the dangerous locations and to help agencies to predict future crimes in a specific location within
a particular time.
This study examines the relationship between domestic violence, college campuses, and poverty in Chicago. Crime and campus data from 2014 are analyzed spatially. Map 1 shows the distribution of different types of domestic violence crimes across the city. Map 2 indicates that campus domestic violence rates do not seem directly connected to neighborhood rates. Map 3 reveals that areas with lower median incomes have higher clustering of domestic violence crimes, suggesting income and domestic violence are inversely related. The study aims to better understand how risk factors like poverty influence domestic violence rates.
This summary provides an overview of a research study examining factors that could predict police officers' attitudes towards the Black Lives Matter movement. The study surveyed 68 police officers across two departments, collecting data on demographics, preferred news sources, and moral beliefs. Significant positive correlations were found between survey items. Hispanic officers were more likely than white officers to perceive Black Lives Matter as legitimate. The use of conservative vs. neutral news sources also predicted some differences in attitudes. The study aimed to understand how individual police officers' characteristics may influence their interactions with communities and views of the Black Lives Matter movement.
Behind the Badge: A Nationally Representative Survey of Police OfficersRenee Stepler
Presentation of finding from Pew Research Center's survey of police officers and the general public delivered at the American Association for Public Opinion Research (AAPOR) 2017 Annual Conference
This document provides an overview and introduction to a hedonic pricing study analyzing relationships between home attributes and median home sale prices in Chicago and surrounding areas using data from the Chicago Tribune Homes website. The data includes physical home characteristics, community attributes such as crime rates and education quality, and location details for over 300 communities. The study aims to quantify how attributes in these categories contribute to home prices using regression analysis.
Electronic Media And United States TerrorismThomas Riner
This document analyzes the relationship between increased availability of electronic media and domestic terrorism in the United States over the last 20 years. It finds that while electronic media access has greatly increased, the number of indicted or killed domestic terrorists has not shown a clear rising trend. Statistical analysis finds no significant difference between the number of terrorist cases in recent years compared to earlier years, despite vast growth in internet usage. While electronic tools may aid terrorist communications, the data does not support the hypothesis that increased information access has led to more domestic terror cells or activities in the U.S.
This presentation was a lightening talk (15 min) at the #WIDS2017 conference in Sarasota, FL. It explains how we address intimate partner violence by developing data-driven solutions.
Using Data Mining Techniques to Analyze Crime PatternZakaria Zubi
Our proposed model will be able to extract crime patterns by using association rule mining and clustering to classify crime records on the basis of the values of crime attributes.
Correlates of criminal behavior among female prisonersashtinadkins
The study examined relationships between attachment, childhood experiences, sensation seeking, and criminal behavior among 348 female inmates. Measures of attachment, adverse childhood events, alcohol use, and sensation seeking were administered and correlated with criminal behaviors. Preliminary results found several scales significantly correlated with number of crimes, including risk taking behaviors, adverse childhood events, experiences of partner abuse, and insecure attachments. Further path analyses may provide a more complex model of criminality accounting for different types of crimes.
This document provides an introduction and literature review for a research project examining factors that influence perceptions of victimization and fear of crime in Sydney. The introduction discusses fear of crime as an important issue and outlines the research questions. The literature review covers previous research finding demographic variables like age, sex, and race correlate with fear of crime. Studies also link perceptions of neighborhood safety and disorder to fear. The methodology section describes the sample, variables, and statistical analysis that will be used to analyze the relationships between demographics, environment, and fear of crime. Tables 1 and 2.1-2.3 provide sample characteristics and preliminary results for research questions 1-3 on correlations between age, race, sex and indicators of fear.
IRJET- Detecting Criminal Method using Data MiningIRJET Journal
The document discusses using data mining techniques like clustering algorithms to help detect criminal methods and patterns from crime data. It proposes applying a weighted k-means clustering approach to group similar crimes together based on important attributes. This would help identify potential criminal patterns and present them to detectives in a geospatial plot, highlighting crime clusters. The results were checked against court case outcomes and some criminal patterns were confirmed. The authors conclude the method helps detectives by organizing large crime datasets but requires close collaboration and domain knowledge to effectively map real crime data for mining.
This chapter discusses the main techniques used to measure crime levels and trends, including official statistics collected by law enforcement and victimization surveys. It examines the strengths and weaknesses of each approach and how they can be compared. The chapter also looks at recent crime trends shown by different data sources and why this remains a controversial area given limitations in the various sources. It assesses how criminologists deal with competing claims from different measurement approaches.
There are three main ways to measure crime: official crime statistics collected by police, victim surveys which ask people about crimes committed against them, and self-report surveys which anonymously ask people about their own criminal behavior. Each method has advantages and disadvantages, such as official statistics only capturing reported crimes while victim surveys provide information about unreported crimes but have smaller sample sizes. Understanding these different measurement methods is important for gaining an accurate picture of crime in a society.
Cja 105 Enthusiastic Study / snaptutorial.comStephenson20
Crime rates are of great interest to policy makers as well as citizens. After all, who wants to live in an area with a high crime rate? When you calculate crime rates, you are able to compare cities, states, and countries of different sizes. For example, suppose city A has a population of 120,000 and city B has a population of 500,000. The preceding year, city A had 250 incidents of murder and city B had 400. On reading this, you may think that you want to live in city A because it was host to fewer murders. However, when you calculate the murder rates, you will see that the murder rate for city A is 208 per 100,000 citizens and that for city B is 80 per 100,000 citizens, which is much lower.
This white paper describes the analysis and models developed to predict crimes in Chicago city. Further, the models are compared and most effective and simple model is recommended with the conclusion.
Crime Data Analysis and Prediction for city of Los AngelesHeta Parekh
This document analyzes crime data from Los Angeles from 2010-2020 to identify trends, predict future crime rates, and make recommendations to law enforcement. Key findings include:
- Crime rates have generally declined over the past decade but dropped significantly in 2020 due to the pandemic.
- Robbery, burglary, and vandalism are the most common crimes.
- Areas with lower median household incomes tend to have higher crime rates.
- Females are consistently the most impacted victims of crime over the past 10 years.
- Southwest LA and other areas have been identified as "hot spots" for criminal activity.
Predictive analysis indicates crime rates will continue increasing post-lockdown in
Christian Castillo D2 Summer 2018Outline of Complete Research.docxchristinemaritza
Christian Castillo
D2 Summer 2018
Outline of Complete Research Proposal 1
RACE AND CRIME 5
1. Introduction
a. Topic introduction and context.
Racial discrimination, which is the way of targeting accused based on race prominence, could be responsible for the increase in rate of crime arrest. Subcultural theorists argue that poor people, also referred to as have-nots, normally reside in areas where the social respect is subject to violence and physical strength and this habit promote crime. More to this, race impacts who gets arrested, and some pieces of evidence show that minorities are disproportionate form crime statistics (Walker, Spohn, & DeLone, 2012). It is official that high rates of arrest, conviction, and incarceration of these minorities may be as a result of criminal justice actors. It is interesting to note that race and social stratification are related in the aspect that nonwhite form lower class and this poorer class lack the genuine ways to obtain goods and they choose to join crime.
b. Significance
Physical injury and death are grouped as homicide, and known as the biggest cause of death amongst the youths. According to the “U.S Public health” brutality is a chief health issues which is challenging the Americans. Crime is intertwined with acts of violence. Secondly, crime is associated with loss, such as vandalism, arson, and environmental destruction. Crimes also pose economic cost through expense linked with transfer of property through robbery, during crime, criminal violence brings about additional medical cost of attending to the victims. There is another form of cost: cost of protection, which includes funds used to guard dogs and surveillance systems. According to studies, race has a huge impact on crime, thus scientists and scholars have tried to uncover what triggers people with different skin color to engage in criminal acts. All these implications make it important to study the relation of race to crime with a mind of reducing the cost.
c. Research question and hypothesis.
Arguably, black people are more likely to engage in criminal activities than white people. Does this stereotype have any relevance to it? A black man in the US today has three times a higher chance of going to prison as well. There has to be a relationship that supports both statements. Comment by Microsoft Office User: State your hypothesis clearly.
d. Proposed research design
The research will use data collected by different institution to evaluate the relationship between crime and race. It will describe offending action of different races within the sample population and this will be used for descriptive purpose. Second, the explanation will predict race pattern in relation to crime. The analysis builds on existing records of crime documenting racial pattern. Comment by Microsoft Office User: How will the data be collected for analysis? Existing statistics?
e. Roadmap
The remainder of this research paper stereotypes the conce ...
Analysis of the Factors Affecting Violent Crime Rates in the USDr. Amarjeet Singh
The goal of this study is to analyze the factors affecting violent crime rates in the US. It is hypothesized that an increase in the gun ownership rate tends to increase violent crimes in the US. It is hypothesized that urban areas in the US tend to have more violent crimes than rural areas. An OLS regression model is formulated using cross-sectional data set across 50 states and the District of Columbia for the year 2019. The endogenous variable is the violent crime rates per 100,000 inhabitants across 50 states and the District of Columbia. The independent variables used in the OLS regression model are population density per square mile, unemployment rate, percentage of the population living in poverty, and gun ownership rate. The four exogenous variables that are found to be statistically significant are gun ownership, unemployment rate, population density per square mile, and percentage of population living in property. An attempt is also made to formulate strategies that would help in reducing violent crime rates in the US.
Psy 303 Effective Communication / snaptutorial.comHarrisGeorg37
This document discusses various assignments for a psychology course on crime. It includes assignments on measuring crime, analyzing theories of crime causation, examining the influence of peer pressure and media on crime, and exploring society's responses to crime. Students are asked to research crime statistics, apply psychological theories to explain specific crimes, analyze media portrayals of crime, and evaluate different concepts of justice in relation to lowering recidivism rates.
This study uses regression analysis to examine the relationship between state-level firearm death rates in the US and several independent variables representing prevailing theories about the causes of gun deaths. The analysis finds that states with weaker gun laws and higher unemployment rates have statistically significant higher firearm death rates, while personal income, mental illness rates, and income inequality were not significant predictors. This provides support for the argument that lax gun regulation and poor economic conditions contribute to higher rates of gun deaths in the US.
This paper focuses on finding spatial and temporal criminal hotspots. It analyses two different real-world crimes datasets for Denver, CO and Los Angeles, CA and provides a comparison between the two datasets through a statistical analysis supported by several graphs. Then, it clarifies how we conducted Apriori algorithm to produce interesting frequent patterns for criminal hotspots. In addition, the paper shows how we used Decision Tree classifier and Naïve Bayesian classifier in order to predict potential crime types. To further analyse crimes’ datasets, the paper introduces an analysis study by combining our findings of Denver crimes’ dataset with its demographics information in order to capture the factors that might affect the safety of neighborhoods. The results of this solution could be used to raise people’s awareness regarding the dangerous locations and to help agencies to predict future crimes in a specific location within
a particular time.
This study examines the relationship between domestic violence, college campuses, and poverty in Chicago. Crime and campus data from 2014 are analyzed spatially. Map 1 shows the distribution of different types of domestic violence crimes across the city. Map 2 indicates that campus domestic violence rates do not seem directly connected to neighborhood rates. Map 3 reveals that areas with lower median incomes have higher clustering of domestic violence crimes, suggesting income and domestic violence are inversely related. The study aims to better understand how risk factors like poverty influence domestic violence rates.
This summary provides an overview of a research study examining factors that could predict police officers' attitudes towards the Black Lives Matter movement. The study surveyed 68 police officers across two departments, collecting data on demographics, preferred news sources, and moral beliefs. Significant positive correlations were found between survey items. Hispanic officers were more likely than white officers to perceive Black Lives Matter as legitimate. The use of conservative vs. neutral news sources also predicted some differences in attitudes. The study aimed to understand how individual police officers' characteristics may influence their interactions with communities and views of the Black Lives Matter movement.
Behind the Badge: A Nationally Representative Survey of Police OfficersRenee Stepler
Presentation of finding from Pew Research Center's survey of police officers and the general public delivered at the American Association for Public Opinion Research (AAPOR) 2017 Annual Conference
This document provides an overview and introduction to a hedonic pricing study analyzing relationships between home attributes and median home sale prices in Chicago and surrounding areas using data from the Chicago Tribune Homes website. The data includes physical home characteristics, community attributes such as crime rates and education quality, and location details for over 300 communities. The study aims to quantify how attributes in these categories contribute to home prices using regression analysis.
Electronic Media And United States TerrorismThomas Riner
This document analyzes the relationship between increased availability of electronic media and domestic terrorism in the United States over the last 20 years. It finds that while electronic media access has greatly increased, the number of indicted or killed domestic terrorists has not shown a clear rising trend. Statistical analysis finds no significant difference between the number of terrorist cases in recent years compared to earlier years, despite vast growth in internet usage. While electronic tools may aid terrorist communications, the data does not support the hypothesis that increased information access has led to more domestic terror cells or activities in the U.S.
This presentation was a lightening talk (15 min) at the #WIDS2017 conference in Sarasota, FL. It explains how we address intimate partner violence by developing data-driven solutions.
Using Data Mining Techniques to Analyze Crime PatternZakaria Zubi
Our proposed model will be able to extract crime patterns by using association rule mining and clustering to classify crime records on the basis of the values of crime attributes.
Correlates of criminal behavior among female prisonersashtinadkins
The study examined relationships between attachment, childhood experiences, sensation seeking, and criminal behavior among 348 female inmates. Measures of attachment, adverse childhood events, alcohol use, and sensation seeking were administered and correlated with criminal behaviors. Preliminary results found several scales significantly correlated with number of crimes, including risk taking behaviors, adverse childhood events, experiences of partner abuse, and insecure attachments. Further path analyses may provide a more complex model of criminality accounting for different types of crimes.
This document provides an introduction and literature review for a research project examining factors that influence perceptions of victimization and fear of crime in Sydney. The introduction discusses fear of crime as an important issue and outlines the research questions. The literature review covers previous research finding demographic variables like age, sex, and race correlate with fear of crime. Studies also link perceptions of neighborhood safety and disorder to fear. The methodology section describes the sample, variables, and statistical analysis that will be used to analyze the relationships between demographics, environment, and fear of crime. Tables 1 and 2.1-2.3 provide sample characteristics and preliminary results for research questions 1-3 on correlations between age, race, sex and indicators of fear.
IRJET- Detecting Criminal Method using Data MiningIRJET Journal
The document discusses using data mining techniques like clustering algorithms to help detect criminal methods and patterns from crime data. It proposes applying a weighted k-means clustering approach to group similar crimes together based on important attributes. This would help identify potential criminal patterns and present them to detectives in a geospatial plot, highlighting crime clusters. The results were checked against court case outcomes and some criminal patterns were confirmed. The authors conclude the method helps detectives by organizing large crime datasets but requires close collaboration and domain knowledge to effectively map real crime data for mining.
This chapter discusses the main techniques used to measure crime levels and trends, including official statistics collected by law enforcement and victimization surveys. It examines the strengths and weaknesses of each approach and how they can be compared. The chapter also looks at recent crime trends shown by different data sources and why this remains a controversial area given limitations in the various sources. It assesses how criminologists deal with competing claims from different measurement approaches.
There are three main ways to measure crime: official crime statistics collected by police, victim surveys which ask people about crimes committed against them, and self-report surveys which anonymously ask people about their own criminal behavior. Each method has advantages and disadvantages, such as official statistics only capturing reported crimes while victim surveys provide information about unreported crimes but have smaller sample sizes. Understanding these different measurement methods is important for gaining an accurate picture of crime in a society.
Cja 105 Enthusiastic Study / snaptutorial.comStephenson20
Crime rates are of great interest to policy makers as well as citizens. After all, who wants to live in an area with a high crime rate? When you calculate crime rates, you are able to compare cities, states, and countries of different sizes. For example, suppose city A has a population of 120,000 and city B has a population of 500,000. The preceding year, city A had 250 incidents of murder and city B had 400. On reading this, you may think that you want to live in city A because it was host to fewer murders. However, when you calculate the murder rates, you will see that the murder rate for city A is 208 per 100,000 citizens and that for city B is 80 per 100,000 citizens, which is much lower.
This white paper describes the analysis and models developed to predict crimes in Chicago city. Further, the models are compared and most effective and simple model is recommended with the conclusion.
Crime Data Analysis and Prediction for city of Los AngelesHeta Parekh
This document analyzes crime data from Los Angeles from 2010-2020 to identify trends, predict future crime rates, and make recommendations to law enforcement. Key findings include:
- Crime rates have generally declined over the past decade but dropped significantly in 2020 due to the pandemic.
- Robbery, burglary, and vandalism are the most common crimes.
- Areas with lower median household incomes tend to have higher crime rates.
- Females are consistently the most impacted victims of crime over the past 10 years.
- Southwest LA and other areas have been identified as "hot spots" for criminal activity.
Predictive analysis indicates crime rates will continue increasing post-lockdown in
Christian Castillo D2 Summer 2018Outline of Complete Research.docxchristinemaritza
Christian Castillo
D2 Summer 2018
Outline of Complete Research Proposal 1
RACE AND CRIME 5
1. Introduction
a. Topic introduction and context.
Racial discrimination, which is the way of targeting accused based on race prominence, could be responsible for the increase in rate of crime arrest. Subcultural theorists argue that poor people, also referred to as have-nots, normally reside in areas where the social respect is subject to violence and physical strength and this habit promote crime. More to this, race impacts who gets arrested, and some pieces of evidence show that minorities are disproportionate form crime statistics (Walker, Spohn, & DeLone, 2012). It is official that high rates of arrest, conviction, and incarceration of these minorities may be as a result of criminal justice actors. It is interesting to note that race and social stratification are related in the aspect that nonwhite form lower class and this poorer class lack the genuine ways to obtain goods and they choose to join crime.
b. Significance
Physical injury and death are grouped as homicide, and known as the biggest cause of death amongst the youths. According to the “U.S Public health” brutality is a chief health issues which is challenging the Americans. Crime is intertwined with acts of violence. Secondly, crime is associated with loss, such as vandalism, arson, and environmental destruction. Crimes also pose economic cost through expense linked with transfer of property through robbery, during crime, criminal violence brings about additional medical cost of attending to the victims. There is another form of cost: cost of protection, which includes funds used to guard dogs and surveillance systems. According to studies, race has a huge impact on crime, thus scientists and scholars have tried to uncover what triggers people with different skin color to engage in criminal acts. All these implications make it important to study the relation of race to crime with a mind of reducing the cost.
c. Research question and hypothesis.
Arguably, black people are more likely to engage in criminal activities than white people. Does this stereotype have any relevance to it? A black man in the US today has three times a higher chance of going to prison as well. There has to be a relationship that supports both statements. Comment by Microsoft Office User: State your hypothesis clearly.
d. Proposed research design
The research will use data collected by different institution to evaluate the relationship between crime and race. It will describe offending action of different races within the sample population and this will be used for descriptive purpose. Second, the explanation will predict race pattern in relation to crime. The analysis builds on existing records of crime documenting racial pattern. Comment by Microsoft Office User: How will the data be collected for analysis? Existing statistics?
e. Roadmap
The remainder of this research paper stereotypes the conce ...
Analysis of the Factors Affecting Violent Crime Rates in the USDr. Amarjeet Singh
The goal of this study is to analyze the factors affecting violent crime rates in the US. It is hypothesized that an increase in the gun ownership rate tends to increase violent crimes in the US. It is hypothesized that urban areas in the US tend to have more violent crimes than rural areas. An OLS regression model is formulated using cross-sectional data set across 50 states and the District of Columbia for the year 2019. The endogenous variable is the violent crime rates per 100,000 inhabitants across 50 states and the District of Columbia. The independent variables used in the OLS regression model are population density per square mile, unemployment rate, percentage of the population living in poverty, and gun ownership rate. The four exogenous variables that are found to be statistically significant are gun ownership, unemployment rate, population density per square mile, and percentage of population living in property. An attempt is also made to formulate strategies that would help in reducing violent crime rates in the US.
Psy 303 Effective Communication / snaptutorial.comHarrisGeorg37
This document discusses various assignments for a psychology course on crime. It includes assignments on measuring crime, analyzing theories of crime causation, examining the influence of peer pressure and media on crime, and exploring society's responses to crime. Students are asked to research crime statistics, apply psychological theories to explain specific crimes, analyze media portrayals of crime, and evaluate different concepts of justice in relation to lowering recidivism rates.
This study uses regression analysis to examine the relationship between state-level firearm death rates in the US and several independent variables representing prevailing theories about the causes of gun deaths. The analysis finds that states with weaker gun laws and higher unemployment rates have statistically significant higher firearm death rates, while personal income, mental illness rates, and income inequality were not significant predictors. This provides support for the argument that lax gun regulation and poor economic conditions contribute to higher rates of gun deaths in the US.
This document provides guidance on developing the methodology section of a research proposal. It discusses including descriptions of the research type (qualitative, quantitative, mixed), population and sampling method, and data collection and analysis tools and procedures. For the research type, the population should be defined along with the sampling strategy and sample size. Common data collection methods include surveys, interviews, and experiments. It is important to explain why the chosen methods are appropriate for answering the research question. The methodology allows readers to evaluate the reliability and validity of the study.
This document analyzes gun-related crime data using big data tools like Apache Hive and Pig. It summarizes the deadliest US mass shootings from 2016 to 2015. It then outlines the tools, data specifications, and workflow used to analyze gun sales rates, gun ownership rates, and crime rates over 2014-2015. Visualizations created in Excel, Tableau and 3D maps show trends in gun crimes in different areas for those years. In conclusion, it finds higher gun crime in central LA, guns comprising 28% of total crimes in 2015, and areas with higher income reporting less gun crimes in New York. Suggestions include using financial stability to predict gun crime likelihood.
Written Response Submission FormYour Name First and last.docxodiliagilby
Written Response Submission Form
Your Name: First and last
Your E-Mail Address: Your e-mail hereInstructions
Write your responses where it reads “Enter your response here....” Write as much as needed to satisfy the requirements indicated. Each item contains the Rubric that will be used to evaluate your responses.
At the end of the template, you will list the references you used to support your responses.
Item 1
Examine the past 10 years of crime data in the Uniform Crime Report (UCR) and the National Crime Victimization Survey (NCVS). Identify three crime trends from the data (2–4 paragraphs).Your Response
Enter your response here.Rubric
013
Failing
1415
Needs Improvement
1620
Meets Expectations
Sub-Competency 1: Analyze crime and victimization data
Identify three trends in crime from the past 10 years of UCR and NCVS data.
Response is missing.
Response describes fewer than three trends, or the description of trends contains inaccuracies or is vague.
Response includes a clear explanation of three crime/victimization trends from the past decade identified from the UCR and/or NCVS databases.
Item 2
What are the strengths and weaknesses of each data source (the UCR and NCVS databases)? Consider their accuracy, coverage, applicability, collection methods, and any other characteristic you think is important to consider. Explain at least two strengths and two weaknesses of the databases (2–3 paragraphs).
Your Response
Enter your response here.Rubric
013
Failing
1415
Needs Improvement
1620
Meets Expectations
Sub-Competency 2: Assess strengths and weaknesses of criminological data sources.
Explain two strengths of the UCR and/or NCVS databases.
Response is missing.
Response is vague, lacking detail, or contains inaccuracies.
Response provides an explanation of at least two strengths of the databases. The explanation demonstrates thorough consideration of research design, collection techniques, dissemination strategies, and applicability within the field.
Explain two weaknesses of the UCR and/or NCVS databases.
Response is missing.
Response is vague, lacking detail, or contains inaccuracies.
Response provides an explanation of at least two weaknesses of the databases. The explanation demonstrates thorough consideration of research design, collection techniques, dissemination strategies, and applicability within the field.
Item 3
What criminological explanations for your identified crime trends can be derived from the UCR and/or NCVS databases? Describe at least two. Then provide an argument for other factors and variables (biological, social, structural, economic, etc.) that cause or influence your identified crime trends that are not present in the UCR/NCVS data. Reference theoretical and scholarly resources that support your criminological explanations (3–5 paragraphs).
Your Response
Enter your response here.Rubric
013
Failing
1415
Needs Improvement
1620
Meets Expectations
Sub-Competency 3: Assess correlative and causative relation ...
Assignment 2 Measures of CrimeMeasuring crime can be a difficul.docxsherni1
Assignment 2: Measures of Crime
Measuring crime can be a difficult process. By its very nature, crime is something that goes undetected. Law enforcement has developed a variety of techniques to track crime, such as police reports and victim reports. The Federal Bureau of Investigation (FBI) uses the Uniform Crime Reporting (UCR) Program for tracking crime; it reports crime in more than one way. All crime reporting and tracking systems categorize crime and have certain limitations.
Measuring crime involves tracking statistics such as demographic information and moderator variables related to the crimes. Moderator variables are any third variable in a correlation that affects the relationship between the first two variables. For example, we may find that gender is related to violent crime with a higher percentage of males engaging in violent behavior. However, a moderator variable would be age, with the highest percentage of violent offenders being below the age of 30.
Research US crime statistics using the Internet. You can also use the following:
· US Department of Justice, The Federal Bureau of Investigation. (2009). 2008 crime in the United States: About crime in the U.S. Retrieved from http://www.fbi.gov/about-us/cjis/ucr/crime-in-the-u.s/2008
Select a crime and write a report addressing the following:
· Summarize the statistics from the last two reporting years. Be sure to include demographic information such as ethnicity, race, age, gender, marital status, employment status, socioeconomic group, etc., and moderator variables related to the crime.
· Examine the reliability and validity of these statistics. Are they accurate? Why or why not? Be sure to discuss how age, gender, race, ethnicity, and socioeconomic level are related to offending and representation in the criminal justice system.
· Explain whether certain populations are overrepresented in the statistics. If so, why?
Use the textbook and peer-reviewed sources to support your arguments.
Write a 3 full-pages report in Word format. Apply APA standards to citation of sources.
Assignment 3 Grading Criteria
Maximum Points
Summarized statistics for the previous two reporting years including demographic information.
20
Explained the accuracy, reliability, and validity of the statistics.
14
Discussed how age, gender, race, ethnicity, and socioeconomic level are related to offending and representation in the criminal justice system.
14
Analyzed the statistics to determine whether certain populations are over-represented.
32
Provided well-researched, authoritative sources in support of the arguments.
12
Wrote in a clear, concise, and organized manner; demonstrated ethical scholarship in accurate representation and attribution of sources; displayed accurate spelling, grammar, and punctuation.
8
Total:
100
...
This document provides an analysis of the effectiveness of gun restrictions in reducing crime rates through a study examining data from U.S. states from 2007-2013 and comparisons to other countries. The analysis finds that while education level highly correlates with lower crime rates, gun restrictions and poverty levels have weaker correlations. It also examines the black market for guns and how this impacts the results of restriction policies. The conclusion is that while gun restrictions may lower crime to a small extent, one's environment and illegal gun access are more influential on crime rates.
This thesis examines various variable selection methods for predicting crime rates using a 1990 US crime database. The author applies stepwise selection, lasso, elastic net, principal component regression, best random subset selection, and regression trees to select predictive variables and compare model performance. Variables consistently associated with crime rates include measures of employment, urbanization, income, poverty, ethnicity, and family structure. The best random subset selection method produced the highest out-of-sample R^2 and an unbiased assessment of model fit when applied to test data, performing better than other methods like lasso and elastic net. Predicting crime is difficult but variable selection techniques help identify important predictive factors and evaluate model performance.
Research Proposal Code#EB0011820201592277528Wordcount 1100 words.docxgholly1
Research Proposal Code#EB0011820201592277528
Wordcount 1100 words /4 pages
Urgency : 12 to 18 hours
Citation : APA
****************
The purpose of this assignment is to be in the role of an amateur researcher, coming up with a research question, and making decisions regarding what they are going to measure and how they are going to do it. Please provide complete answers to the following questions.
Answers should be single spaced, typed in 12pt Times New Roman, and no more than 2 pages.
1.
What is your research question?
Is there a difference between neighborhoods
where officer-involved shootings occur
and neighborhoods
where they do not occur
in terms of their level of social disorganization (measured as 0-100)? In other words, are officer-involved shootings more likely to occur in neighborhoods that are socially disorganized?
2.
Why did you choose this research question?
Officer-involved shootings have become a highly discussed topic in the aftermath of various high-profile killings of men of color. Understanding if a difference exists in the locations where these shootings occur, and where they do not, are important considerations in the officer’s decision to shoot. Places with high levels of social disorganization may lead an officer to be more likely to shoot (for several reasons that we will be unable to control for), whereas places with lower levels may make it less likely.
3.
Describe the dependent variable and the independent variable.
Dependent Variable
– The level of social disorganization, ratio, continuous
Independent Variable
– Census tracts, nominal, two categories
1. those that have experienced an OIS event,
2. those that have not experienced an OIS event.
4.
State the appropriate statistical test and explain why:
For my analysis, an
independent sample
t-test
is appropriate because my
independent variable is categorical
and my
dependent variable is continuous,
and I am
comparing two groups within the same variable
.
5.
State the null and alternative hypotheses:
H0: There
is no difference
between census tracts level of social disorganization based on if the census tract experienced an OIS event or not.
H1: There
is a difference
between census tracts which have experienced an OIS event and those that have not.
6.
Are there any other variables that you think will be related to the outcome? Describe at least 3 and explain why they are relevant.
1) Characteristics of police officers (sex, age, race, how many, etc.): Some officers may be less likely to use lethal force and controlling for that will lead to a stronger outcome.
2) Characteristics of suspects (sex, age, race, how many, did they have a weapon, etc.): Some characteristics of suspects may make officers more likely to shoot, whether they are legal factors or extra-legal factors.
3) Levels of firearm violence per neighborhood: neighborhoods which are at increased risk for firearm violence may make police more likely to per.
This document analyzes the relationship between age and gun ownership in the U.S. using GSS survey data from 1972-2012. It finds that the proportion of people who own guns increases with age up to ages 60-70, then decreases for those 71-80. A chi-square test shows a highly significant relationship (p<0.00001), indicating age and gun ownership are dependent. However, the study notes this relationship may be due to other factors like marriage, children, or peer influence, rather than directly caused by age. More research is needed to understand the deeper reasons individuals choose to own guns.
This investigation analyzed the relationship between a country's GDP per capita and its male suicide rate per 100,000 people. Data on GDP and male suicide rates for 39 countries was collected from NationMaster.com and analyzed using statistical tests. A scatter plot, least squares regression, Pearson's correlation coefficient, and Chi-square test showed little to no correlation. The Chi-square test result supported the conclusion that male suicide rates and relative individual wealth of countries are independent factors. Limitations include potential inaccuracies in suicide rate data collected by some countries.
Taylor & Francis, Ltd. and American Statistical Association a.docxMARRY7
Taylor & Francis, Ltd. and American Statistical Association are collaborating with JSTOR to digitize, preserve and extend
access to Journal of the American Statistical Association.
http://www.jstor.org
Taylor & Francis, Ltd.
American Statistical Association
Hierarchical Models for the Probabilities of a Survey Classification and Nonresponse: An
Example from the National Crime Survey
Author(s): Elizabeth A. Stasny
Source: Journal of the American Statistical Association, Vol. 86, No. 414 (Jun., 1991), pp. 296-
303
Published by: on behalf of the Taylor & Francis, Ltd. American Statistical Association
Stable URL: http://www.jstor.org/stable/2290561
Accessed: 19-10-2015 20:08 UTC
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at http://www.jstor.org/page/
info/about/policies/terms.jsp
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content
in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship.
For more information about JSTOR, please contact [email protected]
This content downloaded from 35.46.24.102 on Mon, 19 Oct 2015 20:08:52 UTC
All use subject to JSTOR Terms and Conditions
http://www.jstor.org
http://www.jstor.org/action/showPublisher?publisherCode=taylorfrancis
http://www.jstor.org/action/showPublisher?publisherCode=astata
http://www.jstor.org/stable/2290561
http://www.jstor.org/page/info/about/policies/terms.jsp
http://www.jstor.org/page/info/about/policies/terms.jsp
http://www.jstor.org/page/info/about/policies/terms.jsp
Hierarchical Models for the Probabilities of a Survey
Classification and Nonresponse: An Example
From the National Crime Survey
ELIZABETH A. STASNY*
A goal in many survey sampling problems is to estimate the probability that elements of the population within various small
areas or domains have some characteristic or fall in some particular survey classification. The estimation problem is typically
complicated by nonrandom nonresponse in that the probability that a unit responds to the survey may be related to the char-
acteristic of interest. This article presents a random parameter or hierarchical model approach to modeling the small-domain
probabilities of the characteristic of interest and the probabilities of nonresponse. The general model allows nonresponse prob-
abilities to depend on a unit's survey classification. A special case of the model treats nonresponse as occurring at random.
Empirical Bayes methods are used to obtain parameter estimates under the hierarchical models. The method is illustrated using
data from the National Crime Survey.
KEY WORDS: Categorical data; Empirical Bayes; Nonrandom nonresponse; Small-domain estimation.
1. INTRODUCTION
Data from large-scale sample surveys are often used to
estimate the probability that an individual falls into a par ...
This study examines factors that impact crime rates across major U.S. cities using data from 251 cities. Ordinary least squares regression is used to analyze the relationship between crime rates and variables like ethnicity, income, gender, age, population density, and political affiliation. Preliminary results found higher crime rates were associated with higher percentages of the population aged 25-34, higher population density, and higher percentages of divorced individuals. Lower crime rates were linked to more vacant housing units, higher median household incomes, and cities that lean Democratic politically. The study aims to help understand why crime rates vary significantly between cities with seemingly similar populations.
Similar to Helping Chicago Communities Identify Subjects Who Are Likely to be Involved in Shootings (20)
06-18-2024-Princeton Meetup-Introduction to MilvusTimothy Spann
06-18-2024-Princeton Meetup-Introduction to Milvus
tim.spann@zilliz.com
https://www.linkedin.com/in/timothyspann/
https://x.com/paasdev
https://github.com/tspannhw
https://github.com/milvus-io/milvus
Get Milvused!
https://milvus.io/
Read my Newsletter every week!
https://github.com/tspannhw/FLiPStackWeekly/blob/main/142-17June2024.md
For more cool Unstructured Data, AI and Vector Database videos check out the Milvus vector database videos here
https://www.youtube.com/@MilvusVectorDatabase/videos
Unstructured Data Meetups -
https://www.meetup.com/unstructured-data-meetup-new-york/
https://lu.ma/calendar/manage/cal-VNT79trvj0jS8S7
https://www.meetup.com/pro/unstructureddata/
https://zilliz.com/community/unstructured-data-meetup
https://zilliz.com/event
Twitter/X: https://x.com/milvusio https://x.com/paasdev
LinkedIn: https://www.linkedin.com/company/zilliz/ https://www.linkedin.com/in/timothyspann/
GitHub: https://github.com/milvus-io/milvus https://github.com/tspannhw
Invitation to join Discord: https://discord.com/invite/FjCMmaJng6
Blogs: https://milvusio.medium.com/ https://www.opensourcevectordb.cloud/ https://medium.com/@tspann
Expand LLMs' knowledge by incorporating external data sources into LLMs and your AI applications.
Build applications with generative AI on Google CloudMárton Kodok
We will explore Vertex AI - Model Garden powered experiences, we are going to learn more about the integration of these generative AI APIs. We are going to see in action what the Gemini family of generative models are for developers to build and deploy AI-driven applications. Vertex AI includes a suite of foundation models, these are referred to as the PaLM and Gemini family of generative ai models, and they come in different versions. We are going to cover how to use via API to: - execute prompts in text and chat - cover multimodal use cases with image prompts. - finetune and distill to improve knowledge domains - run function calls with foundation models to optimize them for specific tasks. At the end of the session, developers will understand how to innovate with generative AI and develop apps using the generative ai industry trends.
We are pleased to share with you the latest VCOSA statistical report on the cotton and yarn industry for the month of March 2024.
Starting from January 2024, the full weekly and monthly reports will only be available for free to VCOSA members. To access the complete weekly report with figures, charts, and detailed analysis of the cotton fiber market in the past week, interested parties are kindly requested to contact VCOSA to subscribe to the newsletter.
PyData London 2024: Mistakes were made (Dr. Rebecca Bilbro)Rebecca Bilbro
To honor ten years of PyData London, join Dr. Rebecca Bilbro as she takes us back in time to reflect on a little over ten years working as a data scientist. One of the many renegade PhDs who joined the fledgling field of data science of the 2010's, Rebecca will share lessons learned the hard way, often from watching data science projects go sideways and learning to fix broken things. Through the lens of these canon events, she'll identify some of the anti-patterns and red flags she's learned to steer around.
Discover the cutting-edge telemetry solution implemented for Alan Wake 2 by Remedy Entertainment in collaboration with AWS. This comprehensive presentation dives into our objectives, detailing how we utilized advanced analytics to drive gameplay improvements and player engagement.
Key highlights include:
Primary Goals: Implementing gameplay and technical telemetry to capture detailed player behavior and game performance data, fostering data-driven decision-making.
Tech Stack: Leveraging AWS services such as EKS for hosting, WAF for security, Karpenter for instance optimization, S3 for data storage, and OpenTelemetry Collector for data collection. EventBridge and Lambda were used for data compression, while Glue ETL and Athena facilitated data transformation and preparation.
Data Utilization: Transforming raw data into actionable insights with technologies like Glue ETL (PySpark scripts), Glue Crawler, and Athena, culminating in detailed visualizations with Tableau.
Achievements: Successfully managing 700 million to 1 billion events per month at a cost-effective rate, with significant savings compared to commercial solutions. This approach has enabled simplified scaling and substantial improvements in game design, reducing player churn through targeted adjustments.
Community Engagement: Enhanced ability to engage with player communities by leveraging precise data insights, despite having a small community management team.
This presentation is an invaluable resource for professionals in game development, data analytics, and cloud computing, offering insights into how telemetry and analytics can revolutionize player experience and game performance optimization.
06-20-2024-AI Camp Meetup-Unstructured Data and Vector DatabasesTimothy Spann
Tech Talk: Unstructured Data and Vector Databases
Speaker: Tim Spann (Zilliz)
Abstract: In this session, I will discuss the unstructured data and the world of vector databases, we will see how they different from traditional databases. In which cases you need one and in which you probably don’t. I will also go over Similarity Search, where do you get vectors from and an example of a Vector Database Architecture. Wrapping up with an overview of Milvus.
Introduction
Unstructured data, vector databases, traditional databases, similarity search
Vectors
Where, What, How, Why Vectors? We’ll cover a Vector Database Architecture
Introducing Milvus
What drives Milvus' Emergence as the most widely adopted vector database
Hi Unstructured Data Friends!
I hope this video had all the unstructured data processing, AI and Vector Database demo you needed for now. If not, there’s a ton more linked below.
My source code is available here
https://github.com/tspannhw/
Let me know in the comments if you liked what you saw, how I can improve and what should I show next? Thanks, hope to see you soon at a Meetup in Princeton, Philadelphia, New York City or here in the Youtube Matrix.
Get Milvused!
https://milvus.io/
Read my Newsletter every week!
https://github.com/tspannhw/FLiPStackWeekly/blob/main/141-10June2024.md
For more cool Unstructured Data, AI and Vector Database videos check out the Milvus vector database videos here
https://www.youtube.com/@MilvusVectorDatabase/videos
Unstructured Data Meetups -
https://www.meetup.com/unstructured-data-meetup-new-york/
https://lu.ma/calendar/manage/cal-VNT79trvj0jS8S7
https://www.meetup.com/pro/unstructureddata/
https://zilliz.com/community/unstructured-data-meetup
https://zilliz.com/event
Twitter/X: https://x.com/milvusio https://x.com/paasdev
LinkedIn: https://www.linkedin.com/company/zilliz/ https://www.linkedin.com/in/timothyspann/
GitHub: https://github.com/milvus-io/milvus https://github.com/tspannhw
Invitation to join Discord: https://discord.com/invite/FjCMmaJng6
Blogs: https://milvusio.medium.com/ https://www.opensourcevectordb.cloud/ https://medium.com/@tspann
https://www.meetup.com/unstructured-data-meetup-new-york/events/301383476/?slug=unstructured-data-meetup-new-york&eventId=301383476
https://www.aicamp.ai/event/eventdetails/W2024062014
Helping Chicago Communities Identify Subjects Who Are Likely to be Involved in Shootings
1. Helping Chicago Communities Identify Subjects Who Are
Likely To Be Involved in Shootings
Group 6:
Josiah Argo, Brendan Sigale, Dylan Leigh, Connor McDevitt, Keenan Daly
2. MSCI:3500:0BBB Group 6
Table of Contents
Overview 2
Background Information 2
Exploratory Analysis 2
Predictors 2
Racial Biases 3
Data Cleaning 4
Data Science 5
Modeling 5
Evaluation 6
Clustering 6
Business Insights 9
Limitations and Improvements 9
Conclusion 9
References 10
Page 1
3. MSCI:3500:0BBB Group 6
Overview
Shooting rates in Chicago are among the highest nationwide. We will be using the Chicago Police
Department’s Strategic Subject List to develop a model which accurately predicts the score for a person
based on various factors in order to extrapolate the data to new people, as the data only contains listings
for people from 2012-2016. We will also generate clusters to estimate the likelihood of a person being
involved in a shooting if the user has a lack of detailed information. This data consists of almost 400,000
anonymized arrest records from the City of Chicago and a score for each person as to whether they are
likely to be involved in a shooting, either as a victim or a perpetrator. The SSL score found in the data
represents the person’s propensity to be involved in a shooting on a scale of 0 (extremely low risk) to 500
(extremely high risk). The original model only used eight features, but we plan to use more, such as their
neighborhood or TRAP (Targeted Repeat Offenders Program) status. This data can be used by community
activists to identify people in their reach who they can specifically target for their programs.
This report will detail our exploratory analysis of the data, followed by which models we chose to test,
what parameters we used in those models, and which model we chose to move forward with. Then we
will evaluate our model, including its weaknesses and possible improvements. We will then describe our
clusters, which can be used to generalize people based on factors which can be learned easily. Finally, we
will go over what insights can be gained from our model and how it can be used in the future. “Inner-city
gun violence mostly occurs within small networks of people who know each other, as allies or enemies,
and are at high risk for becoming violent. Those groups typically account for 0.5 percent or less of a city’s
population” (Bower).
Background Information
Chicago has always been known for its gun violence, but definitely has of late. “The Chicago Police
Department has recovered 7,000 guns per year that had been illegally owned or associated with a crime
between 2013 and 2016” (Kight and Sykes). BUILD is one of many community groups trying to stop the
violence and we believe our model can help them. Which it deeply needs right now because in 2017 only
17% of murders in Chicago were solved (Kight and Sykes). A 27-year-old man was killed at a house
party in South Side Chicago because of a “war of words over social media” (Bower). Our model would be
a stepping stone in the right direction for the Chicago Police Department and community organizations
because they will be able to identify who is at risk to be involved with gun violence. This June in
Chicago fifty-six people were shot in one weekend alone and more than 100 people got shot two years
ago on the Fourth of July weekend (Pascus). This is just a couple of examples of gun violence in Chicago
that we are trying to help prevent or help solve crimes.
Exploratory Analysis
Predictors
Age group was the first attribute we analyzed to determine its influence on the score. The histograms of
the factor show extremely little overlap in the distributions of each level. From this visual, we can
Page 2
4. MSCI:3500:0BBB Group 6
conclude that age will be a great predictor of score. Furthermore, there is a perfect inverse relationship
between age and score. Ages under 20 consistently yield scores closer 500 and as the age group increases,
the scores proportionally decrease. This is evidence that our predictive model will be heavily influenced
by the age of an offender.
Figure 1 Histograms of Age Group
Next, we focused on the previous weapon charge attribute. This feature holds a boolean value of “Yes”,
the offender has unlawfully been charged of possessing a weapon, or “No”, they have not been charged.
One would logically assume that if a past-convict has been charged with such a crime, that they would be
more likely to commit a violent crime in the future. The cumulative curve for this feature certainly backs
this assertion, as it reveals the “Y” group does generally have a higher score than the “N” group. In
conclusion, previous weapon charge can be labeled as a good predictor of the score.
Fig. 2 Cumulative Curve of Age
Racial Biases
In our exploratory analysis, we observed a bias large enough in scale that it warrants acknowledgement.
In the City of Chicago, Whites and African Americans use drugs at similar rates (NAACP). While the
demographic breakdown of the city is 49.14% white and 30.51% African American (World Population
Page 3
5. MSCI:3500:0BBB Group 6
Review), there are significantly more African Americans in our dataset who have been convicted of a
drug offense. Furthermore, the proportion of African Americans charged with a drug offense is
approximately five times higher than any other race group.
Fig. 3 Bar Plots of Race Group
An analysis of the drug offense histogram reveals that high scores can be influenced by a prior drug
charge. In conclusion, the bias of African Americans disproportionately being persecuted for drug
offenses may affect the accuracy of our predictive model because high scores will not be assigned from a
objectively-convicted collection of observations.
Fig. 4 Histograms of Prior Drug Offense
Data Cleaning
The data cleaning process took place in both Excel and Rattle.The original dataset had over 40 features,
of which many were deleted because they would not directly pertain to our inquiries.
Page 4
6. MSCI:3500:0BBB Group 6
Once the dataset was imported into the rattle application with a 70/15/15 partition, it held 279,078 rows
and 15 features. Eight of the features were categorical and 7 were numeric. with our target variable, SSL
score, being numeric. The numeric variables of domestic arrest count, weapon arrest count, and narcotics
arrest count all contained null values for instances where the subject had 0 (e.g. no narcotics arrests).
Therefore, null values were imputed as a 0 so that these features would be accurately represented. After
imputation, the dataset contained 0 missing values so there was no need to delete observations with
missing data.
Lastly, there was additional data transformation that took place before the ANN model. All categorical
variables were transformed into indicator variables to ensure that the dataset would be compatible with a
neural network model. In addition, numeric variables were re-scaled from 0-1 so that high-values would
not compromise the integrity of the model.
Data Science
Modeling
We used Rattle to determine the best model to predict a person’s score based on several input variables.
We tested all four regression models: Decision Tree, Random Forest, Neural Networks, and Linear
Regressions with various tuning parameters to identify the best combination of regression type and
parameters.
The results (R2
and MAE) with their tested parameters were as follows:
Fig. 5 Optimal Tuning Parameters
The Decision Tree’s ideal complexity was 0.0025. Random Forest’s ideal amount of Trees was 100.
ANN’s ideal amount of nodes was 20. Linear Regression didn’t have different parameters to test.
Page 5
7. MSCI:3500:0BBB Group 6
Evaluation
After selecting the best parameters we evaluated each model with different seeds:
Fig. 6 Model Testing
ANN was selected as the best model as it had the lowest MAE - 10.583 and highest R2 at 0.9234. An R2
of 0.9234 and a MAE of 10.583 is extremely accurate for this data, as the score can be anywhere within
0-500. A mean average error of 10 translates to a 2% difference in score, which is not indicative of a huge
shift in a subject’s likelihood to be involved in a shooting. The difference between a subject with a score
of 380 and a subject with a score of 370 is essentially negligible. However, we are not able to determine if
this is the best model achievable. Due to technical limitations, we were not able to test the model
parameters to the full extent we would have liked. For example, we were limited to only two variables
and a small amount of trees for the random forest testing, as tests with three variables spent over an hour
processing with no progress, even with only 10 trees. That being said, achieving a more accurate fits runs
the risk of overfitting, which would be even worse than having inaccurate data. For these reasons, we
were satisfied with the results of our regression models.
Clustering
We sought to generate clusters for the various subjects to learn if there were any generalizable results.
This would allow community organizations to estimate the likelihood of a member being involved in a
shooting based on facts they might know about the person, such as their arrest history, gender, race, and
involvement with weapons and drugs.
Using k-means clustering, we determined that three clusters was the optimal amount of clusters for this
data. This was done by using the cleaned data in rattle and using the iterate function to generate an elbow
plot.
Page 6
9. MSCI:3500:0BBB Group 6
TIN_IMO_AGE.CURR_40.50 0.100494 0.000020 0.421586
TIN_IMO_AGE.GROUP_50.60 0.169136 0.000000 0.093995
TIN_IMO_AGE.CURR_50.60 0.143071 0.000000 0.186580
TIN_IMO_AGE.GROUP_60.70 0.043402 0.000000 0.015757
TIN_IMO_AGE.CURR_60.70 0.065061 0.000000 0.026760
TIN_IMO_AGE.GROUP_70.80 0.005911 0.000623 0.000644
TIN_IMO_AGE.CURR_70.80 0.010318 0.000010 0.001654
TIN_IMO_AGE.GROUP_less.than.20 0.000042 0.408847 0.072487
TIN_IMO_AGE.CURR_less.than.20 0.000000 0.236792 0.035198
TIN_ICN_PAROLEE.I_Y 0.027580 0.012486 0.064657
TIN_ICN_PAROLEE.I_N 0.972420 0.987514 0.935343
TIN_TFC_ICN_TRAP.STATUS_.0.0. 0.995583 0.994558 0.985264
TIN_TFC_ICN_TRAP.STATUS_.0.3. 0.003540 0.004571 0.011660
TIN_TFC_ICN_TRAP.STATUS_.3.4. 0.000877 0.000871 0.003076
R01_YEARS_SINCE_LAST_CONTACT 0.196619 0.169575 0.172311
R01_ICN_WEAPONS.ARR.CNT 0.006833 0.008944 0.013802
R01_ICN_NARCOTICS.ARR.CNT 0.011985 0.000187 0.063653
R01_ICN_DOMESTIC.ARR.CNT 0.012463 0.007255 0.012284
Fig. 8 Feature Means of Clusters
These feature means allow us to separate the clusters into three distinctly different groups of people,
based on gender, race, age, and narcotics and weapons history.
● Cluster 1 – Some Drug Involvement
○ Most varied race, primarily male, least victimized by weapons, some drug use, 30-40
years old, some arrests
● Cluster 2 – Some Weapons Involvement
○ Less varied race, more females, some victims of and involvement with weapons, few
drugs, under 30, large population, very few arrests
● Cluster 3 – High Weapons and Drug Involvement
○ Notably more black people, overwhelmingly male, not victims of weapons but arrested
for weapons, mostly in 40-60 years old range, proportionately higher on TRAP,
significantly more narcotics arrests.
These clusters will make it easy for community organizations to determine how likely any new member is
to be involved in a shooting. An interesting factor is that the mean of the SSL score does not vary
significantly between the three clusters. This indicates the model predicts subjects on the list are
somewhat equally likely to be involved in a shooting based simply on the fact that they have been
arrested before. A list that included all residents of Chicago, regardless of whether they have been
arrested before, would naturally create new clusters which would boost the means in the score category of
the clusters above. This can be seen in microcosms within these clusters, such as the TRAP score. The
Targeted Repeat Offenders (TRAP) List has only 3118 members, representing under 1% of subjects on
the list. Because the TRAP status was processed as an indicator variable, we can easily see that subjects
Page 8
10. MSCI:3500:0BBB Group 6
in Cluster 3 have more people on the TRAP list. Due to the slight difference in the cluster mean. Were
there are new cluster of people not on the list with extremely low scores, the score means of the three
clusters above would increase significantly.
Business Insights
This model tells us what factors often lead an individual to be involved in a shooting. Community
Organizations in Chicago and specialized crime prevention units in the Chicago Police Department can
use these insights to help predict who will be involved in shootings. They could even extend that beyond
predicting shootings into starting to predict perpetrators and victims of other similar violent crimes such
as assault or battery, provided they have more data to add to the model.
Limitations and Improvements
This project has implications for communities across the board. However, we were limited by some
factors, such as processing power and the quality of the data. We would have loved to break this data
down by location. The police data had the district of their last arrest, but this data was difficult to work
with because much of the data was missing. In addition, this data is currently limited to people who
already have an arrest history. While this works if the subject has an arrest history, the model and clusters
are not generalizable to those without an arrest history. In addition, we would have liked to test additional
model parameters. While this would have run the risk of overfitting our model, we can not know without
seeing the data. Some of our models described above already took too long to run, such as any random
forest with three variables. With more processing power, we could run more accurate models and assess
them for overall fit and the risk of them overfitting our data.
Conclusion
After running our analysis, we have determined two different methods of estimating someone’s likelihood
of being involved in a shooting, which can be used by Community Organizations in Chicago and
specialized crime prevention units in the Chicago Police Department. We are confident that a neural
network with 20 nodes is the best method we were capable of testing to predict a community member’s
score given a large amount of information on that person, such as their age, race, gender, and prior
involvement with illegal activity. This allows them to model their current community members to obtain
accurate scores for these people, allowing them to more selectively target their community outreach
programs. The second method we identified was to develop clusters based on their demographics and
criminal history, allowing community organizations to make more general inferences about their
community members. However, there are limitations to this method, as we are only able to cluster people
who have an arrest history to begin with. While this helps cluster people who have an arrest history, it is
difficult to generalize a determination of who has been arrested before, making this method difficult to
implement on community members whom the organization has had little to no contact with. These tools
can be used across Chicago to make their efforts to better their community easier, reducing shootings in
Chicago and bringing communities out of violence across the board.
Page 9
11. MSCI:3500:0BBB Group 6
References
Bower, Bruce. “Can Neighborhood Outreach Reduce Inner-City Gun Violence in the U.S.?” Science
News, 5 Nov. 2019,
www.sciencenews.org/article/neighborhood-outreach-can-reduce-inner-city-gun-violence.
Chicago Tribune. “Tracking Chicago Homicide Victims.” Chicagotribune.com, Chicago Tribune, 3 Dec.
2019, www.chicagotribune.com/news/breaking/ct-chicago-homicides-data-tracker-htmlstory.html.
Chicago Tribune. “Tracking Chicago Shooting Victims.” Chicagotribune.com, Chicago Tribune, 3 Dec.
2019, www.chicagotribune.com/data/ct-shooting-victims-map-charts-htmlstory.html.
“Chicago, Illinois Population 2019.” Chicago, Illinois Population 2019 (Demographics, Maps, Graphs),
worldpopulationreview.com/us-cities/chicago-population/.
Chicago, City of. “Strategic Subject List: City of Chicago: Data Portal.” Chicago Data Portal, 1 May
2017, data.cityofchicago.org/Public-Safety/Strategic-Subject-List/4aki-r3np.
“Criminal Justice Fact Sheet.” NAACP, www.naacp.org/criminal-justice-fact-sheet/.
Kight, Stef W., and Michael Sykes. “The Deadliest City: Behind Chicago's Segregated Shooting Sprees.”
Axios,
www.axios.com/chicago-gun-violence-murder-rate-statistics-4addeeec-d8d8-4ce7-a26b-81d428c1
4836.html.
Pascus, Brian. “56 people were shot in Chicago over the weekend, including four fatally.” CBSNEWS,
www.cbsnews.com/news/chicago-gun-violence-56-shot-this-weekend-four-killed/
Page 10