top of page

Chicago City Sexual Assault - Data Analysis

  • Writer: Paras
    Paras
  • Aug 3, 2023
  • 2 min read

Updated: Aug 3, 2023



Here's a Data Analysis of Chicago City's sexual assault data set, done with the Matplotlib Plotting Library in Python.


The data set is downloaded from The USA Chicago Police Website at https://catalog.data.gov/dataset/sex-offenders


As of the writing of this blog post, there's a total of 3,255 total sex offenders listed. The year range from which these are recorded are not available in the data set.


• The code to import libraries is:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

• The code to import the dataset is:

df=pd.read_csv('Sex_Offenders.csv')
df.head()

•To find the Gender of the perpetrators:


Type:

data=df['GENDER'].value_counts()
data.plot(kind='pie',autopct="%0.1f%%")

Which yields:


Which tells us that in 98.2% of the reported cases, the gender of the perpetrator was Male. The Female gender accounts for a mere 1.8% of the sex offences.



•To Find if the victims are Minors:


Type:

data=df['VICTIM MINOR'].value_counts()
data.plot(kind='pie',autopct="%0.1f%%")

Which will yield the pie chart:


Which tells us that in almost 3-quarters of the reported cases the victims are minors, i.e. under 18 years of age.



•To categorize the Sex Offenders by Race:


Type:

df['RACE'].describe()

The Output will be:

count      3255
unique        7
top       BLACK
freq       1922
Name: RACE,
dtype: object

Thus we can see that:

1) The total number of offenders have been categorized into a race,

2) The unique entries are 7 viz. White, Black, etc.,

3) The most number of offences are done by Blacks,

4) Their frequency is 1,922 out of the total 3,255

5) The entry under the RACE column is an object, i.e. a word or a sentence.



•To Plot the Sex Offence Frequency by Race:


Type:

ax=sns.countplot(x='RACE',data=df,width=.5)
for bars in ax.containers: ax.bar_label(bars)

Which will yield the following plot:



•The Distribution of the Sexual Offenders' Race:


Typing

data=df['RACE'].value_counts()
data.plot(kind='pie',autopct="%0.1f%%")

will yield the pie chart:


Thus we see that the maximum share of the Race is taken up by Blacks.


•The Inference:


From the above Data Analysis, we can derive the inference that:

"The biggest sex offenders in Chicago are Black Males. 3 out of 4 times, the victim is under 18 years of age."

This is not a jab at any community or kinds of people. This is a part of my Portfolio for Data Analysis.


Sources:



Comments


  • Instagram
bottom of page