diff --git a/.ipynb_checkpoints/argument-checkpoint.ipynb b/.ipynb_checkpoints/argument-checkpoint.ipynb
new file mode 100644
index 0000000..20be531
--- /dev/null
+++ b/.ipynb_checkpoints/argument-checkpoint.ipynb
@@ -0,0 +1,625 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "id": "worldwide-blood",
+ "metadata": {},
+ "source": [
+ "# Introduction"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "understanding-numbers",
+ "metadata": {},
+ "source": [
+ "*✏️ Write 2-3 sentences describing your research.*\n",
+ "\n",
+ "It's a collection of data on the reasons fatal car crashes occur in every state of America, and it will be used to determine which region of America is the deadliest. "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "greater-circular",
+ "metadata": {},
+ "source": [
+ "## Overarching Question: What is the deadliest region in America to drive on?"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "appreciated-testimony",
+ "metadata": {},
+ "source": [
+ "*✏️ Write 2-3 sentences explaining why this question.*\n",
+ "\n",
+ "I am interested in this because I live on the Northeast Coast and we have a lot of car \n",
+ "accidents. People drive very fast here. The roads are not always paved properly and maintained. I want to know if it's just bad luck when people get into accidents or if it's their own fault. "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "permanent-pollution",
+ "metadata": {},
+ "source": [
+ "# Data"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 3,
+ "id": "technical-evans",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#Include any import statements you will need\n",
+ "import pandas as pd\n",
+ "import matplotlib.pyplot as plt"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 5,
+ "id": "overhead-sigma",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "### 💻 FILL IN YOUR DATASET FILE NAME BELOW 💻 ###\n",
+ "\n",
+ "file_name = \"bad-drivers.csv\"\n",
+ "dataset_path = \"data/\" + file_name\n",
+ "\n",
+ "df = pd.read_csv(dataset_path)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 8,
+ "id": "heated-blade",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "
\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " | \n",
+ " State | \n",
+ " Number of drivers involved in fatal collisions per billion miles | \n",
+ " Percentage Of Drivers Involved In Fatal Collisions Who Were Speeding | \n",
+ " Percentage Of Drivers Involved In Fatal Collisions Who Were Alcohol-Impaired | \n",
+ " Percentage Of Drivers Involved In Fatal Collisions Who Were Not Distracted | \n",
+ " Percentage Of Drivers Involved In Fatal Collisions Who Had Not Been Involved In Any Previous Accidents | \n",
+ " Car Insurance Premiums ($) | \n",
+ " Losses incurred by insurance companies for collisions per insured driver ($) | \n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " | 0 | \n",
+ " Alabama | \n",
+ " 18.8 | \n",
+ " 39 | \n",
+ " 30 | \n",
+ " 96 | \n",
+ " 80 | \n",
+ " 784.55 | \n",
+ " 145.08 | \n",
+ "
\n",
+ " \n",
+ " | 1 | \n",
+ " Alaska | \n",
+ " 18.1 | \n",
+ " 41 | \n",
+ " 25 | \n",
+ " 90 | \n",
+ " 94 | \n",
+ " 1053.48 | \n",
+ " 133.93 | \n",
+ "
\n",
+ " \n",
+ " | 2 | \n",
+ " Arizona | \n",
+ " 18.6 | \n",
+ " 35 | \n",
+ " 28 | \n",
+ " 84 | \n",
+ " 96 | \n",
+ " 899.47 | \n",
+ " 110.35 | \n",
+ "
\n",
+ " \n",
+ " | 3 | \n",
+ " Arkansas | \n",
+ " 22.4 | \n",
+ " 18 | \n",
+ " 26 | \n",
+ " 94 | \n",
+ " 95 | \n",
+ " 827.34 | \n",
+ " 142.39 | \n",
+ "
\n",
+ " \n",
+ " | 4 | \n",
+ " California | \n",
+ " 12.0 | \n",
+ " 35 | \n",
+ " 28 | \n",
+ " 91 | \n",
+ " 89 | \n",
+ " 878.41 | \n",
+ " 165.63 | \n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " State \\\n",
+ "0 Alabama \n",
+ "1 Alaska \n",
+ "2 Arizona \n",
+ "3 Arkansas \n",
+ "4 California \n",
+ "\n",
+ " Number of drivers involved in fatal collisions per billion miles \\\n",
+ "0 18.8 \n",
+ "1 18.1 \n",
+ "2 18.6 \n",
+ "3 22.4 \n",
+ "4 12.0 \n",
+ "\n",
+ " Percentage Of Drivers Involved In Fatal Collisions Who Were Speeding \\\n",
+ "0 39 \n",
+ "1 41 \n",
+ "2 35 \n",
+ "3 18 \n",
+ "4 35 \n",
+ "\n",
+ " Percentage Of Drivers Involved In Fatal Collisions Who Were Alcohol-Impaired \\\n",
+ "0 30 \n",
+ "1 25 \n",
+ "2 28 \n",
+ "3 26 \n",
+ "4 28 \n",
+ "\n",
+ " Percentage Of Drivers Involved In Fatal Collisions Who Were Not Distracted \\\n",
+ "0 96 \n",
+ "1 90 \n",
+ "2 84 \n",
+ "3 94 \n",
+ "4 91 \n",
+ "\n",
+ " Percentage Of Drivers Involved In Fatal Collisions Who Had Not Been Involved In Any Previous Accidents \\\n",
+ "0 80 \n",
+ "1 94 \n",
+ "2 96 \n",
+ "3 95 \n",
+ "4 89 \n",
+ "\n",
+ " Car Insurance Premiums ($) \\\n",
+ "0 784.55 \n",
+ "1 1053.48 \n",
+ "2 899.47 \n",
+ "3 827.34 \n",
+ "4 878.41 \n",
+ "\n",
+ " Losses incurred by insurance companies for collisions per insured driver ($) \n",
+ "0 145.08 \n",
+ "1 133.93 \n",
+ "2 110.35 \n",
+ "3 142.39 \n",
+ "4 165.63 "
+ ]
+ },
+ "execution_count": 8,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "df.head()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "continental-franklin",
+ "metadata": {},
+ "source": [
+ "**Data Overview**\n",
+ "\n",
+ "*✏️ Write 2-3 sentences describing this dataset. Be sure to include where the data comes from and what it contains.*\n",
+ "\n",
+ "I got the data set from FiveThirtyEight. It was used for an article called\n",
+ "\"Dear Mona, Which state has the worst drivers?\" in October 2014. The person who wrote the article is Mona Chalabi, they are a data editor at the Guardian US, \n",
+ "a columnist at New York Margazine, and a lead news writer for FiveThirtyEight.\n",
+ "\n",
+ "The date is about fatal collisions in each state. There are 7 rows, some of the rows\n",
+ "are about \"Percentage Of Drivers Involved In Fatal Collisions Who Were Alcohol-Impaired\" and \"Percentage Of Drivers Involved In Fatal Collisions Who Were Not Distracted\"\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "6ba44c9c-60d1-46a4-8257-b4e8eeea348d",
+ "metadata": {},
+ "source": [
+ "I will recategorise the data so that all of the states data will be separated into the five regions of the United States"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 40,
+ "id": "f7bba5f3-5911-4a76-ad43-f6ce78cd4fb3",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " | \n",
+ " State | \n",
+ " Number of drivers involved in fatal collisions per billion miles | \n",
+ " Percentage Of Drivers Involved In Fatal Collisions Who Were Speeding | \n",
+ " Percentage Of Drivers Involved In Fatal Collisions Who Were Alcohol-Impaired | \n",
+ " Percentage Of Drivers Involved In Fatal Collisions Who Were Not Distracted | \n",
+ " Percentage Of Drivers Involved In Fatal Collisions Who Had Not Been Involved In Any Previous Accidents | \n",
+ " Car Insurance Premiums ($) | \n",
+ " Losses incurred by insurance companies for collisions per insured driver ($) | \n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " | 32 | \n",
+ " New York | \n",
+ " 12.3 | \n",
+ " 32 | \n",
+ " 29 | \n",
+ " 88 | \n",
+ " 80 | \n",
+ " 1234.31 | \n",
+ " 150.01 | \n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " State \\\n",
+ "32 New York \n",
+ "\n",
+ " Number of drivers involved in fatal collisions per billion miles \\\n",
+ "32 12.3 \n",
+ "\n",
+ " Percentage Of Drivers Involved In Fatal Collisions Who Were Speeding \\\n",
+ "32 32 \n",
+ "\n",
+ " Percentage Of Drivers Involved In Fatal Collisions Who Were Alcohol-Impaired \\\n",
+ "32 29 \n",
+ "\n",
+ " Percentage Of Drivers Involved In Fatal Collisions Who Were Not Distracted \\\n",
+ "32 88 \n",
+ "\n",
+ " Percentage Of Drivers Involved In Fatal Collisions Who Had Not Been Involved In Any Previous Accidents \\\n",
+ "32 80 \n",
+ "\n",
+ " Car Insurance Premiums ($) \\\n",
+ "32 1234.31 \n",
+ "\n",
+ " Losses incurred by insurance companies for collisions per insured driver ($) \n",
+ "32 150.01 "
+ ]
+ },
+ "execution_count": 40,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "\n",
+ "Northeast = df[df.State == \"New York\"]\n",
+ "Northeast"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "infinite-instrument",
+ "metadata": {},
+ "source": [
+ "# Methods and Results"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "basic-canadian",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#Import any helper files you need here"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "recognized-positive",
+ "metadata": {},
+ "source": [
+ "## First Research Question: Is drinking and driving the biggest cause of fatal collisions?\\"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "graduate-palmer",
+ "metadata": {},
+ "source": [
+ "### Methods"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "endless-variation",
+ "metadata": {},
+ "source": [
+ "*Explain how you will approach this research question below. Consider the following:* \n",
+ " - *Which aspects of the dataset will you use?* \n",
+ " - *How will you reorganize/store the data?* \n",
+ " - *What data science tools/functions will you use and why?* \n",
+ " \n",
+ "✏️ *Write your answer below:*\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "portuguese-japan",
+ "metadata": {},
+ "source": [
+ "### Results "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 17,
+ "id": "negative-highlight",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#######################################################################\n",
+ "### 💻 YOUR WORK GOES HERE TO ANSWER THE FIRST RESEARCH QUESTION 💻 \n",
+ "### \n",
+ "### Your data analysis may include a statistic and/or a data visualization\n",
+ "#######################################################################"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 16,
+ "id": "victorian-burning",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# 💻 YOU CAN ADD NEW CELLS WITH THE \"+\" BUTTON "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "collectible-puppy",
+ "metadata": {},
+ "source": [
+ "## Second Research Question: What state is the most unluckiest state for fatel collisions?\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "demographic-future",
+ "metadata": {},
+ "source": [
+ "### Methods"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "incorporate-roller",
+ "metadata": {},
+ "source": [
+ "*Explain how you will approach this research question below. Consider the following:* \n",
+ " - *Which aspects of the dataset will you use?* \n",
+ " - *How will you reorganize/store the data?* \n",
+ " - *What data science tools/functions will you use and why?* \n",
+ "\n",
+ "✏️ *Write your answer below:*\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "juvenile-creation",
+ "metadata": {},
+ "source": [
+ "### Results "
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 14,
+ "id": "pursuant-surrey",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#######################################################################\n",
+ "### 💻 YOUR WORK GOES HERE TO ANSWER THE SECOND RESEARCH QUESTION 💻 \n",
+ "###\n",
+ "### Your data analysis may include a statistic and/or a data visualization\n",
+ "#######################################################################"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 15,
+ "id": "located-night",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# 💻 YOU CAN ADD NEW CELLS WITH THE \"+\" BUTTON "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "infectious-symbol",
+ "metadata": {},
+ "source": [
+ "# Discussion"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "furnished-camping",
+ "metadata": {
+ "code_folding": []
+ },
+ "source": [
+ "## Considerations"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "bearing-stadium",
+ "metadata": {},
+ "source": [
+ "*It's important to recognize the limitations of our research.\n",
+ "Consider the following:*\n",
+ "\n",
+ "- *Do the results give an accurate depiction of your research question? Why or why not?*\n",
+ "- *What were limitations of your datset?*\n",
+ "- *Are there any known biases in the data?*\n",
+ "\n",
+ "✏️ *Write your answer below:*"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "beneficial-invasion",
+ "metadata": {},
+ "source": [
+ "## Summary"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "about-raise",
+ "metadata": {},
+ "source": [
+ "*Summarize what you discovered through the research. Consider the following:*\n",
+ "\n",
+ "- *What did you learn about your media consumption/digital habits?*\n",
+ "- *Did the results make sense?*\n",
+ "- *What was most surprising?*\n",
+ "- *How will this project impact you going forward?*\n",
+ "\n",
+ "✏️ *Write your answer below:*"
+ ]
+ }
+ ],
+ "metadata": {
+ "jupytext": {
+ "cell_metadata_json": true,
+ "text_representation": {
+ "extension": ".Rmd",
+ "format_name": "rmarkdown",
+ "format_version": "1.2",
+ "jupytext_version": "1.9.1"
+ }
+ },
+ "kernelspec": {
+ "display_name": "Python 3 (ipykernel)",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.13.7"
+ },
+ "toc": {
+ "base_numbering": 1,
+ "nav_menu": {},
+ "number_sections": false,
+ "sideBar": true,
+ "skip_h1_title": false,
+ "title_cell": "Table of Contents",
+ "title_sidebar": "Contents",
+ "toc_cell": false,
+ "toc_position": {},
+ "toc_section_display": true,
+ "toc_window_display": false
+ },
+ "varInspector": {
+ "cols": {
+ "lenName": 16,
+ "lenType": 16,
+ "lenVar": 40
+ },
+ "kernels_config": {
+ "python": {
+ "delete_cmd_postfix": "",
+ "delete_cmd_prefix": "del ",
+ "library": "var_list.py",
+ "varRefreshCmd": "print(var_dic_list())"
+ },
+ "r": {
+ "delete_cmd_postfix": ") ",
+ "delete_cmd_prefix": "rm(",
+ "library": "var_list.r",
+ "varRefreshCmd": "cat(var_dic_list()) "
+ }
+ },
+ "types_to_exclude": [
+ "module",
+ "function",
+ "builtin_function_or_method",
+ "instance",
+ "_Feature"
+ ],
+ "window_display": false
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/argument.ipynb b/argument.ipynb
index 4ed27b4..20be531 100644
--- a/argument.ipynb
+++ b/argument.ipynb
@@ -13,7 +13,9 @@
"id": "understanding-numbers",
"metadata": {},
"source": [
- "*✏️ Write 2-3 sentences describing your research.*"
+ "*✏️ Write 2-3 sentences describing your research.*\n",
+ "\n",
+ "It's a collection of data on the reasons fatal car crashes occur in every state of America, and it will be used to determine which region of America is the deadliest. "
]
},
{
@@ -21,7 +23,7 @@
"id": "greater-circular",
"metadata": {},
"source": [
- "## Overarching Question: [✏️ PUT YOUR QUESTION HERE ✏️]"
+ "## Overarching Question: What is the deadliest region in America to drive on?"
]
},
{
@@ -29,7 +31,10 @@
"id": "appreciated-testimony",
"metadata": {},
"source": [
- "*✏️ Write 2-3 sentences explaining why this question.*"
+ "*✏️ Write 2-3 sentences explaining why this question.*\n",
+ "\n",
+ "I am interested in this because I live on the Northeast Coast and we have a lot of car \n",
+ "accidents. People drive very fast here. The roads are not always paved properly and maintained. I want to know if it's just bad luck when people get into accidents or if it's their own fault. "
]
},
{
@@ -42,7 +47,7 @@
},
{
"cell_type": "code",
- "execution_count": null,
+ "execution_count": 3,
"id": "technical-evans",
"metadata": {},
"outputs": [],
@@ -54,14 +59,14 @@
},
{
"cell_type": "code",
- "execution_count": null,
+ "execution_count": 5,
"id": "overhead-sigma",
"metadata": {},
"outputs": [],
"source": [
"### 💻 FILL IN YOUR DATASET FILE NAME BELOW 💻 ###\n",
"\n",
- "file_name = \"YOUR_DATASET_FILE_NAME.csv\"\n",
+ "file_name = \"bad-drivers.csv\"\n",
"dataset_path = \"data/\" + file_name\n",
"\n",
"df = pd.read_csv(dataset_path)"
@@ -69,10 +74,164 @@
},
{
"cell_type": "code",
- "execution_count": null,
+ "execution_count": 8,
"id": "heated-blade",
"metadata": {},
- "outputs": [],
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " | \n",
+ " State | \n",
+ " Number of drivers involved in fatal collisions per billion miles | \n",
+ " Percentage Of Drivers Involved In Fatal Collisions Who Were Speeding | \n",
+ " Percentage Of Drivers Involved In Fatal Collisions Who Were Alcohol-Impaired | \n",
+ " Percentage Of Drivers Involved In Fatal Collisions Who Were Not Distracted | \n",
+ " Percentage Of Drivers Involved In Fatal Collisions Who Had Not Been Involved In Any Previous Accidents | \n",
+ " Car Insurance Premiums ($) | \n",
+ " Losses incurred by insurance companies for collisions per insured driver ($) | \n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " | 0 | \n",
+ " Alabama | \n",
+ " 18.8 | \n",
+ " 39 | \n",
+ " 30 | \n",
+ " 96 | \n",
+ " 80 | \n",
+ " 784.55 | \n",
+ " 145.08 | \n",
+ "
\n",
+ " \n",
+ " | 1 | \n",
+ " Alaska | \n",
+ " 18.1 | \n",
+ " 41 | \n",
+ " 25 | \n",
+ " 90 | \n",
+ " 94 | \n",
+ " 1053.48 | \n",
+ " 133.93 | \n",
+ "
\n",
+ " \n",
+ " | 2 | \n",
+ " Arizona | \n",
+ " 18.6 | \n",
+ " 35 | \n",
+ " 28 | \n",
+ " 84 | \n",
+ " 96 | \n",
+ " 899.47 | \n",
+ " 110.35 | \n",
+ "
\n",
+ " \n",
+ " | 3 | \n",
+ " Arkansas | \n",
+ " 22.4 | \n",
+ " 18 | \n",
+ " 26 | \n",
+ " 94 | \n",
+ " 95 | \n",
+ " 827.34 | \n",
+ " 142.39 | \n",
+ "
\n",
+ " \n",
+ " | 4 | \n",
+ " California | \n",
+ " 12.0 | \n",
+ " 35 | \n",
+ " 28 | \n",
+ " 91 | \n",
+ " 89 | \n",
+ " 878.41 | \n",
+ " 165.63 | \n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " State \\\n",
+ "0 Alabama \n",
+ "1 Alaska \n",
+ "2 Arizona \n",
+ "3 Arkansas \n",
+ "4 California \n",
+ "\n",
+ " Number of drivers involved in fatal collisions per billion miles \\\n",
+ "0 18.8 \n",
+ "1 18.1 \n",
+ "2 18.6 \n",
+ "3 22.4 \n",
+ "4 12.0 \n",
+ "\n",
+ " Percentage Of Drivers Involved In Fatal Collisions Who Were Speeding \\\n",
+ "0 39 \n",
+ "1 41 \n",
+ "2 35 \n",
+ "3 18 \n",
+ "4 35 \n",
+ "\n",
+ " Percentage Of Drivers Involved In Fatal Collisions Who Were Alcohol-Impaired \\\n",
+ "0 30 \n",
+ "1 25 \n",
+ "2 28 \n",
+ "3 26 \n",
+ "4 28 \n",
+ "\n",
+ " Percentage Of Drivers Involved In Fatal Collisions Who Were Not Distracted \\\n",
+ "0 96 \n",
+ "1 90 \n",
+ "2 84 \n",
+ "3 94 \n",
+ "4 91 \n",
+ "\n",
+ " Percentage Of Drivers Involved In Fatal Collisions Who Had Not Been Involved In Any Previous Accidents \\\n",
+ "0 80 \n",
+ "1 94 \n",
+ "2 96 \n",
+ "3 95 \n",
+ "4 89 \n",
+ "\n",
+ " Car Insurance Premiums ($) \\\n",
+ "0 784.55 \n",
+ "1 1053.48 \n",
+ "2 899.47 \n",
+ "3 827.34 \n",
+ "4 878.41 \n",
+ "\n",
+ " Losses incurred by insurance companies for collisions per insured driver ($) \n",
+ "0 145.08 \n",
+ "1 133.93 \n",
+ "2 110.35 \n",
+ "3 142.39 \n",
+ "4 165.63 "
+ ]
+ },
+ "execution_count": 8,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
"source": [
"df.head()"
]
@@ -84,7 +243,113 @@
"source": [
"**Data Overview**\n",
"\n",
- "*✏️ Write 2-3 sentences describing this dataset. Be sure to include where the data comes from and what it contains.*"
+ "*✏️ Write 2-3 sentences describing this dataset. Be sure to include where the data comes from and what it contains.*\n",
+ "\n",
+ "I got the data set from FiveThirtyEight. It was used for an article called\n",
+ "\"Dear Mona, Which state has the worst drivers?\" in October 2014. The person who wrote the article is Mona Chalabi, they are a data editor at the Guardian US, \n",
+ "a columnist at New York Margazine, and a lead news writer for FiveThirtyEight.\n",
+ "\n",
+ "The date is about fatal collisions in each state. There are 7 rows, some of the rows\n",
+ "are about \"Percentage Of Drivers Involved In Fatal Collisions Who Were Alcohol-Impaired\" and \"Percentage Of Drivers Involved In Fatal Collisions Who Were Not Distracted\"\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "6ba44c9c-60d1-46a4-8257-b4e8eeea348d",
+ "metadata": {},
+ "source": [
+ "I will recategorise the data so that all of the states data will be separated into the five regions of the United States"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 40,
+ "id": "f7bba5f3-5911-4a76-ad43-f6ce78cd4fb3",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " | \n",
+ " State | \n",
+ " Number of drivers involved in fatal collisions per billion miles | \n",
+ " Percentage Of Drivers Involved In Fatal Collisions Who Were Speeding | \n",
+ " Percentage Of Drivers Involved In Fatal Collisions Who Were Alcohol-Impaired | \n",
+ " Percentage Of Drivers Involved In Fatal Collisions Who Were Not Distracted | \n",
+ " Percentage Of Drivers Involved In Fatal Collisions Who Had Not Been Involved In Any Previous Accidents | \n",
+ " Car Insurance Premiums ($) | \n",
+ " Losses incurred by insurance companies for collisions per insured driver ($) | \n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " | 32 | \n",
+ " New York | \n",
+ " 12.3 | \n",
+ " 32 | \n",
+ " 29 | \n",
+ " 88 | \n",
+ " 80 | \n",
+ " 1234.31 | \n",
+ " 150.01 | \n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " State \\\n",
+ "32 New York \n",
+ "\n",
+ " Number of drivers involved in fatal collisions per billion miles \\\n",
+ "32 12.3 \n",
+ "\n",
+ " Percentage Of Drivers Involved In Fatal Collisions Who Were Speeding \\\n",
+ "32 32 \n",
+ "\n",
+ " Percentage Of Drivers Involved In Fatal Collisions Who Were Alcohol-Impaired \\\n",
+ "32 29 \n",
+ "\n",
+ " Percentage Of Drivers Involved In Fatal Collisions Who Were Not Distracted \\\n",
+ "32 88 \n",
+ "\n",
+ " Percentage Of Drivers Involved In Fatal Collisions Who Had Not Been Involved In Any Previous Accidents \\\n",
+ "32 80 \n",
+ "\n",
+ " Car Insurance Premiums ($) \\\n",
+ "32 1234.31 \n",
+ "\n",
+ " Losses incurred by insurance companies for collisions per insured driver ($) \n",
+ "32 150.01 "
+ ]
+ },
+ "execution_count": 40,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "\n",
+ "Northeast = df[df.State == \"New York\"]\n",
+ "Northeast"
]
},
{
@@ -110,7 +375,7 @@
"id": "recognized-positive",
"metadata": {},
"source": [
- "## First Research Question: [✏️ PUT YOUR QUESTION HERE ✏️]\n"
+ "## First Research Question: Is drinking and driving the biggest cause of fatal collisions?\\"
]
},
{
@@ -172,7 +437,7 @@
"id": "collectible-puppy",
"metadata": {},
"source": [
- "## Second Research Question: [✏️ PUT YOUR QUESTION HERE ✏️]\n"
+ "## Second Research Question: What state is the most unluckiest state for fatel collisions?\n"
]
},
{
@@ -310,7 +575,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
- "version": "3.9.7"
+ "version": "3.13.7"
},
"toc": {
"base_numbering": 1,
diff --git a/data/bad-drivers.csv b/data/bad-drivers.csv
new file mode 100644
index 0000000..d90f8ec
--- /dev/null
+++ b/data/bad-drivers.csv
@@ -0,0 +1,52 @@
+State,Number of drivers involved in fatal collisions per billion miles,Percentage Of Drivers Involved In Fatal Collisions Who Were Speeding,Percentage Of Drivers Involved In Fatal Collisions Who Were Alcohol-Impaired,Percentage Of Drivers Involved In Fatal Collisions Who Were Not Distracted,Percentage Of Drivers Involved In Fatal Collisions Who Had Not Been Involved In Any Previous Accidents,Car Insurance Premiums ($),Losses incurred by insurance companies for collisions per insured driver ($)
+Alabama,18.8,39,30,96,80,784.55,145.08
+Alaska,18.1,41,25,90,94,1053.48,133.93
+Arizona,18.6,35,28,84,96,899.47,110.35
+Arkansas,22.4,18,26,94,95,827.34,142.39
+California,12,35,28,91,89,878.41,165.63
+Colorado,13.6,37,28,79,95,835.5,139.91
+Connecticut,10.8,46,36,87,82,1068.73,167.02
+Delaware,16.2,38,30,87,99,1137.87,151.48
+District of Columbia,5.9,34,27,100,100,1273.89,136.05
+Florida,17.9,21,29,92,94,1160.13,144.18
+Georgia,15.6,19,25,95,93,913.15,142.8
+Hawaii,17.5,54,41,82,87,861.18,120.92
+Idaho,15.3,36,29,85,98,641.96,82.75
+Illinois,12.8,36,34,94,96,803.11,139.15
+Indiana,14.5,25,29,95,95,710.46,108.92
+Iowa,15.7,17,25,97,87,649.06,114.47
+Kansas,17.8,27,24,77,85,780.45,133.8
+Kentucky,21.4,19,23,78,76,872.51,137.13
+Louisiana,20.5,35,33,73,98,1281.55,194.78
+Maine,15.1,38,30,87,84,661.88,96.57
+Maryland,12.5,34,32,71,99,1048.78,192.7
+Massachusetts,8.2,23,35,87,80,1011.14,135.63
+Michigan,14.1,24,28,95,77,1110.61,152.26
+Minnesota,9.6,23,29,88,88,777.18,133.35
+Mississippi,17.6,15,31,10,100,896.07,155.77
+Missouri,16.1,43,34,92,84,790.32,144.45
+Montana,21.4,39,44,84,85,816.21,85.15
+Nebraska,14.9,13,35,93,90,732.28,114.82
+Nevada,14.7,37,32,95,99,1029.87,138.71
+New Hampshire,11.6,35,30,87,83,746.54,120.21
+New Jersey,11.2,16,28,86,78,1301.52,159.85
+New Mexico,18.4,19,27,67,98,869.85,120.75
+New York,12.3,32,29,88,80,1234.31,150.01
+North Carolina,16.8,39,31,94,81,708.24,127.82
+North Dakota,23.9,23,42,99,86,688.75,109.72
+Ohio,14.1,28,34,99,82,697.73,133.52
+Oklahoma,19.9,32,29,92,94,881.51,178.86
+Oregon,12.8,33,26,67,90,804.71,104.61
+Pennsylvania,18.2,50,31,96,88,905.99,153.86
+Rhode Island,11.1,34,38,92,79,1148.99,148.58
+South Carolina,23.9,38,41,96,81,858.97,116.29
+South Dakota,19.4,31,33,98,86,669.31,96.87
+Tennessee,19.5,21,29,82,81,767.91,155.57
+Texas,19.4,40,38,91,87,1004.75,156.83
+Utah,11.3,43,16,88,96,809.38,109.48
+Vermont,13.6,30,30,96,95,716.2,109.61
+Virginia,12.7,19,27,87,88,768.95,153.72
+Washington,10.6,42,33,82,86,890.03,111.62
+West Virginia,23.8,34,28,97,87,992.61,152.56
+Wisconsin,13.8,36,33,39,84,670.31,106.62
+Wyoming,17.4,42,32,81,90,791.14,122.04
\ No newline at end of file