generated from mwc/project_argument
842 lines
27 KiB
Plaintext
842 lines
27 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "worldwide-blood",
|
|
"metadata": {},
|
|
"source": [
|
|
"# Introduction"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "understanding-numbers",
|
|
"metadata": {},
|
|
"source": [
|
|
"*✏️ Write 2-3 sentences describing your research.*\n",
|
|
"\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "appreciated-testimony",
|
|
"metadata": {},
|
|
"source": [
|
|
"*✏️ Write 2-3 sentences explaining why this question.*\n",
|
|
"\n",
|
|
"Birth rates in the US are an exponential amount of data, particularly when it is broken into additional data such as states and dates. \n",
|
|
"Tracking the birthrates across the country can open the door to many other lines of questioning including if certian states are \n",
|
|
"creating environments that are more or less condusive with having a children. It could also be combined with other data sets to explain if these births are in\n",
|
|
"families with multiple children or if they are single-child households.\n",
|
|
" "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "permanent-pollution",
|
|
"metadata": {},
|
|
"source": [
|
|
"# Data"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "ed4fcba0",
|
|
"metadata": {},
|
|
"source": []
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 13,
|
|
"id": "technical-evans",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"ename": "ModuleNotFoundError",
|
|
"evalue": "No module named 'pandas'",
|
|
"output_type": "error",
|
|
"traceback": [
|
|
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
|
|
"\u001b[0;31mModuleNotFoundError\u001b[0m Traceback (most recent call last)",
|
|
"Cell \u001b[0;32mIn[13], line 2\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[38;5;66;03m#Include any import statements you will need\u001b[39;00m\n\u001b[0;32m----> 2\u001b[0m \u001b[38;5;28;01mimport\u001b[39;00m \u001b[38;5;21;01mpandas\u001b[39;00m \u001b[38;5;28;01mas\u001b[39;00m \u001b[38;5;21;01mpd\u001b[39;00m\n\u001b[1;32m 3\u001b[0m \u001b[38;5;28;01mimport\u001b[39;00m \u001b[38;5;21;01mmatplotlib\u001b[39;00m\u001b[38;5;21;01m.\u001b[39;00m\u001b[38;5;21;01mpyplot\u001b[39;00m \u001b[38;5;28;01mas\u001b[39;00m \u001b[38;5;21;01mplt\u001b[39;00m\n",
|
|
"\u001b[0;31mModuleNotFoundError\u001b[0m: No module named 'pandas'"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"#Include any import statements you will need\n",
|
|
"import pandas as pd\n",
|
|
"import matplotlib.pyplot as plt\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 16,
|
|
"id": "overhead-sigma",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"ename": "SyntaxError",
|
|
"evalue": "invalid decimal literal (1694504035.py, line 3)",
|
|
"output_type": "error",
|
|
"traceback": [
|
|
"\u001b[0;36m Cell \u001b[0;32mIn[16], line 3\u001b[0;36m\u001b[0m\n\u001b[0;31m file_name = births/US_births_1994-2003_CDC_NCHS.csv\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m invalid decimal literal\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"### 💻 FILL IN YOUR DATASET FILE NAME BELOW 💻 ###\n",
|
|
"\n",
|
|
"file_name = births/US_births_1994-2003_CDC_NCHS.csv\n",
|
|
"dataset_path = \"data/\" + file_name\n",
|
|
"\n",
|
|
"df = pd.read_csv(dataset_path) \n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 12,
|
|
"id": "heated-blade",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"ename": "NameError",
|
|
"evalue": "name 'df' is not defined",
|
|
"output_type": "error",
|
|
"traceback": [
|
|
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
|
|
"\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)",
|
|
"Cell \u001b[0;32mIn[12], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43mdf\u001b[49m\u001b[38;5;241m.\u001b[39mhead()\n",
|
|
"\u001b[0;31mNameError\u001b[0m: name 'df' is not defined"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"df.head()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "continental-franklin",
|
|
"metadata": {},
|
|
"source": [
|
|
"**Data Overview**\n",
|
|
"\n",
|
|
"*✏️ Write 2-3 sentences describing this dataset. Be sure to include where the data comes from and what it contains.*"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "infinite-instrument",
|
|
"metadata": {},
|
|
"source": [
|
|
"# Methods and Results"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 17,
|
|
"id": "basic-canadian",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"ename": "NameError",
|
|
"evalue": "name 'df' is not defined",
|
|
"output_type": "error",
|
|
"traceback": [
|
|
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
|
|
"\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)",
|
|
"Cell \u001b[0;32mIn[17], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m mean_birth_rate \u001b[38;5;241m=\u001b[39m \u001b[43mdf\u001b[49m[\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mbirth_rate\u001b[39m\u001b[38;5;124m'\u001b[39m]\u001b[38;5;241m.\u001b[39mmean()\n\u001b[1;32m 3\u001b[0m \u001b[38;5;28mprint\u001b[39m(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mMean Births in the US: \u001b[39m\u001b[38;5;124m\"\u001b[39m, mean_birth_rate)\n\u001b[1;32m 5\u001b[0m df_grouped \u001b[38;5;241m=\u001b[39m df\u001b[38;5;241m.\u001b[39mgroupby(\u001b[38;5;124m'\u001b[39m\u001b[38;5;124myear\u001b[39m\u001b[38;5;124m'\u001b[39m)[\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mbirth_rate\u001b[39m\u001b[38;5;124m'\u001b[39m]\u001b[38;5;241m.\u001b[39mmean()\u001b[38;5;241m.\u001b[39mreset_index()\n",
|
|
"\u001b[0;31mNameError\u001b[0m: name 'df' is not defined"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"mean_birth_rate = df['birth_rate'].mean()\n",
|
|
"\n",
|
|
"print(\"Mean Births in the US: \", mean_birth_rate)\n",
|
|
"\n",
|
|
"df_grouped = df.groupby('year')['birth_rate'].mean().reset_index()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 18,
|
|
"id": "49c5bade",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"ename": "ModuleNotFoundError",
|
|
"evalue": "No module named 'birth_rate'",
|
|
"output_type": "error",
|
|
"traceback": [
|
|
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
|
|
"\u001b[0;31mModuleNotFoundError\u001b[0m Traceback (most recent call last)",
|
|
"Cell \u001b[0;32mIn[18], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[38;5;28;01mimport\u001b[39;00m \u001b[38;5;21;01mbirth_rate\u001b[39;00m \u001b[38;5;28;01mas\u001b[39;00m \u001b[38;5;21;01msns\u001b[39;00m\n\u001b[1;32m 2\u001b[0m sns\u001b[38;5;241m.\u001b[39mset_theme()\n",
|
|
"\u001b[0;31mModuleNotFoundError\u001b[0m: No module named 'birth_rate'"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"import birth_rate as sns\n",
|
|
"sns.set_theme()\n",
|
|
"\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 19,
|
|
"id": "7a453fdc",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"ename": "NameError",
|
|
"evalue": "name 'sns' is not defined",
|
|
"output_type": "error",
|
|
"traceback": [
|
|
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
|
|
"\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)",
|
|
"Cell \u001b[0;32mIn[19], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43msns\u001b[49m\u001b[38;5;241m.\u001b[39mhistplot(data\u001b[38;5;241m=\u001b[39mbirth_rate, x\u001b[38;5;241m=\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mstate\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n",
|
|
"\u001b[0;31mNameError\u001b[0m: name 'sns' is not defined"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"sns.histplot(data=birth_rate, x=\"state\")\n",
|
|
"<Axes: xlabel='states', ylabel='births'>"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "29debf8d",
|
|
"metadata": {},
|
|
"source": []
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "a487cffe",
|
|
"metadata": {},
|
|
"source": [
|
|
"# Introduction"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "160f0c6d",
|
|
"metadata": {},
|
|
"source": [
|
|
"*✏️ Write 2-3 sentences describing your research.*\n",
|
|
"\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "20c53259",
|
|
"metadata": {},
|
|
"source": [
|
|
"*✏️ Write 2-3 sentences explaining why this question.*\n",
|
|
"\n",
|
|
"Birth rates in the US are an exponential amount of data, particularly when it is broken into additional data such as states and dates. \n",
|
|
"Tracking the birthrates across the country can open the door to many other lines of questioning including if certian states are \n",
|
|
"creating environments that are more or less condusive with having a children. It could also be combined with other data sets to explain if these births are in\n",
|
|
"families with multiple children or if they are single-child households.\n",
|
|
" "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "271ec8fd",
|
|
"metadata": {},
|
|
"source": [
|
|
"# Data"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "46bfa281",
|
|
"metadata": {},
|
|
"source": []
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "7ea35d71",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"ename": "ModuleNotFoundError",
|
|
"evalue": "No module named 'pandas'",
|
|
"output_type": "error",
|
|
"traceback": [
|
|
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n",
|
|
"\u001b[0;31mModuleNotFoundError\u001b[0m Traceback (most recent call last)\n",
|
|
"Cell \u001b[0;32mIn[13], line 2\u001b[0m\n",
|
|
"\u001b[1;32m 1\u001b[0m \u001b[38;5;66;03m#Include any import statements you will need\u001b[39;00m\n",
|
|
"\u001b[0;32m----> 2\u001b[0m \u001b[38;5;28;01mimport\u001b[39;00m \u001b[38;5;21;01mpandas\u001b[39;00m \u001b[38;5;28;01mas\u001b[39;00m \u001b[38;5;21;01mpd\u001b[39;00m\n",
|
|
"\u001b[1;32m 3\u001b[0m \u001b[38;5;28;01mimport\u001b[39;00m \u001b[38;5;21;01mmatplotlib\u001b[39;00m\u001b[38;5;21;01m.\u001b[39;00m\u001b[38;5;21;01mpyplot\u001b[39;00m \u001b[38;5;28;01mas\u001b[39;00m \u001b[38;5;21;01mplt\u001b[39;00m\n",
|
|
"\n",
|
|
"\u001b[0;31mModuleNotFoundError\u001b[0m: No module named 'pandas'"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"#Include any import statements you will need\n",
|
|
"import pandas as pd\n",
|
|
"import matplotlib.pyplot as plt\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "84eb129b",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"ename": "SyntaxError",
|
|
"evalue": "invalid decimal literal (1694504035.py, line 3)",
|
|
"output_type": "error",
|
|
"traceback": [
|
|
"\u001b[0;36m Cell \u001b[0;32mIn[16], line 3\u001b[0;36m\u001b[0m\n",
|
|
"\u001b[0;31m file_name = births/US_births_1994-2003_CDC_NCHS.csv\u001b[0m\n",
|
|
"\u001b[0m ^\u001b[0m\n",
|
|
"\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m invalid decimal literal\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"### 💻 FILL IN YOUR DATASET FILE NAME BELOW 💻 ###\n",
|
|
"\n",
|
|
"file_name = births/US_births_1994-2003_CDC_NCHS.csv\n",
|
|
"dataset_path = \"data/\" + file_name\n",
|
|
"\n",
|
|
"df = pd.read_csv(dataset_path) \n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "259a51aa",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"ename": "NameError",
|
|
"evalue": "name 'df' is not defined",
|
|
"output_type": "error",
|
|
"traceback": [
|
|
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n",
|
|
"\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)\n",
|
|
"Cell \u001b[0;32mIn[12], line 1\u001b[0m\n",
|
|
"\u001b[0;32m----> 1\u001b[0m \u001b[43mdf\u001b[49m\u001b[38;5;241m.\u001b[39mhead()\n",
|
|
"\n",
|
|
"\u001b[0;31mNameError\u001b[0m: name 'df' is not defined"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"df.head()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "ab0f92cf",
|
|
"metadata": {},
|
|
"source": [
|
|
"**Data Overview**\n",
|
|
"\n",
|
|
"*✏️ Write 2-3 sentences describing this dataset. Be sure to include where the data comes from and what it contains.*"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "0012c102",
|
|
"metadata": {},
|
|
"source": [
|
|
"# Methods and Results"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "0444a28d",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"ename": "NameError",
|
|
"evalue": "name 'df' is not defined",
|
|
"output_type": "error",
|
|
"traceback": [
|
|
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n",
|
|
"\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)\n",
|
|
"Cell \u001b[0;32mIn[17], line 1\u001b[0m\n",
|
|
"\u001b[0;32m----> 1\u001b[0m mean_birth_rate \u001b[38;5;241m=\u001b[39m \u001b[43mdf\u001b[49m[\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mbirth_rate\u001b[39m\u001b[38;5;124m'\u001b[39m]\u001b[38;5;241m.\u001b[39mmean()\n",
|
|
"\u001b[1;32m 3\u001b[0m \u001b[38;5;28mprint\u001b[39m(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mMean Births in the US: \u001b[39m\u001b[38;5;124m\"\u001b[39m, mean_birth_rate)\n",
|
|
"\u001b[1;32m 5\u001b[0m df_grouped \u001b[38;5;241m=\u001b[39m df\u001b[38;5;241m.\u001b[39mgroupby(\u001b[38;5;124m'\u001b[39m\u001b[38;5;124myear\u001b[39m\u001b[38;5;124m'\u001b[39m)[\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mbirth_rate\u001b[39m\u001b[38;5;124m'\u001b[39m]\u001b[38;5;241m.\u001b[39mmean()\u001b[38;5;241m.\u001b[39mreset_index()\n",
|
|
"\n",
|
|
"\u001b[0;31mNameError\u001b[0m: name 'df' is not defined"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"mean_birth_rate = df['birth_rate'].mean()\n",
|
|
"\n",
|
|
"print(\"Mean Births in the US: \", mean_birth_rate)\n",
|
|
"\n",
|
|
"df_grouped = df.groupby('year')['birth_rate'].mean().reset_index()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "c3350546",
|
|
"metadata": {},
|
|
"source": [
|
|
"## First Research Question: What is the mean birth rate in the United States over time? \n",
|
|
"\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "5bc48dd0",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Methods"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "53b0e329",
|
|
"metadata": {},
|
|
"source": [
|
|
"*Explain how you will approach this research question below. Consider the following:* \n",
|
|
" - *Which aspects of the dataset will you use?* \n",
|
|
" \n",
|
|
" - *How will you reorganize/store the data?* \n",
|
|
"\n",
|
|
" - *What data science tools/functions will you use and why?* \n",
|
|
"\n",
|
|
" \n",
|
|
"✏️ *Write your answer below:*\n",
|
|
" I will use the totals of the data set. When analyzing data, it seems that the first place to start would be to calculate the overall averages that the data provides. Once one has these averages, then they can distribute them and analyze them further with more detailed research questions. \n",
|
|
" I will reorganize the data into graphs that breakdown the data into visual representations. Once these are created, then the steps to further analyze can be better visualized. \n",
|
|
" I will start by using pd.read_csv() to read the file with the data and analyze it. Then I'll use np.mean() to calculate mean of the data set, plt.bar() to create bar graphs that refelct the data and lend itself to further analysis. "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "b225bb8c",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Results "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 21,
|
|
"id": "73c194a2",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"ename": "NameError",
|
|
"evalue": "name 'df' is not defined",
|
|
"output_type": "error",
|
|
"traceback": [
|
|
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
|
|
"\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)",
|
|
"Cell \u001b[0;32mIn[21], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m mean_birth_rate \u001b[38;5;241m=\u001b[39m \u001b[43mdf\u001b[49m[\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mbirth_rate\u001b[39m\u001b[38;5;124m'\u001b[39m]\u001b[38;5;241m.\u001b[39mmean()\n\u001b[1;32m 3\u001b[0m \u001b[38;5;28mprint\u001b[39m(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mMean Births in the US: \u001b[39m\u001b[38;5;124m\"\u001b[39m, mean_birth_rate)\n\u001b[1;32m 5\u001b[0m df_grouped \u001b[38;5;241m=\u001b[39m df\u001b[38;5;241m.\u001b[39mgroupby(\u001b[38;5;124m'\u001b[39m\u001b[38;5;124myear\u001b[39m\u001b[38;5;124m'\u001b[39m)[\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mbirth_rate\u001b[39m\u001b[38;5;124m'\u001b[39m]\u001b[38;5;241m.\u001b[39mmean()\u001b[38;5;241m.\u001b[39mreset_index()\n",
|
|
"\u001b[0;31mNameError\u001b[0m: name 'df' is not defined"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"mean_birth_rate = df['birth_rate'].mean()\n",
|
|
"\n",
|
|
"print(\"Mean Births in the US: \", mean_birth_rate)\n",
|
|
"\n",
|
|
"df_grouped = df.groupby('year')['birth_rate'].mean().reset_index()\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "f4958d53",
|
|
"metadata": {},
|
|
"source": [
|
|
"The mean birthrate is 6026.24 in the US from 1994-2003. "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 22,
|
|
"id": "84813d3e",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# 💻 YOU CAN ADD NEW CELLS WITH THE \"+\" BUTTON "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "72b520b7",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Second Research Question: [✏️ PUT YOUR QUESTION HERE ✏️]\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "1c9b2c9c",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Methods"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "0c92fb7c",
|
|
"metadata": {},
|
|
"source": [
|
|
"*Explain how you will approach this research question below. Consider the following:* \n",
|
|
" - *Which aspects of the dataset will you use?* \n",
|
|
" - *How will you reorganize/store the data?* \n",
|
|
" - *What data science tools/functions will you use and why?* \n",
|
|
"\n",
|
|
"✏️ *Write your answer below:*\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "00337aaa",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Results "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "ad3ad12f",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"#######################################################################\n",
|
|
"### 💻 YOUR WORK GOES HERE TO ANSWER THE SECOND RESEARCH QUESTION 💻 \n",
|
|
"###\n",
|
|
"### Your data analysis may include a statistic and/or a data visualization\n",
|
|
"#######################################################################"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "1e994e16",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# 💻 YOU CAN ADD NEW CELLS WITH THE \"+\" BUTTON "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "ca9ba8c2",
|
|
"metadata": {},
|
|
"source": [
|
|
"# Discussion"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "519421fe",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Considerations"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "96702591",
|
|
"metadata": {},
|
|
"source": [
|
|
"*It's important to recognize the limitations of our research.\n",
|
|
"Consider the following:*\n",
|
|
"\n",
|
|
"- *Do the results give an accurate depiction of your research question? Why or why not?*\n",
|
|
"- *What were limitations of your datset?*\n",
|
|
"The dataset does not include the circumstances of the births themselves. If one was attempting to analyze the population based off these birthrates, there would need to be data included for after the births. The viability of the births or the infants would need to be a factor as well. \n",
|
|
"- *Are there any known biases in the data?*\n",
|
|
"\n",
|
|
"✏️ *Write your answer below:*"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "43719761",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Summary"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "7840a317",
|
|
"metadata": {},
|
|
"source": [
|
|
"*Summarize what you discovered through the research. Consider the following:*\n",
|
|
"\n",
|
|
"- *What did you learn about your media consumption/digital habits?*\n",
|
|
"- *Did the results make sense?*\n",
|
|
"- *What was most surprising?*\n",
|
|
"- *How will this project impact you going forward?*\n",
|
|
"\n",
|
|
"✏️ *Write your answer below:*"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "recognized-positive",
|
|
"metadata": {},
|
|
"source": [
|
|
"## First Research Question: What is the mean birth rate in the United States over time? \n",
|
|
"\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "graduate-palmer",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Methods"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "endless-variation",
|
|
"metadata": {},
|
|
"source": [
|
|
"*Explain how you will approach this research question below. Consider the following:* \n",
|
|
" - *Which aspects of the dataset will you use?* \n",
|
|
" \n",
|
|
" - *How will you reorganize/store the data?* \n",
|
|
"\n",
|
|
" - *What data science tools/functions will you use and why?* \n",
|
|
"\n",
|
|
" \n",
|
|
"✏️ *Write your answer below:*\n",
|
|
" I will use the totals of the data set. When analyzing data, it seems that the first place to start would be to calculate the overall averages that the data provides. Once one has these averages, then they can distribute them and analyze them further with more detailed research questions. \n",
|
|
" I will reorganize the data into graphs that breakdown the data into visual representations. Once these are created, then the steps to further analyze can be better visualized. \n",
|
|
" I will start by using pd.read_csv() to read the file with the data and analyze it. Then I'll use np.mean() to calculate mean of the data set, plt.bar() to create bar graphs that refelct the data and lend itself to further analysis. "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "portuguese-japan",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Results "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 17,
|
|
"id": "negative-highlight",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"#######################################################################\n",
|
|
"### 💻 YOUR WORK GOES HERE TO ANSWER THE FIRST RESEARCH QUESTION 💻 \n",
|
|
"### \n",
|
|
"### Your data analysis may include a statistic and/or a data visualization\n",
|
|
"#######################################################################"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 16,
|
|
"id": "victorian-burning",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# 💻 YOU CAN ADD NEW CELLS WITH THE \"+\" BUTTON "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "collectible-puppy",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Second Research Question: [✏️ PUT YOUR QUESTION HERE ✏️]\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "demographic-future",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Methods"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "incorporate-roller",
|
|
"metadata": {},
|
|
"source": [
|
|
"*Explain how you will approach this research question below. Consider the following:* \n",
|
|
" - *Which aspects of the dataset will you use?* \n",
|
|
" - *How will you reorganize/store the data?* \n",
|
|
" - *What data science tools/functions will you use and why?* \n",
|
|
"\n",
|
|
"✏️ *Write your answer below:*\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 14,
|
|
"id": "pursuant-surrey",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"#######################################################################\n",
|
|
"### 💻 YOUR WORK GOES HERE TO ANSWER THE SECOND RESEARCH QUESTION 💻 \n",
|
|
"###\n",
|
|
"### Your data analysis may include a statistic and/or a data visualization\n",
|
|
"#######################################################################"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 15,
|
|
"id": "located-night",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# 💻 YOU CAN ADD NEW CELLS WITH THE \"+\" BUTTON "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "infectious-symbol",
|
|
"metadata": {},
|
|
"source": [
|
|
"# Discussion"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "furnished-camping",
|
|
"metadata": {
|
|
"code_folding": []
|
|
},
|
|
"source": [
|
|
"## Considerations"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "bearing-stadium",
|
|
"metadata": {},
|
|
"source": [
|
|
"*It's important to recognize the limitations of our research.\n",
|
|
"Consider the following:*\n",
|
|
"\n",
|
|
"- *Do the results give an accurate depiction of your research question? Why or why not?*\n",
|
|
"- *What were limitations of your datset?*\n",
|
|
"- *Are there any known biases in the data?*\n",
|
|
"\n",
|
|
"✏️ *Write your answer below:*"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "beneficial-invasion",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Summary"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "about-raise",
|
|
"metadata": {},
|
|
"source": [
|
|
"*Summarize what you discovered through the research. Consider the following:*\n",
|
|
"\n",
|
|
"- *What did you learn about your media consumption/digital habits?*\n",
|
|
"- *Did the results make sense?*\n",
|
|
"- *What was most surprising?*\n",
|
|
"- *How will this project impact you going forward?*\n",
|
|
"\n",
|
|
"✏️ *Write your answer below:*"
|
|
]
|
|
}
|
|
],
|
|
"metadata": {
|
|
"jupytext": {
|
|
"cell_metadata_json": true,
|
|
"text_representation": {
|
|
"extension": ".Rmd",
|
|
"format_name": "rmarkdown",
|
|
"format_version": "1.2",
|
|
"jupytext_version": "1.9.1"
|
|
}
|
|
},
|
|
"kernelspec": {
|
|
"display_name": "Python 3 (ipykernel)",
|
|
"language": "python",
|
|
"name": "python3"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.12.6"
|
|
},
|
|
"toc": {
|
|
"base_numbering": 1,
|
|
"nav_menu": {},
|
|
"number_sections": false,
|
|
"sideBar": true,
|
|
"skip_h1_title": false,
|
|
"title_cell": "Table of Contents",
|
|
"title_sidebar": "Contents",
|
|
"toc_cell": false,
|
|
"toc_position": {},
|
|
"toc_section_display": true,
|
|
"toc_window_display": false
|
|
},
|
|
"varInspector": {
|
|
"cols": {
|
|
"lenName": 16,
|
|
"lenType": 16,
|
|
"lenVar": 40
|
|
},
|
|
"kernels_config": {
|
|
"python": {
|
|
"delete_cmd_postfix": "",
|
|
"delete_cmd_prefix": "del ",
|
|
"library": "var_list.py",
|
|
"varRefreshCmd": "print(var_dic_list())"
|
|
},
|
|
"r": {
|
|
"delete_cmd_postfix": ") ",
|
|
"delete_cmd_prefix": "rm(",
|
|
"library": "var_list.r",
|
|
"varRefreshCmd": "cat(var_dic_list()) "
|
|
}
|
|
},
|
|
"types_to_exclude": [
|
|
"module",
|
|
"function",
|
|
"builtin_function_or_method",
|
|
"instance",
|
|
"_Feature"
|
|
],
|
|
"window_display": false
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 5
|
|
}
|