generated from mwc/project_argument
My successes in this unit may seem small, but they
are a huge deal to me as a learner and an educator. Historically speaking, statistical analysis has been a point of contention in my educaiton. It has stood in the way of many of my goals and each time has been a struggle. My biggest goal with this unit was understanding the format of Jupyter Labs and the pokemon lab. I enjoyed going through the server and the interface and understanding the way it analyzes data. I think that presenting data through this platform is really cool and interesting. This project was a challenge for me to compelte. Earlier this semester I began the project, but was unable to complete it due to the same issues we were having with the pokemon lab. However, we were able to sort those out which was amazing! Then, when I went back to finish the lab my work had not saved. I am sure that there was user error here, but what I am submitting here is the second attempt at the project and does not use the data that I initially had planned. I understand the core of the project and I know the steps to complete it, but the server, pandas, and other elements are not working again. So, I unforetunately only have the following to submit at this time. I am really proud of myself for not giviing up on the project and for trying to work back through the data, but I am disapointed in this project. I apologize for the state of the project overall, it is not up to my usual expectations. With my Engish background, I really love the idea of Juoyter labs and integrating it into the framework of English curriculum. For example, every year the studetns at my school (in English class) have to complete a reearch project. It is not only dat driven, but can be. The culminating assessment for this unit is a research paper and presentation. Students can choose their mediums; canva, powerpoint, etc. However, I think that Jupyter Labs would be an amazing option. Students could track their data easily, and analyze it while it synthesizes in real time.
This commit is contained in:
parent
df2ad8071c
commit
fe503a8476
545
argument.ipynb
545
argument.ipynb
|
@ -13,15 +13,8 @@
|
|||
"id": "understanding-numbers",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"*✏️ Write 2-3 sentences describing your research.*"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "greater-circular",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Overarching Question: [✏️ PUT YOUR QUESTION HERE ✏️]"
|
||||
"*✏️ Write 2-3 sentences describing your research.*\n",
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
|
@ -29,7 +22,13 @@
|
|||
"id": "appreciated-testimony",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"*✏️ Write 2-3 sentences explaining why this question.*"
|
||||
"*✏️ Write 2-3 sentences explaining why this question.*\n",
|
||||
"\n",
|
||||
"Birth rates in the US are an exponential amount of data, particularly when it is broken into additional data such as states and dates. \n",
|
||||
"Tracking the birthrates across the country can open the door to many other lines of questioning including if certian states are \n",
|
||||
"creating environments that are more or less condusive with having a children. It could also be combined with other data sets to explain if these births are in\n",
|
||||
"families with multiple children or if they are single-child households.\n",
|
||||
" "
|
||||
]
|
||||
},
|
||||
{
|
||||
|
@ -40,39 +39,78 @@
|
|||
"# Data"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "ed4fcba0",
|
||||
"metadata": {},
|
||||
"source": []
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"execution_count": 13,
|
||||
"id": "technical-evans",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"outputs": [
|
||||
{
|
||||
"ename": "ModuleNotFoundError",
|
||||
"evalue": "No module named 'pandas'",
|
||||
"output_type": "error",
|
||||
"traceback": [
|
||||
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
|
||||
"\u001b[0;31mModuleNotFoundError\u001b[0m Traceback (most recent call last)",
|
||||
"Cell \u001b[0;32mIn[13], line 2\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[38;5;66;03m#Include any import statements you will need\u001b[39;00m\n\u001b[0;32m----> 2\u001b[0m \u001b[38;5;28;01mimport\u001b[39;00m \u001b[38;5;21;01mpandas\u001b[39;00m \u001b[38;5;28;01mas\u001b[39;00m \u001b[38;5;21;01mpd\u001b[39;00m\n\u001b[1;32m 3\u001b[0m \u001b[38;5;28;01mimport\u001b[39;00m \u001b[38;5;21;01mmatplotlib\u001b[39;00m\u001b[38;5;21;01m.\u001b[39;00m\u001b[38;5;21;01mpyplot\u001b[39;00m \u001b[38;5;28;01mas\u001b[39;00m \u001b[38;5;21;01mplt\u001b[39;00m\n",
|
||||
"\u001b[0;31mModuleNotFoundError\u001b[0m: No module named 'pandas'"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"#Include any import statements you will need\n",
|
||||
"import pandas as pd\n",
|
||||
"import matplotlib.pyplot as plt"
|
||||
"import matplotlib.pyplot as plt\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"execution_count": 16,
|
||||
"id": "overhead-sigma",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"outputs": [
|
||||
{
|
||||
"ename": "SyntaxError",
|
||||
"evalue": "invalid decimal literal (1694504035.py, line 3)",
|
||||
"output_type": "error",
|
||||
"traceback": [
|
||||
"\u001b[0;36m Cell \u001b[0;32mIn[16], line 3\u001b[0;36m\u001b[0m\n\u001b[0;31m file_name = births/US_births_1994-2003_CDC_NCHS.csv\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m invalid decimal literal\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"### 💻 FILL IN YOUR DATASET FILE NAME BELOW 💻 ###\n",
|
||||
"\n",
|
||||
"file_name = \"YOUR_DATASET_FILE_NAME.csv\"\n",
|
||||
"file_name = births/US_births_1994-2003_CDC_NCHS.csv\n",
|
||||
"dataset_path = \"data/\" + file_name\n",
|
||||
"\n",
|
||||
"df = pd.read_csv(dataset_path)"
|
||||
"df = pd.read_csv(dataset_path) \n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"execution_count": 12,
|
||||
"id": "heated-blade",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"outputs": [
|
||||
{
|
||||
"ename": "NameError",
|
||||
"evalue": "name 'df' is not defined",
|
||||
"output_type": "error",
|
||||
"traceback": [
|
||||
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
|
||||
"\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)",
|
||||
"Cell \u001b[0;32mIn[12], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43mdf\u001b[49m\u001b[38;5;241m.\u001b[39mhead()\n",
|
||||
"\u001b[0;31mNameError\u001b[0m: name 'df' is not defined"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"df.head()"
|
||||
]
|
||||
|
@ -97,12 +135,457 @@
|
|||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"execution_count": 17,
|
||||
"id": "basic-canadian",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"ename": "NameError",
|
||||
"evalue": "name 'df' is not defined",
|
||||
"output_type": "error",
|
||||
"traceback": [
|
||||
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
|
||||
"\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)",
|
||||
"Cell \u001b[0;32mIn[17], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m mean_birth_rate \u001b[38;5;241m=\u001b[39m \u001b[43mdf\u001b[49m[\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mbirth_rate\u001b[39m\u001b[38;5;124m'\u001b[39m]\u001b[38;5;241m.\u001b[39mmean()\n\u001b[1;32m 3\u001b[0m \u001b[38;5;28mprint\u001b[39m(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mMean Births in the US: \u001b[39m\u001b[38;5;124m\"\u001b[39m, mean_birth_rate)\n\u001b[1;32m 5\u001b[0m df_grouped \u001b[38;5;241m=\u001b[39m df\u001b[38;5;241m.\u001b[39mgroupby(\u001b[38;5;124m'\u001b[39m\u001b[38;5;124myear\u001b[39m\u001b[38;5;124m'\u001b[39m)[\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mbirth_rate\u001b[39m\u001b[38;5;124m'\u001b[39m]\u001b[38;5;241m.\u001b[39mmean()\u001b[38;5;241m.\u001b[39mreset_index()\n",
|
||||
"\u001b[0;31mNameError\u001b[0m: name 'df' is not defined"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"mean_birth_rate = df['birth_rate'].mean()\n",
|
||||
"\n",
|
||||
"print(\"Mean Births in the US: \", mean_birth_rate)\n",
|
||||
"\n",
|
||||
"df_grouped = df.groupby('year')['birth_rate'].mean().reset_index()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 18,
|
||||
"id": "49c5bade",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"ename": "ModuleNotFoundError",
|
||||
"evalue": "No module named 'birth_rate'",
|
||||
"output_type": "error",
|
||||
"traceback": [
|
||||
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
|
||||
"\u001b[0;31mModuleNotFoundError\u001b[0m Traceback (most recent call last)",
|
||||
"Cell \u001b[0;32mIn[18], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[38;5;28;01mimport\u001b[39;00m \u001b[38;5;21;01mbirth_rate\u001b[39;00m \u001b[38;5;28;01mas\u001b[39;00m \u001b[38;5;21;01msns\u001b[39;00m\n\u001b[1;32m 2\u001b[0m sns\u001b[38;5;241m.\u001b[39mset_theme()\n",
|
||||
"\u001b[0;31mModuleNotFoundError\u001b[0m: No module named 'birth_rate'"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"import birth_rate as sns\n",
|
||||
"sns.set_theme()\n",
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 19,
|
||||
"id": "7a453fdc",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"ename": "NameError",
|
||||
"evalue": "name 'sns' is not defined",
|
||||
"output_type": "error",
|
||||
"traceback": [
|
||||
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
|
||||
"\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)",
|
||||
"Cell \u001b[0;32mIn[19], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43msns\u001b[49m\u001b[38;5;241m.\u001b[39mhistplot(data\u001b[38;5;241m=\u001b[39mbirth_rate, x\u001b[38;5;241m=\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mstate\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n",
|
||||
"\u001b[0;31mNameError\u001b[0m: name 'sns' is not defined"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"sns.histplot(data=birth_rate, x=\"state\")\n",
|
||||
"<Axes: xlabel='states', ylabel='births'>"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "29debf8d",
|
||||
"metadata": {},
|
||||
"source": []
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "a487cffe",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Introduction"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "160f0c6d",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"*✏️ Write 2-3 sentences describing your research.*\n",
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "20c53259",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"*✏️ Write 2-3 sentences explaining why this question.*\n",
|
||||
"\n",
|
||||
"Birth rates in the US are an exponential amount of data, particularly when it is broken into additional data such as states and dates. \n",
|
||||
"Tracking the birthrates across the country can open the door to many other lines of questioning including if certian states are \n",
|
||||
"creating environments that are more or less condusive with having a children. It could also be combined with other data sets to explain if these births are in\n",
|
||||
"families with multiple children or if they are single-child households.\n",
|
||||
" "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "271ec8fd",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Data"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "46bfa281",
|
||||
"metadata": {},
|
||||
"source": []
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "7ea35d71",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"ename": "ModuleNotFoundError",
|
||||
"evalue": "No module named 'pandas'",
|
||||
"output_type": "error",
|
||||
"traceback": [
|
||||
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n",
|
||||
"\u001b[0;31mModuleNotFoundError\u001b[0m Traceback (most recent call last)\n",
|
||||
"Cell \u001b[0;32mIn[13], line 2\u001b[0m\n",
|
||||
"\u001b[1;32m 1\u001b[0m \u001b[38;5;66;03m#Include any import statements you will need\u001b[39;00m\n",
|
||||
"\u001b[0;32m----> 2\u001b[0m \u001b[38;5;28;01mimport\u001b[39;00m \u001b[38;5;21;01mpandas\u001b[39;00m \u001b[38;5;28;01mas\u001b[39;00m \u001b[38;5;21;01mpd\u001b[39;00m\n",
|
||||
"\u001b[1;32m 3\u001b[0m \u001b[38;5;28;01mimport\u001b[39;00m \u001b[38;5;21;01mmatplotlib\u001b[39;00m\u001b[38;5;21;01m.\u001b[39;00m\u001b[38;5;21;01mpyplot\u001b[39;00m \u001b[38;5;28;01mas\u001b[39;00m \u001b[38;5;21;01mplt\u001b[39;00m\n",
|
||||
"\n",
|
||||
"\u001b[0;31mModuleNotFoundError\u001b[0m: No module named 'pandas'"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"#Include any import statements you will need\n",
|
||||
"import pandas as pd\n",
|
||||
"import matplotlib.pyplot as plt\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "84eb129b",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"ename": "SyntaxError",
|
||||
"evalue": "invalid decimal literal (1694504035.py, line 3)",
|
||||
"output_type": "error",
|
||||
"traceback": [
|
||||
"\u001b[0;36m Cell \u001b[0;32mIn[16], line 3\u001b[0;36m\u001b[0m\n",
|
||||
"\u001b[0;31m file_name = births/US_births_1994-2003_CDC_NCHS.csv\u001b[0m\n",
|
||||
"\u001b[0m ^\u001b[0m\n",
|
||||
"\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m invalid decimal literal\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"### 💻 FILL IN YOUR DATASET FILE NAME BELOW 💻 ###\n",
|
||||
"\n",
|
||||
"file_name = births/US_births_1994-2003_CDC_NCHS.csv\n",
|
||||
"dataset_path = \"data/\" + file_name\n",
|
||||
"\n",
|
||||
"df = pd.read_csv(dataset_path) \n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "259a51aa",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"ename": "NameError",
|
||||
"evalue": "name 'df' is not defined",
|
||||
"output_type": "error",
|
||||
"traceback": [
|
||||
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n",
|
||||
"\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)\n",
|
||||
"Cell \u001b[0;32mIn[12], line 1\u001b[0m\n",
|
||||
"\u001b[0;32m----> 1\u001b[0m \u001b[43mdf\u001b[49m\u001b[38;5;241m.\u001b[39mhead()\n",
|
||||
"\n",
|
||||
"\u001b[0;31mNameError\u001b[0m: name 'df' is not defined"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"df.head()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "ab0f92cf",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"**Data Overview**\n",
|
||||
"\n",
|
||||
"*✏️ Write 2-3 sentences describing this dataset. Be sure to include where the data comes from and what it contains.*"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "0012c102",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Methods and Results"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "0444a28d",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"ename": "NameError",
|
||||
"evalue": "name 'df' is not defined",
|
||||
"output_type": "error",
|
||||
"traceback": [
|
||||
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n",
|
||||
"\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)\n",
|
||||
"Cell \u001b[0;32mIn[17], line 1\u001b[0m\n",
|
||||
"\u001b[0;32m----> 1\u001b[0m mean_birth_rate \u001b[38;5;241m=\u001b[39m \u001b[43mdf\u001b[49m[\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mbirth_rate\u001b[39m\u001b[38;5;124m'\u001b[39m]\u001b[38;5;241m.\u001b[39mmean()\n",
|
||||
"\u001b[1;32m 3\u001b[0m \u001b[38;5;28mprint\u001b[39m(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mMean Births in the US: \u001b[39m\u001b[38;5;124m\"\u001b[39m, mean_birth_rate)\n",
|
||||
"\u001b[1;32m 5\u001b[0m df_grouped \u001b[38;5;241m=\u001b[39m df\u001b[38;5;241m.\u001b[39mgroupby(\u001b[38;5;124m'\u001b[39m\u001b[38;5;124myear\u001b[39m\u001b[38;5;124m'\u001b[39m)[\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mbirth_rate\u001b[39m\u001b[38;5;124m'\u001b[39m]\u001b[38;5;241m.\u001b[39mmean()\u001b[38;5;241m.\u001b[39mreset_index()\n",
|
||||
"\n",
|
||||
"\u001b[0;31mNameError\u001b[0m: name 'df' is not defined"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"mean_birth_rate = df['birth_rate'].mean()\n",
|
||||
"\n",
|
||||
"print(\"Mean Births in the US: \", mean_birth_rate)\n",
|
||||
"\n",
|
||||
"df_grouped = df.groupby('year')['birth_rate'].mean().reset_index()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "c3350546",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## First Research Question: What is the mean birth rate in the United States over time? \n",
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "5bc48dd0",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Methods"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "53b0e329",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"*Explain how you will approach this research question below. Consider the following:* \n",
|
||||
" - *Which aspects of the dataset will you use?* \n",
|
||||
" \n",
|
||||
" - *How will you reorganize/store the data?* \n",
|
||||
"\n",
|
||||
" - *What data science tools/functions will you use and why?* \n",
|
||||
"\n",
|
||||
" \n",
|
||||
"✏️ *Write your answer below:*\n",
|
||||
" I will use the totals of the data set. When analyzing data, it seems that the first place to start would be to calculate the overall averages that the data provides. Once one has these averages, then they can distribute them and analyze them further with more detailed research questions. \n",
|
||||
" I will reorganize the data into graphs that breakdown the data into visual representations. Once these are created, then the steps to further analyze can be better visualized. \n",
|
||||
" I will start by using pd.read_csv() to read the file with the data and analyze it. Then I'll use np.mean() to calculate mean of the data set, plt.bar() to create bar graphs that refelct the data and lend itself to further analysis. "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "b225bb8c",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Results "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 21,
|
||||
"id": "73c194a2",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"ename": "NameError",
|
||||
"evalue": "name 'df' is not defined",
|
||||
"output_type": "error",
|
||||
"traceback": [
|
||||
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
|
||||
"\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)",
|
||||
"Cell \u001b[0;32mIn[21], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m mean_birth_rate \u001b[38;5;241m=\u001b[39m \u001b[43mdf\u001b[49m[\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mbirth_rate\u001b[39m\u001b[38;5;124m'\u001b[39m]\u001b[38;5;241m.\u001b[39mmean()\n\u001b[1;32m 3\u001b[0m \u001b[38;5;28mprint\u001b[39m(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mMean Births in the US: \u001b[39m\u001b[38;5;124m\"\u001b[39m, mean_birth_rate)\n\u001b[1;32m 5\u001b[0m df_grouped \u001b[38;5;241m=\u001b[39m df\u001b[38;5;241m.\u001b[39mgroupby(\u001b[38;5;124m'\u001b[39m\u001b[38;5;124myear\u001b[39m\u001b[38;5;124m'\u001b[39m)[\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mbirth_rate\u001b[39m\u001b[38;5;124m'\u001b[39m]\u001b[38;5;241m.\u001b[39mmean()\u001b[38;5;241m.\u001b[39mreset_index()\n",
|
||||
"\u001b[0;31mNameError\u001b[0m: name 'df' is not defined"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"mean_birth_rate = df['birth_rate'].mean()\n",
|
||||
"\n",
|
||||
"print(\"Mean Births in the US: \", mean_birth_rate)\n",
|
||||
"\n",
|
||||
"df_grouped = df.groupby('year')['birth_rate'].mean().reset_index()\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "f4958d53",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"The mean birthrate is 6026.24 in the US from 1994-2003. "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 22,
|
||||
"id": "84813d3e",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"#Import any helper files you need here"
|
||||
"# 💻 YOU CAN ADD NEW CELLS WITH THE \"+\" BUTTON "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "72b520b7",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Second Research Question: [✏️ PUT YOUR QUESTION HERE ✏️]\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "1c9b2c9c",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Methods"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "0c92fb7c",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"*Explain how you will approach this research question below. Consider the following:* \n",
|
||||
" - *Which aspects of the dataset will you use?* \n",
|
||||
" - *How will you reorganize/store the data?* \n",
|
||||
" - *What data science tools/functions will you use and why?* \n",
|
||||
"\n",
|
||||
"✏️ *Write your answer below:*\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "00337aaa",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Results "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "ad3ad12f",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"#######################################################################\n",
|
||||
"### 💻 YOUR WORK GOES HERE TO ANSWER THE SECOND RESEARCH QUESTION 💻 \n",
|
||||
"###\n",
|
||||
"### Your data analysis may include a statistic and/or a data visualization\n",
|
||||
"#######################################################################"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "1e994e16",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# 💻 YOU CAN ADD NEW CELLS WITH THE \"+\" BUTTON "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "ca9ba8c2",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Discussion"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "519421fe",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Considerations"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "96702591",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"*It's important to recognize the limitations of our research.\n",
|
||||
"Consider the following:*\n",
|
||||
"\n",
|
||||
"- *Do the results give an accurate depiction of your research question? Why or why not?*\n",
|
||||
"- *What were limitations of your datset?*\n",
|
||||
"The dataset does not include the circumstances of the births themselves. If one was attempting to analyze the population based off these birthrates, there would need to be data included for after the births. The viability of the births or the infants would need to be a factor as well. \n",
|
||||
"- *Are there any known biases in the data?*\n",
|
||||
"\n",
|
||||
"✏️ *Write your answer below:*"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "43719761",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Summary"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "7840a317",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"*Summarize what you discovered through the research. Consider the following:*\n",
|
||||
"\n",
|
||||
"- *What did you learn about your media consumption/digital habits?*\n",
|
||||
"- *Did the results make sense?*\n",
|
||||
"- *What was most surprising?*\n",
|
||||
"- *How will this project impact you going forward?*\n",
|
||||
"\n",
|
||||
"✏️ *Write your answer below:*"
|
||||
]
|
||||
},
|
||||
{
|
||||
|
@ -110,7 +593,8 @@
|
|||
"id": "recognized-positive",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## First Research Question: [✏️ PUT YOUR QUESTION HERE ✏️]\n"
|
||||
"## First Research Question: What is the mean birth rate in the United States over time? \n",
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
|
@ -128,11 +612,16 @@
|
|||
"source": [
|
||||
"*Explain how you will approach this research question below. Consider the following:* \n",
|
||||
" - *Which aspects of the dataset will you use?* \n",
|
||||
" \n",
|
||||
" - *How will you reorganize/store the data?* \n",
|
||||
"\n",
|
||||
" - *What data science tools/functions will you use and why?* \n",
|
||||
"\n",
|
||||
" \n",
|
||||
"✏️ *Write your answer below:*\n",
|
||||
"\n"
|
||||
" I will use the totals of the data set. When analyzing data, it seems that the first place to start would be to calculate the overall averages that the data provides. Once one has these averages, then they can distribute them and analyze them further with more detailed research questions. \n",
|
||||
" I will reorganize the data into graphs that breakdown the data into visual representations. Once these are created, then the steps to further analyze can be better visualized. \n",
|
||||
" I will start by using pd.read_csv() to read the file with the data and analyze it. Then I'll use np.mean() to calculate mean of the data set, plt.bar() to create bar graphs that refelct the data and lend itself to further analysis. "
|
||||
]
|
||||
},
|
||||
{
|
||||
|
@ -196,14 +685,6 @@
|
|||
"✏️ *Write your answer below:*\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "juvenile-creation",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Results "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 14,
|
||||
|
@ -310,7 +791,7 @@
|
|||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.9.7"
|
||||
"version": "3.12.6"
|
||||
},
|
||||
"toc": {
|
||||
"base_numbering": 1,
|
||||
|
|
Loading…
Reference in New Issue