project_argument/argument.ipynb

750 lines
76 KiB
Plaintext
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

{
"cells": [
{
"cell_type": "markdown",
"id": "worldwide-blood",
"metadata": {},
"source": [
"# Introduction"
]
},
{
"cell_type": "markdown",
"id": "understanding-numbers",
"metadata": {},
"source": [
"*✏️ Write 2-3 sentences describing your research.*"
]
},
{
"cell_type": "markdown",
"id": "greater-circular",
"metadata": {},
"source": [
"## Overarching Question: [How has the foundational implementation of government redlining had an effect on generational wealth by race and socioeconomic standing over time?]✏"
]
},
{
"cell_type": "markdown",
"id": "appreciated-testimony",
"metadata": {},
"source": [
"This is an important question as we look at our cities and, still today, see heavily segregated areas which must have a foundational reason. Hopefully through this data set we see an irrefutable pattern that brings clarity to the overarching question."
]
},
{
"cell_type": "markdown",
"id": "permanent-pollution",
"metadata": {},
"source": [
"# Data"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "technical-evans",
"metadata": {},
"outputs": [],
"source": [
"#Include any import statements you will need\n",
"import pandas as pd\n",
"import matplotlib.pyplot as plt\n",
"import seaborn as sns"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "overhead-sigma",
"metadata": {},
"outputs": [],
"source": [
"### 💻 FILL IN YOUR DATASET FILE NAME BELOW 💻 ###\n",
"\n",
"file_name = \"metro-grades.csv\"\n",
"dataset_path = \"data/\" + file_name\n",
"\n",
"df = pd.read_csv(dataset_path)"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "heated-blade",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>metro_area</th>\n",
" <th>holc_grade</th>\n",
" <th>white_pop</th>\n",
" <th>black_pop</th>\n",
" <th>hisp_pop</th>\n",
" <th>asian_pop</th>\n",
" <th>other_pop</th>\n",
" <th>total_pop</th>\n",
" <th>pct_white</th>\n",
" <th>pct_black</th>\n",
" <th>...</th>\n",
" <th>surr_area_white_pop</th>\n",
" <th>surr_area_black_pop</th>\n",
" <th>surr_area_hisp_pop</th>\n",
" <th>surr_area_asian_pop</th>\n",
" <th>surr_area_other_pop</th>\n",
" <th>surr_area_pct_white</th>\n",
" <th>surr_area_pct_black</th>\n",
" <th>surr_area_pct_hisp</th>\n",
" <th>surr_area_pct_asian</th>\n",
" <th>surr_area_pct_other</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Akron, OH</td>\n",
" <td>A</td>\n",
" <td>24702</td>\n",
" <td>8624</td>\n",
" <td>956</td>\n",
" <td>688</td>\n",
" <td>1993</td>\n",
" <td>36963</td>\n",
" <td>66.83</td>\n",
" <td>23.33</td>\n",
" <td>...</td>\n",
" <td>304399</td>\n",
" <td>70692</td>\n",
" <td>11037</td>\n",
" <td>17295</td>\n",
" <td>23839</td>\n",
" <td>71.24</td>\n",
" <td>16.55</td>\n",
" <td>2.58</td>\n",
" <td>4.05</td>\n",
" <td>5.58</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Akron, OH</td>\n",
" <td>B</td>\n",
" <td>41531</td>\n",
" <td>16499</td>\n",
" <td>2208</td>\n",
" <td>3367</td>\n",
" <td>4211</td>\n",
" <td>67816</td>\n",
" <td>61.24</td>\n",
" <td>24.33</td>\n",
" <td>...</td>\n",
" <td>304399</td>\n",
" <td>70692</td>\n",
" <td>11037</td>\n",
" <td>17295</td>\n",
" <td>23839</td>\n",
" <td>71.24</td>\n",
" <td>16.55</td>\n",
" <td>2.58</td>\n",
" <td>4.05</td>\n",
" <td>5.58</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Akron, OH</td>\n",
" <td>C</td>\n",
" <td>73105</td>\n",
" <td>22847</td>\n",
" <td>3149</td>\n",
" <td>6291</td>\n",
" <td>7302</td>\n",
" <td>112694</td>\n",
" <td>64.87</td>\n",
" <td>20.27</td>\n",
" <td>...</td>\n",
" <td>304399</td>\n",
" <td>70692</td>\n",
" <td>11037</td>\n",
" <td>17295</td>\n",
" <td>23839</td>\n",
" <td>71.24</td>\n",
" <td>16.55</td>\n",
" <td>2.58</td>\n",
" <td>4.05</td>\n",
" <td>5.58</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Akron, OH</td>\n",
" <td>D</td>\n",
" <td>6179</td>\n",
" <td>6921</td>\n",
" <td>567</td>\n",
" <td>455</td>\n",
" <td>1022</td>\n",
" <td>15144</td>\n",
" <td>40.80</td>\n",
" <td>45.70</td>\n",
" <td>...</td>\n",
" <td>304399</td>\n",
" <td>70692</td>\n",
" <td>11037</td>\n",
" <td>17295</td>\n",
" <td>23839</td>\n",
" <td>71.24</td>\n",
" <td>16.55</td>\n",
" <td>2.58</td>\n",
" <td>4.05</td>\n",
" <td>5.58</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Albany-Schenectady-Troy, NY</td>\n",
" <td>A</td>\n",
" <td>16989</td>\n",
" <td>1818</td>\n",
" <td>1317</td>\n",
" <td>1998</td>\n",
" <td>1182</td>\n",
" <td>23303</td>\n",
" <td>72.91</td>\n",
" <td>7.80</td>\n",
" <td>...</td>\n",
" <td>387016</td>\n",
" <td>68371</td>\n",
" <td>42699</td>\n",
" <td>41112</td>\n",
" <td>40596</td>\n",
" <td>66.75</td>\n",
" <td>11.79</td>\n",
" <td>7.36</td>\n",
" <td>7.09</td>\n",
" <td>7.00</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>5 rows × 28 columns</p>\n",
"</div>"
],
"text/plain": [
" metro_area holc_grade white_pop black_pop hisp_pop \\\n",
"0 Akron, OH A 24702 8624 956 \n",
"1 Akron, OH B 41531 16499 2208 \n",
"2 Akron, OH C 73105 22847 3149 \n",
"3 Akron, OH D 6179 6921 567 \n",
"4 Albany-Schenectady-Troy, NY A 16989 1818 1317 \n",
"\n",
" asian_pop other_pop total_pop pct_white pct_black ... \\\n",
"0 688 1993 36963 66.83 23.33 ... \n",
"1 3367 4211 67816 61.24 24.33 ... \n",
"2 6291 7302 112694 64.87 20.27 ... \n",
"3 455 1022 15144 40.80 45.70 ... \n",
"4 1998 1182 23303 72.91 7.80 ... \n",
"\n",
" surr_area_white_pop surr_area_black_pop surr_area_hisp_pop \\\n",
"0 304399 70692 11037 \n",
"1 304399 70692 11037 \n",
"2 304399 70692 11037 \n",
"3 304399 70692 11037 \n",
"4 387016 68371 42699 \n",
"\n",
" surr_area_asian_pop surr_area_other_pop surr_area_pct_white \\\n",
"0 17295 23839 71.24 \n",
"1 17295 23839 71.24 \n",
"2 17295 23839 71.24 \n",
"3 17295 23839 71.24 \n",
"4 41112 40596 66.75 \n",
"\n",
" surr_area_pct_black surr_area_pct_hisp surr_area_pct_asian \\\n",
"0 16.55 2.58 4.05 \n",
"1 16.55 2.58 4.05 \n",
"2 16.55 2.58 4.05 \n",
"3 16.55 2.58 4.05 \n",
"4 11.79 7.36 7.09 \n",
"\n",
" surr_area_pct_other \n",
"0 5.58 \n",
"1 5.58 \n",
"2 5.58 \n",
"3 5.58 \n",
"4 7.00 \n",
"\n",
"[5 rows x 28 columns]"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.head()"
]
},
{
"cell_type": "code",
"execution_count": 30,
"id": "1378c142-b0ef-4048-aad0-3674ec12c532",
"metadata": {},
"outputs": [],
"source": [
"df[\"state\"] = df.metro_area.str.split(\", \").str.get(1)"
]
},
{
"cell_type": "markdown",
"id": "970ac473-ad2f-47df-b57d-f843992e2543",
"metadata": {},
"source": [
"# Messing Around"
]
},
{
"cell_type": "code",
"execution_count": 32,
"id": "74a2f075-6acb-4d7c-b3b9-20d9bda274fe",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<Axes: xlabel='holc_grade', ylabel='pct_white'>"
]
},
"execution_count": 32,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "",
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"sns.barplot (data=df, y=\"pct_white\", x=\"holc_grade\")"
]
},
{
"cell_type": "code",
"execution_count": 34,
"id": "f71bb640-ea17-43c5-ba5e-0b785b7c1e63",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<Axes: xlabel='holc_grade', ylabel='pct_other'>"
]
},
"execution_count": 34,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "",
"text/plain": [
"<Figure size 1500x200 with 5 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"figure, (ax0, ax1, ax2, ax3, ax4) = plt.subplots (1,5, figsize = (15, 2), sharey=True)\n",
"sns.barplot (data=df, y=\"pct_white\", x=\"holc_grade\", ax=ax0) \n",
"sns.barplot (data=df, y=\"pct_black\", x=\"holc_grade\", ax=ax1) \n",
"sns.barplot (data=df, y=\"pct_hisp\", x=\"holc_grade\", ax=ax2) \n",
"sns.barplot (data=df, y=\"pct_asian\", x=\"holc_grade\", ax=ax3) \n",
"sns.barplot (data=df, y=\"pct_other\", x=\"holc_grade\", ax=ax4) \n",
"\n",
"\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 35,
"id": "84231de8-ce69-4909-b69e-62a1744bad01",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<Axes: xlabel='holc_grade'>"
]
},
"execution_count": 35,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "",
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"racial_perct = df.groupby (\"holc_grade\")[[\"pct_white\", \"pct_black\", \"pct_hisp\", \"pct_asian\", \"pct_other\"]].mean()\n",
"racial_perct.plot.bar()"
]
},
{
"cell_type": "markdown",
"id": "continental-franklin",
"metadata": {},
"source": [
"**Data Overview**\n",
"\n",
"The data above comes from a 2020 data set using the original redlining data maps from the 1930s. Simply put, do these maps which were deemed illegal in 1967 still show that the damage over the years that they were used still exists today. The data is relatively straight forward in that the drawing of these maps created zones, A being the most favorable, and D being the least favorable. In the 1930s, the maps clearly show racial bias and those in the D zones are heavily people of color who are unable to obtain mortgages, thus being subjected to having no way to gain equity in property, which most consider the best way to wealth over generations within a family. Because of this practice, the question is really whether or not this has an effect on the current look of communities. Using racial population information from 2020 and combining this with the 1930s maps, we can see whether or not things have changed."
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "a1614a90-f412-43a2-a246-eeb4df9ee9eb",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([<Axes: >, <Axes: >, <Axes: >, <Axes: >, <Axes: >], dtype=object)"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"axes"
]
},
{
"cell_type": "markdown",
"id": "infinite-instrument",
"metadata": {},
"source": [
"# Methods and Results"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "basic-canadian",
"metadata": {},
"outputs": [],
"source": [
"#Import any helper files you need here"
]
},
{
"cell_type": "markdown",
"id": "recognized-positive",
"metadata": {},
"source": [
"## First Research Question: [What can we interpret from the data set regarding the percentage of whites and blacks living in various graded zones?]\n"
]
},
{
"cell_type": "markdown",
"id": "graduate-palmer",
"metadata": {},
"source": [
"### Methods"
]
},
{
"cell_type": "markdown",
"id": "endless-variation",
"metadata": {},
"source": [
"*Explain how you will approach this research question below. Consider the following:* \n",
" - *Which aspects of the dataset will you use?*\n",
" We will graph the percentages of whites and blacks living in specific zones (A, B, C, D)\n",
" - *How will you reorganize/store the data?*\n",
" This data will be organized into two bar plots, side by side, to show the current state of whites and blacks in regard to where they live as it pertains to the zoning from the 1930s map\n",
" - *What data science tools/functions will you use and why?* \n",
" We will use seaborn barplot to run this data. One important note is that we will make sure the y-axis in uniform to allow us to interpret the data appropriately. This will allow us to avoid skewing the data.\n",
"✏"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d33c1cd3-a4a3-4f17-ad5f-3d27fdc29022",
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"id": "portuguese-japan",
"metadata": {},
"source": [
"### Results "
]
},
{
"cell_type": "code",
"execution_count": 41,
"id": "negative-highlight",
"metadata": {},
"outputs": [],
"source": [
"#The thing to note as we look at the graphing done above is that there is a clear distinction between whites\n",
"#and blacks when it comes to today's residents various zones. While things have certainly changed, the A zone \n",
"#is still predominately white, and you can see by the data that as the zones go from A-D, the white population in \n",
"#each decreases, while the black population over the same area increases. This shows that while there are some improvements, \n",
"#we are still 90 years later, living in a very racially segregated society,"
]
},
{
"cell_type": "code",
"execution_count": 42,
"id": "victorian-burning",
"metadata": {},
"outputs": [],
"source": [
"# 💻 YOU CAN ADD NEW CELLS WITH THE \"+\" BUTTON "
]
},
{
"cell_type": "markdown",
"id": "collectible-puppy",
"metadata": {},
"source": [
"## Second Research Question: [What does this data show regarding generational wealth between all involved?]\n"
]
},
{
"cell_type": "markdown",
"id": "demographic-future",
"metadata": {},
"source": [
"### Methods"
]
},
{
"cell_type": "markdown",
"id": "incorporate-roller",
"metadata": {},
"source": [
"*Explain how you will approach this research question below. Consider the following:* \n",
" - *Which aspects of the dataset will you use?*\n",
" We will take a look again at the A-D zones and consider those in the most sought after zones as being the most economically viable. Given that the data set is from 2020, and the maps were first created 90 years prior, it is fair to determine how well groups outside the white population are fairing in the A or B zones.\n",
" - *How will you reorganize/store the data?*\n",
" - Barplot with all groups/races side by side to see the distinctions\n",
" - *What data science tools/functions will you use and why?*\n",
" The mean of racial disparity between races both within separated bar plots, and one combined one to see patterns between zones\n",
"\n",
"✏️ *Write your answer below:*\n"
]
},
{
"cell_type": "markdown",
"id": "juvenile-creation",
"metadata": {},
"source": [
"### Results "
]
},
{
"cell_type": "code",
"execution_count": 43,
"id": "pursuant-surrey",
"metadata": {},
"outputs": [],
"source": [
"# Immediately, one can tell that population percentage of whites goes down as the lettered zones get further in the alphabet\n",
"# whereas the opposite is true witb the black and hispanic zones. The inference that can be made from this data is that because\n",
"# redlining created an inequitable system within the cities on the US, those in charge made it easier for those in certain areas to gain\n",
"# easy access to things like home ownership. Given that home ownership in considered a great tool in obtaining generational wealth, it can\n",
"# also be seen that 90 years after the racially biased maps were implemented, that their effects are still here today. Because this wealth\n",
"# is in fact generational, it clear that those in zones A and B have had an easier time growing this wealth than those in zones C and D -- the\n",
"# racial obviousness of said zones shows this as well."
]
},
{
"cell_type": "code",
"execution_count": 44,
"id": "located-night",
"metadata": {},
"outputs": [],
"source": [
"# 💻 YOU CAN ADD NEW CELLS WITH THE \"+\" BUTTON "
]
},
{
"cell_type": "markdown",
"id": "infectious-symbol",
"metadata": {},
"source": [
"# Discussion"
]
},
{
"cell_type": "markdown",
"id": "furnished-camping",
"metadata": {
"code_folding": []
},
"source": [
"## Considerations"
]
},
{
"cell_type": "markdown",
"id": "bearing-stadium",
"metadata": {},
"source": [
"*It's important to recognize the limitations of our research.\n",
"Consider the following:*\n",
"\n",
"- *Do the results give an accurate depiction of your research question? Why or why not?*\n",
"- The results do show an accurate answer to this question. The major focus is the 90 year gap between the original map and the data that now incorporates it.\n",
"- *What were limitations of your datset?*\n",
" It would be nice to have the same figures from the 1930 map to see a side-by-side comparison\n",
"- *Are there any known biases in the data?*\n",
" Certainly my answer above adds some limitation to this and could create a greater bias of racial segregation given that we are guessing as to the original data and could be further of than we would like.\n"
]
},
{
"cell_type": "markdown",
"id": "beneficial-invasion",
"metadata": {},
"source": [
"## Summary"
]
},
{
"cell_type": "markdown",
"id": "about-raise",
"metadata": {},
"source": [
"*Summarize what you discovered through the research. Consider the following:*\n",
"\n",
"- *What did you learn about your media consumption/digital habits?*\n",
" I enjoy when data that is compiled so cleanly. It is really interesting to be able to take such various pieces of the given data and plot it. While I enjoyed putting this together, I can only imagine the complexities that are truly able to be used here.\n",
"- *Did the results make sense?*\n",
" The results made sense, and while I needed some help putting them together, I enjoyed the different looks and avenues that could be assembled quickly and efficiently \n",
"- *What was most surprising?*\n",
" I am not sure if the results of my data were very suprising but I was surpirsed by how panda will just take data if not given specifics and graph it how \"it\" sees fit. \n",
"- *How will this project impact you going forward?*\n",
" I have always liked FiveThirtyEight's analytical data for certain things and I enjoy that we are able to take that data and make more sense of it. I would like to learn more about complexities within this data to see how we can go so much deeper with it.\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6ee86a83-ccb5-47c0-9b44-6f0be3ef4c68",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"jupytext": {
"cell_metadata_json": true,
"text_representation": {
"extension": ".Rmd",
"format_name": "rmarkdown",
"format_version": "1.2",
"jupytext_version": "1.9.1"
}
},
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
},
"toc": {
"base_numbering": 1,
"nav_menu": {},
"number_sections": false,
"sideBar": true,
"skip_h1_title": false,
"title_cell": "Table of Contents",
"title_sidebar": "Contents",
"toc_cell": false,
"toc_position": {},
"toc_section_display": true,
"toc_window_display": false
},
"varInspector": {
"cols": {
"lenName": 16,
"lenType": 16,
"lenVar": 40
},
"kernels_config": {
"python": {
"delete_cmd_postfix": "",
"delete_cmd_prefix": "del ",
"library": "var_list.py",
"varRefreshCmd": "print(var_dic_list())"
},
"r": {
"delete_cmd_postfix": ") ",
"delete_cmd_prefix": "rm(",
"library": "var_list.r",
"varRefreshCmd": "cat(var_dic_list()) "
}
},
"types_to_exclude": [
"module",
"function",
"builtin_function_or_method",
"instance",
"_Feature"
],
"window_display": false
}
},
"nbformat": 4,
"nbformat_minor": 5
}