This session, I started working on my Argument

file. I added the introduction,and the data overview. I started the
data analysis but I get a little frustrated with the way I have my data formatted
and the way the indexing works with pandas. Ifigured out that I need to use
'loc' and/or 'ilock' for how I want to use my data. Tomorrow I will
attempt to use these to start my analysis.
This commit is contained in:
caglazir
2025-11-11 22:42:14 -05:00
parent 72c61b7490
commit 10d5e59d61
6 changed files with 692 additions and 23 deletions

View File

@@ -0,0 +1,490 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "worldwide-blood",
"metadata": {},
"source": [
"# Introduction"
]
},
{
"cell_type": "markdown",
"id": "understanding-numbers",
"metadata": {},
"source": [
"*In my teaching residency placement, I have noticed that the students in my physics class lack the fundamental mathemetical and computational reasoning skills that they should have achieved by grade level. I have been hearing from veteran teachers that their cohorts who were in middle school during the pandemic are the most prominent victims of this situation. I want to compare the Algebra I Regents performance of cohorts that were 8th graders before, during and after the COVID-19 pandemic. I chose the anchoring grade level as 8th grade because traditional this is when the students should have been learned and practiced their fundamentals of algebra*"
]
},
{
"cell_type": "markdown",
"id": "greater-circular",
"metadata": {},
"source": [
"## Overarching Question: [✏️ PUT YOUR QUESTION HERE ✏️]"
]
},
{
"cell_type": "markdown",
"id": "appreciated-testimony",
"metadata": {},
"source": [
"*✏️ Write 2-3 sentences explaining why this question.*"
]
},
{
"cell_type": "markdown",
"id": "permanent-pollution",
"metadata": {},
"source": [
"# Data"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "technical-evans",
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"import math\n",
"import statistics\n",
"import csv\n",
"import pandas as pd\n",
"import matplotlib.pyplot as plt"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "overhead-sigma",
"metadata": {},
"outputs": [],
"source": [
"\n",
"\n",
"algebra = \"algebra.csv\"\n",
"dataset_path = \"data/\" + algebra\n",
"\n",
"df = pd.read_csv(dataset_path)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "heated-blade",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>2023</th>\n",
" <th>2022</th>\n",
" <th>2021</th>\n",
" <th>2020</th>\n",
" <th>2018</th>\n",
" <th>2017</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>164</td>\n",
" <td>264</td>\n",
" <td>137</td>\n",
" <td>41</td>\n",
" <td>218</td>\n",
" <td>256</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>26</td>\n",
" <td>60</td>\n",
" <td>27</td>\n",
" <td>21</td>\n",
" <td>23</td>\n",
" <td>23</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>26</td>\n",
" <td>51</td>\n",
" <td>19</td>\n",
" <td>10</td>\n",
" <td>40</td>\n",
" <td>51</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>68</td>\n",
" <td>112</td>\n",
" <td>67</td>\n",
" <td>10</td>\n",
" <td>102</td>\n",
" <td>130</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>37</td>\n",
" <td>20</td>\n",
" <td>20</td>\n",
" <td>0</td>\n",
" <td>35</td>\n",
" <td>35</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>7</td>\n",
" <td>21</td>\n",
" <td>4</td>\n",
" <td>0</td>\n",
" <td>18</td>\n",
" <td>17</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" 2023 2022 2021 2020 2018 2017\n",
"0 164 264 137 41 218 256\n",
"1 26 60 27 21 23 23\n",
"2 26 51 19 10 40 51\n",
"3 68 112 67 10 102 130\n",
"4 37 20 20 0 35 35\n",
"5 7 21 4 0 18 17"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.head(6)"
]
},
{
"cell_type": "markdown",
"id": "continental-franklin",
"metadata": {},
"source": [
"**Data Overview**\n",
"\n",
"*The data is taken from NYSED website. It is pulled from Cheektowaga High School's recent and archival 'School Report Card(s)' The school report cards can contain very detailed information, but the website allows the users to sort what they need. In this case, the school report card was generated on the NYSED website to present the annual Regents examinations. From these examinations, I handpicked relevant data for the Algebra examination. \n",
"\n",
"The header row reperesents the **year** of the examination. Since Algebra one is the fundamental math examination in the Regents standards, we will make the assumption that the year also represents the high school enterance cohort year for *most, if not all* of the students.\n",
"- The first two cohorts (2023 & 2022) are considered as post-COVID-19 cohorts as they were introduced to fundamentals of Algebra in 8th grade, a year prior to their cohort entry as tabulated.\n",
"- The following two cohorts (2021 & 2020) are considered to be the COVID-19 cohorts, as they were in 8th grade *during* the pandemic years.\n",
"- The last two cohorts (2018 & 2017) are considered to be the pre-COVID-19 cohorts. \n",
"\n",
"The 0th data row represents the **total number of students who took the Algebra Regents**.\\\n",
"The 1st data row represents the **number of students who performed at Level 1 (lowest level).**\\\n",
"The 2nd data row represents the **number of students who performed at Level 2.**\\\n",
"The 3rd data row represents the **number of students who performed at Level 3.**\\\n",
"The 4th data row represents the **number of students who performed at Level 4.**\\\n",
"The 5th data row represents the **number of students who performed at Level 5 (highest level).**\n",
"\n",
"*Regents defines proficiency in Algebra as performed at Level 3 or above*. This categorization will inform our data analysis. \n",
"*"
]
},
{
"cell_type": "markdown",
"id": "infinite-instrument",
"metadata": {},
"source": [
"# Methods and Results"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "basic-canadian",
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"id": "recognized-positive",
"metadata": {},
"source": [
"## First Research Question:\n",
"## How has COVID-19 pandemic impacted student's matehmetical skills?: Exploring the high school Algebra Regents examination performance of post, during and pre COVID-19 middle schoolers.\n"
]
},
{
"cell_type": "markdown",
"id": "graduate-palmer",
"metadata": {},
"source": [
"### Methods"
]
},
{
"cell_type": "markdown",
"id": "endless-variation",
"metadata": {},
"source": [
"*Explain how you will approach this research question below. Consider the following:* \n",
" - *Which aspects of the dataset will you use?* \n",
" - *How will you reorganize/store the data?* \n",
" - *What data science tools/functions will you use and why?* \n",
" \n",
"✏️ *Write your answer below:*\n",
"\n"
]
},
{
"cell_type": "markdown",
"id": "portuguese-japan",
"metadata": {},
"source": [
"### Results "
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "negative-highlight",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0 180.000000\n",
"1 30.000000\n",
"2 32.833333\n",
"3 81.500000\n",
"4 24.500000\n",
"5 11.166667\n",
"dtype: float64\n"
]
}
],
"source": [
"avg_per_level"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "victorian-burning",
"metadata": {},
"outputs": [],
"source": [
"# 💻 YOU CAN ADD NEW CELLS WITH THE \"+\" BUTTON "
]
},
{
"cell_type": "markdown",
"id": "collectible-puppy",
"metadata": {},
"source": [
"## Second Research Question: [✏️ PUT YOUR QUESTION HERE ✏️]\n"
]
},
{
"cell_type": "markdown",
"id": "demographic-future",
"metadata": {},
"source": [
"### Methods"
]
},
{
"cell_type": "markdown",
"id": "incorporate-roller",
"metadata": {},
"source": [
"*Explain how you will approach this research question below. Consider the following:* \n",
" - *Which aspects of the dataset will you use?* \n",
" - *How will you reorganize/store the data?* \n",
" - *What data science tools/functions will you use and why?* \n",
"\n",
"✏️ *Write your answer below:*\n"
]
},
{
"cell_type": "markdown",
"id": "juvenile-creation",
"metadata": {},
"source": [
"### Results "
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "pursuant-surrey",
"metadata": {},
"outputs": [],
"source": [
"#######################################################################\n",
"### 💻 YOUR WORK GOES HERE TO ANSWER THE SECOND RESEARCH QUESTION 💻 \n",
"###\n",
"### Your data analysis may include a statistic and/or a data visualization\n",
"#######################################################################"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "located-night",
"metadata": {},
"outputs": [],
"source": [
"# 💻 YOU CAN ADD NEW CELLS WITH THE \"+\" BUTTON "
]
},
{
"cell_type": "markdown",
"id": "infectious-symbol",
"metadata": {},
"source": [
"# Discussion"
]
},
{
"cell_type": "markdown",
"id": "furnished-camping",
"metadata": {
"code_folding": []
},
"source": [
"## Considerations"
]
},
{
"cell_type": "markdown",
"id": "bearing-stadium",
"metadata": {},
"source": [
"*It's important to recognize the limitations of our research.\n",
"Consider the following:*\n",
"\n",
"- *Do the results give an accurate depiction of your research question? Why or why not?*\n",
"- *What were limitations of your datset?*\n",
"- *Are there any known biases in the data?*\n",
"\n",
"✏️ *Write your answer below:*"
]
},
{
"cell_type": "markdown",
"id": "beneficial-invasion",
"metadata": {},
"source": [
"## Summary"
]
},
{
"cell_type": "markdown",
"id": "about-raise",
"metadata": {},
"source": [
"*Summarize what you discovered through the research. Consider the following:*\n",
"\n",
"- *What did you learn about your media consumption/digital habits?*\n",
"- *Did the results make sense?*\n",
"- *What was most surprising?*\n",
"- *How will this project impact you going forward?*\n",
"\n",
"✏️ *Write your answer below:*"
]
}
],
"metadata": {
"jupytext": {
"cell_metadata_json": true,
"text_representation": {
"extension": ".Rmd",
"format_name": "rmarkdown",
"format_version": "1.2",
"jupytext_version": "1.9.1"
}
},
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.3"
},
"toc": {
"base_numbering": 1,
"nav_menu": {},
"number_sections": false,
"sideBar": true,
"skip_h1_title": false,
"title_cell": "Table of Contents",
"title_sidebar": "Contents",
"toc_cell": false,
"toc_position": {},
"toc_section_display": true,
"toc_window_display": false
},
"varInspector": {
"cols": {
"lenName": 16,
"lenType": 16,
"lenVar": 40
},
"kernels_config": {
"python": {
"delete_cmd_postfix": "",
"delete_cmd_prefix": "del ",
"library": "var_list.py",
"varRefreshCmd": "print(var_dic_list())"
},
"r": {
"delete_cmd_postfix": ") ",
"delete_cmd_prefix": "rm(",
"library": "var_list.r",
"varRefreshCmd": "cat(var_dic_list()) "
}
},
"types_to_exclude": [
"module",
"function",
"builtin_function_or_method",
"instance",
"_Feature"
],
"window_display": false
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -0,0 +1,35 @@
# Project proposal
This planning document will also form the introduction of your
argument.
## Overarching Question
### What central question are you interested in exploring? Why are you interested in exploring this question?
*The question I'm interested in exploring is 'How has COVID-19 pandemic impacted student's matehmetical skills'. I am interested in this question because in my teaching residency placement, I have noticed that the students in my physics class lack the fundamental mathemetical and computational reasoning skills that they should have achieved by grade level. I have been hearing from veteran teachers that their cohorts who were in middle school during the pandemic are the most prominent victims of this situation. I want to compare data for a cohort that has gone through the pandemic and compare their dataset to a cohort that has not.*
### What specific research questions will you investigate?
*"1. How has COVID-19 pandemic impacted student's matehmetical skills. 2. Is there a decrease in mathematical proficiency for the COVID-19 middle school cohorts? 3.How does mathemetical proficiency of high school students differ between COVID-19 middle school cohorts and the cohorts before them?*
## Data source
### What data set will you use to answer your overarching question?
*I will be using the NYSED data as my resource. I will be using their data for my residency high school specifically. I will be using the archived report cards and pull from section 2: Student Performance - Regents Exams in these report cards.
https://data.nysed.gov/archive.php?instid=800000052345*
### Where is this data from?
*As mentioned above the data is from NYSED. I do trust this data because it is directly from the State and it includes data from their statewide official examinations that have set standards and *
### What is this data about?
*This data set is about the Regents testing scores of Cheektowaga High School. It includes all subjects (10 rows) however I am only interested in math related tests (rows 2-4). I am also focusing on general education students (column 2) instead of all (column 1) or special education students (column 3). Every column has built-in columns for specific data such as total tested, percentage of students scoring at or above 55, 65 and 85.*
## Methods
### How will you use your data set to answer your quantitative questions?
*First I will choose 3-4 cohorts, at least one that was in middle school before COVID-19, one during and one after. I will look at each cohort's performance across middle and high school years. I plan creating a line graph for each and then comparing these with each other. I also want to compare the percentages of below and above passing scores in each cohort. This can be tabulated.*

View File

@@ -13,7 +13,7 @@
"id": "understanding-numbers",
"metadata": {},
"source": [
"*✏️ Write 2-3 sentences describing your research.*"
"*In my teaching residency placement, I have noticed that the students in my physics class lack the fundamental mathemetical and computational reasoning skills that they should have achieved by grade level. I have been hearing from veteran teachers that their cohorts who were in middle school during the pandemic are the most prominent victims of this situation. I want to compare the Algebra I Regents performance of cohorts that were 8th graders before, during and after the COVID-19 pandemic. I chose the anchoring grade level as 8th grade because traditional this is when the students should have been learned and practiced their fundamentals of algebra*"
]
},
{
@@ -42,39 +42,145 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 4,
"id": "technical-evans",
"metadata": {},
"outputs": [],
"source": [
"#Include any import statements you will need\n",
"import numpy as np\n",
"import math\n",
"import statistics\n",
"import csv\n",
"import pandas as pd\n",
"import matplotlib.pyplot as plt"
]
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 5,
"id": "overhead-sigma",
"metadata": {},
"outputs": [],
"source": [
"### 💻 FILL IN YOUR DATASET FILE NAME BELOW 💻 ###\n",
"\n",
"file_name = \"YOUR_DATASET_FILE_NAME.csv\"\n",
"dataset_path = \"data/\" + file_name\n",
"\n",
"algebra = \"algebra.csv\"\n",
"dataset_path = \"data/\" + algebra\n",
"\n",
"df = pd.read_csv(dataset_path)"
]
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 10,
"id": "heated-blade",
"metadata": {},
"outputs": [],
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>2023</th>\n",
" <th>2022</th>\n",
" <th>2021</th>\n",
" <th>2020</th>\n",
" <th>2018</th>\n",
" <th>2017</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>164</td>\n",
" <td>264</td>\n",
" <td>137</td>\n",
" <td>41</td>\n",
" <td>218</td>\n",
" <td>256</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>26</td>\n",
" <td>60</td>\n",
" <td>27</td>\n",
" <td>21</td>\n",
" <td>23</td>\n",
" <td>23</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>26</td>\n",
" <td>51</td>\n",
" <td>19</td>\n",
" <td>10</td>\n",
" <td>40</td>\n",
" <td>51</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>68</td>\n",
" <td>112</td>\n",
" <td>67</td>\n",
" <td>10</td>\n",
" <td>102</td>\n",
" <td>130</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>37</td>\n",
" <td>20</td>\n",
" <td>20</td>\n",
" <td>0</td>\n",
" <td>35</td>\n",
" <td>35</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>7</td>\n",
" <td>21</td>\n",
" <td>4</td>\n",
" <td>0</td>\n",
" <td>18</td>\n",
" <td>17</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" 2023 2022 2021 2020 2018 2017\n",
"0 164 264 137 41 218 256\n",
"1 26 60 27 21 23 23\n",
"2 26 51 19 10 40 51\n",
"3 68 112 67 10 102 130\n",
"4 37 20 20 0 35 35\n",
"5 7 21 4 0 18 17"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.head()"
"df.head(6)"
]
},
{
@@ -84,7 +190,22 @@
"source": [
"**Data Overview**\n",
"\n",
"*✏️ Write 2-3 sentences describing this dataset. Be sure to include where the data comes from and what it contains.*"
"*The data is taken from NYSED website. It is pulled from Cheektowaga High School's recent and archival 'School Report Card(s)' The school report cards can contain very detailed information, but the website allows the users to sort what they need. In this case, the school report card was generated on the NYSED website to present the annual Regents examinations. From these examinations, I handpicked relevant data for the Algebra examination. \n",
"\n",
"The header row reperesents the **year** of the examination. Since Algebra one is the fundamental math examination in the Regents standards, we will make the assumption that the year also represents the high school enterance cohort year for *most, if not all* of the students.\n",
"- The first two cohorts (2023 & 2022) are considered as post-COVID-19 cohorts as they were introduced to fundamentals of Algebra in 8th grade, a year prior to their cohort entry as tabulated.\n",
"- The following two cohorts (2021 & 2020) are considered to be the COVID-19 cohorts, as they were in 8th grade *during* the pandemic years.\n",
"- The last two cohorts (2018 & 2017) are considered to be the pre-COVID-19 cohorts. \n",
"\n",
"The 0th data row represents the **total number of students who took the Algebra Regents**.\\\n",
"The 1st data row represents the **number of students who performed at Level 1 (lowest level).**\\\n",
"The 2nd data row represents the **number of students who performed at Level 2.**\\\n",
"The 3rd data row represents the **number of students who performed at Level 3.**\\\n",
"The 4th data row represents the **number of students who performed at Level 4.**\\\n",
"The 5th data row represents the **number of students who performed at Level 5 (highest level).**\n",
"\n",
"*Regents defines proficiency in Algebra as performed at Level 3 or above*. This categorization will inform our data analysis. \n",
"*"
]
},
{
@@ -101,16 +222,15 @@
"id": "basic-canadian",
"metadata": {},
"outputs": [],
"source": [
"#Import any helper files you need here"
]
"source": []
},
{
"cell_type": "markdown",
"id": "recognized-positive",
"metadata": {},
"source": [
"## First Research Question: [✏️ PUT YOUR QUESTION HERE ✏️]\n"
"## First Research Question:\n",
"## How has COVID-19 pandemic impacted student's matehmetical skills?: Exploring the high school Algebra Regents examination performance of post, during and pre COVID-19 middle schoolers.\n"
]
},
{
@@ -145,16 +265,26 @@
},
{
"cell_type": "code",
"execution_count": 17,
"execution_count": 18,
"id": "negative-highlight",
"metadata": {},
"outputs": [],
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0 180.000000\n",
"1 30.000000\n",
"2 32.833333\n",
"3 81.500000\n",
"4 24.500000\n",
"5 11.166667\n",
"dtype: float64\n"
]
}
],
"source": [
"#######################################################################\n",
"### 💻 YOUR WORK GOES HERE TO ANSWER THE FIRST RESEARCH QUESTION 💻 \n",
"### \n",
"### Your data analysis may include a statistic and/or a data visualization\n",
"#######################################################################"
"avg_per_level"
]
},
{
@@ -310,7 +440,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.7"
"version": "3.12.3"
},
"toc": {
"base_numbering": 1,

View File

@@ -0,0 +1,7 @@
2023,2022,2021,2020,2018,2017
164,264,137,41,218,256
26,60,27,21,23,23
26,51,19,10,40,51
68,112,67,10,102,130
37,20,20,0,35,35
7,21,4,0,18,17
1 2023 2022 2021 2020 2018 2017
2 164 264 137 41 218 256
3 26 60 27 21 23 23
4 26 51 19 10 40 51
5 68 112 67 10 102 130
6 37 20 20 0 35 35
7 7 21 4 0 18 17

7
data/algebra.csv Normal file
View File

@@ -0,0 +1,7 @@
2023,2022,2021,2020,2018,2017
164,264,137,41,218,256
26,60,27,21,23,23
26,51,19,10,40,51
68,112,67,10,102,130
37,20,20,0,35,35
7,21,4,0,18,17
1 2023 2022 2021 2020 2018 2017
2 164 264 137 41 218 256
3 26 60 27 21 23 23
4 26 51 19 10 40 51
5 68 112 67 10 102 130
6 37 20 20 0 35 35
7 7 21 4 0 18 17

Binary file not shown.