project_argument/argument.ipynb

1519 lines
135 KiB
Plaintext
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

{
"cells": [
{
"cell_type": "markdown",
"id": "worldwide-blood",
"metadata": {},
"source": [
"# Introduction"
]
},
{
"cell_type": "markdown",
"id": "understanding-numbers",
"metadata": {},
"source": [
"This research examines the relationship between instructors' reported frequencies on focus on developing mathmatical and conceptual models and their reported frequency of using computer-interfacing measurement devices, using course information survey responses that accompany the E-CLASS student data. Understanding this relationship may be able to inform what purposes instructors believe are best suited for using these technologies. Since these are general ratings for the course for the entirety of the semester, it is likely that the granularity is not fine enough to adequately answer this question based on this data alone, but we'll take a look anyways."
]
},
{
"cell_type": "markdown",
"id": "greater-circular",
"metadata": {},
"source": [
"## Overarching Question: How do the reported uses of real-time data visualization devices in physics labs relate to the development of models?"
]
},
{
"cell_type": "markdown",
"id": "43640d34-8f4b-41c3-bdbd-38dbcf51b5d5",
"metadata": {},
"source": [
"The central question is to examine the ways the computer-interfacing measurement devices are used in introductory college physics laboratories, and how they are connected to changes in students' epistemological beliefs and attitudes towards experimentation / laboratory. Computer-interfacing measurement devices are commonly used for collection and visualization of real-time physical data. Most physics instructors accept that these experiences can facilitate the development of both mathematical and conceptual models. However, there have not been many studies which explore the learning processes underlying these assumptions, and learning experiences from the use of these devices in laboratory settings are not well-understood. \n",
"\n",
"Additionally, a growing body of research has shown that the roles that learners adopt in physics laboratory learning settings are inequitably distributed by gender, with women undertaking more secretarial and management roles and men undertaking more tinkering and experimenting roles. Although how gender may be connected to learners' experiences using computer-interfacing measurment devices is not well documented, it is possible that similar dynamics plague the use of these technologies, and consequently impact how much learners benefit."
]
},
{
"cell_type": "markdown",
"id": "permanent-pollution",
"metadata": {},
"source": [
"# Data"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "technical-evans",
"metadata": {},
"outputs": [],
"source": [
"#Import libraries\n",
"import pandas as pd\n",
"import matplotlib.pyplot as plt\n",
"import numpy as np"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "overhead-sigma",
"metadata": {},
"outputs": [],
"source": [
"### 💻 FILL IN YOUR DATASET FILE NAME BELOW 💻 ###\n",
"\n",
"file_name = \"ECLASS_anon_cis.csv\"\n",
"dataset_path = data/ECLASS_anon_cis.csv\n",
"#dataset_path = \"data/\" + file_name\n",
"\n",
"df = pd.read_csv(data/ECLASS_anon_cis.csv)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "f1affbce-2a6e-429b-8a58-c337c6fdcd13",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Q5</th>\n",
" <th>Q52</th>\n",
" <th>Q53</th>\n",
" <th>Q18</th>\n",
" <th>Q27</th>\n",
" <th>Q6</th>\n",
" <th>Q11</th>\n",
" <th>Q19</th>\n",
" <th>Q20</th>\n",
" <th>Q15</th>\n",
" <th>...</th>\n",
" <th>Q38_4</th>\n",
" <th>Q41</th>\n",
" <th>Q42</th>\n",
" <th>Q43</th>\n",
" <th>StartDate</th>\n",
" <th>anon_instructor_id</th>\n",
" <th>anon_university_id</th>\n",
" <th>ResponseId</th>\n",
" <th>pre_survey_id</th>\n",
" <th>post_survey_id</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>semester</td>\n",
" <td>Fall</td>\n",
" <td>NaN</td>\n",
" <td>First year (introductory) lab</td>\n",
" <td>Calculus-based</td>\n",
" <td>9/9/2016</td>\n",
" <td>No incentive</td>\n",
" <td>55</td>\n",
" <td>3</td>\n",
" <td>4 year college</td>\n",
" <td>...</td>\n",
" <td>Never</td>\n",
" <td>11.0</td>\n",
" <td>11.0</td>\n",
" <td>0.0</td>\n",
" <td>29/07/2016 09:35</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>R_9TQGuSY5m31uCgp</td>\n",
" <td>eb70O8rWoi7TeEl</td>\n",
" <td>bdrBNZhS5ctkrtj</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>semester</td>\n",
" <td>Fall</td>\n",
" <td>NaN</td>\n",
" <td>First year (introductory) lab</td>\n",
" <td>Calculus-based</td>\n",
" <td>9/3/2016</td>\n",
" <td>Credit for completion (like an assignment)</td>\n",
" <td>600</td>\n",
" <td>37</td>\n",
" <td>PhD granting institution</td>\n",
" <td>...</td>\n",
" <td>Never</td>\n",
" <td>6.0</td>\n",
" <td>12.0</td>\n",
" <td>0.0</td>\n",
" <td>30/07/2016 15:16</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>R_WcbonOUwNhq1aTv</td>\n",
" <td>enXrvYRTHSwaJiR</td>\n",
" <td>3I6hLxNFWEUu5Ex</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>semester</td>\n",
" <td>Fall</td>\n",
" <td>NaN</td>\n",
" <td>Beyond the first year lab</td>\n",
" <td>NaN</td>\n",
" <td>8/22/2016</td>\n",
" <td>Credit for completion (like an assignment)</td>\n",
" <td>23</td>\n",
" <td>1</td>\n",
" <td>PhD granting institution</td>\n",
" <td>...</td>\n",
" <td>Always</td>\n",
" <td>1.0</td>\n",
" <td>3.0</td>\n",
" <td>3.0</td>\n",
" <td>14/08/2016 20:45</td>\n",
" <td>2</td>\n",
" <td>2</td>\n",
" <td>R_6QckmxZUXlkQ5z3</td>\n",
" <td>bK2V8FlTDsDGtJX</td>\n",
" <td>bpVIjAwWGMGtekJ</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>semester</td>\n",
" <td>Fall</td>\n",
" <td>NaN</td>\n",
" <td>First year (introductory) lab</td>\n",
" <td>Calculus-based</td>\n",
" <td>9/14/2016</td>\n",
" <td>Credit for completion (like an assignment)</td>\n",
" <td>25</td>\n",
" <td>2</td>\n",
" <td>4 year college</td>\n",
" <td>...</td>\n",
" <td>Sometimes</td>\n",
" <td>7.0</td>\n",
" <td>3.0</td>\n",
" <td>10.0</td>\n",
" <td>15/08/2016 08:32</td>\n",
" <td>3</td>\n",
" <td>3</td>\n",
" <td>R_31hCg7myirwz5YQ</td>\n",
" <td>bJlVpGxcYoFnxWd</td>\n",
" <td>3vDaZhGQglbcdiR</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>semester</td>\n",
" <td>Fall</td>\n",
" <td>NaN</td>\n",
" <td>First year (introductory) lab</td>\n",
" <td>Calculus-based</td>\n",
" <td>9/9/2016</td>\n",
" <td>No incentive</td>\n",
" <td>40</td>\n",
" <td>3</td>\n",
" <td>4 year college</td>\n",
" <td>...</td>\n",
" <td>Never</td>\n",
" <td>11.0</td>\n",
" <td>11.0</td>\n",
" <td>0.0</td>\n",
" <td>16/08/2016 10:39</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>R_2uUuv0cqKVTrZPd</td>\n",
" <td>8elF9Rxw2rrXOvP</td>\n",
" <td>7Vetb0Z6aGRzirH</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>5 rows × 49 columns</p>\n",
"</div>"
],
"text/plain": [
" Q5 Q52 Q53 Q18 Q27 \\\n",
"0 semester Fall NaN First year (introductory) lab Calculus-based \n",
"1 semester Fall NaN First year (introductory) lab Calculus-based \n",
"2 semester Fall NaN Beyond the first year lab NaN \n",
"3 semester Fall NaN First year (introductory) lab Calculus-based \n",
"4 semester Fall NaN First year (introductory) lab Calculus-based \n",
"\n",
" Q6 Q11 Q19 Q20 \\\n",
"0 9/9/2016 No incentive 55 3 \n",
"1 9/3/2016 Credit for completion (like an assignment) 600 37 \n",
"2 8/22/2016 Credit for completion (like an assignment) 23 1 \n",
"3 9/14/2016 Credit for completion (like an assignment) 25 2 \n",
"4 9/9/2016 No incentive 40 3 \n",
"\n",
" Q15 ... Q38_4 Q41 Q42 Q43 \\\n",
"0 4 year college ... Never 11.0 11.0 0.0 \n",
"1 PhD granting institution ... Never 6.0 12.0 0.0 \n",
"2 PhD granting institution ... Always 1.0 3.0 3.0 \n",
"3 4 year college ... Sometimes 7.0 3.0 10.0 \n",
"4 4 year college ... Never 11.0 11.0 0.0 \n",
"\n",
" StartDate anon_instructor_id anon_university_id ResponseId \\\n",
"0 29/07/2016 09:35 0 0 R_9TQGuSY5m31uCgp \n",
"1 30/07/2016 15:16 1 1 R_WcbonOUwNhq1aTv \n",
"2 14/08/2016 20:45 2 2 R_6QckmxZUXlkQ5z3 \n",
"3 15/08/2016 08:32 3 3 R_31hCg7myirwz5YQ \n",
"4 16/08/2016 10:39 0 0 R_2uUuv0cqKVTrZPd \n",
"\n",
" pre_survey_id post_survey_id \n",
"0 eb70O8rWoi7TeEl bdrBNZhS5ctkrtj \n",
"1 enXrvYRTHSwaJiR 3I6hLxNFWEUu5Ex \n",
"2 bK2V8FlTDsDGtJX bpVIjAwWGMGtekJ \n",
"3 bJlVpGxcYoFnxWd 3vDaZhGQglbcdiR \n",
"4 8elF9Rxw2rrXOvP 7Vetb0Z6aGRzirH \n",
"\n",
"[5 rows x 49 columns]"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.head()"
]
},
{
"cell_type": "markdown",
"id": "1875294d-24d1-4efa-9042-7d9aa75fe41b",
"metadata": {},
"source": [
"## Data Cleaning"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "44855b13-cfb3-429e-8ddb-d356fd63f3d8",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Q18</th>\n",
" <th>Q36_1</th>\n",
" <th>Q36_2</th>\n",
" <th>Q37_4</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>First year (introductory) lab</td>\n",
" <td>Rarely</td>\n",
" <td>Rarely</td>\n",
" <td>Always</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>First year (introductory) lab</td>\n",
" <td>Rarely</td>\n",
" <td>Rarely</td>\n",
" <td>Never</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Beyond the first year lab</td>\n",
" <td>Often</td>\n",
" <td>Often</td>\n",
" <td>Often</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>First year (introductory) lab</td>\n",
" <td>Sometimes</td>\n",
" <td>Often</td>\n",
" <td>Sometimes</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>First year (introductory) lab</td>\n",
" <td>Sometimes</td>\n",
" <td>Rarely</td>\n",
" <td>Often</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Q18 Q36_1 Q36_2 Q37_4\n",
"0 First year (introductory) lab Rarely Rarely Always\n",
"1 First year (introductory) lab Rarely Rarely Never\n",
"2 Beyond the first year lab Often Often Often\n",
"3 First year (introductory) lab Sometimes Often Sometimes\n",
"4 First year (introductory) lab Sometimes Rarely Often"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#new dataframe with columns of interest\n",
"df_reg_ana = df[['Q18', 'Q36_1', 'Q36_2', 'Q37_4']]\n",
"df_reg_ana.head()"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "258dd944-c520-4599-b7a7-a5fff19dc728",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"C:\\Users\\stacy\\AppData\\Local\\Temp\\ipykernel_33472\\2660362453.py:2: FutureWarning: Downcasting behavior in `replace` is deprecated and will be removed in a future version. To retain the old behavior, explicitly call `result.infer_objects(copy=False)`. To opt-in to the future behavior, set `pd.set_option('future.no_silent_downcasting', True)`\n",
" df_reg_ana['Q36_1'] = df_reg_ana['Q36_1'].replace({'Never': 1, 'Rarely': 2, 'Sometimes': 3, 'Often': 4, 'Always': 5})\n",
"C:\\Users\\stacy\\AppData\\Local\\Temp\\ipykernel_33472\\2660362453.py:2: SettingWithCopyWarning: \n",
"A value is trying to be set on a copy of a slice from a DataFrame.\n",
"Try using .loc[row_indexer,col_indexer] = value instead\n",
"\n",
"See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
" df_reg_ana['Q36_1'] = df_reg_ana['Q36_1'].replace({'Never': 1, 'Rarely': 2, 'Sometimes': 3, 'Often': 4, 'Always': 5})\n",
"C:\\Users\\stacy\\AppData\\Local\\Temp\\ipykernel_33472\\2660362453.py:3: FutureWarning: Downcasting behavior in `replace` is deprecated and will be removed in a future version. To retain the old behavior, explicitly call `result.infer_objects(copy=False)`. To opt-in to the future behavior, set `pd.set_option('future.no_silent_downcasting', True)`\n",
" df_reg_ana['Q36_2'] = df_reg_ana['Q36_2'].replace({'Never': 1, 'Rarely': 2, 'Sometimes': 3, 'Often': 4, 'Always': 5})\n",
"C:\\Users\\stacy\\AppData\\Local\\Temp\\ipykernel_33472\\2660362453.py:3: SettingWithCopyWarning: \n",
"A value is trying to be set on a copy of a slice from a DataFrame.\n",
"Try using .loc[row_indexer,col_indexer] = value instead\n",
"\n",
"See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
" df_reg_ana['Q36_2'] = df_reg_ana['Q36_2'].replace({'Never': 1, 'Rarely': 2, 'Sometimes': 3, 'Often': 4, 'Always': 5})\n",
"C:\\Users\\stacy\\AppData\\Local\\Temp\\ipykernel_33472\\2660362453.py:4: FutureWarning: Downcasting behavior in `replace` is deprecated and will be removed in a future version. To retain the old behavior, explicitly call `result.infer_objects(copy=False)`. To opt-in to the future behavior, set `pd.set_option('future.no_silent_downcasting', True)`\n",
" df_reg_ana['Q37_4'] = df_reg_ana['Q37_4'].replace({'Never': 1, 'Rarely': 2, 'Sometimes': 3, 'Often': 4, 'Always': 5})\n",
"C:\\Users\\stacy\\AppData\\Local\\Temp\\ipykernel_33472\\2660362453.py:4: SettingWithCopyWarning: \n",
"A value is trying to be set on a copy of a slice from a DataFrame.\n",
"Try using .loc[row_indexer,col_indexer] = value instead\n",
"\n",
"See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
" df_reg_ana['Q37_4'] = df_reg_ana['Q37_4'].replace({'Never': 1, 'Rarely': 2, 'Sometimes': 3, 'Often': 4, 'Always': 5})\n"
]
},
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Q18</th>\n",
" <th>Q36_1</th>\n",
" <th>Q36_2</th>\n",
" <th>Q37_4</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>First year (introductory) lab</td>\n",
" <td>2.0</td>\n",
" <td>2.0</td>\n",
" <td>5.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>First year (introductory) lab</td>\n",
" <td>2.0</td>\n",
" <td>2.0</td>\n",
" <td>1.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Beyond the first year lab</td>\n",
" <td>4.0</td>\n",
" <td>4.0</td>\n",
" <td>4.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>First year (introductory) lab</td>\n",
" <td>3.0</td>\n",
" <td>4.0</td>\n",
" <td>3.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>First year (introductory) lab</td>\n",
" <td>3.0</td>\n",
" <td>2.0</td>\n",
" <td>4.0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Q18 Q36_1 Q36_2 Q37_4\n",
"0 First year (introductory) lab 2.0 2.0 5.0\n",
"1 First year (introductory) lab 2.0 2.0 1.0\n",
"2 Beyond the first year lab 4.0 4.0 4.0\n",
"3 First year (introductory) lab 3.0 4.0 3.0\n",
"4 First year (introductory) lab 3.0 2.0 4.0"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#recode to numerical\n",
"df_reg_ana['Q36_1'] = df_reg_ana['Q36_1'].replace({'Never': 1, 'Rarely': 2, 'Sometimes': 3, 'Often': 4, 'Always': 5})\n",
"df_reg_ana['Q36_2'] = df_reg_ana['Q36_2'].replace({'Never': 1, 'Rarely': 2, 'Sometimes': 3, 'Often': 4, 'Always': 5})\n",
"df_reg_ana['Q37_4'] = df_reg_ana['Q37_4'].replace({'Never': 1, 'Rarely': 2, 'Sometimes': 3, 'Often': 4, 'Always': 5})\n",
"df_reg_ana.head()"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "f4c7de1f-dea4-435d-b86a-7a99283919eb",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>level</th>\n",
" <th>math_models_freq</th>\n",
" <th>conc_models_freq</th>\n",
" <th>comp_int_freq</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>First year (introductory) lab</td>\n",
" <td>2.0</td>\n",
" <td>2.0</td>\n",
" <td>5.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>First year (introductory) lab</td>\n",
" <td>2.0</td>\n",
" <td>2.0</td>\n",
" <td>1.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>Beyond the first year lab</td>\n",
" <td>4.0</td>\n",
" <td>4.0</td>\n",
" <td>4.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>First year (introductory) lab</td>\n",
" <td>3.0</td>\n",
" <td>4.0</td>\n",
" <td>3.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>First year (introductory) lab</td>\n",
" <td>3.0</td>\n",
" <td>2.0</td>\n",
" <td>4.0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" level math_models_freq conc_models_freq \\\n",
"0 First year (introductory) lab 2.0 2.0 \n",
"1 First year (introductory) lab 2.0 2.0 \n",
"2 Beyond the first year lab 4.0 4.0 \n",
"3 First year (introductory) lab 3.0 4.0 \n",
"4 First year (introductory) lab 3.0 2.0 \n",
"\n",
" comp_int_freq \n",
"0 5.0 \n",
"1 1.0 \n",
"2 4.0 \n",
"3 3.0 \n",
"4 4.0 "
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#rename columns\n",
"df_reg_ana = df_reg_ana.rename(columns={'Q18': 'level', 'Q36_1': 'math_models_freq', 'Q36_2': 'conc_models_freq', 'Q37_4': 'comp_int_freq'})\n",
"df_reg_ana.head()"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "3855ee16-deff-4572-aafe-4bcd8cc32cde",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"C:\\Users\\stacy\\AppData\\Local\\Temp\\ipykernel_33472\\4007376843.py:2: FutureWarning: Downcasting behavior in `replace` is deprecated and will be removed in a future version. To retain the old behavior, explicitly call `result.infer_objects(copy=False)`. To opt-in to the future behavior, set `pd.set_option('future.no_silent_downcasting', True)`\n",
" df_reg_ana['level'] = df_reg_ana['level'].replace({'First year (introductory) lab': 1, 'Beyond the first year lab': 0})\n"
]
},
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>level</th>\n",
" <th>math_models_freq</th>\n",
" <th>conc_models_freq</th>\n",
" <th>comp_int_freq</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>1</td>\n",
" <td>2.0</td>\n",
" <td>2.0</td>\n",
" <td>5.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>1</td>\n",
" <td>2.0</td>\n",
" <td>2.0</td>\n",
" <td>1.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>0</td>\n",
" <td>4.0</td>\n",
" <td>4.0</td>\n",
" <td>4.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>1</td>\n",
" <td>3.0</td>\n",
" <td>4.0</td>\n",
" <td>3.0</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>1</td>\n",
" <td>3.0</td>\n",
" <td>2.0</td>\n",
" <td>4.0</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" level math_models_freq conc_models_freq comp_int_freq\n",
"0 1 2.0 2.0 5.0\n",
"1 1 2.0 2.0 1.0\n",
"2 0 4.0 4.0 4.0\n",
"3 1 3.0 4.0 3.0\n",
"4 1 3.0 2.0 4.0"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#recode level (1 = first year, 0 = beyond first year), calc_or_alg (1 = alg, 0 = calc), inst_type (1 = 2y, 2 = 4y, 3 = PhD granting)\n",
"df_reg_ana['level'] = df_reg_ana['level'].replace({'First year (introductory) lab': 1, 'Beyond the first year lab': 0})\n",
"df_reg_ana.head()"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "89122494-60c3-4230-b292-abec915e0f6c",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<class 'pandas.core.frame.DataFrame'>\n",
"RangeIndex: 494 entries, 0 to 493\n",
"Data columns (total 4 columns):\n",
" # Column Non-Null Count Dtype \n",
"--- ------ -------------- ----- \n",
" 0 level 494 non-null int64 \n",
" 1 math_models_freq 220 non-null float64\n",
" 2 conc_models_freq 220 non-null float64\n",
" 3 comp_int_freq 220 non-null float64\n",
"dtypes: float64(3), int64(1)\n",
"memory usage: 15.6 KB\n"
]
}
],
"source": [
"df_reg_ana.info()"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "475f73b7-f2a7-45b8-9b1d-6389d8291e96",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<class 'pandas.core.frame.DataFrame'>\n",
"Index: 220 entries, 0 to 493\n",
"Data columns (total 4 columns):\n",
" # Column Non-Null Count Dtype \n",
"--- ------ -------------- ----- \n",
" 0 level 220 non-null int64 \n",
" 1 math_models_freq 220 non-null float64\n",
" 2 conc_models_freq 220 non-null float64\n",
" 3 comp_int_freq 220 non-null float64\n",
"dtypes: float64(3), int64(1)\n",
"memory usage: 8.6 KB\n"
]
}
],
"source": [
"#drop missing values\n",
"df_reg_ana.dropna(inplace=True)\n",
"df_reg_ana.info()"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "d16c8ce1-a114-46c3-af02-588dca3a7717",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<class 'pandas.core.frame.DataFrame'>\n",
"Index: 93 entries, 2 to 483\n",
"Data columns (total 4 columns):\n",
" # Column Non-Null Count Dtype \n",
"--- ------ -------------- ----- \n",
" 0 level 93 non-null int64 \n",
" 1 math_models_freq 93 non-null float64\n",
" 2 conc_models_freq 93 non-null float64\n",
" 3 comp_int_freq 93 non-null float64\n",
"dtypes: float64(3), int64(1)\n",
"memory usage: 3.6 KB\n"
]
}
],
"source": [
"#drop non-introductory lab courses\n",
"df_reg_ana = df_reg_ana[df_reg_ana['level'] != 1]\n",
"df_reg_ana.info()"
]
},
{
"cell_type": "markdown",
"id": "continental-franklin",
"metadata": {},
"source": [
"**Data Overview**\n",
"\n",
"The Lewandoski Lab at the University of Colorado Boulder developed and validated the Colorado Learning Attitudes about Science Survey for Experimental Physics (E-CLASS) data collection instrument, and made the dataset publicly accessible and available for additional research use (https://github.com/Lewandowski-Labs-PER/eclass-public/tree/master). This is a well-established Physics Education Research (and Experimental Cold Molecular Physics) lab at an R1 university that has published many studies concerning this and other datasets (https://jila.colorado.edu/lewandowski/publications/scientific/year).\n",
"\n",
"The course information survey of the E-CLASS contains 494 responses (rows) from instructors (pre-cleaning), for each class they administered the E-CLASS survey to. Columns which will be important to me are Q18 (level of the course), Q36_1 (Modeling - Develop mathematical models for the system being studied), Q36_2 (Modeling - Develop conceptual models for the system being studied), and Q37_4 (Data analysis and visualization - Use computers to interface with measurement devices)."
]
},
{
"cell_type": "markdown",
"id": "e1561989-c267-413a-9510-c003609f4ca5",
"metadata": {},
"source": [
"## Data Distributions"
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "7289c0e1-8b9d-4625-aae1-f6abeec4c91d",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"level 0.000000\n",
"math_models_freq -0.025612\n",
"conc_models_freq 0.019958\n",
"comp_int_freq 0.416858\n",
"dtype: float64\n"
]
}
],
"source": [
"fischer_kurtosis = df_reg_ana.kurtosis()\n",
"print(fischer_kurtosis)"
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "dc1776a6-c39e-41d4-81a1-8e587ce2842d",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"level 0.000000\n",
"math_models_freq -0.202256\n",
"conc_models_freq -0.407829\n",
"comp_int_freq -0.648785\n",
"dtype: float64\n"
]
}
],
"source": [
"skewness = df_reg_ana.skew()\n",
"print(skewness)"
]
},
{
"cell_type": "code",
"execution_count": 21,
"id": "c6c88247-f676-4101-a24a-96b94b21d644",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<Axes: >"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAh8AAAGdCAYAAACyzRGfAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjguNCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8fJSN1AAAACXBIWXMAAA9hAAAPYQGoP6dpAAAmfUlEQVR4nO3df2zU52HH8c8ZH19wsEmIa/ssPJdQky5xQBGkYPIDCLETJ0MQuqmbqwi2tc0WwsK8igQQyrEEk6KJ0QrVW5aIkU2W0cbIOiUB37TYNKNMNgGFsJZRxSFeYseCgs/Y5Djwsz86n2JszH3N3eP7fvt+SSdy33vuueeTxz5/9L2zL2CMMQIAALAka7wXAAAAfrNQPgAAgFWUDwAAYBXlAwAAWEX5AAAAVlE+AACAVZQPAABgFeUDAABYlT3eC7jWwMCAPvvsM+Xm5ioQCIz3cgAAQBKMMert7VVxcbGyskY/t5Fx5eOzzz5TSUnJeC8DAACMQUdHh6ZPnz7qmIwrH7m5uZJ+vfi8vLyUzh2Px9XU1KSqqioFg8GUzp0J/J5P8n9G8nmf3zP6PZ/k/4zpyheNRlVSUpL4OT6ajCsfgy+15OXlpaV85OTkKC8vz7dfUH7OJ/k/I/m8z+8Z/Z5P8n/GdOdL5i0TvOEUAABYdVPlY9u2bQoEAlq3bl3imDFG4XBYxcXFmjx5shYvXqyTJ0/e7DoBAIBPjLl8tLa26tVXX9Xs2bOHHN++fbt27NihXbt2qbW1VUVFRaqsrFRvb+9NLxYAAHjfmMrHxYsX9e1vf1t/93d/p9tuuy1x3BijnTt3atOmTVq5cqXKy8u1Z88e9ff3q6GhIWWLBgAA3jWmN5yuWbNGTzzxhB555BG9/PLLiePt7e3q6upSVVVV4pjjOFq0aJEOHz6sp59+ethcsVhMsVgscT0ajUr69Rti4vH4WJZ3XYPzpXreTOH3fJL/M5LP+/ye0e/5JP9nTFc+N/O5Lh+NjY16//331draOuy2rq4uSVJhYeGQ44WFhTpz5syI823btk1btmwZdrypqUk5OTlul5eUSCSSlnkzhd/zSf7PSD7v83tGv+eT/J8x1fn6+/uTHuuqfHR0dOi5555TU1OTJk2adN1x1/6ajTHmur96s2HDBtXW1iauD/6ecFVVVVp+1TYSiaiystK3vz7l53yS/zOSz/v8ntHv+ST/Z0xXvsFXLpLhqnwcPXpU3d3dmjt3buLY1atXdejQIe3atUunTp2S9OszIKFQKDGmu7t72NmQQY7jyHGcYceDwWDaNj2dc2cCv+eT/J+RfN7n94x+zyf5P2Oq87mZy9UbTpcuXaoTJ07o+PHjicu8efP07W9/W8ePH9cdd9yhoqKiIadyLl++rJaWFi1cuNDNQwEAAJ9ydeYjNzdX5eXlQ47dcsstuv322xPH161bp7q6OpWVlamsrEx1dXXKyclRTU1N6lYNAAA8K+V/Xn39+vW6dOmSnnnmGZ0/f17z589XU1NTUn/rHQAA+N9Nl4/m5uYh1wOBgMLhsMLh8M1ODQAAfIjPdgEAAFZRPgAAgFUpf88HAHu++sJb470EV5wJRtu/IZWHDyp29cYfu51JPn7lifFeAuAbnPkAAABWUT4AAIBVlA8AAGAV5QMAAFhF+QAAAFZRPgAAgFWUDwAAYBXlAwAAWEX5AAAAVlE+AACAVZQPAABgFeUDAABYRfkAAABWUT4AAIBVlA8AAGAV5QMAAFhF+QAAAFZRPgAAgFWUDwAAYBXlAwAAWEX5AAAAVlE+AACAVZQPAABgFeUDAABYRfkAAABWUT4AAIBVlA8AAGAV5QMAAFhF+QAAAFZRPgAAgFWUDwAAYJWr8lFfX6/Zs2crLy9PeXl5qqio0DvvvJO4ffXq1QoEAkMuCxYsSPmiAQCAd2W7GTx9+nS98sor+trXviZJ2rNnj5YvX65jx47p7rvvliQ99thj2r17d+I+EydOTOFyAQCA17kqH8uWLRtyfevWraqvr9eRI0cS5cNxHBUVFaVuhQAAwFdclY8vu3r1qv7pn/5JfX19qqioSBxvbm5WQUGBbr31Vi1atEhbt25VQUHBdeeJxWKKxWKJ69FoVJIUj8cVj8fHurwRDc6X6nkzhd/zSf7P6DafM8Gkczkp52SZIf96SbJ7wteo9/k9Y7ryuZkvYIxx9Sxw4sQJVVRU6IsvvtCUKVPU0NCgxx9/XJK0d+9eTZkyRaWlpWpvb9fmzZt15coVHT16VI7jjDhfOBzWli1bhh1vaGhQTk6Om6UBAIBx0t/fr5qaGvX09CgvL2/Usa7Lx+XLl/XJJ5/owoUL2rdvn1577TW1tLTorrvuGja2s7NTpaWlamxs1MqVK0ecb6QzHyUlJTp79uwNF+9WPB5XJBJRZWWlgsFgSufOBH7PJ/k/o9t85eGDFlaVOk6W0UvzBrS5LUuxgcB4L8eVD8OPJjWOr1Hv83vGdOWLRqPKz89Pqny4ftll4sSJiTeczps3T62trfrhD3+ov/3bvx02NhQKqbS0VKdPn77ufI7jjHhWJBgMpm3T0zl3JvB7Psn/GZPNF7vqrR/gg2IDAc+t3e3XG1+j3uf3jKnO52aum/47H8aYIWcuvuzcuXPq6OhQKBS62YcBAAA+4erMx8aNG1VdXa2SkhL19vaqsbFRzc3NOnDggC5evKhwOKxvfvObCoVC+vjjj7Vx40bl5+frySefTNf6AQCAx7gqH59//rmeeuopdXZ2aurUqZo9e7YOHDigyspKXbp0SSdOnNAbb7yhCxcuKBQKacmSJdq7d69yc3PTtX4AAOAxrsrH66+/ft3bJk+erIMHvfXmNwAAYB+f7QIAAKyifAAAAKsoHwAAwCrKBwAAsIryAQAArKJ8AAAAqygfAADAKsoHAACwivIBAACsonwAAACrKB8AAMAqygcAALCK8gEAAKyifAAAAKsoHwAAwCrKBwAAsIryAQAArKJ8AAAAqygfAADAKsoHAACwivIBAACsonwAAACrKB8AAMAqygcAALCK8gEAAKyifAAAAKsoHwAAwCrKBwAAsIryAQAArKJ8AAAAqygfAADAKsoHAACwivIBAACsonwAAACrXJWP+vp6zZ49W3l5ecrLy1NFRYXeeeedxO3GGIXDYRUXF2vy5MlavHixTp48mfJFAwAA73JVPqZPn65XXnlFbW1tamtr08MPP6zly5cnCsb27du1Y8cO7dq1S62trSoqKlJlZaV6e3vTsngAAOA9rsrHsmXL9Pjjj2vWrFmaNWuWtm7dqilTpujIkSMyxmjnzp3atGmTVq5cqfLycu3Zs0f9/f1qaGhI1/oBAIDHjPk9H1evXlVjY6P6+vpUUVGh9vZ2dXV1qaqqKjHGcRwtWrRIhw8fTsliAQCA92W7vcOJEydUUVGhL774QlOmTNH+/ft11113JQpGYWHhkPGFhYU6c+bMdeeLxWKKxWKJ69FoVJIUj8cVj8fdLm9Ug/Olet5M4fd8kv8zus3nTDDpXE7KOVlmyL9ekuye8DXqfX7PmK58buYLGGNcPQtcvnxZn3zyiS5cuKB9+/bptddeU0tLiy5cuKD7779fn332mUKhUGL8d7/7XXV0dOjAgQMjzhcOh7Vly5ZhxxsaGpSTk+NmaQAAYJz09/erpqZGPT09ysvLG3Ws6/JxrUceeUQzZ87U888/r5kzZ+r999/Xvffem7h9+fLluvXWW7Vnz54R7z/SmY+SkhKdPXv2hot3Kx6PKxKJqLKyUsFgMKVzZwK/55P8n9FtvvLwQQurSh0ny+ileQPa3Jal2EBgvJeTFn7PmO58H4YfTfmcbvE8MzbRaFT5+flJlQ/XL7tcyxijWCymGTNmqKioSJFIJFE+Ll++rJaWFv3gBz+47v0dx5HjOMOOB4PBtG16OufOBH7PJ/k/Y7L5Yle9+cMtNhDw7NqT5feM6cqXSd/XPM+4ny9ZrsrHxo0bVV1drZKSEvX29qqxsVHNzc06cOCAAoGA1q1bp7q6OpWVlamsrEx1dXXKyclRTU2N6xAAAMCfXJWPzz//XE899ZQ6Ozs1depUzZ49WwcOHFBlZaUkaf369bp06ZKeeeYZnT9/XvPnz1dTU5Nyc3PTsngAAOA9rsrH66+/PurtgUBA4XBY4XD4ZtYEAAB8jM92AQAAVlE+AACAVZQPAABgFeUDAABYRfkAAABWUT4AAIBVlA8AAGAV5QMAAFhF+QAAAFZRPgAAgFWUDwAAYBXlAwAAWEX5AAAAVlE+AACAVZQPAABgFeUDAABYRfkAAABWUT4AAIBVlA8AAGAV5QMAAFhF+QAAAFZRPgAAgFWUDwAAYBXlAwAAWEX5AAAAVlE+AACAVZQPAABgFeUDAABYRfkAAABWUT4AAIBVlA8AAGAV5QMAAFhF+QAAAFZRPgAAgFWuyse2bdt03333KTc3VwUFBVqxYoVOnTo1ZMzq1asVCASGXBYsWJDSRQMAAO9yVT5aWlq0Zs0aHTlyRJFIRFeuXFFVVZX6+vqGjHvsscfU2dmZuLz99tspXTQAAPCubDeDDxw4MOT67t27VVBQoKNHj+qhhx5KHHccR0VFRalZIQAA8JWbes9HT0+PJGnatGlDjjc3N6ugoECzZs3Sd7/7XXV3d9/MwwAAAB9xdebjy4wxqq2t1QMPPKDy8vLE8erqav3e7/2eSktL1d7ers2bN+vhhx/W0aNH5TjOsHlisZhisVjiejQalSTF43HF4/GxLm9Eg/Olet5M4fd8kv8zus3nTDDpXE7KOVlmyL9+5PeM6c6XCd/bPM/c3LzJCBhjxvQVtGbNGr311lt67733NH369OuO6+zsVGlpqRobG7Vy5cpht4fDYW3ZsmXY8YaGBuXk5IxlaQAAwLL+/n7V1NSop6dHeXl5o44dU/lYu3at3nzzTR06dEgzZsy44fiysjJ95zvf0fPPPz/stpHOfJSUlOjs2bM3XLxb8XhckUhElZWVCgaDKZ07E/g9n+T/jG7zlYcPWlhV6jhZRi/NG9DmtizFBgLjvZy08HvGdOf7MPxoyud0i+eZsYlGo8rPz0+qfLh62cUYo7Vr12r//v1qbm5OqnicO3dOHR0dCoVCI97uOM6IL8cEg8G0bXo6584Efs8n+T9jsvliV735wy02EPDs2pPl94zpypdJ39c8z7ifL1mu3nC6Zs0a/eM//qMaGhqUm5urrq4udXV16dKlS5Kkixcv6vvf/75+9rOf6eOPP1Zzc7OWLVum/Px8Pfnkk+5SAAAAX3J15qO+vl6StHjx4iHHd+/erdWrV2vChAk6ceKE3njjDV24cEGhUEhLlizR3r17lZubm7JFAwAA73L9sstoJk+erIMHvfUaNAAAsIvPdgEAAFZRPgAAgFWUDwAAYBXlAwAAWEX5AAAAVlE+AACAVZQPAABgFeUDAABYRfkAAABWUT4AAIBVlA8AAGAV5QMAAFhF+QAAAFZRPgAAgFWUDwAAYBXlAwAAWEX5AAAAVlE+AACAVZQPAABgFeUDAABYRfkAAABWUT4AAIBVlA8AAGAV5QMAAFhF+QAAAFZRPgAAgFWUDwAAYBXlAwAAWEX5AAAAVlE+AACAVZQPAABgFeUDAABYRfkAAABWUT4AAIBVlA8AAGCVq/Kxbds23XfffcrNzVVBQYFWrFihU6dODRljjFE4HFZxcbEmT56sxYsX6+TJkyldNAAA8C5X5aOlpUVr1qzRkSNHFIlEdOXKFVVVVamvry8xZvv27dqxY4d27dql1tZWFRUVqbKyUr29vSlfPAAA8J5sN4MPHDgw5Pru3btVUFCgo0eP6qGHHpIxRjt37tSmTZu0cuVKSdKePXtUWFiohoYGPf3006lbOQAA8CRX5eNaPT09kqRp06ZJktrb29XV1aWqqqrEGMdxtGjRIh0+fHjE8hGLxRSLxRLXo9GoJCkejysej9/M8oYZnC/V82YKv+eT/J/RbT5ngknnclLOyTJD/vUjv2dMd75M+N7meebm5k1GwBgzpq8gY4yWL1+u8+fP66c//akk6fDhw7r//vv16aefqri4ODH2e9/7ns6cOaODBw8OmyccDmvLli3Djjc0NCgnJ2csSwMAAJb19/erpqZGPT09ysvLG3XsmM98PPvss/rggw/03nvvDbstEAgMuW6MGXZs0IYNG1RbW5u4Ho1GVVJSoqqqqhsu3q14PK5IJKLKykoFg8GUzp0J/J5P8n9Gt/nKw8MLfSZzsoxemjegzW1Zig2M/JzgdX7PmO58H4YfTfmcbvE8MzaDr1wkY0zlY+3atfrJT36iQ4cOafr06YnjRUVFkqSuri6FQqHE8e7ubhUWFo44l+M4chxn2PFgMJi2TU/n3JnA7/kk/2dMNl/sqjd/uMUGAp5de7L8njFd+TLp+5rnGffzJcvVb7sYY/Tss8/qX/7lX/Qf//EfmjFjxpDbZ8yYoaKiIkUikcSxy5cvq6WlRQsXLnTzUAAAwKdcnflYs2aNGhoa9K//+q/Kzc1VV1eXJGnq1KmaPHmyAoGA1q1bp7q6OpWVlamsrEx1dXXKyclRTU1NWgIAAABvcVU+6uvrJUmLFy8ecnz37t1avXq1JGn9+vW6dOmSnnnmGZ0/f17z589XU1OTcnNzU7JgAADgba7KRzK/GBMIBBQOhxUOh8e6JgAA4GN8tgsAALCK8gEAAKyifAAAAKsoHwAAwCrKBwAAsIryAQAArKJ8AAAAqygfAADAKsoHAACwivIBAACsonwAAACrKB8AAMAqygcAALCK8gEAAKyifAAAAKsoHwAAwCrKBwAAsIryAQAArKJ8AAAAqygfAADAKsoHAACwivIBAACsonwAAACrKB8AAMAqygcAALCK8gEAAKyifAAAAKsoHwAAwCrKBwAAsIryAQAArKJ8AAAAqygfAADAKsoHAACwivIBAACscl0+Dh06pGXLlqm4uFiBQEBvvvnmkNtXr16tQCAw5LJgwYJUrRcAAHic6/LR19enOXPmaNeuXdcd89hjj6mzszNxefvtt29qkQAAwD+y3d6hurpa1dXVo45xHEdFRUVjXhQAAPAv1+UjGc3NzSooKNCtt96qRYsWaevWrSooKBhxbCwWUywWS1yPRqOSpHg8rng8ntJ1Dc6X6nkzhd/zSf7P6DafM8Gkczkp52SZIf/6kd8zpjtfJnxv8zxzc/MmI2CMGfNXUCAQ0P79+7VixYrEsb1792rKlCkqLS1Ve3u7Nm/erCtXrujo0aNyHGfYHOFwWFu2bBl2vKGhQTk5OWNdGgAAsKi/v181NTXq6elRXl7eqGNTXj6u1dnZqdLSUjU2NmrlypXDbh/pzEdJSYnOnj17w8W7FY/HFYlEVFlZqWAwmNK5M4Hf80n+z+g2X3n4oIVVpY6TZfTSvAFtbstSbCAw3stJC79nTHe+D8OPpnxOt3ieGZtoNKr8/PykykdaXnb5slAopNLSUp0+fXrE2x3HGfGMSDAYTNump3PuTOD3fJL/MyabL3bVmz/cYgMBz649WX7PmK58mfR9zfOM+/mSlfa/83Hu3Dl1dHQoFAql+6EAAIAHuD7zcfHiRf3yl79MXG9vb9fx48c1bdo0TZs2TeFwWN/85jcVCoX08ccfa+PGjcrPz9eTTz6Z0oUDAABvcl0+2tratGTJksT12tpaSdKqVatUX1+vEydO6I033tCFCxcUCoW0ZMkS7d27V7m5ualbNQAA8CzX5WPx4sUa7T2qBw966w1wAADALj7bBQAAWEX5AAAAVlE+AACAVZQPAABgFeUDAABYRfkAAABWUT4AAIBVlA8AAGAV5QMAAFhF+QAAAFZRPgAAgFWUDwAAYBXlAwAAWEX5AAAAVlE+AACAVZQPAABgFeUDAABYRfkAAABWUT4AAIBVlA8AAGBV9ngvAP701RfeSsu8zgSj7d+QysMHFbsaSMtjjCe/5wMAiTMfAADAMsoHAACwivIBAACsonwAAACrKB8AAMAqygcAALCK8gEAAKyifAAAAKsoHwAAwCrKBwAAsIryAQAArKJ8AAAAqygfAADAKtfl49ChQ1q2bJmKi4sVCAT05ptvDrndGKNwOKzi4mJNnjxZixcv1smTJ1O1XgAA4HGuy0dfX5/mzJmjXbt2jXj79u3btWPHDu3atUutra0qKipSZWWlent7b3qxAADA+7Ld3qG6ulrV1dUj3maM0c6dO7Vp0yatXLlSkrRnzx4VFhaqoaFBTz/99M2tFgAAeJ7r8jGa9vZ2dXV1qaqqKnHMcRwtWrRIhw8fHrF8xGIxxWKxxPVoNCpJisfjisfjqVxeYr5Uz5spMimfM8GkZ94sM+RfvyGf9/k9Y7rzZcLzVyY9l6ZDuvK5mS9gjBnzV1AgEND+/fu1YsUKSdLhw4d1//3369NPP1VxcXFi3Pe+9z2dOXNGBw8eHDZHOBzWli1bhh1vaGhQTk7OWJcGAAAs6u/vV01NjXp6epSXlzfq2JSe+RgUCASGXDfGDDs2aMOGDaqtrU1cj0ajKikpUVVV1Q0X71Y8HlckElFlZaWCwWBK584EmZSvPDy8aKaCk2X00rwBbW7LUmxg5K8pLyOf9/k9Y7rzfRh+NOVzupVJz6XpkK58g69cJCOl5aOoqEiS1NXVpVAolDje3d2twsLCEe/jOI4cxxl2PBgMpm3T0zl3JsiEfLGr6X3SjQ0E0v4Y44l83uf3jOnKN97PXV+WCc+l6ZTqfG7mSunf+ZgxY4aKiooUiUQSxy5fvqyWlhYtXLgwlQ8FAAA8yvWZj4sXL+qXv/xl4np7e7uOHz+uadOm6bd+67e0bt061dXVqaysTGVlZaqrq1NOTo5qampSunAAAOBNrstHW1ublixZkrg++H6NVatW6e///u+1fv16Xbp0Sc8884zOnz+v+fPnq6mpSbm5ualbNQAA8CzX5WPx4sUa7RdkAoGAwuGwwuHwzawLAAD4FJ/tAgAArKJ8AAAAq9Lydz4AABiLr77w1ngvQc4Eo+3f+PXfK/Ljr0sP5htPnPkAAABWUT4AAIBVlA8AAGAV5QMAAFhF+QAAAFZRPgAAgFWUDwAAYBXlAwAAWEX5AAAAVlE+AACAVZQPAABgFeUDAABYRfkAAABWUT4AAIBVlA8AAGAV5QMAAFhF+QAAAFZRPgAAgFWUDwAAYBXlAwAAWEX5AAAAVlE+AACAVZQPAABgFeUDAABYRfkAAABWUT4AAIBVlA8AAGAV5QMAAFhF+QAAAFZRPgAAgFWUDwAAYFXKy0c4HFYgEBhyKSoqSvXDAAAAj8pOx6R33323/v3f/z1xfcKECel4GAAA4EFpKR/Z2dmc7QAAACNKS/k4ffq0iouL5TiO5s+fr7q6Ot1xxx0jjo3FYorFYonr0WhUkhSPxxWPx1O6rsH5Uj1vpsikfM4Ek555s8yQf/2GfN7n94x+zyf5P+NgrnT9jE1GwBiT0v+777zzjvr7+zVr1ix9/vnnevnll/WLX/xCJ0+e1O233z5sfDgc1pYtW4Ydb2hoUE5OTiqXBgAA0qS/v181NTXq6elRXl7eqGNTXj6u1dfXp5kzZ2r9+vWqra0ddvtIZz5KSkp09uzZGy7erXg8rkgkosrKSgWDwZTOnQkyKV95+GBa5nWyjF6aN6DNbVmKDQTS8hjjiXze5/eMfs8n+T/jYL5U/6yIRqPKz89Pqnyk5WWXL7vlllt0zz336PTp0yPe7jiOHMcZdjwYDKbtB2g6584EmZAvdjW937CxgUDaH2M8kc/7/J7R7/kk/2dM9c8KN3Ol/e98xGIx/fznP1coFEr3QwEAAA9Iefn4/ve/r5aWFrW3t+u//uu/9Lu/+7uKRqNatWpVqh8KAAB4UMpfdvnf//1f/cEf/IHOnj2rr3zlK1qwYIGOHDmi0tLSVD8UAADwoJSXj8bGxlRPCQAAfITPdgEAAFZRPgAAgFWUDwAAYBXlAwAAWEX5AAAAVlE+AACAVZQPAABgFeUDAABYRfkAAABWUT4AAIBVlA8AAGBVyj/bxQvKwwcVuxoY72WknDPBaPs3/JsPAOAPnPkAAABWUT4AAIBVlA8AAGAV5QMAAFhF+QAAAFZRPgAAgFWUDwAAYBXlAwAAWEX5AAAAVlE+AACAVZQPAABgFeUDAABYRfkAAABWUT4AAIBVlA8AAGAV5QMAAFhF+QAAAFZRPgAAgFWUDwAAYBXlAwAAWEX5AAAAVlE+AACAVWkrHz/+8Y81Y8YMTZo0SXPnztVPf/rTdD0UAADwkLSUj71792rdunXatGmTjh07pgcffFDV1dX65JNP0vFwAADAQ9JSPnbs2KE//uM/1ne+8x399m//tnbu3KmSkhLV19en4+EAAICHZKd6wsuXL+vo0aN64YUXhhyvqqrS4cOHh42PxWKKxWKJ6z09PZKkX/3qV4rH4yldWzweV39/v7LjWbo6EEjp3Jkge8Cov3/At/kk/2ckn/f5PaPf80n+zziY79y5cwoGgymbt7e3V5JkjLnxYJNin376qZFk/vM//3PI8a1bt5pZs2YNG//iiy8aSVy4cOHChQsXH1w6Ojpu2BVSfuZjUCAwtC0aY4Ydk6QNGzaotrY2cX1gYEC/+tWvdPvtt484/mZEo1GVlJSoo6NDeXl5KZ07E/g9n+T/jOTzPr9n9Hs+yf8Z05XPGKPe3l4VFxffcGzKy0d+fr4mTJigrq6uIce7u7tVWFg4bLzjOHIcZ8ixW2+9NdXLGiIvL8+XX1CD/J5P8n9G8nmf3zP6PZ/k/4zpyDd16tSkxqX8DacTJ07U3LlzFYlEhhyPRCJauHBhqh8OAAB4TFpedqmtrdVTTz2lefPmqaKiQq+++qo++eQT/cmf/Ek6Hg4AAHhIWsrHt771LZ07d05/+Zd/qc7OTpWXl+vtt99WaWlpOh4uaY7j6MUXXxz2Mo9f+D2f5P+M5PM+v2f0ez7J/xkzIV/AmGR+JwYAACA1+GwXAABgFeUDAABYRfkAAABWUT4AAIBVviofhw4d0rJly1RcXKxAIKA333zzhvdpaWnR3LlzNWnSJN1xxx36m7/5m/QvdIzc5mtublYgEBh2+cUvfmFnwS5t27ZN9913n3Jzc1VQUKAVK1bo1KlTN7yfV/ZwLPm8tIf19fWaPXt24g8XVVRU6J133hn1Pl7Zu0FuM3pp/0aybds2BQIBrVu3btRxXtvHQcnk89oehsPhYWstKioa9T7jsX++Kh99fX2aM2eOdu3aldT49vZ2Pf7443rwwQd17Ngxbdy4UX/2Z3+mffv2pXmlY+M236BTp06ps7MzcSkrK0vTCm9OS0uL1qxZoyNHjigSiejKlSuqqqpSX1/fde/jpT0cS75BXtjD6dOn65VXXlFbW5va2tr08MMPa/ny5Tp58uSI4720d4PcZhzkhf27Vmtrq1599VXNnj171HFe3Ecp+XyDvLSHd99995C1njhx4rpjx23/UvJpchlIktm/f/+oY9avX2++/vWvDzn29NNPmwULFqRxZamRTL53333XSDLnz5+3sqZU6+7uNpJMS0vLdcd4eQ+Tyef1PbztttvMa6+9NuJtXt67Lxsto1f3r7e315SVlZlIJGIWLVpknnvuueuO9eI+usnntT188cUXzZw5c5IeP17756szH2797Gc/U1VV1ZBjjz76qNra2hSPx8dpVal37733KhQKaenSpXr33XfHezlJ6+npkSRNmzbtumO8vIfJ5BvktT28evWqGhsb1dfXp4qKihHHeHnvpOQyDvLa/q1Zs0ZPPPGEHnnkkRuO9eI+usk3yEt7ePr0aRUXF2vGjBn6/d//fX300UfXHTte+5e2T7X1gq6urmEfdldYWKgrV67o7NmzCoVC47Sy1AiFQnr11Vc1d+5cxWIx/cM//IOWLl2q5uZmPfTQQ+O9vFEZY1RbW6sHHnhA5eXl1x3n1T1MNp/X9vDEiROqqKjQF198oSlTpmj//v266667Rhzr1b1zk9Fr+ydJjY2Nev/999Xa2prUeK/to9t8XtvD+fPn64033tCsWbP0+eef6+WXX9bChQt18uRJ3X777cPGj9f+/UaXD0kKBAJDrpv//4Ov1x73ojvvvFN33nln4npFRYU6Ojr0V3/1Vxn5TfNlzz77rD744AO99957NxzrxT1MNp/X9vDOO+/U8ePHdeHCBe3bt0+rVq1SS0vLdX84e3Hv3GT02v51dHToueeeU1NTkyZNmpT0/byyj2PJ57U9rK6uTvz3Pffco4qKCs2cOVN79uxRbW3tiPcZj/37jX7ZpaioSF1dXUOOdXd3Kzs7e8SG6AcLFizQ6dOnx3sZo1q7dq1+8pOf6N1339X06dNHHevFPXSTbySZvIcTJ07U1772Nc2bN0/btm3TnDlz9MMf/nDEsV7cO8ldxpFk8v4dPXpU3d3dmjt3rrKzs5Wdna2Wlhb96Ec/UnZ2tq5evTrsPl7ax7HkG0km7+G1brnlFt1zzz3XXe947d9v9JmPiooK/du//duQY01NTZo3b56CweA4rSq9jh07lnGnQQcZY7R27Vrt379fzc3NmjFjxg3v46U9HEu+kWTyHl7LGKNYLDbibV7au9GMlnEkmbx/S5cuHfabEX/4h3+or3/963r++ec1YcKEYffx0j6OJd9IMnkPrxWLxfTzn/9cDz744Ii3j9v+pfXtrJb19vaaY8eOmWPHjhlJZseOHebYsWPmzJkzxhhjXnjhBfPUU08lxn/00UcmJyfH/Pmf/7n57//+b/P666+bYDBo/vmf/3m8IozKbb6//uu/Nvv37zf/8z//Yz788EPzwgsvGElm37594xVhVH/6p39qpk6dapqbm01nZ2fi0t/fnxjj5T0cSz4v7eGGDRvMoUOHTHt7u/nggw/Mxo0bTVZWlmlqajLGeHvvBrnN6KX9u55rfxvED/v4ZTfK57U9/Iu/+AvT3NxsPvroI3PkyBHzO7/zOyY3N9d8/PHHxpjM2T9flY/BX4m69rJq1SpjjDGrVq0yixYtGnKf5uZmc++995qJEyear371q6a+vt7+wpPkNt8PfvADM3PmTDNp0iRz2223mQceeMC89dZb47P4JIyUTZLZvXt3YoyX93As+by0h3/0R39kSktLzcSJE81XvvIVs3Tp0sQPZWO8vXeD3Gb00v5dz7U/nP2wj192o3xe28NvfetbJhQKmWAwaIqLi83KlSvNyZMnE7dnyv4FjPn/d5YAAABY8Bv9hlMAAGAf5QMAAFhF+QAAAFZRPgAAgFWUDwAAYBXlAwAAWEX5AAAAVlE+AACAVZQPAABgFeUDAABYRfkAAABWUT4AAIBV/wfJ9iVJ5LkfaAAAAABJRU5ErkJggg==",
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"df_reg_ana['math_models_freq'].hist(bins=5)"
]
},
{
"cell_type": "code",
"execution_count": 22,
"id": "6bf316b7-2d94-445e-9b25-1317cf64fd31",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<Axes: >"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "",
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"df_reg_ana['conc_models_freq'].hist(bins=5)"
]
},
{
"cell_type": "code",
"execution_count": 23,
"id": "1889f08e-e0cc-4702-9f50-c0ae9dcdf0be",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<Axes: >"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "",
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"df_reg_ana['comp_int_freq'].hist(bins=5)"
]
},
{
"cell_type": "markdown",
"id": "infinite-instrument",
"metadata": {},
"source": [
"# Methods and Results"
]
},
{
"cell_type": "code",
"execution_count": 25,
"id": "basic-canadian",
"metadata": {},
"outputs": [],
"source": [
"#Import any helper files you need here\n",
"import scipy as sp\n",
"import statsmodels.miscmodels.ordinal_model as om\n",
"import seaborn as sns"
]
},
{
"cell_type": "markdown",
"id": "recognized-positive",
"metadata": {},
"source": [
"## First Research Question: Is there a significant relationship between the instructor's rating of mathematical focus frequency in introductory physics labs and the reported frequency of use of computer to interface with measurement devices?"
]
},
{
"cell_type": "markdown",
"id": "graduate-palmer",
"metadata": {},
"source": [
"### Methods"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "endless-variation",
"metadata": {},
"source": [
"I will perform an ordinal regression to model this data (using the Statsmodels library), since the numerical values are derived from a Likert scale, and each point on the scale is ordered but the differences between each of the points are not necessarily equal. The distribution of math_models_freq was approximately normal for (based on -1.0 < kurtosis < 1.0 and -0.5 < skewness < 0.5). The distribution of comp_int_freq is slightly left-skewed, so logit transformation will be used. I will present the answer in a scatterplot (using the Seaborn library) and with ordinal regression statistics.\n",
"\n",
"I will use math_models_freq and comp_int_freq, which are the instructor-reported frequencies of how often students develop mathematical models for the system being studied and use computers to interface with measurement devices.\r\n",
")\n",
"\n",
"I cleaned the data by first creating a new dataframe, df_reg_ana, containing just the variables of interest, renaming the columns with more descriptive names, recoding the Likert-scale for the reported frequencies (Never, Rarely, Sometimes, Often, Always) to a 5-point numerical scale (1, 2, 3, 4, 5). Then I dropped missing values and kept only the rows for introductory lab coursive"
]
},
{
"cell_type": "markdown",
"id": "portuguese-japan",
"metadata": {},
"source": [
"### Results "
]
},
{
"cell_type": "code",
"execution_count": 30,
"id": "3ccedb81-36bf-4c80-a0f5-5a883662ac8f",
"metadata": {},
"outputs": [
{
"data": {
"image/png": "",
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"x = df_reg_ana['math_models_freq']\n",
"y = df_reg_ana['comp_int_freq']\n",
"sns.scatterplot(x=x, y=y, alpha=0.1)\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 31,
"id": "5f430c68-7a38-4c75-ba09-4ce2e21c27c7",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Optimization terminated successfully.\n",
" Current function value: 1.278472\n",
" Iterations: 199\n",
" Function evaluations: 323\n"
]
},
{
"data": {
"text/html": [
"<table class=\"simpletable\">\n",
"<caption>OrderedModel Results</caption>\n",
"<tr>\n",
" <th>Dep. Variable:</th> <td>comp_int_freq</td> <th> Log-Likelihood: </th> <td> -118.90</td>\n",
"</tr>\n",
"<tr>\n",
" <th>Model:</th> <td>OrderedModel</td> <th> AIC: </th> <td> 247.8</td>\n",
"</tr>\n",
"<tr>\n",
" <th>Method:</th> <td>Maximum Likelihood</td> <th> BIC: </th> <td> 260.5</td>\n",
"</tr>\n",
"<tr>\n",
" <th>Date:</th> <td>Sat, 07 Dec 2024</td> <th> </th> <td> </td> \n",
"</tr>\n",
"<tr>\n",
" <th>Time:</th> <td>16:34:52</td> <th> </th> <td> </td> \n",
"</tr>\n",
"<tr>\n",
" <th>No. Observations:</th> <td> 93</td> <th> </th> <td> </td> \n",
"</tr>\n",
"<tr>\n",
" <th>Df Residuals:</th> <td> 88</td> <th> </th> <td> </td> \n",
"</tr>\n",
"<tr>\n",
" <th>Df Model:</th> <td> 1</td> <th> </th> <td> </td> \n",
"</tr>\n",
"</table>\n",
"<table class=\"simpletable\">\n",
"<tr>\n",
" <td></td> <th>coef</th> <th>std err</th> <th>z</th> <th>P>|z|</th> <th>[0.025</th> <th>0.975]</th> \n",
"</tr>\n",
"<tr>\n",
" <th>math_models_freq</th> <td> 0.1932</td> <td> 0.120</td> <td> 1.608</td> <td> 0.108</td> <td> -0.042</td> <td> 0.429</td>\n",
"</tr>\n",
"<tr>\n",
" <th>1.0/2.0</th> <td> -1.4304</td> <td> 0.472</td> <td> -3.034</td> <td> 0.002</td> <td> -2.355</td> <td> -0.506</td>\n",
"</tr>\n",
"<tr>\n",
" <th>2.0/3.0</th> <td> -0.3945</td> <td> 0.393</td> <td> -1.005</td> <td> 0.315</td> <td> -1.164</td> <td> 0.375</td>\n",
"</tr>\n",
"<tr>\n",
" <th>3.0/4.0</th> <td> -0.0038</td> <td> 0.183</td> <td> -0.021</td> <td> 0.984</td> <td> -0.363</td> <td> 0.356</td>\n",
"</tr>\n",
"<tr>\n",
" <th>4.0/5.0</th> <td> 0.2176</td> <td> 0.129</td> <td> 1.685</td> <td> 0.092</td> <td> -0.036</td> <td> 0.471</td>\n",
"</tr>\n",
"</table>"
],
"text/latex": [
"\\begin{center}\n",
"\\begin{tabular}{lclc}\n",
"\\toprule\n",
"\\textbf{Dep. Variable:} & comp\\_int\\_freq & \\textbf{ Log-Likelihood: } & -118.90 \\\\\n",
"\\textbf{Model:} & OrderedModel & \\textbf{ AIC: } & 247.8 \\\\\n",
"\\textbf{Method:} & Maximum Likelihood & \\textbf{ BIC: } & 260.5 \\\\\n",
"\\textbf{Date:} & Sat, 07 Dec 2024 & \\textbf{ } & \\\\\n",
"\\textbf{Time:} & 16:34:52 & \\textbf{ } & \\\\\n",
"\\textbf{No. Observations:} & 93 & \\textbf{ } & \\\\\n",
"\\textbf{Df Residuals:} & 88 & \\textbf{ } & \\\\\n",
"\\textbf{Df Model:} & 1 & \\textbf{ } & \\\\\n",
"\\bottomrule\n",
"\\end{tabular}\n",
"\\begin{tabular}{lcccccc}\n",
" & \\textbf{coef} & \\textbf{std err} & \\textbf{z} & \\textbf{P$> |$z$|$} & \\textbf{[0.025} & \\textbf{0.975]} \\\\\n",
"\\midrule\n",
"\\textbf{math\\_models\\_freq} & 0.1932 & 0.120 & 1.608 & 0.108 & -0.042 & 0.429 \\\\\n",
"\\textbf{1.0/2.0} & -1.4304 & 0.472 & -3.034 & 0.002 & -2.355 & -0.506 \\\\\n",
"\\textbf{2.0/3.0} & -0.3945 & 0.393 & -1.005 & 0.315 & -1.164 & 0.375 \\\\\n",
"\\textbf{3.0/4.0} & -0.0038 & 0.183 & -0.021 & 0.984 & -0.363 & 0.356 \\\\\n",
"\\textbf{4.0/5.0} & 0.2176 & 0.129 & 1.685 & 0.092 & -0.036 & 0.471 \\\\\n",
"\\bottomrule\n",
"\\end{tabular}\n",
"%\\caption{OrderedModel Results}\n",
"\\end{center}"
],
"text/plain": [
"<class 'statsmodels.iolib.summary.Summary'>\n",
"\"\"\"\n",
" OrderedModel Results \n",
"==============================================================================\n",
"Dep. Variable: comp_int_freq Log-Likelihood: -118.90\n",
"Model: OrderedModel AIC: 247.8\n",
"Method: Maximum Likelihood BIC: 260.5\n",
"Date: Sat, 07 Dec 2024 \n",
"Time: 16:34:52 \n",
"No. Observations: 93 \n",
"Df Residuals: 88 \n",
"Df Model: 1 \n",
"====================================================================================\n",
" coef std err z P>|z| [0.025 0.975]\n",
"------------------------------------------------------------------------------------\n",
"math_models_freq 0.1932 0.120 1.608 0.108 -0.042 0.429\n",
"1.0/2.0 -1.4304 0.472 -3.034 0.002 -2.355 -0.506\n",
"2.0/3.0 -0.3945 0.393 -1.005 0.315 -1.164 0.375\n",
"3.0/4.0 -0.0038 0.183 -0.021 0.984 -0.363 0.356\n",
"4.0/5.0 0.2176 0.129 1.685 0.092 -0.036 0.471\n",
"====================================================================================\n",
"\"\"\""
]
},
"execution_count": 31,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#ordinal regression analysis for math_models_freq and comp_int_freq\n",
"X = df_reg_ana[\"math_models_freq\"]\n",
"y = df_reg_ana[\"comp_int_freq\"]\n",
"model = om.OrderedModel(y, X, dist=\"logit\")\n",
"result = model.fit()\n",
"result.summary()"
]
},
{
"cell_type": "markdown",
"id": "3e554d38-cf35-417b-aee3-5228e3bb048f",
"metadata": {},
"source": [
"The coefficient indicates a 0.1932 unit increase in comp_int_freq for a one unit increase in math_models_freq. The P>|z| values indicate the probability of observing a coefficient as extreme as the calculated coefficient, with the threshhold for significance at alpha. Based on alpha of 0.05, the observed relationship is not statistically significant (P>|z| = 0.108)."
]
},
{
"cell_type": "markdown",
"id": "collectible-puppy",
"metadata": {},
"source": [
"## Second Research Question: Is there a significant relationship between the instructor's rating of conceptual focus frequency in introductory physics labs and the reported frequency of use of computer to interface with measurement devices?"
]
},
{
"cell_type": "markdown",
"id": "demographic-future",
"metadata": {},
"source": [
"### Methods"
]
},
{
"cell_type": "markdown",
"id": "incorporate-roller",
"metadata": {},
"source": [
"I will perform an ordinal regression to model this data (using the Statsmodels library), since the numerical values are derived from a Likert scale, and each point on the scale is ordered but the differences between each of the points are not necessarily equal. The distribution of conc_models_freq was approximately normal for (based on -1.0 < kurtosis < 1.0 and -0.5 < skewness < 0.5). The distribution of comp_int_freq is slightly left-skewed, so logit transformation will be used. I will present the answer in a scatterplot (using the Seaborn library) and with ordinal regression statistics.\n",
"\n",
"I will use conc_models_freq and comp_int_freq, which are the instructor-reported frequencies of how often students develop conceptual models for the system being studied and use computers to interface with measurement devices.\n",
"\n",
"I cleaned the data by first creating a new dataframe, df_reg_ana, containing just the variables of interest, renaming the columns with more descriptive names, recoding the Likert-scale for the reported frequencies (Never, Rarely, Sometimes, Often, Always) to a 5-point numerical scale (1, 2, 3, 4, 5). Then I dropped missing values and kept only the rows for introductory lab courses."
]
},
{
"cell_type": "markdown",
"id": "juvenile-creation",
"metadata": {},
"source": [
"### Results "
]
},
{
"cell_type": "code",
"execution_count": 37,
"id": "e4496366-5aec-4ab1-8425-a19099a808df",
"metadata": {},
"outputs": [
{
"data": {
"image/png": "",
"text/plain": [
"<Figure size 640x480 with 1 Axes>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"x = df_reg_ana['conc_models_freq']\n",
"y = df_reg_ana['comp_int_freq']\n",
"sns.scatterplot(x=x, y=y, alpha=0.1)\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": 38,
"id": "fc1b7a33-ae18-47e6-8cd2-b67b6169e6d1",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Optimization terminated successfully.\n",
" Current function value: 1.292302\n",
" Iterations: 147\n",
" Function evaluations: 243\n"
]
},
{
"data": {
"text/html": [
"<table class=\"simpletable\">\n",
"<caption>OrderedModel Results</caption>\n",
"<tr>\n",
" <th>Dep. Variable:</th> <td>comp_int_freq</td> <th> Log-Likelihood: </th> <td> -120.18</td>\n",
"</tr>\n",
"<tr>\n",
" <th>Model:</th> <td>OrderedModel</td> <th> AIC: </th> <td> 250.4</td>\n",
"</tr>\n",
"<tr>\n",
" <th>Method:</th> <td>Maximum Likelihood</td> <th> BIC: </th> <td> 263.0</td>\n",
"</tr>\n",
"<tr>\n",
" <th>Date:</th> <td>Sat, 07 Dec 2024</td> <th> </th> <td> </td> \n",
"</tr>\n",
"<tr>\n",
" <th>Time:</th> <td>16:34:53</td> <th> </th> <td> </td> \n",
"</tr>\n",
"<tr>\n",
" <th>No. Observations:</th> <td> 93</td> <th> </th> <td> </td> \n",
"</tr>\n",
"<tr>\n",
" <th>Df Residuals:</th> <td> 88</td> <th> </th> <td> </td> \n",
"</tr>\n",
"<tr>\n",
" <th>Df Model:</th> <td> 1</td> <th> </th> <td> </td> \n",
"</tr>\n",
"</table>\n",
"<table class=\"simpletable\">\n",
"<tr>\n",
" <td></td> <th>coef</th> <th>std err</th> <th>z</th> <th>P>|z|</th> <th>[0.025</th> <th>0.975]</th> \n",
"</tr>\n",
"<tr>\n",
" <th>conc_models_freq</th> <td> 0.0195</td> <td> 0.131</td> <td> 0.149</td> <td> 0.882</td> <td> -0.237</td> <td> 0.276</td>\n",
"</tr>\n",
"<tr>\n",
" <th>1.0/2.0</th> <td> -1.9548</td> <td> 0.547</td> <td> -3.577</td> <td> 0.000</td> <td> -3.026</td> <td> -0.884</td>\n",
"</tr>\n",
"<tr>\n",
" <th>2.0/3.0</th> <td> -0.4198</td> <td> 0.395</td> <td> -1.064</td> <td> 0.287</td> <td> -1.193</td> <td> 0.353</td>\n",
"</tr>\n",
"<tr>\n",
" <th>3.0/4.0</th> <td> -0.0362</td> <td> 0.183</td> <td> -0.198</td> <td> 0.843</td> <td> -0.395</td> <td> 0.323</td>\n",
"</tr>\n",
"<tr>\n",
" <th>4.0/5.0</th> <td> 0.2055</td> <td> 0.129</td> <td> 1.590</td> <td> 0.112</td> <td> -0.048</td> <td> 0.459</td>\n",
"</tr>\n",
"</table>"
],
"text/latex": [
"\\begin{center}\n",
"\\begin{tabular}{lclc}\n",
"\\toprule\n",
"\\textbf{Dep. Variable:} & comp\\_int\\_freq & \\textbf{ Log-Likelihood: } & -120.18 \\\\\n",
"\\textbf{Model:} & OrderedModel & \\textbf{ AIC: } & 250.4 \\\\\n",
"\\textbf{Method:} & Maximum Likelihood & \\textbf{ BIC: } & 263.0 \\\\\n",
"\\textbf{Date:} & Sat, 07 Dec 2024 & \\textbf{ } & \\\\\n",
"\\textbf{Time:} & 16:34:53 & \\textbf{ } & \\\\\n",
"\\textbf{No. Observations:} & 93 & \\textbf{ } & \\\\\n",
"\\textbf{Df Residuals:} & 88 & \\textbf{ } & \\\\\n",
"\\textbf{Df Model:} & 1 & \\textbf{ } & \\\\\n",
"\\bottomrule\n",
"\\end{tabular}\n",
"\\begin{tabular}{lcccccc}\n",
" & \\textbf{coef} & \\textbf{std err} & \\textbf{z} & \\textbf{P$> |$z$|$} & \\textbf{[0.025} & \\textbf{0.975]} \\\\\n",
"\\midrule\n",
"\\textbf{conc\\_models\\_freq} & 0.0195 & 0.131 & 0.149 & 0.882 & -0.237 & 0.276 \\\\\n",
"\\textbf{1.0/2.0} & -1.9548 & 0.547 & -3.577 & 0.000 & -3.026 & -0.884 \\\\\n",
"\\textbf{2.0/3.0} & -0.4198 & 0.395 & -1.064 & 0.287 & -1.193 & 0.353 \\\\\n",
"\\textbf{3.0/4.0} & -0.0362 & 0.183 & -0.198 & 0.843 & -0.395 & 0.323 \\\\\n",
"\\textbf{4.0/5.0} & 0.2055 & 0.129 & 1.590 & 0.112 & -0.048 & 0.459 \\\\\n",
"\\bottomrule\n",
"\\end{tabular}\n",
"%\\caption{OrderedModel Results}\n",
"\\end{center}"
],
"text/plain": [
"<class 'statsmodels.iolib.summary.Summary'>\n",
"\"\"\"\n",
" OrderedModel Results \n",
"==============================================================================\n",
"Dep. Variable: comp_int_freq Log-Likelihood: -120.18\n",
"Model: OrderedModel AIC: 250.4\n",
"Method: Maximum Likelihood BIC: 263.0\n",
"Date: Sat, 07 Dec 2024 \n",
"Time: 16:34:53 \n",
"No. Observations: 93 \n",
"Df Residuals: 88 \n",
"Df Model: 1 \n",
"====================================================================================\n",
" coef std err z P>|z| [0.025 0.975]\n",
"------------------------------------------------------------------------------------\n",
"conc_models_freq 0.0195 0.131 0.149 0.882 -0.237 0.276\n",
"1.0/2.0 -1.9548 0.547 -3.577 0.000 -3.026 -0.884\n",
"2.0/3.0 -0.4198 0.395 -1.064 0.287 -1.193 0.353\n",
"3.0/4.0 -0.0362 0.183 -0.198 0.843 -0.395 0.323\n",
"4.0/5.0 0.2055 0.129 1.590 0.112 -0.048 0.459\n",
"====================================================================================\n",
"\"\"\""
]
},
"execution_count": 38,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#ordinal regression analysis for conc_models_freq and comp_int_freq\n",
"X = df_reg_ana[\"conc_models_freq\"]\n",
"y = df_reg_ana[\"comp_int_freq\"]\n",
"model = om.OrderedModel(y, X, dist=\"logit\")\n",
"result = model.fit()\n",
"result.summary()"
]
},
{
"cell_type": "markdown",
"id": "3fc177af-cd92-494a-b3e0-3bf5d7e7f91b",
"metadata": {},
"source": [
"The coefficient indicates a 0.0195 unit increase in comp_int_freq for a one unit increase in conc_models_freq. The P>|z| values indicate the probability of observing a coefficient as extreme as the calculated coefficient, with the threshhold for significance at alpha. Based on alpha of 0.05, the observed relationship is not statistically significant (P>|z| = 0.882)."
]
},
{
"cell_type": "markdown",
"id": "infectious-symbol",
"metadata": {},
"source": [
"# Discussion"
]
},
{
"cell_type": "markdown",
"id": "furnished-camping",
"metadata": {
"code_folding": []
},
"source": [
"## Considerations"
]
},
{
"cell_type": "markdown",
"id": "bearing-stadium",
"metadata": {},
"source": [
"These results indicate that instructor rated frequency of developing neither mathematical nor conceptual models in lab are strong or significant predictors of their use of computers to interface with data collection devices. The research question is probably better answered through interviews (qualitative) or observation or document analysis of laboratory manuals (qualitative or quantitative).\n",
"\n",
"The dataset has some limitations, including the construction and validation of the course information survey, which was not validated as part of the student survey for E-CLASS. While some of the items are objective, such as the level and math-basis of the course, the Likert scale used to assess frequency of laboratory instructional choices should have been validated. After cleaning the dataset and omitting rows with missing values, there were only 93 entries included in the analysis.\n",
"\n",
"Self-selection bias is a flaw of this data, as the survey was administered to classes based on the instructor choosing to do so."
]
},
{
"cell_type": "markdown",
"id": "beneficial-invasion",
"metadata": {},
"source": [
"## Summary"
]
},
{
"cell_type": "markdown",
"id": "about-raise",
"metadata": {},
"source": [
"I think the results made sense, based on the way the data was collected. I hadn't used this statistical model before (so my execution could also be flawed), and previously I've mainly used linear regression for analysis of interval or continuous data. However, although sometimes linear regression is used for Likert scale data (as many commercial statistical softwares do not have ordinal regression capabilities), it seemed like a better choice to use the ordinal regression model. \n",
"\n",
"Generally, I'm not a huge fan of Likert scale surveys. I would probably not use one in my own research. However, I think it was a good experience to look at this dataset more deeply and do some analysis with it. I'm looking the other side of the data (the student responses to the epistemological items) in another class, and that has been validated so maybe there will be something there."
]
}
],
"metadata": {
"jupytext": {
"cell_metadata_json": true,
"text_representation": {
"extension": ".Rmd",
"format_name": "rmarkdown",
"format_version": "1.2",
"jupytext_version": "1.9.1"
}
},
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.4"
},
"toc": {
"base_numbering": 1,
"nav_menu": {},
"number_sections": false,
"sideBar": true,
"skip_h1_title": false,
"title_cell": "Table of Contents",
"title_sidebar": "Contents",
"toc_cell": false,
"toc_position": {},
"toc_section_display": true,
"toc_window_display": false
},
"varInspector": {
"cols": {
"lenName": 16,
"lenType": 16,
"lenVar": 40
},
"kernels_config": {
"python": {
"delete_cmd_postfix": "",
"delete_cmd_prefix": "del ",
"library": "var_list.py",
"varRefreshCmd": "print(var_dic_list())"
},
"r": {
"delete_cmd_postfix": ") ",
"delete_cmd_prefix": "rm(",
"library": "var_list.r",
"varRefreshCmd": "cat(var_dic_list()) "
}
},
"types_to_exclude": [
"module",
"function",
"builtin_function_or_method",
"instance",
"_Feature"
],
"window_display": false
}
},
"nbformat": 4,
"nbformat_minor": 5
}