Files
project_argument/proposal.md
2025-11-03 09:37:14 -05:00

66 lines
3.5 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Project proposal
Argument Project
Nelson Mason - Date: 11/3/2025
## Overarching Question
### What central question are you interested in exploring? Why are you interested in exploring this question?
I want to know about what relationship exists, if any, between an adult (18 +)
person's age and their weight (I'll use metric).
I'm trying to find out at what age, on average, do people experience a dramatic
weight gain or loss, if at all?
I'm curious to find out if such a dramatic increase or decrease in weight can
be captured in a one-time snapshot database, where individuals are NOT tracked
over a period of time, but ONLY once.
### What specific research questions will you investigate?
1) What number or percentage can be used to accurately indicate a
dramatic change in weight by age? How do I determine what "dramatic" is?
2) What is the count of people by age? Is this distribution of any significance?
3) I want to see the main distribution of weight by age. I want to see where
the probable outliers are (Box Plot).
4) I want to see the main distribution of weight by age. I want to see the
probability density (Violin Plot).
## Data source
https://www.cdc.gov/brfss/annual_data/annual_2020.htm
### What data set will you use to answer your overarching question?
brfss_2020_cleaned.csv
### Where is this data from?
BRFSS 2020
This lab uses a simplified subset of the BRFSS 2020 dataset, brfss_2020.csv. This notebook explains the variables included as well as the process used to produce this file. Read more about BRFSS at https://www.cdc.gov/brfss/annual_data/annual_2020.htm
l“The link brfss/annual_data/annual_2020.htm directs to the 2020 Behavioral Risk Factor Surveillance System (BRFSS) annual survey data from the Centers for Disease Control and Prevention (CDC). This dataset includes data from 50 states, the District of Columbia, Guam, and Puerto Rico, collected through a combination of landline and cell phone interviews. The 2020 data reflect changes in the weighting methodology and the inclusion of cell phone respondents that began in 2011, making it non-comparable to data from before that year.
What the 2020 BRFSS data includes:
• Survey data: Includes approximately 401,958 records and 279 variables.
• Data files: Available in ASCII and SAS Transport formats.
• Geographic scope: Data collected from all 50 states, the District of Columbia, Guam, and Puerto Rico.
• Methodology: A combination of landline and cell phone data, using updated weighting methods.
• Documentation: Includes a codebook, survey description, and information on data collection and processing.
Key differences from prior years:
• The 2020 data is not directly comparable to BRFSS data from before 2011 due to the inclusion of cell-phone-only respondents and a revised weighting methodology known as "raking".
How to use the data:
• Users can access the 2020 survey data and accompanying documentation through the CDC's BRFSS website.
• Researchers can use this public data for various studies on health-related behaviors and chronic conditions, as shown in the example research that analyzed the association between sleep, exercise, and coronary heart disease in the 2020 BRFSS data.” Googe Search
### What is this data about?
2 columns: Age and Weight (metric)
166,426 rows
## Methods
I will use quantitative analysis methods.
### How will you use your data set to answer your quantitative questions?
I will create 5 charts.