project_argument/proposal.md

2.8 KiB

Project proposal

This planning document will also form the introduction of your argument.

Overarching Question

What central question are you interested in exploring? Why are you interested in exploring this question?

The central question I am interested in exploring is whether The Phantom Menace and A New Hope are good "first" movies to become interested in Star Wars. Star Wars is a franchise that continues to be popular, as seen through continued exploration of its universe through various media since its introduction in 1977. Even among those who are not fans of the series, many are aware of different characters, debates, lines, or situations from the series, e.g. the line "No, I am your father" or it's common misquote, "Luke, I am your father." Given the popularity of the series and its universe as a whole, it might safely be assumed that at least one of The Phantom Menace or A New Hope, movies that were the first in their respective trilogies, was a sufficiently compelling entry in the series to induce an appetite for more. Given the availability of some data on people who have watched Star Wars films, I would like to see if that assumption holds any merit.

What specific research questions will you investigate?

  1. If a person watches either The Phantom Menace or A New Hope, how likely is the person to be a fan?
  2. If a person watches either The Phantom Menace or A New Hope, how likely is the person to watch the film following it?

Data source

What data set will you use to answer your overarching question?

My data set is the Star Wars Survey, which can be found at https://github.com/fivethirtyeight/data/tree/master/star-wars-survey

Where is this data from?

The data comes from a survey performed by FiveThirtyEight for an article, "America's Favorite 'Star Wars' Movies (And Least Favorite Characters)." FiveThirtyEight is a website that aims to deliver journalism informed by data. I trust them because the project website recommended using one of their data sets.

What is this data about?

Describe the nature of the data in the dataset, including the number of rows and some of the columns which will be important to you. There are 1188 rows in the dataset. Of these, 1186 represent respondents. Respondents indicated, among other things, whether they were fans and which of the movies from the first two trilogies they'd watched.

Methods

How will you use your data set to answer your quantitative questions?

  1. I will look at people who have watched The Phantom Menace and from there see how many of those identify as fans. I will do the same with A New Hope. I will present this as a table.

  2. I will look at people who have watched The Phantom Menace and from there see how many of those have watched The Clone Wars. I will do the same with A New Hope and ThE Empire Strikes Back. I will present this as a table.