This is my initial proposal. While the data set I found seems large, I think this is manageable and I look forward to getting started on the project.