You will be expected to:
- Select an area of interest in public policy or social science. Define some questions you’d like to answer using data. Write a short (1 page or less) proposal describing the project you plan to do and potential data sources.
- Collect datasets through some combination of open data and data collection (e.g. web scraping) and vet them.
- Perform computational analysis to answer your questions. This work will be done in a git repo that will be shared with the class.
- Present your results to the class during finals week.
- Proposal: Due Friday, Feb 12th at 5pm by email to the mailing list. Should include:
- your group,
- your area of interest,
- the data sources you will use, and
- the questions you hope to answer.
- Git repository: Due 2/17. We will be grading you based on:
- (30%) Clear, well-documented code. Unless you used private data that you cannot commit to your repository, we should be able to run your code to replicate your analysis.
- (10%) README.md: Use Markdown to create a README.md file in your repository that briefly (equivalent of 1 page) summarizes the purpose and findings of your project, including some tables and visualizations.
- Project presentation: During finals week. At least 10 minutes. For groups or 2 or more, at least 20 minutes but no more than 30 minutes. We will be looking for the following (but feel free to use a different outline):
- Introduction and motivation: Why did you select this problem?
- Data: Describe your data including how you selected and collected it.
- (30%) Analysis and Visualization
- (30%) Result: What does your work mean? What questions does it answer?
- Further work: If you were to continue this work, what extensions would you pursue?