Criteria for an appropriate journal article are:
I will let you propose a journal article with a dataset, but I retain the right to decide if the dataset is “relatively complicated”.
PLoS Journals are all required to have a data availability statement, so this is a good place to start searching.
You can get a sense of the type of articles I am looking for (or use one of these) based on the following that I picked out of a search from PLoS One. These aren’t necessarily good or interesting articles, just ones that satisfy the above requirements and were released when I checked for the newest releases in PLoS One. I don’t guarantee that these articles are fully reproducible. They just looked like the data were available and the methods were regression.
If the data are in a weird format (like “sav” or “dta”), then try to
use the {haven}
package to load it into R.
(Links at bottom of page in the References section)
If you want to use one of the articles above, please provide me with rankings of your top five so I can apply Rank-maximal allocation to assign journal articles.
If you want to use one that is not on this list, please send me an email before the deadline so I can pre-approve it.
It often (not always) takes about two weeks to reproduce the results of a paper.
Sometimes, you find out right away that reproducing the paper is impossible, and you need to switch papers. This takes up time.
For the progress report you will turn in an R markdown file that loads in the data (that is already cleaned by some other script), fits a regression and reproduces the tables and figures in the paper you have selected.
If your numbers and figures are not the same as in the paper, your options are:
From past experience, students don’t believe me when I say it often (though, not always) takes two weeks to reproduce the results of a paper, so they start the day before the progress report is due and ask for extensions.
There will be no extensions on the progress report, even if you need to switch papers. Instead, there will be an upper bound on the grade of your progress report grade based on how late you turn it in.
You will provide a presentation and a written report. They should both follow this outline:
Your presentation should be 12 minutes long. I will strict on time.
All group members should speak.
I will ask questions for 3 minutes at the end of the presentation.
You should turn in all of your code to reproduce the results of the paper in a single zipped folder.
I should be able to run your code without modification and obtain your results.
There should be no errors in your code.
You should use the methods/packages we’ve learned in class.
At the end of the project, I will have your teammates rate your contribution to the project. I will adjust your grade according to their comments. If they all say that you didn’t do anything, I will give you a failing grade.
You will get 2 points for filling out the assessment.