Source: https://drive.google.com/file/d/1w5XnqaJP_DZWwVsVnRyzTQ7BsZn-3M7H/view?usp=sharing

Reproducibility and Bias

Define what constitutes reproducible & replicable data science
Explain the challenges and limitations to reproducibility & replicability
Understand confirmation bias & Identify cases of confirmation bias

Reproducibility

re-performing the same analysis (with the same code) using a different analyst

Reasons for a study to fail to be reproducible

data not provided, unable to be shared

lack of peer review, or not in a transparent process

missing code, failure to publish code, trade secrets

different data provided

lack of computational literacy

different software versions, deprecated software, or legacy systems

lack of statistical literacy or maliciousness (p-hacking)

poorly written or incomplete documentation

no cash or funding

The Replicability Crisis

Replicable

re-performing the experiment and collecting new data

Replicability is harder than reproducibility

because, by definnition, the underlying data are different. And data are variable.

Replicability

“The measurement can be obtained with stated precision by a different team using the same measurement procedure, the same measuring system, under the same operating conditions, in the same or a different location on multiple trials.”

Reasons for a study failing to replicate

Finding was not “real” (or small effect size)

Measurement error

Variable finding (things change over time)

Samples come from different populations

Different experimental design or conditions

Fraud

The Unicorn Test

If you find a “unicorn” result, return to your data several times and from a different view points until you are fully convinced.

Cognitive Bias

A cognitive bias wherein humans have a tendency to search for, interpret, favor, and recall information in a way that confirms one’s preexisting beliefs or hypotheses.

Carter's Digital Garden

Explorer

COGS 9 Lecture 4

Reproducibility and Bias

Graph View

Backlinks