Source: https://drive.google.com/file/d/1deehwP5Pl9Ofl9B6l-yaKJ8WaI6pH0Ox/view

Version Control

What is version control?

A way to manage the evolution of a set of files

Repository (repo)

The set of files

Check out

Selecting a commit

Branch

A separate place for your commits when you are changing things and experimenting

Main Branch

Where the main commits live

Merge

Allows you to combine the commits from a branch to the main branch

Fork

Making your own version of someone else’s repo

Pull Request (PR)

A request to add a new feature to the original repo, which then has to be merged

Issue

Bug trackers, feature requests, or other requests and suggestions for a repo

Hash

The unique identifier for each commit

File Structure

Data Science Files Should Be

  • Easy to find
  • Easy to share
  • Easy to understand
  • Easy to update
  • Data
  • Raw data
  • Tidy data
  • Figures
  • Exploratory figures
  • Explanatory figures
  • Code
  • Raw code
  • Final code
  • Products
  • Papers
  • Reports

File Naming

File names should be

  • Machine readable
  • Human readable
  • Nicely ordered