Average rainfall 2001-2016, global tropics

Map: Average rainfall 2001-2016, global tropics

Local git control

Thomas Gumbricht bio photo By Thomas Gumbricht

NOTE, 1 October 2020 the label ‘master’ was replaced by ‘main’ for the default repo branch.

Introduction

This post covers git Version Control System (VCS) for local machines. If you are a git (command line) newbie, you can view the Youtube Introduction to Git - Core Concepts by David Mahler or if you prefer to read, the blog post Tutorial: Git for Absolutely Everyone by Michelle Gienow. This post is inspired by these two sources.

The tutorial series Learn Git with Bitbucket Cloud is more in depth and links to relevant pages in that tutorial series are given in different sections below. The most comprehensive source available is probably the book Pro Git - Everything you need to know about git by Scott Chacon and Ben Straub.

Every git project is bound to a repository (or repo for short) that always resides inside a directory (synonym: folder). A repo can either be a master (~original) or a clone (~copy). Unfortunately the term master can mean several different things in git, but you just have to learn to live with it. The difference between an ordinary directory and a repository is that the repository contains a system of, usually hidden, files and folders that track, log and save changes in the directory. These hidden git files allow the use of the git toolbox for tracking, merging and restoring different versions of the documents in the working directory.

A repository consists of three “trees” maintained and tracked by git. The first tree is the working directory which holds the “normal” files, it is like an ordinary directory with sub-directories and files. The second tree is the index (or staging area) and the third tree is the head (or history) tree which contains all commits and also points to the last commited versions. stage and commit are git jargon, and you can picture them as meaning adding and locking (saving) changes. In git you can always go back to locked savings and explore who made what changes at which stage.

Two useful git commands for keeping track of what has been done and what is pending, are: git log or git reflog for tracking the git command history, and git status to see what is pending.

Prerequisites

git must be installed, if you need assistance look at the post on install git for command line. This tutorial relies heavily on the Terminal command tool, if you are not acquainted with the command line, the Command line crash course (pdf) will teach you most things you need to know.

Create directory with git control

Create a new, empty, directory on a local drive. To use the Terminal, first change directory (cd) to the parent folder under where you want to create the new directory.

$ cd path/to/parent/directory

You can also just type $ cd in the Terminal window, then open a Finder window, navigate the the parent folder and drag the directory icon of the parent to the Terminal window. Make sure the Terminal points towards the parent directory, and create the new directory with the mkdir command:

$ mkdir git-test-dir

cd to the newly created directory:

$ cd git-test-dir

git Init

The default initialisation command git init converts a directory to a master repository with a hidden .git directory. master repos are intended for development work, with the content typically cloned by other users; edits made in cloned repos, however, can not be pushed back into the master. If you want a local master repo that is more of a container and where development actually takes place in clones, then you should have a look in the parallel post on Shared master local git control. For this post, with the master being the site of development, turn your newly created directory into a repository:

$ git init

The response will be something like

Initialized empty git repository in path/to/git-test-dir/.git/

If you now check the content of your directory by listing (ls) its content:

$ ls

it will contain no (visible) file. If you instead use list all, ls -a, you will see the hidden .git folder:

$ ls -a

You can explore the .git folder by changing directory cd and, listing ls. But that is beyond this tutorial. All the version control and tracking that you do using git will be registered under the hidden directory.

Link to in depth exploration of git init at Atlassian Bitbucket.

Add a readme.md

A good practice is to add a README file (they are commonly named with capitals to be noted) to any repository. It will typically be a simple text file (not from any word processor), or perhaps a markdown (or md) file. The latter is a text file with some layout possibilities when published online for instance, but md files can also be just simple text files. Create a file called README.md using the touch command.

$ touch README.md

Then you can use the command line editor pico to add some information:

$ pico README.md

Repo for project on ...

Hit [ctrl]+[X] to exit pico and save the edits by pressing Y when asked.

Create a document

With your git repo (the directory with the hidden .git directory) setup, create a simple document, for instance a markdown file with an outline of chapters or sections. This first version is going to become your initial master. You can use pico without having created the file before:

$ pico Chapters.md

# Chapters in book on ...
...

Clone I

The core idea of git is to have one established master document (or project) and then share or work on copies (or clones). The differences between a clone and the master can then be reviewed and edits accepted or rejected (or edited further). Creating copies from the master is done with the command git clone master-directory [clone-directory] . If you omit the clone-directory from the command, the clone will end up inside the master-directory:

$ git clone path/to/git-test-dir

You can only do this once, as the cloning process will detect that the sub-directory is not empty if you try to run the command more then once. Thus it is often better to explicitly state the target, or clone, where I prefer to give both a date and version for easier identification:

$ git clone path/to/git-test-dir path/to/git-test-dir-YYYYMMDD-vX

When you execute the git clone command you will get the following return at the terminal prompt:

warning: You appear to have cloned an empty repository.
done.

To actually get any content with the cloning you must add (stage in the git jargon) the files (or folders) you want to clone, and then commit to the changes, while also including a message.

Link to Bitbucket Atlassian in depth page on git clone command.

git init + git clone

As an alternative to git clone into an empty (or non-existing) directory, you can create the git repo first, and then clone into this repo; the sequence of commands then become to create a directory, initialise git and then clone:

$ mkdir git-clone-dir
$ cd git-clone-dir
$ git init
$ git clone path/to/git-test-dir

git remote

regardless of how you created your clone repo, you can check its connections (which master it is related to) by the git remote command:

$ git remote -v

The response should be something like:

origin	path/to/git-test-dir (fetch)
origin	path/to/git-test-dir (push)

where origin is the default alias given to the master repo where fetch and push operates. fetch and push are git commands for retrieving and sending data, they will be explained below.

Stage: git add

Make sure your Terminal window points to your original directory (master), then check the status:

$ git status

If you created the two markdown (.md) files outlined above, you should get the following response in the terminal window:

On branch master

No commits yet

Untracked files:
  (use "git add <file>..." to include in what will be committed)

	Chapters.md
	README.md

nothing added to commit but untracked files present (use "git add" to track)

This tells us that we have two new (untracked) files that have not been added to the tracking process. In the git jargon, adding a file to the tracking process is called staging, but the command is git add. To stage the two existing documents run the commands:

$ git add Chapters.md
$ git add README.md

or to stage all files in one go:

$ git add *

or to stage all files and folders:

$ git add .

It is enough to stage a file once, you can then tell commit to lock snapshots of all files that were ever staged to be included. This will clarified in the next section.

More details on Saving changes and git add, at Atlassian BitBucket.

git commit

If you again run the $ git status command, you will see that the file names have changed color from red to green (not shown here, but in the terminal window), and that the message is different:

On branch master

No commits yet

Changes to be committed:
  (use "git rm --cached <file>..." to unstage)

	new file:   Chapters.md
	new file:   README.md

The message tells us how to unstage a file (if we regret the staging), and that we have not done any commits, yet. To actually execute the commit you need to give a message (summary is compulsory, extended is optional).

$ git commit

On Mac OSX this command opens the text editor Vim, and the terminal window will look similar to this (I have added my commit message below though):

\# Please enter the commit message for your changes. Lines starting
\# with '\#' will be ignored, and an empty message aborts the commit.
\#
\# On branch master
\#
\# Initial commit
\#
\# Changes to be committed:
\#       new file:   Chapters.md
\#       new file:   README.md
Initial commit \# One line summary added by me
\# Blank line separating and summary and full message
Initial commit for test project, 20200218 \# first line of full message
Only contains Chapters.md and README.md
~
~

Enter the message for your commit, typically you enter a one line summary, followed by a blank line and then the full message. The latter can be skipped. Writing informative summaries is an art, summarised(!) in the post The Art of the Commit by David Demaree.

To exit Vim and save the edits, hold down the [SHIFT] key and type [ZZ] or [:wq!]. If you do not want to save and just exit, instead type [:q!]. if saving the edits, you should be returned to the ordinary terminal prompt, and get a message like this:

[master (root-commit) f8a8843] Initial commit
 2 files changed, 8 insertions(+)
 create mode 100644 Chapters.md
 create mode 100644 README.md

git identifies each unique commit by attaching a long (40 characters) hexadecimal number to every commit. Above, the code “f8a8843” constitutes the first 7 characters that identifies the commit.

If you again execute the command $ git status, the response should be similar to:

On branch master
nothing to commit, working tree clean

You can simplify the commit process by adding the parameter -m followed by the message, and git will bypass the interactive message request:

git commit -m "Initial commit"

git commit -a[m]

You can include all files that where ever staged (added to the tracking system), bypassing the staging by adding the parameter -a. However, this only affects files that have previously been staged. New files that have not been staged are excluded.

$ git commit -a

To both include the staging (of all files with a history of staging) and the commit message the command becomes:

$ git commit -am "Initial commit"

More details can be found on Atlassian BitBucket git commit page.

git checkout

In git jargon, a “checkout” switches between different versions of either files, commits or branches - you will encounter checkout for several tasks. You can also use it to discard un-staged changes. Add some text to Chapters.md, you can use pico as shown above. Save the edits and then check status.

$ git status

On branch master
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

	modified:   Chapters.md

no changes added to commit (use "git add" and/or "git commit -a")

git tells us to use "git checkout -- <file>…" to discard changes in working directory

So let us just do that and then check the status again:

$ git checkout -- Chapters.md
$ git status

On branch master
nothing to commit, working tree clean

If you open Chapters.md you will see that the line you added is not there. It has not been deleted, instead the working directory version of Chapters.md has been replaced with the latest commited version.

Undo stage (git reset)

Repeat the editing of the Chapters.md document (do not checkout), but this time stage the file :

$ git add Chapters.md

Try the command git diff:

$ git diff

It should return nothing, instead try:

$ git diff –staged

and you should see the difference between the stage and the most recent commit, or put differently, the differnce between the stage tree and the history tree:

diff --git a/Chapters.md b/Chapters.md
index 611a054..fc92164 100644
--- a/Chapters.md
+++ b/Chapters.md
@@ -13,3 +13,5 @@
 ### Cloud changes 1970 - 2020

 ### Glacier changes 1970 -2020
+
+### dummy

To remove the staged version of Chapters.md and replace it with the latest commited version (called: HEAD, where HEAD is rather a pointer) run the git reset command:

$ git reset HEAD Chapters.md

Unstaged changes after reset:
M	Chapters.md

You can now proceed and check out Chapters.md as in the previous section.

$ git checkout -- Chapters.md
$ git status

.gitignore

In the next section you are going to use the operating system (OS) for interfering in the working directory tree. This might leave traces from the OS, and git will acknowledge these changes as something to keep track of. To prevent that from happening you can create a .gitignore file, and in that file list the files, file types and folders that git should ignore:

$ pico .gitignore

I added the hidden Mac OSX file .DS_store and all kinds of log files and folders:

.DS_Store
*.log
log/
logs/

Hit [ctrl]+[X] to exit pico and save the edits by pressing Y when asked.

You have to stage and commit the .gitignore file:

$ git add .gitignore
$ git commit -m ".gitignore"

Delete file with operating system

Remove (delete) the file Chapters.md from the working directory. If you want to use the command line then type:

$ rm Chapters.md

To stage and commit the changes in one command:

$ git commit -am "deleting Chapters.md"

[master 41704db] deleting Chapters.md
 1 file changed, 15 deletions(-)
 delete mode 100644 Chapters.md

Restore earlier/deleted file

To restore the deleted file Chapters.md, first check out the git log for this file:

$ git log -- Chapters.md

In the returned message we can see that the last changes we made to Chapters.md where in the commit with the id starting “3f294b9”:

commit 41704dbe25e51f3fc49302c171af55dcc6b38475 (HEAD -> master)
Author: Karttur <thomas.gumbricht@karttur.com>
Date:   Thu Feb 20 11:27:23 2020 +0100

    deleting Chapters.md

commit 3f294b9e27e9184cf17b0acf4d61268181d67fe8
Author: Karttur <thomas.gumbricht@karttur.com>
Date:   Tue Feb 18 15:22:07 2020 +0100

    chapters.md extended

commit f8a88432ba928f9e8e2f245f5c13306b9641443a
Author: Karttur <thomas.gumbricht@karttur.com>
Date:   Tue Feb 18 11:23:08 2020 +0100

    Initial commit

    Initial commit for test project, 20200218
    Only contains Chapters.md and README.md

To retrieve the last version of Chapters.md from the history tree, use the command git checkout introduced above:

$ git checkout 3f294b9 -- Chapters.md

List (ls) the files in your working directory, and Chapters.md should be back. Because you only removed the file using the operating system (not git), git still has the file Chapters.md in the tracking system. Thus, to commit the changes, simply type

$ git commit -m "Restoring Chapters.md"

git rm

The command git rm is the git equivalent of the operating system rm. The difference is that with git rm you also remove the file from the staging tree.

$ git rm Chapters.md

The removal is already staged, and you can commit:

$ git commit -m "deleting Chapters.md"

Restore the file exactly as done above done above, and then:

$ git checkout 3f294b9 -- Chapters.md

git rm manual on Atlassian Bitbucket.

Clone II

You have now both staged and commited changes made to the master repo. if you now run the git clone command, the target (or clone) directory will contain the staged and commited files (folders). Note that you can not clone into a target directory that already exists as a clone, then you instead pull the content from the master - as explained in the next section. You can only clone into an empty target directory.

$ git clone path/to/git-test-dir path/to/git-test-dir-YYYYMMDD-vY

The message returned in the terminal window should now be something like:

Cloning into 'path/to/git-test-dir-YYYYMMDD-vY'...
done.

And if you explore the clone it should contain identical copies of the two files in the master repo.

git pull

When you created your first clone (in the section Clone I towards the beginning of this post) nothing was actually copied (or cloned) as you had not staged and commited any changes. To copy any changes made in the master to an existing clone also involves two steps: fetch and merge. As explained on the Atlassian Bitbucket in depth page on git fetch fetch is the ‘safe’ version of pull; it will not download files to working directory, only affect the history (HEAD) tree. To complete the update you must run merge following fetch.

The pull command combines fetch and merge. Thus you can pull any staged and commited changes by executing the git pull command from the clone. Technically git pull fetches the files (folders) from the master (whether remote or local) to your clone (or local) working directory tree while also updating the history tree with the pulled changes.

Change directory, cd to the clone you want to update:

$ cd path/to/git-test-dir-YYYYMMDD-vX

and then execute the git pull command:

$ git pull

The terminal response should be something like:

remote: Enumerating objects: 4, done.
remote: Counting objects: 100% (4/4), done.
remote: Compressing objects: 100% (3/3), done.
remote: Total 4 (delta 0), reused 0 (delta 0)
Unpacking objects: 100% (4/4), done.
From /path/to/git-test-dir
 * [new branch]      master     -> origin/master

If you directly re-run the pull command:

$ git pull

The response will state that you are update:

Already up to date.

If you now explore the initial clone, it also should contain identical copies of the two files in the master repo.

Details on git pull at Atlassian Bitbucket.

Edit master and try clone again

Return to your master repo and edit the Chapters.md document by adding a few more chapters to your project. You can use any text editor, including pico or Vim.. Save your edits.

In a terminal window, cd back to your master repo (the directory “git-test-dir”).

First stage the changes you made:

$ git add Chapters.md

then commit all added (-a) changes with a new message (-m):

git commit -am "chapters.md extended"

[master 628ee9d] chapters.md extended
 1 file changed, 2 insertions(+)

With your edits in master both staged and commited, cd back to your clone (the directory “git-test-dir-YYYYMMDD-vX”), and execute pull:

$ git pull

remote: Enumerating objects: 8, done.
remote: Counting objects: 100% (8/8), done.
remote: Compressing objects: 100% (6/6), done.
remote: Total 6 (delta 1), reused 0 (delta 0)
Unpacking objects: 100% (6/6), done.
From /path/to/git-test-dir
   f8a8843..628ee9d  master     -> origin/master
Updating f8a8843..628ee9d
Fast-forward
 Chapters.md | 6 ++++++
 1 file changed, 6 insertions(+)

Your local clone will now be updated to reflect the master. To check that out you can open the file “/path/to/git-test-dir-YYYYMMDD-vX/Chapters.md” to see that it contains the changes you made in the master copy.

Edit master and clone, and clone again

To see what happens when you have edited both the master and the clone, add a few more chapters to both copies of the document Chapters.md (i.e. in the master and the clone repos). Add different chapters in the two versions so that they are both changed and different.

Return to the master repo and stage and commit the changes (as in the previous section). Then again try to pull from the clone:

$ git pull

remote: Enumerating objects: 5, done.
remote: Counting objects: 100% (5/5), done.
remote: Compressing objects: 100% (3/3), done.
remote: Total 3 (delta 1), reused 0 (delta 0)
Unpacking objects: 100% (3/3), done.
From /path/to/git-test-dir
   628ee9d..3f294b9  master     -> origin/master
Updating 628ee9d..3f294b9
error: Your local changes to the following files would be overwritten by merge:
	Chapters.md
Please commit your changes or stash them before you merge.
Aborting

As you see from the message returned in the Terminal window, the pull was aborted because the changes you made to the clone copy would be overwritten by the master and lost. The solution is to analyse the differences with git diff and then fix them either manually or with git stash.

git diff

When you have overlapping changes in both the master and the clone this result in conflicting differences. You will not be able to pull or push any changes before solving the differences. To analyse the difference, use the command git diff, in our example you should execute the command from the clone repository:

$ git diff

diff --git a/Chapters.md b/Chapters.md
index dbda5d7..1c9c6a8 100644
--- a/Chapters.md
+++ b/Chapters.md
@@ -11,3 +11,5 @@
 ### Albedo changes 1970 - 2020

 ### Cloud changes 1970 - 2020
+
+### Arctic sea ice changes 1970 - 2020

As I edited the documents (Chapters.md), the lines “### Albedo changes 1970 - 2020” and “### Cloud changes 1970 - 2020” occur in the version residing in master, whereas the copy residing in the clone lack these two lines but instead include the last (green fonted) line with a plus sign “+” in front (### Arctic sea ice changes 1970 - 2020).

If you try the git status command in the clone, you will get some hints:

$ git status

On branch master
Your branch is behind 'origin/master' by 1 commit, and can be fast-forwarded.
  (use "git pull" to update your local branch)

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

	modified:   Chapters.md

no changes added to commit (use "git add" and/or "git commit -a")

You have neither staged nor commited the changes that you did in the clone, and when you are trying git pull the un-commited changes prevent the edited master file to be copied.

You can get more details by specifying what diff you want to analyse:

$ git diff –staged

There is a lot more you can analyse with git diff. But the result above is sufficient for manually correcting (harmonizing or transferring all changes to one of the two copies). But you can also do it by stashing.

git stash

git stash temporarily shelves (or stashes) changes made in a repo and allows both to continue on other tasks and return to the stashed changes at a later time. You can solve the conflict with the two different versions in the previous section by stashing your clone. Make sure your terminal window points to the clone and execute:

$ git stash

Saved working directory and index state WIP on master: 628ee9d chapters.md extended

You can now apply the pull command to get the changes in the master repo to the clone:

$ git pull

Updating 628ee9d..3f294b9
Fast-forward
 Chapters.md | 2 ++
 1 file changed, 2 insertions(+)

If you look at the content of Chapters.md in your clone it will (again) be identical to the original in master. But you have stashed changes that you need to attend to. As your clone is no longer identical to the stashed version, there will be a conflict when you merge them. But the conflicting lines will be clearly marked and you need to manually edit the conflicts. To retrieve the stashed version you can either choose to automatically delete (or pop) it after merging, or keep it for stashing multiple times. To keep the stash:

$ git stash apply

and to delete it after applying:

$ git stash pop

In both cases you will get the same result reported in the terminal window:

Auto-merging Chapters.md
CONFLICT (content): Merge conflict in Chapters.md

If you open the copy of Chapters.md in your clone you will see how git recorded the conflicts:

<<<<<<< Updated upstream
### Glacier changes 1970 -2020
=======
### Arctic sea ice changes 1970 - 2020
>>>>>>> Stashed changes

In our simple example we want the copy to include both the upstream update and the stashed changes, so we simply delete the comments:

### Glacier changes 1970 -2020

### Arctic sea ice changes 1970 - 2020

and save the changes.

When you are finished you can drop the stash:

$ git stash drop

The capabilities of git stash are much larger than outlined above.

git push

The git push command is used to transfer data from a clone repo to the master repo (or from a local to a remote repo). In this post, however, we set up the master as the primary development environment, and you will not be able to push changes from the clone to the master. To be able to do that you need to setup the master while omitting the hidden working directory. How to go about creating such a git repository is the topic of the parallel post on Shared master local git control.

Resources

A Visual Git Reference by M Lodatao (2010).

Pro Git - Everything you need to know about git by Scott Chacon and Ben Straub (20200219).

Learn Git with Bitbucket Cloud

Youtube tutorial Introduction to Git - Core Concepts by D. Mahler (20170621)

Better understanding Git’s work flow in order to properly deal with merge conflicts — Part I by Ted Goldfus (20160617)