[Oberon] all in one git tree
private at claudio.ch
Mon Dec 28 00:52:42 CET 2020
> Thirdly, Git is being used for a huge number of projects for which it
> is overkill. It is like Perl: a sort of Swiss Army Chainsaw with a
> tonne of functionality that most of its users don't need, and for whom
> it merely makes like overcomplicated.
Interestingly at its core git is rather simple and robust. Whenever a
file is added or is changed, the whole file is stored in the repository
(i.e. what you find in the .git directory) in a file having its SHA1
checksum as name.
For every commit typically two more files are created:
* one which maps the location of all files i.e. there original path
and file name to that SHA1 checksum of the file stored in the
repository. This is called a tree file and its name is again the
SHA1 hash of its content.
* one which contains the hash of above mentioned tree file and a hash
of the preceeding commit file(s) and useful information like date,
name of the person who did the comit, a comit message etc. This is
called a commit file and its name is again the SHA1 hash of its content.
A branch is represented by a file having the branch name as file name
and holding just the SHA1 hash of the most recent commit file.
Thus if e.g. you modify two files then commit, four new files are
created in the repository:
* Two files having the complete new content of the two files you have
* A tree file which is basically a copy of the last tree file except
for the two entries pertaining to your two changed files that now
have the new checksum in its mapping,
* A commit file which refers to the new tree file and has a reference
to the last commit file.
As a side note, I found it very insightful to do an exercise where I
created an empty git repository with git init /tmp/X then looked at all
the files created (looking at the content either with a hex editor or
git cat-file -p then git add a file having a look at what changed then
doing a commit etc. and so understand what the commands do. Actually
seeing what the git commands I use do behind the scenes helped me a lot
to understand how to properly use them.
With regard to someting like Oberon Text files or non plain text files
in general, keeping the content in the repository to just have snapshots
of individual states of these files over time works perfectly with git.
It starts to get tricky once you want to see what changed between two
commits. git knows what files have changed and it knows for each file
the old and the new content, but then needs a way to show to the user
what has changed. An easy task for a plain text file (using the unix
diff command) but more difficult for something else. First the diff
command needs an understanding of a more complex file format and
furthermore it needs to be able to illustrate to the user in an
understandable way the change. It also depends a lot of what the user is
interested to see. Is it ok to just show that a statement was inserted,
or is the user also interested to see that an otherwise unchanged
statements is now set in a different font?
It gets even more difficult, once you consider merging changes made on
one branch onto another one. Not only do you need a tool that can create
the difference between two commits, you need additionally a tool that
can detect whether the change can easily/automatically be applied to a
target file, and in such case knows how to apply the change. When it
detects that a change cannot be applied automatically it needs a way to
present the issue to the user, so the user can see and understand what
needs to be done. Finally it needs a way so the user can give back its
decision on what the outcome of the merge shall be.
By choosing to (properly) support plain text files only, you avoid these
difficulties and that is what git does, given that its first use case
was plain text files anyway.
More information about the Oberon