How I Use Git to Automatically Sync My Writing
One of the annoyances I face when writing is keeping my files synced across multiple computers in several locations. I might work on tomorrow’s lecture on my office computer for 30 minutes, after which I get distracted and have to deal with other things. I get home, eat dinner, and head into my home office for an hour to finish up the lecture. There’s only one problem: I forgot to push that file to the repo, so I don’t have access to anything I did earlier in the day, including the outline of the lecture.
There are a couple of ways people avoid this scenario:
- They use a service like Dropbox or Syncthing.
- They work entirely in the cloud.
I don’t have much interest in either of these. Option (1) requires me to fork over large sums of cash ($12/month for Dropbox) or to set up and maintain my own syncing service. I’m too stingy for the former and have no desire to spend my time doing the latter. Option (2) is not going to happen because I want my files on my hard drive. I edit files in a text editor, not a web browser, and I want my changes version controlled.
So where does this leave me? One of the joys of being a Linux user is writing short scripts to automate tasks you don’t want to do or that you might forget to do.
My Syncing System
Here’s my three-step approach to syncing content to a Git repo. The key is that everything is done by a script. I click an icon to activate the script and it handles the rest.
Step 1. Commit all files. Then pull from the server. This is a server -> local computer
sync that ensures I’m building on the most recent version of the files. One of the downsides of using Git for syncing is that it’s easy to start typing without doing a pull, and then when you try to do your first push, you have a merge conflict. This step means I don’t have to worry about finding myself in that ugly situation.
Step 2, repeated every 60 seconds. Add all untracked files. Then commit. This is a fast operation when all you’re doing is creating and editing text files. There’s no technical reason I couldn’t do it every 5 seconds, but I wouldn’t want to do it that frequently. There’s a tradeoff between fine-grained tracking of changes and being able to navigate your repo history. 60 seconds has worked fine for me, but I could also see someone committing every five or even fifteen minutes. The repeat is handled by the built-in Linux command watch
(see below).
Step 3, repeated every five minutes. Push to the server. This is the local computer -> server
part of the sync. It’s implemented by checking the current time and doing a push if the minute is divisible by 5. I don’t want to push too often. Pushing is a network operation, so it can be slow. Syncing should be done with sufficient frequency that you don’t lose much if your computer dies or if you shut down your computer to go home. Five minutes is often enough for me.
The approach described above has one serious limitation. You could find yourself in a situation where you are working on another machine and you do not have access to changes made in the last few minutes of using the other machine. That could include the creation of new files at the end of your session. In addition, it opens the door to merge conflicts.
As you can see below, due to the way I’ve set up my script, there’s a final push to the server when I shut things down. Under normal usage, there’s no way for changes to slip through the cracks and not be synced.
The Code
Here’s the command I run by clicking an icon when I’m ready to start writing:
xterm -hold -e 'cd ~/dendron;~/bin/gitsync f;watch -n 60 ~/bin/gitsync;~/bin/gitsync p'
And this is what each part means:
xterm -hold -e
: Open an xterm window, keep it open, and execute the command that follows.cd ~/dendron
: Change into the directory holding my notes, in this case the directory holding my Dendron repo.~/bin/gitsync f
: Run the Git syncing script with optionf
. It’s a full sync cycle: add untracked files to the repo, commit, pull from the server, push back to the server.watch -n 60 ~/bin/gitsync
: Every 60 seconds, do a commit. If the minute is divisible by 5, thegitsync
script follows up with a push.~/bin/gitsync p
: If I hit Ctrl-c, that kills the watch command, but the full command continues to run inside xterm. It moves on to call thegitsync
script with argumentp
, which tells it to commit and push. In other words, when I’ve decided I’m done writing for the day, there’s a final push to sync the most recent changes. As long as I work in the usual way, killing the watch command before I quit, everything is guaranteed to sync.
Pros and Cons
There are some advantages of this approach to syncing compared with alternatives like Dropbox:
- It’s built on top of Git. Few apps have achieved the level of reliability of Git. You get full version history for free. Selective sync is done by making entries in a .gitignore file.
- It has no dependencies. It’s nothing but a short script calling standard Linux commands.
- It’s future-proof. There will never be a need to upgrade any of the code - it will almost certainly work in this exact form 20 years from now.
- Simplicity. It took less than 20 minutes to write a full, working sync system. That’s less time than it would have taken me to purchase and install Dropbox. Installation on a new computer takes about 30 seconds. There are never “edge cases” where for some reason, it just doesn’t work the way it’s supposed to. It just works.
- I can use the Github web interface if I’m using a Github repo. Issues, discussions, etc. are all available as with any other Github repo.
The main limitation is that by being built on top of Git it is designed explicitly for syncing text files. A few small non-text files (Word documents, PDF files, screenshots) are okay as long as they don’t change very often. If I’m editing a Word document, repeatedly building a PDF version of a markdown document as I make changes, or I have music and video files, this isn’t a great solution.