The problem with Git Submodules

time to read 5 min | 880 words

The builtin answer for sharing code between multiple projects is quite simple…

git submodule

But it introduces several problems along the way:

  • You can’t just git clone the repository, you need to clone the repository, then call git submodule init & git submodule update.
  • You can’t just download the entire source code from github.
  • You can’t branch easily with submodules, well, you can, but you have to branch in the related projects as well. And that assumes that you have access to them.
  • You can’t fork easily with submodules, well, you can, if you really feel like updating the associations all the time. Which is really nasty.

Let me present you with a simple scenario, okay? I have two projects that share a common license. Obviously I want all projects to use the same license and the whole thing to be under source control.

Here is our basic setup:

PS C:\Work\temp> git init R1
Initialized empty Git repository in C:/Work/temp/R1/.git/
PS C:\Work\temp> git init R2
Initialized empty Git repository in C:/Work/temp/R2/.git/
PS C:\Work\temp> git init Lic
Initialized empty Git repository in C:/Work/temp/Lic/.git/
PS C:\Work\temp> cd R1
PS C:\Work\temp\R1> echo "Hello Dolly" > Dolly.txt
PS C:\Work\temp\Lic> cd ..\R1
PS C:\Work\temp\R1> git add --all
PS C:\Work\temp\R1> git commit -m "initial commit"
[master (root-commit) 498ab77] initial commit
 1 files changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 Dolly.txt
PS C:\Work\temp\R1> cd ..\R2
PS C:\Work\temp\R2> echo "Hello Jane" > Jane.txt
PS C:\Work\temp\R2> git add --all
PS C:\Work\temp\R2> git commit -m "initial commit"
[master (root-commit) deb45bc] initial commit
 1 files changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 Jane.txt
PS C:\Work\temp\R2> cd ..\Lic
PS C:\Work\temp\Lic> echo "Copyright Ayende (C) 2011" > license.txt
PS C:\Work\temp\Lic> git add --all
PS C:\Work\temp\Lic> git commit -m "initial commit"
[master (root-commit) 8e8b1b4] initial commit
 1 files changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 license.txt

This just gives us the basics. Now I want to share the license.txt file between the projects. I can do that with submodules, like so:

PS C:\Work\temp\R1> git submodule init
PS C:\Work\temp\R1> git submodule add C:\Work\temp\Lic Legal
Cloning into Legal...
done.
PS C:\Work\temp\R1> cd ..\R2
PS C:\Work\temp\R2> git submodule init
PS C:\Work\temp\R2> git submodule add C:\Work\temp\Lic Legal
Cloning into Legal...
done.

Now, this looks nice, and it works beautifully. Until you start sharing this with other people. Then it starts to become somewhat messy.

For example, let us say that I want to add a disclaimer in R1:

PS C:\Work\temp\R1\Legal> echo "Not for Jihad use" > Disclaimer.txt
PS C:\Work\temp\R1\Legal> git add .\Disclaimer.txt
PS C:\Work\temp\R1\Legal> git commit -m "adding disclaimer"
[master db3987c] adding disclaimer
 1 files changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 Disclaimer.txt

And here is where the problems starts. Let us assume that I want to make a change that is local to just this project.

Well, guess what, you can’t. Not if you intend to share this with other people. You need to push your changes to the submodules somewhere, and that means that if you need to fork the original project, update references to the project. Of course, if there is an update to the original submodule, you need to have two stages to update that.

And we haven’t spoken yet on the fun of pushing the main repository but forgetting to push the submodule. It gives a new meaning to “it works on my machine”.

In short, git submodules looks like a good idea, but they aren’t really workable in the real world. I’ll have a new post shortly showing how to deal with the issue