Creating a monorepo from separate repos (merging repositories)
In this post we’ll look at how to combine multiple repositories into a single repository.
The common reason you might want to merge repos into a type of single repo (monorepo) is because you realize changes from one repo affect the other. It would be better to have single atomic commit with all the changes across different repos rather than separate commits for each repo. A single repo guarantees all the code across projects works together with guaranteed dependencies which reduces maintenance and makes tracking or reverting changes easier.
Usually a repo will start with a smaller scope but as time passes it grows and begins overlapping with responsibility from another repo. Perhaps there is a common piece of code you would want to move to a shared dependency or you simply want to keep related projects together and be managed more easily.
In this specific example we’ll be using repos with typescript development on Node but the technique could be applied to any repository because it’s only using git
and not dependent on language or project type.
Walkthrough.
Let’s imagine we have three separate repos repo1
repo2
repo3
and we want to combine them into a new monorepo
We will use the technique described in this SO post:
https://stackoverflow.com/a/17373088/545566
I have created samples for you to clone and follow along:
(The article assumes all the repos are in same location)
git clone https://github.com/mattmazzola/repo1
git clone https://github.com/mattmazzola/repo2
git clone https://github.com/mattmazzola/repo3
There is also a repo of the final result available. Instead cloning that final result we’ll go through the steps to help better learn the process.
Creating the monorepo
There is a initial
branch on the final repo you can clone to get started. Or if you want to go completely manual and you can create this too. It is a single monorepo
folder with a single package.json
from npm init -y
. If perhaps you don’t have npm installed but still want to follow along there is this option.
git clone https://github.com/mattmazzola/monorepo -b initial
You should have folder structure that looks like this:
Note: The individual repos 1,2, and 3 only log out their name since this post is focused on the git operations to merge rather than what the repos actually do.
You might not havebuild
ornode_modules
folders if you haven’t installed and ran them. Don’t worry as it won’t affect the ability to continue.
We will merge each repo 1 first and the process is repeated for the others.
Migrating Repo 1
First we need to add a remote of repo1
to monorepo
so we can fetch the code and go through the merge process.
The remote is usually urls, but it can also be a file location to another repo. For this tutorial we’ll use files so it doesn’t matter where they’re stored.
From monorepo
git remote add repo1 ../repo1
Next, we’ll fetch repo1 from the remote we previously added and attempt to merge it.
git fetch repo1
git merge repo1/master --allow-unrelated-histories
Notice the key flag --allow-unrelated-histories
which allows to merge two commits which have no common ancestor commits.
Resolving The Conflicts
You may have noticed that both repo1
and monorepo
have a package.json
. This is will cause a conflict. We actually want to keep both package.json files because one is the repo1 package and another for the top level monorepo package.
After the merge, the directories represents the state of a repo1.
Select all the current files from repo1
and move them to a folder called repo1
. Now you can revert the changes to the top level monorepo
package.json
. It should look like this:
Commit the changes and now you have repo1
in folder inside monorepo
with all the history.
Repeat the process for each repo you would like to migrate
Repeat the above steps for the other 2 repos and you should end up with history like this:
As mentioned before this technique is very nice since it only uses git and file operations instead of relying on external tools or scripts. In my opinion, the key parts were being able to use file paths as remote locations and handling the merge conflicts. After you move the merged code into it’s own folder and revert any changes to original files git will realize it’s really only addition of a new file which is what we want.
I remember when I was first learning Visual Studio and by default it creates a Solution folder with Project folder inside it. This always seemed unnecessary at the time, but after developing a more sophisticated project this makes a lot more sense and after going through this exercise having that extra project specific folder is great for expansion of new projects and would avoid merges. I might start to follow the practice of having always having a folder at the top level regardless if I intend to use it then. This would be inline with yarn
and npm
move to workspaces
which resembles VisualStudio’s project
.
Hope this helps you better understand how you might use the --allow-unrelated-histories
flag and merge your repos. Let me know what you think in the comments or perhaps improvements on the technique.