Managing repositories with Git submodules
Git submodules are a powerful tool for managing dependencies in your codebase. They allow you to include one or more external Git repositories within your own repository, making it easy to keep track of and update those dependencies. In this blog post, we’ll take a look at how to use Git submodules, some of the pros and cons of using them, and a small tutorial on how to get started.
Large-scale apps are usually built with modules that are developed by different teams and maintained by different people. These modules are often kept as separate repositories and are included in the main application as dependencies. This is a common pattern in the industry and is used by many large-scale applications like Facebook, Twitter, and many others.
You might think of monorepos or package distribution systems like npm or yarn as a solution to this problem. However, these solutions are not always the best fit for every project.
In this blog post, we’ll take a look at how to use Git submodules, some of the pros and cons of using them, how it’s better than monorepo or package distribution, and a small tutorial on how to get started.
Benefits of Git submodules
One of the main benefits is that it allows you to easily manage dependencies in your codebase. Since each submodule is a separate Git repository, you can update it independently of your main repository, making it easy to keep track of changes and ensure that you’re always using the latest version of your dependencies.
A separate Git repository means that you can branch out and work on the submodule independently of the main repository. This is especially useful if you’re working on a large-scale project with multiple teams. You can easily create a branch for each team and work on the submodule independently of the main repository.
It enables developers to push changes to submodules without the hassle of going through the release process.
Drawbacks of Git submodules
One of the main drawbacks of submodules comes from its core i.e separate repos, since each submodule is a separate Git repository, you need to keep track of the changes in each submodule and ensure that you’re always using the latest version of your dependencies.
Additionally, submodules can be tricky to work with if you’re not familiar with the command line since you need to use Git commands to manage them. To ensure a smooth workflow, you should make sure that your team is familiar with the command line before using submodules as most of the UI tools don’t support submodules.
Moreover, submodules may increase the size of your repository since they include the entire history of the repository. Having several Git histories in your repository can make it difficult to manage and can slow down your Git operations.
Finally, Submodules are complex to work with. It can be difficult to manage submodules if you’re not familiar with the command line and don’t have a good grasp of how Git works.
Most of these drawbacks can be mitigated by using scripts to automate the process of clone, pull, etc. However, it’s still important to be aware of these drawbacks before using submodules.
Comparison with monorepos
Monorepos are a popular solution for managing large-scale projects. They allow you to keep all of your code in a single repository, making it easy to manage and update. However, there are some areas where submodules are a better fit than monorepos.
- Git submodules make the commit history of the main repository clean. Since submodules are separate repositories, they don’t add any commit history to the main repository. This makes it easy to keep track of changes in the main repository and packages, hence, making it easier to manage.
- Git submodules are more secure as compared to monorepos. Access to submodules can be restricted to specific users or teams. This makes it easier to manage access to submodules and ensure that only authorized users can access them, or only authorized users can push changes to submodules.
- For apps using different languages, tools like lerna, yarn workspace, etc. are not a good fit. These tools are designed to work with a single language and are unsuitable for apps using multiple languages. Git submodules are a better fit for such apps as they are agnostic to the language used.
Comparison with package distribution
Package distribution systems like npm and yarn are popular solutions for managing dependencies in your codebase. However, there are some areas where submodules are a better fit than package distribution systems.
- Fast feedback and better developer experience. Since submodules are cloned in the project directory, you can easily change and test them. This makes it easier to get fast feedback and improve developer experience. Compared to package distribution systems, you need to publish changes to the package and then update the package in your project to see the impact.
- The release process is simpler with submodules. Since submodules are separate repositories, you can push changes to them while working with the main repository, without having to go through the release process. This makes it easier to manage and release changes to submodules.
Playing with git submodules
Here we will take a look at simple operations that can be performed with git submodules. We will be using a simple example of a frontend app that has src/theme/
directory which contains all the styles and assets for the app.
We want to move this directory to a separate repository and use it as a submodule in our main repository. So it can be shared across multiple apps.
Creating a Git submodule
To begin, we will create a new repository for the theme/
directory. We will then add this repository as a submodule to our main repository.
Step 1: Remove the directory from the main repository
Before we can add the directory as a submodule, we need to remove it from the main repository. This is because submodules are separate repositories and can’t be added to an existing repository.
# Navigate to theme directory
cd src/theme
# Remove the directory from the main repository
git rm -r --cached .
Step 2: Create a new repository for the theme directory
While we are still at theme/
directory in the terminal, we need to create a new repository for the theme directory. We can do this by creating a new repository and then pushing the theme/
directory to the new repository.
# Initialize a new git repository
git init
# Add remote origin to the new repository -
git remote add origin <YOUR_REMOTE_URL>
# Add all files to the new repository
git add . && git commit -m 'Initial Commit' && git push --set-upstream origin master
Step 3: Add the new repository as a submodule
Now that we have a new repository for the theme directory, we can add it as a submodule to our main repository. We can do this by using the git submodule add
command.
# Navigate to the main repository
cd ../../
# Add the new repository as a submodule
git submodule add <YOUR_REMOTE_URL> src/theme
# Commit the changes
git commit -m 'chore: move theme to submodule'
# Push the changes
git push
Now the theme/
directory is added as a submodule to the main repository.
For adding a theme to any other repository
To add the theme to any other repository, you can simply clone the main repository of the other app and then run the following command.
# Add the theme submodule
git submodule add <YOUR_REMOTE_URL> src/theme
Updating a Git Submodule
Now that we have added the theme directory as a submodule, we need to update it whenever we make changes to it. We can do this by using the git submodule update
command.
# Update the submodule
git submodule update --remote src/theme
Removing a Git Submodule
To remove a submodule from your repository, you can do so by using the git submodule deinit
command.
# Remove the submodule
git submodule deinit src/theme
Using Git Submodules in CI/CD
Git submodules can be used in CI/CD to manage dependencies in your codebase. Here we will take a look at how to use submodules in CI/CD.
To clone a submodule, you need to use the --recursive
flag with the git clone
command.
# Clone the repository
git clone --recursive <YOUR_REMOTE_URL>
Conclusion
Git submodules are a powerful tool that can be used to manage dependencies in your codebase. But they are not a silver bullet and should be used with caution. It’s important to understand the drawbacks of submodules before using them.
Additionally, it’s important to ensure your team is familiar with the command line before using submodules. If used with care, submodules can be a powerful tool that can help you manage dependencies in your codebase and can boost developer productivity.
Aviator: Automate your cumbersome merge processes
Aviator automates tedious developer workflows by managing git Pull Requests (PRs) and continuous integration test (CI) runs to help your team avoid broken builds, streamline cumbersome merge processes, manage cross-PR dependencies, and handle flaky tests while maintaining their security compliance.
There are 4 key components to Aviator:
- MergeQueue – an automated queue that manages the merging workflow for your GitHub repository to help protect important branches from broken builds. The Aviator bot uses GitHub Labels to identify Pull Requests (PRs) that are ready to be merged, validates CI checks, processes semantic conflicts, and merges the PRs automatically.
- ChangeSets – workflows to synchronize validating and merging multiple PRs within the same repository or multiple repositories. Useful when your team often sees groups of related PRs that need to be merged together, or otherwise treated as a single broader unit of change.
- FlakyBot – a tool to automatically detect, take action on, and process results from flaky tests in your CI infrastructure.
- Stacked PRs CLI – a command line tool that helps developers manage cross-PR dependencies. This tool also automates syncing and merging of stacked PRs. Useful when your team wants to promote a culture of smaller, incremental PRs instead of large changes, or when your workflows involve keeping multiple, dependent PRs in sync.