How to partially clone a git project

Download files from git repositories without cloning the entire project

How to partially clone a git project

Git is the most popular version control system used by almost all the software projects. However, it is not the only version control software, there are other popular tools like Subversion and Fossil. You can use Subversion on GitHub for managing your source code.

Why would you partially clone a project?

While cloning the entire repository is generally easier, it a a few drawbacks - it takes longer, uses more bandwidth and takes extra space on your hard drive. These issues are exacerbated if you only want a small portion of a humongous repository.

If you want to download files which have been overwritten then you would need to clone the repository and then revert changes, this takes time and requires knowing how to run necessary git commands.

Although you can download single files easily from raw.github.com, downloading multiple files or entire sub-directories is not possible via this method.

How to clone a sub-directory?

There are multiple ways of downloading sub directories, in this article we will explore two methods to achieve this:

Using the Web interface

There is a excellent tool called download-directory, it can download files and folders from any GitHub repository. You can even download files from different branches and older commits which might have been overwritten. You just need to pass the GitHub URL of the repository and it will be zipped and downloaded for you. it does not work with GitLab

# URL for downloading a particular folder from the head of master branch
https://github.com/trekhleb/javascript-algorithms/tree/master/src/algorithms/sorting

# URL for downloading a folder from an older commit on master branch
https://github.com/trekhleb/javascript-algorithms/tree/7a37a6b86e76ee22bf93ffd9d01d7acfd79d0714/src/algorithms/ml

Using the terminal

When working with servers, I often encounter situations where I need to run a tool that requires a desktop environment. We can use the CLI method for such instances. This method is can also be used to make automated shell scrips.

We will use Subversion to achieve this. You can download Apache Subversion from its download page or if you are on Linux, you can easily download it from your package manager.

Step 1: Construct the URL to download

In the URL, replace tree/master with trunk if you want to clone the latest version from master branch.

# GitHub URL
https://github.com/trekhleb/javascript-algorithms/tree/master/src/algorithms/string

# URL for SVN
https://github.com/trekhleb/javascript-algorithms/trunk/src/algorithms/string

Replace tree with branches if you want to clone a particular banch.

# GitHub URL
https://github.com/TheAlgorithms/Python/tree/Write-for-current-Python/knapsack

# URL for SVN
https://github.com/TheAlgorithms/Python/branches/Write-for-current-Python/knapsack

Step 2: Verify the files before downloading

You can list the files you are about to download by running svn list <url> in your terminal.

svn list https://github.com/trekhleb/javascript-algorithms/trunk/src/algorithms/string

Step 3: Download the files

You can download the files using svn export <url> <local_folder>. This creates a directory locally with the content of the specified subdirectory of the project.

svn export https://github.com/trekhleb/javascript-algorithms/trunk/src/algorithms/string algorithms

Summary

We can use download-directory and svn command to download sub directories of a git project from GitHub. This is useful if you only want to clone a small portion of a particular repository.

I have not yet been able to successfully download a particular commit via CLI method or download files from other git hosting providers like GitLab. If you know a way to do so, feel free to share your knowledge in the comments.

Did you find this article valuable?

Support NAMAN SINGHAL by becoming a sponsor. Any amount is appreciated!