Packaging and developing python projects with nested git-submodules
by Konstantinos Demartinos
Introduction
The present article aims at discussing basic operations that might be relevant to packaging and developing python projects with nested git-submodules. The motivation stems from an actual case that can be abstracted as follows:
We want to work with a git repository that has nested git submodules of an arbitrary depth.
This can be further analysed in the following use-cases:
- It should be easy to update the superproject after any revision in the nested submodules.
- Packaging of the superproject should depend only in the upstream repositories of the nested submodules.
Abstract structure
We represent an abstract structure based on the simple example provided by Python Packaging Authority (PyPA).
/mypackage
/mypackage
__init__.py
...
setup.py
Makefile
requirements.txt
/deps
/package00
/deps
/package10
...
...
/deps
/packageij
/package01
packageij
is the jth
submodule at depth i
.
Packaging: set-up setup.py
One of our designated goals is to make installation of mypackage
independent
of the submodules. To this end, the install_requires
field in setup.py
should
include only the names of submodules at the highest level (package0j
).
IMPORTANT NOTE: If a submodule at any depth (i.e. packageij
) is not
uploaded to the Python Package Index (PyPI) then a valid link (see here)
should be appended to the dependency_links
field in the setup
function. The
package should be then installed by using the --process-dependency-links
flag
of the pip
command, like so:
$ pip install mypackage --process-dependency-links
Development operations
Clone the repository
In order to have all the submodules initialized while cloning the superproject the following command should be invoked:
$ git clone --recurse-submodules <mypackage-URL>
Using a Makefile
to create the development environment
To simplify the configuration of the development environement
a Makefile
is provided, that performs the following operations:
- Creates a
python
virtual environment. - Installs all dependencies in
requirements.txt
in the virtual environment. - Runs steps (1) and (2) in case of upstream or local revisions of the dependencies.
Steps (1) and (2) are executed with make install
, while Step (3) is executed
through make reinstall
.
A typical content for such a file would be the following:
0 env_dir:=venv
1 pip:=$(env_dir)/bin/pip
2
3 install:
4 python3 -m venv $(env_dir)
5 $(pip) install -r requirements.txt --process-dependency-links
6 $(pip) install --upgrade pip
7
8 clean:
9 rm -r $(env_dir)
10
11 reinstall:
12 make clean install
Notice the --process-dependency-links
flag on line 5.
Declaring dependencies
Working with submodules enables us to make revisions in the respective packages, while developing the superproject. The question then rises: How could these revisions be easily exposed to the super-project?
Direct import
One solution would be to directly import the submodules through the respective paths.
E.g. import deps.packageij.subpackageij
. But with nested submodules of
arbitrary depth this becomes rather tedious.
Updating upstream
Another solution would be to update the upstream repositories first and then recreate the development environment. Depending on how we actually publish the submodules, this might require one or more additional operations before we actually work on the super-project.
If we publish to PyPI, a two-step process is required so that to update first the git-repository and then the package in PyPI. Although this is a proper deployment procedure, it seems rather complex for local development purposes.
Declaring dependencies through the submodule paths
Probably the simplest solution is to declare the dependency to
all submodules, by explicitly referring to the respective paths
in the requirements.txt
file, with reverse order from the deepest
to the most shallow, like so:
deps/package00/deps/package10/deps/package20
deps/package00/deps/package10
deps/package00
...
This way, the resolution of all the paths needs be done only once. Afterwards one can make revisions directly in the nested submodules, and re-create the development environment with
$ make reinstall