Workflow: Python

  1. Work environement
  2. Repo structure
  3. Setup
  4. Tests and coverage
  5. Docstrings and documentation

I personally like to split my Python projects in two parts: one for the logic of the program, in .py modules, and other for the inupt and output, data processing and plotting, in Jupyter notebooks, each on its own repo. Here I cover the first part.

Work environement

I use VisualStudio Code with the Python extension. At the moment I’m using Python 3.9.5.

Repo structure

The parent directory only contains configuration files:

  • README.md: Description of the repo, in markdown format. It can be styled using shields.
  • .gitignore: List of files and folders that aren’t tracked by git. For python projects, it typically includes *.pyc, /dist/, *.egg-info, .env/,…
  • LICENSE: Can be created by GitHub. Recommended licenses are GPL3 and MIT.
  • CHANGELOG.md: Keeps track of the major changes to the project. Instructions here.
  • pyproject.toml, setup.py and setup.cfg: Needed to install the project with pip.
  • requirements.txt: List of all the Python packages required by the project.
  • Dockerfile: To create a docker container.

The repo contains a folder test/ for the python tests, a folder docs/ for the Sphynx documentation, and finally a folder myproject/ (i.e. the same as the repo) for the python code. The folder myproject/ and each of its subfolders contains a __init__.py file that imports everything else.

Setup

Python is currently changing the way to prepare packages, so this part is a bit of a mess. The files setup.py and pyproject.toml can be as simple as the ones linked.

The meat of the setup is in setup.cfg. Tha data about the package, version, authors and urls are in the [metadata] section. The packages in [options] defines in which folders to look for modules. install_requires lists the required packages, with the minimum version. A more complete example here

To generate the installation files (remember to add them to .gitignore), run the command

python -m build

This creates two new folders, dev/ and project.egg-info. To check that everything works fine, try to install in a virtual environment

python3 -m venv .env/fresh-install-test
. .env/fresh-install-test/bin/activate
pip install --force dist/myproject-versionnum-py3-none-any.whl

If the package is installed correctly, try to run it. Once that everything works, exit the virtual environment with

deactivate

Tests and coverage

Tests are a way to check that your code does what is supposed to do. Tests are useful, for example, to detect an accidental change that breaks something or that is not backwards-compatible.

I use pytest, which is probably the simplest test suite. In the tests\ folder, add an empty __init__.py file, and one or several .py files, with filenames starting by test_, to contain the tests. In each file, write some functions, also with names starting with test_, one for each individual test. The results of the tests are determined by the assert statements: if the condition of the assert evaluates to True, the test passes, and else it fails. One simple example:

def test_passing():
    assert 2+2 == 4

def test_failing():
    assert 2+2 == 5

To test if the code raises the error that you expect, the syntax is a bit different:

import pytest

def test_division():
    with pytest.raise(ZeroDivisionError):
        1/0

Obviously, you’ll need tests that check your code, and not just some mathematical expressions. Don’t forget to import the modules containing the code.

To run the tests, in the root directory execute the command

python -m pytest

Alternatively VSCode can detect the tests and run them just by pressing a button.

Coverage is a measure of how many lines of your code are being watched by the tests. Install the package

pip install coverage

and run the command

coverage run -m pytest
coverage report -m

to see what lines of your code are not covered by the tests.

On top of running the tests locally, you can use GitHub actions in order to execute them in every push or pull request, with this simple example. For the coverage, you can generate a Codecov report (needs a linked account) with this action.

Docstrings and documentation

Docstrings are strings located at the start of a function, class or module used to documentate them. They always use triple-quotes. For functions and classes, there are several styles to format their contents. I will use the google style, which is very simple and legible. For example,

def myfunction(a: str, b: int, c: bool = True) -> dict:
    '''Short description of the function

    Args:
        a (str): Description of a.
        b (int): Description of b.
        c (bool, optional): Description of c. Defaults to True.

    Returns:
        dict: Description of the return value
    '''

During run-time, you can see the docstring of any function using help(myfunction).

There are tools that compile these docstring into a full documentation, like Sphinx. Add to the requirements.txt the following

sphinx
sphinxcontrib-napoleon

and run pip install -r requirements.txt. Now start sphinx with

mkdir docs
cd docks
sphinx-quickstart

Sphinx guides you through the process. Make sure to choose the option for separate folders for source and build (we will track the source and gitignore the build), and to activate autodoc and githubpages.

Once the initial setup is ready, go to the source folder and open conf.py. Uncomment the lines import os and import sys, and tell Sphinx the location of your python files with

sys.path.insert(0, os.path.abspath('../../myproject'))

Look for the extensions = [, and add to the list 'sphinx.ext.napoleon'.

Now we’re ready to start adding the documentation files. First, open index.rst and add one line for each module in your project, without the extension .py. Be careful with spaces and indentations, you should have something like this:

Welcome to myproject's documentation!
=====================================

.. toctree::
   :maxdepth: 2
   :caption: Contents:
   
   mymodule
   anothermodule
   moremodules

Indices and tables
==================

For each line that you added, create a .rst file with the same name, and with the text

mymodule
--------

.. automodule:: mymodule
    :members:
    :undoc-members:
    :show-inheritance:

Execute the command

make html

to generate the documentation. Open with your browser the file docs/build/html/index.html to see a local preview of the documentation.

To upload the documentation to GitHub pages, add this action to your .github/workflows folder. Each time that you push, the results of build html will automatically be saved to the gh-pages branch of your repo. To publish it as a webpage, go to the Settings page of the GitHub repo, click on Pages, and select as source the gh-pages branch. The documentation will be available at https://myusername.github.io/myproject/, and always up-to-date with the chages you push.