In the past week I went to add Python 3.12 support to my zstandard Python package. A few hours into the unexpected yak shave / rat hole, I decided to start chronicling my experience so that I may share it with the broader Python community. My hope is that by sharing my (unfortunately painful) end-user experience that I can draw attention to aspects of Python packaging that are confusing so that better informed and empowered people can improve matters and help make future Python packaging decisions to help scenarios like what I’m about to describe.
This blog post is purposefully verbose and contains a very lightly edited stream of my mental thoughts. Think of it as a self-assessed user experience study of Python packaging.
Some Ba…
In the past week I went to add Python 3.12 support to my zstandard Python package. A few hours into the unexpected yak shave / rat hole, I decided to start chronicling my experience so that I may share it with the broader Python community. My hope is that by sharing my (unfortunately painful) end-user experience that I can draw attention to aspects of Python packaging that are confusing so that better informed and empowered people can improve matters and help make future Python packaging decisions to help scenarios like what I’m about to describe.
This blog post is purposefully verbose and contains a very lightly edited stream of my mental thoughts. Think of it as a self-assessed user experience study of Python packaging.
Some Background
I’m no stranger to the Python ecosystem or Python packaging. I’ve been programming Python for 10+ years. I’ve even authored a Python application packaging tool, PyOxidizer.
When programming, I strive to understand how things work. I try to not blindly copy-paste or cargo cult patterns unless I understand how they work. This means I often scope bloat myself and slow down velocity in the short term. But I justify this practice because I find it often pays dividends in the long term because I actually understand how things work.
I also have a passion for security and supply chain robustness. After you’ve helped maintain complex CI systems for multiple companies, you learn the hard way that it is important to do things like transitively pin dependencies and reduce surface area for failures so that build automation breaks in reaction to code changes in your version control, not spooky-action-at-a-distance when state on a third party server changes (e.g. a new package version is uploaded).
I’ve been aware of the emergence of pyproject.toml. But I’ve largely sat on the sidelines and held off adopting them, mainly for if it isn’t broken, don’t fix it reasons. Plus, my perception has been that the tooling still hasn’t stabilized: I’m not going to incur work now if it is going to invite avoidable churn that could be avoided by sitting on my hands a little longer.
Now, on to my user experience of adding Python 3.12 to python-zstandard and the epic packaging yak shave that entailed.
The Journey Begins
When I attempted to run CI against Python 3.12 on GitHub Actions, running python setup.pycomplained that setuptools couldn’t be imported.
Huh? I thought setuptools was installed in pretty much every Python distribution by default? It was certainly installed in all previous Python versions by the actions/setup-python GitHub Action. I was aware distutils was removed from the Python 3.12 standard library. But setuptools and distutils are not the same! Why did setuptools disappear?
I look at the CI logs for the passing Python 3.11 job and notice a message:
********************************************************************************
Please avoid running ``setup.py`` directly.
Instead, use pypa/build, pypa/installer or other
standards-based tools.
See https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html for details.
********************************************************************************
I had several immediate reactions:
- OK, maybe this is a sign I should be modernizing to
pyproject.tomland moving away frompython setup.py. Maybe the missingsetuptoolsin the 3.12 CI environment is a side-effect of this policy shift? - What are
pypa/buildandpypa/installer? I’ve never heard of them. I knowpypais the Python Packaging Authority (I suspect most Python developers don’t know this). Are these GitHub org/repo identifiers? - What exactly is a standards-based tool? Is pip not a standards-based tool?
- Speaking of pip, why isn’t it mentioned? I thought pip was the de facto packaging tool and had been for a while!
- It’s linking a URL for more info. But why is this a link to what looks like an individual’s blog and not to some more official site, like the setuptools or pip docs? Or anything under python.org?
Learning That I Shouldn’t Invoke python setup.py
I open https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html in my browser and see a 4,000+ word blog post. Oof. Do I really want/need to read this? Fortunately, the author included a tl;dr and linked to a summary section telling me a lot of useful information! It informs me (my commentary in parentheses):
- The setuptools project has stopped maintaining all direct invocations of setup.py years ago. (What?!)
- There are undoubtedly many ways that your setup.py-based system is broken today, even if it’s not failing loudly or obviously.. (What?! Surely this can’t be true. I didn’t see any warnings from tooling until recently. How was I supposed to know this?)
- PEP 517, 518 and other standards-based packaging are the future of the Python ecosystem. (A ha - a definition of standards-based tooling. I guess I have to look at PEP 517 and PEP 518 in more detail. I’m pretty sure these are the PEPs that define
pyproject.toml.) - At this point you may be expecting me to give you a canonical list of the right way to do everything that setup.py used to do, and unfortunately the answer here is that it’s complicated. (You are telling me that we had a working
python setup.pysolution for 10+ years, this workflow is now quasi deprecated, and the recommended replacement is it’s complicated?! I’m just trying to get my package modernized. Why does that need to be complicated?) - That said, I can give you some simple “works for most people” recommendations for some of the common commands. (Great, this is exactly what I was looking for!)
Then I look at the table mapping old ways to new ways. In the new column, it references the following tools: build, pytest, tox, nox, pip, and twine. That’s quite the tooling salad! (And that
buildtool must be the pypa/build referenced in the setuptools warning message. One mystery solved!)
I scroll back to the top of the article and notice the date: October 2021. Two years old. The summary section also mentioned that there’s been a lot of activity around packaging tooling occurring. So now I’m wondering if this blog post is outdated. Either way, it is clear I have to perform some additional research to figure out how to migrate off python setup.py so I can be compliant with the new world order.
Learning About pyproject.toml and Build Systems
I had pre-existing knowledge of pyproject.toml as the modern way to define build system metadata. So I decide to start my research by Googling pyproject.toml. The first results are:
- https://pip.pypa.io/en/stable/reference/build-system/pyproject-toml/
- https://stackoverflow.com/questions/62983756/what-is-pyproject-toml-file-for
- https://python-poetry.org/docs/pyproject/
- https://setuptools.pypa.io/en/latest/userguide/pyproject_config.html
- https://godatadriven.com/blog/a-practical-guide-to-setuptools-and-pyproject-toml/
- https://towardsdatascience.com/pyproject-python-9df8cc092f61
I click pip’s documentation first because pip is known to me and it seems a canonical source. Pip’s documentation proceeds to link to PEP-518, PEP-517, PEP-621, and PEP-660 before telling me how projects with
pyproject.tomlare built, without giving me - a package maintainer - much useful advice for what to do or how to port fromsetup.py. This seems like a dead end.
Then I look at the Stack Overflow link. Again, telling me a lot of what I don’t really care about. (I’ve somewhat lost faith in Stack Overflow and only really skimmed this page: I would much prefer to get an answer from a first party source.)
I click on the Poetry link. It documents TOML fields. But only for the [tool.poetry] section. While I’ve heard about Poetry, I know that I probably don’t want to scope bloat myself to learn how Poetry works so I can use it. (No offence meant to the Poetry project here but I don’t perceive my project as needing whatever features Poetry provides: I’m just trying to publish a simple library package.) I go back to the search results.
I click on the setuptools link. I’m using setuptools via setup.py so this content looks promising! It gives me a nice example TOML of how to configure a [build-system] and [project] metadata. It links to PyPA’s Declaring project metadata content, which I open in a new tab, as the content seems useful. I continue reading setuptools documentation. I land on its Quickstart documentation, which seems useful. I start reading it and it links to the build tool documentation. That’s the second link to the build tool. So I open that in a new tab.
At this point, I think I have all the documentation on pyproject.toml. But I’m still trying to figure out what to replace python setup.py with. The build tool certainly seems like a contender since I’ve seen multiple references to it. But I’m still looking for modern, actively maintained documentation pointing me in a blessed direction.
The next Google link is A Practical Guide to Setuptools and Pyproject.toml. I start reading that. I’m immediately confused because it is recommending I put setuptools metadata in setup.cfg files. But I just read all about defining this metadata in pyproject.toml files in setuptools’ own documentation! Is this blog post out of date? March 12, 2022. Seems pretty modern. I look at the setuptools documentation again and see the pyproject.toml metadata pieces are in version 61.0.0 and newer. I go to https://github.com/pypa/setuptools/releases/tag/v61.0.0 and see version 61.0.0 was released on March 25, 2022. So the fifth Google link was seemingly obsoleted 13 days after it was published. Good times. I pretend I never read this content because it seems out of date.
The next Google link is https://towardsdatascience.com/pyproject-python-9df8cc092f61. I click through. But Medium wants me to log in to read it all and it is unclear it is going to tell me anything important, so I back out.
Learning About the build Tool
I give up on Google for the moment and start reading up on the build tool from its docs.
The only usage documentation for the build tool is on its root documentation page. And that documentation basically prints what python -m build --help would print: says what the tool does but doesn’t give any guidance or where I should be using it or how to replace existing tools (like python setup.py invocations). Yes, I can piece the parts together and figure out that python -m build can be used as a replacement for python setup.py sdist and python setup.py bdist_wheel (and maybe pip wheel?). But should it be the replacement I choose? I make use of python setup.py develop and the aforementioned blog post recommended replacing that with python -m pip install -e. Perhaps I can use pip as the singular replacement for building source distributions and binary wheels so I have N-1 packaging tools? I keep researching.
Exploring the Python Packaging User Guide
I had previously opened https://packaging.python.org/en/latest/specifications/declaring-project-metadata/ in a browser tab without really looking at it. On second glance, I see it is part of a broader Python Packaging User Guide. Oh, this looks promising! A guide on how to do what I’m seeking maintained by the Python Packaging Authority (PyPA), the group who I know to be the, well, authorities on Python packaging. It is is published under the canonical python.org domain. Surely the answer will be here.
I immediately click on the link to Packaging Python Projects to hopefully see what the PyPA folks are recommending.
Is Hatch the Answer?
I skim through. I see recommendations to use a pyproject.toml with a [build-system] to define the build backend. This matches my expectations. But they are using Hatchling as their build backend. Another tool I don’t really know about. I click through some inline links and eventually arrive at https://github.com/pypa/hatch. (I’m kind of confused why the PyPA tutorial said Hatchling when the project and tool is apparently named Hatch. But whatever.)
I skim Hatch’s GitHub README. It looks like a unified packaging tool. Build system. Package uploading/publishing. Environment management (sounds like a virtualenv alternative?). This tool actually seems quite nice! I start skimming the docs. Like Poetry, it seems like this is yet another new tool that I’d need to learn and would require me to blow up my existing setup.py in order to adopt. Do I really want to put in that effort? I’m just trying to get python-zstandard back on the paved road and avoid seemingly deprecated workflows: I’m not looking to adopt new tooling stacks.
I’m also further confused by the existence of Hatch under the PyPA GitHub Organization. That’s the same GitHub organization hosting the Python packaging tools that are known to me, namely build, pip, and setuptools. Those three projects are pinned repositories. (The other three pinned repositories are virtualenv, wheel, and twine.) Hatch is seemingly a replacement for pip, setuptools, virtualenv, twine, and possibly other tools. But it isn’t a pinned repository. Yet it is the default tool used in the PyPA maintained Packaging Python Projects guide. (That guide also suggests using other tools like setuptools, flit, and pdm. But the default is Hatch and that has me asking questions. Also, I didn’t initially notice that Creating pyproject.toml has multiple tabs for different backends.)
While Hatch looks interesting, I’m just not getting a strong signal that Hatch is sufficiently stable or warrants my time investment to switch to. So I go back to reading the Python Packaging User Guide.
The PyPA User Guide Search Continues
As I click around the User Guide, it is clear the PyPA folks really want me to use pyproject.toml for packaging. I suppose that’s the future and that’s a fair ask. But I’m still confused how I should migrate my setup.py to it. What are the risks with replacing my setup.py with pyproject.toml? Could I break someone installing my package on an old Linux distribution or old virtualenv using an older version of setuptools or pip? Will my adoption of build, hatch, poetry, whatever constitute a one way door where I lock out users in older environments? My package is downloaded over one million times per month and if I break packaging someone is likely to complain.
I’m desperately looking for guidance from the PyPA at https://packaging.python.org/ on how to manage this migration. But I just... can’t find it. Guides surprisingly has nothing on the topic.
Outdated Tool Recommendations from the PyPA
Finally I find Tool recommendations in the PyPA User Guide. Under Packaging tool recommendations it says:
- Use setuptools to define projects.
- Use build to create Source Distributions and wheels.
- If you have binary extensions and want to distribute wheels for multiple platforms, use cibuildwheel as part of your CI setup to build distributable wheels.
- Use twine for uploading distributions to PyPI. Finally, some canonical documentation from the PyPA that comes out and suggests what to use!
But my relief immediately turns to questioning whether this tooling recommendations documentation is up to date:
- If setuptools is recommended, why does the Packaging Python Projects tutorial use Hatch?
- How exactly should I be using setuptools to define projects? Is this referring to setuptools as a
[build-system]backend? The existence of define seemingly implies usingsetup.pyorsetup.cfgto define metadata. But I thought these distutils/setuptools specific mechanisms were deprecated in favor of the more genericpyproject.toml? - Why aren’t other tools like Hatch, pip, poetry, flit, and pdm mentioned on this page? Where’s the guidance on when to use these alternative tools?
- There are footnotes referencing
distutilsas if it is still a modern practice. No mention that it was removed from the standard library in Python 3.12. - But the
buildtool is referenced and that tool is relatively new. So the docs have to be somewhat up-to-date, right? Sadly, I reach the conclusion that this Tool recommendations documentation is inconsistent with newer documentation and can’t be trusted. But it did mention thebuildtool and we now have multiple independent sources steering me in the direction of thebuildtool (at least for source distribution and wheel building), so it seems like we have a winner on our hands.
Initial Failures Running build
So let’s use the build tool. I remember docs saying to invoke it with python -m build, so I try that:
$ python3.12 -m build --help
No module named build.__main__; 'build' is a package and cannot be directly executed
So the build package exists but it doesn’t have a __main__. Ummm.
$ python3.12R
Python 3.12.0 (main, Oct 23 2023, 19:58:35) [Clang 15.0.0 (clang-1500.0.40.1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import build
>>> build.__spec__
ModuleSpec(name='build', loader=<_frozen_importlib_external.NamespaceLoader object at 0x10d403bc0>, submodule_search_locations=_NamespacePath(['/Users/gps/src/python-zstandard/build']))
Oh, it picked up the build directory from my source checkout because sys.path has the current directory by default. Good times.
$ (cd ~ && python3.12 -m build)
/Users/gps/.pyenv/versions/3.12.0/bin/python3.12: No module named build
I guess build isn’t installed in my Python distribution / environment. You used to be able to build packages using just the Python standard library. I guess this battery is no longer included in the stdlib. I shrug and continue.
Installing build
I go to the Build installation docs. It says to pip install build. (I thought I read years ago that one should use python3 -m pip to invoke pip. Strange that a PyPA maintained tool is telling me to invoke pip directly since I’m pretty sure a lot of the reasons to use python -m to invoke tools are still valid. But I digress.)
I follow the instructions, installing it to the global site-packages because I figure I’ll use this tool a lot and I’m not a virtual environment purist:
$ python3.12 -m pip install build
Collecting build
Obtaining dependency information for build from https://files.pythonhosted.org/packages/93/dd/b464b728b866aaa62785a609e0dd8c72201d62c5f7c53e7c20f4dceb085f/build-1.0.3-py3-none-any.whl.metadata
Downloading build-1.0.3-py3-none-any.whl.metadata (4.2 kB)
Collecting packaging>=19.0 (from build)
Obtaining dependency information for packaging>=19.0 from https://files.pythonhosted.org/packages/ec/1a/610693ac4ee14fcdf2d9bf3c493370e4f2ef7ae2e19217d7a237ff42367d/packaging-23.2-py3-none-any.whl.metadata
Downloading packaging-23.2-py3-none-any.whl.metadata (3.2 kB)
Collecting pyproject_hooks (from build)
Using cached pyproject_hooks-1.0.0-py3-none-any.whl (9.3 kB)
Using cached build-1.0.3-py3-none-any.whl (18 kB)
Using cached packaging-23.2-py3-none-any.whl (53 kB)
Installing collected packages: pyproject_hooks, packaging, build
Successfully installed build-1.0.3 packaging-23.2 pyproject_hooks-1.0.0
That downloads and installs wheels for build, packaging, and pyproject_hooks.
At this point the security aware part of my brain is screaming because we didn’t pin versions or SHA-256 digests of any of these packages anywhere. So if a malicious version of any of these packages is somehow uploaded to PyPI that’s going to be a nightmare software supply chain vulnerability having similar industry impact as log4shell. Nowhere in build’s documentation does it mention this or say how to securely install build. I suppose you have to just know about the supply chain gotchas with pip install in order to mitigate this risk for yourself.
Initial Results With build Are Promising
After getting build installed, python3.12 -m build --help works now and I can build a wheel:
$ python3.12 -m build --wheel .
* Creating venv isolated environment...
* Installing packages in isolated environment... (setuptools >= 40.8.0, wheel)
* Getting build dependencies for wheel...
...
* Installing packages in isolated environment... (wheel)
* Building wheel...
running bdist_wheel
running build
running build_py
...
Successfully built zstandard-0.22.0.dev0-cp312-cp312-macosx_14_0_x86_64.whl
That looks promising! It seems to have invoked my setup.py without me having to define a [build-system] in my pyproject.toml! Yay for backwards compatibility.
The Mystery of the Missing cffi Package
But I notice something.
My setup.py script conditionally builds a zstandard._cffi extension module if import cffi succeeds. Building with build isn’t building this extension module.
Before using build, I had to run setup.py using a python having the cffi package installed, usually a project-local virtualenv. So let’s try that:
$ venv/bin/python -m pip install build cffi
...
$ venv/bin/python -m build --wheel .
...
And I get the same behavior: no CFFI extension module.
Staring at the output, I see what looks like a smoking gun:
* Creating venv isolated environment...
* Installing packages in isolated environment... (setuptools >= 40.8.0, wheel)
* Getting build dependencies for wheel...
...
* Installing packages in isolated environment... (wheel)
OK. So it looks like build is creating its own isolated environment (disregarding the invoked Python environment having cffi installed), installing setuptools >= 40.8.0 and wheel into it, and then executing the build from that environment.
So build sandboxes builds in an ephemeral build environment. This actually seems like a useful feature to help with deterministic and reproducible builds: I like it! But at this moment it stands in the way of progress. So I run python -m build --help, spot a --no-isolation argument and do the obvious:
$ venv/bin/python -m build --wheel --no-isolation .
...
building 'zstandard._cffi' extension
...
Success!
And I don’t see any deprecation warnings either. So I think I’m all good.
But obviously I’ve ventured off the paved road here, as we had to violate the default constraints of build to get things to work. I’ll get back to that later.
Reproducing Working Wheel Builds With pip
Just for good measure, let’s see if we can use pip wheel to produce wheels, as I’ve seen references that this is a supported mechanism for building wheels.
$ venv/bin/python -m pip wheel .
Processing /Users/gps/src/python-zstandard
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing metadata (pyproject.toml) ... done
Building wheels for collected packages: zstandard
Building wheel for zstandard (pyproject.toml) ... done
Created wheel for zstandard: filename=zstandard-0.22.0.dev0-cp312-cp312-macosx_14_0_x86_64.whl size=407841 sha256=a2e1cc1ad570ab6b2c23999695165a71c8c9e30823f915b88db421443749f58e
Stored in directory: /Users/gps/Library/Caches/pip/wheels/eb/6b/3e/89aae0b17b638c9cdcd2015d98b85ee7fb3ef00325bb44a572
Successfully built zstandard
That output is a bit terse, since the setuptools build logs are getting swallowed. That’s fine. Rather than run with -v to get those logs, I manually inspect the built wheel:
$ unzip -lv zstandard-0.22.0.dev0-cp312-cp312-macosx_14_0_x86_64.whl
Archive: zstandard-0.22.0.dev0-cp312-cp312-macosx_14_0_x86_64.whl
Length Method Size Cmpr Date Time CRC-32 Name
-------- ------ ------- ---- ---------- ----- -------- ----
7107 Defl:N 2490 65% 10-23-2023 08:36 7bb42fff zstandard/__init__.py
13938 Defl:N 2498 82% 10-23-2023 08:36 8d8d1316 zstandard/__init__.pyi
919352 Defl:N 366631 60% 10-26-2023 08:28 3aeefc48 zstandard/backend_c.cpython-312-darwin.so
152430 Defl:N 32528 79% 10-26-2023 05:37 fc1a3c0c zstandard/backend_cffi.py
0 Defl:N 2 0% 12-26-2020 16:12 00000000 zstandard/py.typed
1484 Defl:N 784 47% 10-26-2023 08:28 facba579 zstandard-0.22.0.dev0.dist-info/LICENSE
2863 Defl:N 847 70% 10-26-2023 08:28 b8d80875 zstandard-0.22.0.dev0.dist-info/METADATA
111 Defl:N 106 5% 10-26-2023 08:28 878098e6 zstandard-0.22.0.dev0.dist-info/WHEEL
10 Defl:N 12 -20% 10-26-2023 08:28 a5f38e4e zstandard-0.22.0.dev0.dist-info/top_level.txt
841 Defl:N 509 40% 10-26-2023 08:28 e9a804ae zstandard-0.22.0.dev0.dist-info/RECORD
-------- ------- --- -------
1098136 406407 63% 10 files
(Python wheels are just zip files with certain well-defined paths having special meanings. I know this because I wrote Rust code for parsing wheels as part of developing PyOxidizer.)
Looks like the zstandard/_cffi.cpython-312-darwin.so extension module is missing. Well, at least pip is consistent with build! Although somewhat confusingly I don’t see any reference to a separate build environment in the pip output. But I suspect it is there because cffi is installed in the virtual environment I invoke pip from!
Reading pip help output, I find the relevant argument to not spawn a new environment and try again:
$ venv/bin/python -m pip wheel --no-build-isolation .
<same exact output except the wheel size and digest changes>
$ unzip -lv zstandard-0.22.0.dev0-cp312-cp312-macosx_14_0_x86_64.whl
...
1002664 Defl:N 379132 62% 10-26-2023 08:33 48afe5ba zstandard/_cffi.cpython-312-darwin.so
...
(I’m happy to see build and pip agreeing on the no isolation terminology.)
OK, so I got build and pip to behave nearly identically. I feel like I finally understand this!
I also run pip -v wheel and pip -vv wheel to peek under the covers and see what it’s doing. Interestingly, I don’t see any hint of a virtual environment or temporary directory until I go to -vv. I find it interesting that build presents details about this by default but you have to put pip in very verbose mode to get it. I’m glad I used build first because the ephemeral build environment was the source of my missing dependency and pip buried this important detail behind a ton of other output in -vv, making it much harder to discover!
Understanding How setuptools Gets Installed
When looking at pip’s verbose output, I also see references to installing the setuptools and wheel packages:
Processing /Users/gps/src/python-zstandard
Running command pip subprocess to install build dependencies
Collecting setuptools>=40.8.0
Using cached setuptools-68.2.2-py3-none-any.whl.metadata (6.3 kB)
Collecting wheel
Using cached wheel-0.41.2-py3-none-any.whl.metadata (2.2 kB)
Using cached setuptools-68.2.2-py3-none-any.whl (807 kB)
Using cached wheel-0.41.2-py3-none-any.whl (64 kB)
Installing collected packages: wheel, setuptools
Successfully installed setuptools-68.2.2 wheel-0.41.2
Installing build dependencies ... done
There’s that setuptools>=40.8.0 constraint again. (We also saw it in build.) I rg 40.8.0 my source checkout (note: the . in there are wildcard characters since 40.8.0 is a regexp so this could over match) and come up with nothing. If it’s not coming from my code, where is it coming from?
In the pip documentation, Fallback behaviour says that a missing [build-system] from pyproject.toml is implicitly translated to the following:
[build-system]
requires = ["setuptools>=40.8.0", "wheel"]
build-backend = "setuptools.build_meta:__legacy__"
For build, I go to the source code and discover that similar functionality was added in May 2020.
I’m not sure if this default behavior is specified in a PEP or what. But build and pip seem to be agreeing on the behavior of adding setuptools>=40.8.0 and wheel to their ephemeral build environments and invoking setuptools.build_meta:__legacy__ as the build backend as implicit defaults if your pyproject.toml lacks a [build-system]. OK.
Being Explicit About The Build System
Perhaps I should consider defining [build-system] and being explicit about things? After all, the tools aren’t printing anything indicating they are assuming implicit defaults and for all I know the defaults could change in a backwards incompatible manner in any release and break my build. (Although I would hope to see a deprecation warning before that occurs.)
So I modify my pyproject.toml accordingly:
[build-system]
requires = [
"cffi==1.16.0",
"setuptools==68.2.2",
"wheel==0.41.2",
]
build-backend = "setuptools.build_meta:__legacy__"
I pinned all the dependencies to specific versions because I like determinism and reproducibility. I really don’t like when the upload of a new package version breaks my builds!
Software Supply Chain Weaknesses in pyproject.toml
When I pinned dependencies in [build-system] in pyproject.toml, the security part of my brain is screaming over the lack of SHA-256 digest pinning.
How am I sure that we’re using well-known, trusted versions of these dependencies? Are all the transitive dependencies even pinned?
Before pyproject.toml, I used pip-compile from pip-tools to generate a requirements.txt containing SHA-256 digests for all transitive dependencies. I would use python3 -m venv to create a virtualenv, venv/bin/python -m pip install -r requirements.txt to materialize a (highly deterministic) set of packages, then run venv/bin/python setup.py to invoke a build in this stable and securely created environment. (Some) software supply chain risks averted! But, uh, how do I do that with pyproject.toml build-system.requires? Does it even support pinning SHA-256 digests?
I skim the PEPs related to pyproject.toml and don’t see anything. Surely I’m missing something.
In desperation I check the pip-tools project and sure enough they document pyproject.toml integration. However, they tell you how to feed requirements.txt files into the dynamic dependencies consumed by the build backend: there’s nothing on how to securely install the build backend itself.
As far as I can tell pyproject.toml has no facilities for securely installing (read: pinning content digests for all transitive dependencies) the build backend itself. This is left as an exercise to the reader. But, um, the build frontend (which I was also instructed to download insecurely via python -m pip install) is the thing installing the build backend. How am I supposed to subvert the build frontend to securely install the build backend? Am I supposed to disable default behavior of using an ephemeral environment in order to get secure backend installs? Doesn’t the ephemeral environment give me additional, desired protections for build determinism and reproducibility? That seems wrong.
It kind of looks like pyproject.toml wasn’t designed with software supply chain risk mitigation as a criteria. This is extremely surprising for a build system abstraction designed in the past few years. I shrug my shoulders and move on.
Porting python setup.py develop Invocations
Now that I figure I have a working pyproject.toml, I move onto removing python setup.py invocations.
First up is a python setup.py develop --rust-backend invocation.
My setup.py performs very crude scanning of sys.argv looking for command arguments like --system-zstd and --rust-backend as a way to influence the build. We just sniff these special arguments and remove them from sys.argv so they don’t confuse the setuptools options parser. (I don’t believe this is a blessed way of doing custom options handling in distutils/setuptools. But it is simple and has worked since I introduced the pattern in 2016.)
Is --global-option the Answer?
With python setup.py invocations going away and a build frontend invoking setup.py, I need to find an alternative mechanism to pass settings into my setup.py.
Why you shouldn’t invoke setup.py directly tells me I should use pip install -e. I’m guessing there’s a way to instruct pip install to pass arguments to setup.py.
$ venv/bin/python -m pip install --help
...
-C, --config-settings <settings>
Configuration settings to be passed to the PEP 517 build backend. Settings take the form KEY=VALUE. Use multiple --config-settings options to pass multiple keys to the backend.
--global-option <options> Extra global options to be supplied to the setup.py call before the install or bdist_wheel command.
...
Hmmm. Not really sure which of these to use. But--global-option mentions setup.py and I’m using setup.py. So I try that:
$ venv/bin/python -m pip install --global-option --rust-backend -e .
Usage:
/Users/gps/src/python-zstandard/venv/bin/python -m pip install [options] <requirement specifier> [package-index-options] ...
/Users/gps/src/python-zstandard/venv/bin/python -m pip install [options] -r <requirements file> [package-index-options] ...
/Users/gps/src/python-zstandard/venv/bin/python -m pip install [options] [-e] <vcs project url> ...
/Users/gps/src/python-zstandard/venv/bin/python -m pip install [options] [-e] <local project path> ...
/Users/gps/src/python-zstandard/venv/bin/python -m pip install [options] <archive url/path> ...
no such option: --rust-backend
Oh, duh, --rust-backend looks like an argument and makes pip’s own argument parsing ambiguous as to how to handle it. Let’s try that again with --global-option=--rust-backend:
$ venv/bin/python -m pip install --global-option=--rust-backend -e .
DEPRECATION: --build-option and --global-option are deprecated. pip 24.0 will enforce this behaviour change. A possible replacement is to use --config-settings. Discussion can be found at https://github.com/pypa/pip/issues/11859
WARNING: Implying --no-binary=:all: due to the presence of --build-option / --global-option.
Obtaining file:///Users/gps/src/python-zstandard
Installing build dependencies ... done
Checking if build backend supports build_editable ... done
Getting requirements to build editable ... done
Preparing editable metadata (pyproject.toml) ... done
Building wheels for collected packages: zstandard
WARNING: Ignoring --global-option when building zstandard using PEP 517
Building editable for zstandard (pyproject.toml) ... done
Created wheel for zstandard: filename=zstandard-0.22.0.dev0-0.editable-cp312-cp312-macosx_14_0_x86_64.whl size=4379 sha256=05669b0a5fd8951cac711923d687d9d4192f6a70a8268dca31bdf39012b140c8
Stored in directory: /private/var/folders/dd/xb3jz0tj133_hgnvdttctwxc0000gn/T/pip-ephem-wheel-cache-6amdpg21/wheels/eb/6b/3e/89aae0b17b638c9cdcd2015d98b85ee7fb3ef00325bb44a572
Successfully built zstandard
Installing collected packages: zstandard
Successfully installed zstandard-0.22.0.dev0
I immediately see the three DEPRECATION and WARNING lines (which are color highlighted in my terminal, yay):
DEPRECATION: --build-option and --global-option are deprecated. pip 24.0 will enforce this behaviour change. A possible replacement is to use --config-settings. Discussion can be found at https://github.com/pypa/pip/issues/11859
WARNING: Implying --no-binary=:all: due to the presence of --build-option / --global-option.
WARNING: Ignoring --global-option when building zstandard using PEP 517
Yikes. It looks like --global-option is deprecated and will be removed in pip 24.0. And, later it says --global-option was ignored. Is that true?!
$ ls -al zstandard/*cpython-312*.so
-rwxr-xr-x 1 gps staff 1002680 Oct 27 11:35 zstandard/_cffi.cpython-312-darwin.so
-rwxr-xr-x 1 gps staff 919352 Oct 27 11:35 zstandard/backend_c.cpython-312-darwin.so
Not seeing a backend_rust library like I was expecting. So, yes, it does look like --global-option was ignored.
This behavior is actually pretty concerning to me. It certainly seems like at one time --global-option (and a --build-option which doesn’t exist on the pip install command I guess) did get threaded through to setup.py. However, it no longer does.
I find an entry in the pip 23.1 changelog: Deprecate --build-option and --global-option. Users are invited to switch to --config-settings. (#11859). Deprecate. What is pip’s definition of deprecate? I click the link to #11859. An open issue with a lot of comments. I scan the issue history to find referenced PRs and click on #11861. OK, it is just an advertisement. Maybe --global-option never got threaded through to setup.py? But its help usage text clearly says it is related to setup.py! Maybe the presence of [build-system] in pyproject.toml is somehow engaging different semantics that result in --global-option not being passed to setup.py? The warning message did say Ignoring --global-option when building zstandard using PEP 517.
I try commenting out the [build-system] section in my pyproject.toml and trying again. Same result. Huh? Reading the pip install --help output, I see --no-use-pep517 and try it:
$ venv/bin/python -m pip install --global-option=--rust-backend --no-use-pep517 -e .
...
$ ls -al zstandard/*cpython-312*.so
-rwxr-xr-x 1 gps staff 1002680 Oct 27 11:35 zstandard/_cffi.cpython-312-darwin.so
-rwxr-xr-x 1 gps staff 919352 Oct 27 11:35 zstandard/backend_c.cpython-312-darwin.so
-rwxr-xr-x 1 gps staff 2727920 Oct 27 11:53 zstandard/backend_rust.cpython-312-darwin.so
Ahh, so pip’s default PEP-517 build mode is causing --global-option to get ignored. So I guess older versions of pip honored --global-option and when pip switched to PEP-517 build mode by default --global-option just stopped working and emitted a warning instead. That’s quite the backwards incompatible behavior break! I really wish tools would fail fast when making these kinds of breaks or at least offer a --warnings-as-errors mode so I can opt into fatal errors when these kinds of breaks / deprecations are introduced. I would 100% opt into this since these warnings are often the figurative needle in a haystack of CI logs and easy to miss. Especially if the build environment is non-deterministic and new versions of tools like pip get installed randomly without a version control commit.
Pip’s allowing me to specify --global-option but then only issuing a warning when it is ignored doesn’t sit well with me. But what can I do?
It is obvious --global-option is a non-starter here.
Attempts at Using --config-setting
Fortunately, pip’s deprecation message suggests a path forward:
A possible replacement is to use --config-settings. Discussion can be found
at https://github.com/pypa/pip/issues/11859
First, kudos for actionable warning messages. However, the wording says possible replacement. Are there other alternatives I didn’t see in the pip install --help output?
Anyway, I decide to go with that --config-settings suggestion.
$ venv/bin/python -m pip install --config-settings=--rust-backend -e .
Usage:
/Users/gps/src/python-zstandard/venv/bin/python -m pip install [options] <requirement specifier> [package-index-options] ...
/Users/gps/src/python-zstandard/venv/bin/python -m pip install [options] -r <requirements file> [package-index-options] ...
/Users/gps/src/python-zstandard/venv/bin/python -m pip install [options] [-e] <vcs project url> ...
/Users/gps/src/python-zstandard/venv/bin/python -m pip install [options] [-e] <local project path> ...
/Users/gps/src/python-zstandard/venv/bin/python -m pip install [options] <archive url/path> ...
Arguments to --config-settings must be of the form KEY=VAL
Hmmm. Let’s try adding a trailing =?
$ venv/bin/python -m pip install --config-settings=--rust-backend= -e .
Obtaining file:///Users/gps/src/python-zstandard
Installing build dependencies ... done
Checking if build backend supports build_editable ... done
Getting requirements to build editable ... done
Preparing editable metadata (pyproject.toml) ... done
Building wheels for collected packages: zstandard
Building editable for zstandard (pyproject.toml) ... done
Created wheel for zstandard: filename=zstandard-0.22.0.dev0-0.editable-cp312-cp312-macosx_14_0_x86_64.whl size=4379 sha256=619db9806bc4c39e973c3197a0ddb9b03b49fff53cd9ac3d7df301318d390b5e
Stored in directory: /private/var/folders/dd/xb3jz0tj133_hgnvdttctwxc0000gn/T/pip-ephem-wheel-cache-gtsvw78d/wheels/eb/6b/3e/89aae0b17b638c9cdcd2015d98b85ee7fb3ef00325bb44a572
Successfully built zstandard
Installing collected packages: zstandard
Attempting uninstall: zstandard
Found existing installation: zstandard 0.22.0.dev0
Uninstalling zstandard-0.22.0.dev0:
Successfully uninstalled zstandard-0.22.0.dev0
Successfully installed zstandard-0.22.0.dev0
No warnings or deprecations. That’s promising. Did it work?
$ ls -al zstandard/*cpython-312*.so
-rwxr-xr-x 1 gps staff 1002680 Oct 27 12:11 zstandard/_cffi.cpython-312-darwin.so
-rwxr-xr-x 1 gps staff 919352 Oct 27 12:11 zstandard/backend_c.cpython-312-darwin.so
No backend_rust extension module. Boo. So what actually happened?
$ venv/bin/python -m pip -v install --config-settings=--rust-backend= -e .
I don’t see --rust-backend anywhere in that log output. I try with more verbosity:
$ venv/bin/python -m pip -vvvvv install --config-settings=--rust-backend= -e .
Still nothing!
Maybe That -- prefix is wrong?
$ venv/bin/python -m pip -vvvvv install --config-settings=rust-backend= -e .
Still nothing!
I have no clue how --config-settings= is getting passed to setup.py nor where it is seemingly getting dropped on the floor.
How Does setuptools Handle --config-settings?
This must be documented in the setuptools project. So I open those docs in my web browser and do a search for settings. I open the first three results in separate tabs:
- Running setuptools commands
- Configuration File Options
- develop - Deploy the project source in “Development Mode”
That first link has docs on the deprecated setuptools commands and how to invoke
python setup.pydirectly. (Note: there is a warning box here saying thatpython setup.pyis deprecated. I guess I somehow missed this document when looking at setuptools documentation earlier! In hindsight, it appears to be buried at the figurative bottom of the docs tree as the last item under aBackward compatibility & deprecated practice section. Talk about burying the lede!) These docs aren’t useful.
The second link also takes me to deprecated documentation related to direct python setup.py command invocations.
The third link is also useless.
I continue opening search results in new tabs. Surely the answer is in here.
I find an Adding Arguments section telling me that Adding arguments to setup is discouraged as such arguments are only supported through imperative execution and not supported through declarative config.. I think that’s an obtuse of saying that sys.argv arguments are only supported via python setup.py invocations and not via setup.cfg or pyproject.toml? But the example only shows me how to use setup.cfg and doesn’t have any mention of pyproject.toml. So is this documentation even relevant to pyproject.toml?
Eventually I stumble across Build System Support. In the Dynamic build dependencies and other build_meta tweaks section, I notice the following example code:
from setuptools import build_meta as _orig
from setuptools.build_meta import *
def get_requires_for_build_wheel(config_settings=None):
return _orig.get_requires_for_build_wheel(config_settings) + [...]
def get_requires_for_build_sdist(config_settings=None):
return _orig.get_requires_for_build_sdist(config_settings) + [...]
config_settings=None. OK, this might be the --config-settings values passed to the build frontend getting fed into the build backend. I Google get_requires_for_build_wheel. One of the top results is PEP-517, which I click on.
I see that the Build backend interface consists of a handful of functions that are invoked by the build frontend. These functions all seem to take a config_settings=None argument. Great, now I know the interface between build frontends and backends at the Python API level. Where was I in this yak shave?
I remember from pyproject.toml that one of the lines is build-backend = "setuptools.build_meta:__legacy__". That setuptools.build_meta:__legacy__ bit looks like a Python symbol reference. Since the setuptools documentation didn’t answer my question on how to thread --config-settings into setup.py invocations, I open the build_meta.py source code. (Aside: experience has taught me that when in doubt on how something works, consult the source code: code doesn’t lie.)
I search for config_settings. I immediately see class _ConfigSettingsTranslator: whose purported job is Translate config_settings into distutils-style command arguments. Only a limited number of options is currently supported. Oh, this looks relevant. But there’s a fair bit of code in here. Do I really need to grok it