Pyinstaller打包pikepdf失败的问题排查

问题

最近在项目里用到了pikepdf这个库,用于实现pdf水印插入的一个小功能,源码调试阶段运行一切OK但是在出包时报了如下异常。

Traceback (most recent call last):
  File "pikepdf\__init__.py", line 19, in <module>
  File "PyInstaller\loader\pyimod03_importers.py", line 495, in exec_module
  File "pikepdf\_version.py", line 13, in <module>
  File "importlib\metadata.py", line 530, in version
  File "importlib\metadata.py", line 503, in distribution
  File "importlib\metadata.py", line 177, in from_name
importlib.metadata.PackageNotFoundError: pikepdf

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "main.py", line 1, in <module>
  File "PyInstaller\loader\pyimod03_importers.py", line 495, in exec_module
  File "pikepdf\__init__.py", line 21, in <module>
ImportError: Failed to determine version
[29708] Failed to execute script 'main' due to unhandled exception!

异常定位

打印了两份堆栈的信息,翻翻炸出来点 init.py 的21行,代码如下

try:
    from ._version import __version__
except ImportError as _e:  # pragma: no cover
    raise ImportError("Failed to determine version") from _e

从.version.py文件导入__version__失败?看看_version.py

try:
    from importlib_metadata import version as _package_version  # type: ignore
except ImportError:
    from importlib.metadata import version as _package_version

__version__ = _package_version('pikepdf')

__all__ = ['__version__']

再看看上面的异常,也就再_package_version这个函数了。这边可以先写个简单的demo.py验证下,使用pyinstgaller编译后运行。

# demo.py
import pikepdf

if __name__ == '__main__':
    print("Hello World")
# 输出
Traceback (most recent call last):
  File "pikepdf\__init__.py", line 19, in <module>
  File "PyInstaller\loader\pyimod03_importers.py", line 495, in exec_module
  File "pikepdf\_version.py", line 13, in <module>
  File "importlib\metadata.py", line 530, in version
  File "importlib\metadata.py", line 503, in distribution
  File "importlib\metadata.py", line 177, in from_name
importlib.metadata.PackageNotFoundError: pikepdf

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "main.py", line 1, in <module>
  File "PyInstaller\loader\pyimod03_importers.py", line 495, in exec_module
  File "pikepdf\__init__.py", line 21, in <module>
ImportError: Failed to determine version
[29708] Failed to execute script 'main' due to unhandled exception!

符合预期。所以说importlib.metadata.version 无法在pyinstaller打包后运行?

问题原因

对于pkgs_to_check_at_runtime中列出的每个包,需要通过在spec文件中使用copy_metadata(name)来收集元数据。说白了就是pyinstaller打包后缺少对应的metadata信息。

修复方案

1. 降低版本pikepdf的版本

这个库以前用过,没有出这个幺蛾子。看了下旧版本的源代码,一下是5.1.3版本之前的get_version实现,没有使用importlib库,自然也不会有问题。

from pkg_resources import DistributionNotFound
from pkg_resources import get_distribution as _get_distribution

try:
    __version__ = _get_distribution(__package__).version
except DistributionNotFound:  # pragma: no cover
    __version__ = "Not installed"

__all__ = ['__version__']

2. 在pyinstaller打包时指定copy-metadata

像这样

pyinstaller -F --copy-metadata pikepdf main.py

碎碎念

我觉得完全可以像版本5.1.3之前一样,获取不到时赋值为”Not installed”一样就行。然后提了个pr[https://github.com/pikepdf/pikepdf/pull/358]给作者,可惜作者认为这是importlib的锅(不改),倒也时说得过去。

张贴在2