Skip to content

Conversation

@paulslaby
Copy link
Contributor

Description

Adds parser for PDM dependency manager lockfile.

Type of change

  • New feature (non-breaking change which adds functionality)

Checklist:

  • I have added unit tests that cover changed behavior
  • I have tested my code in common scenarios and confirmed there are no regressions
  • I have added comments to my code, particularly in hard-to-understand sections

I'll be honest, neither Go or Python is my primary language, but I am open to helping you. Please help me polish this PR so that it can eventually get merged into syft codebase.

Copy link
Contributor

@kzantow kzantow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks to be well on track already! I just thought I'd note you will want to add a function to create the cataloger and integrate it with the task selection, thanks for working on this!

Also make lint-fix can help with static analysis failures.

@paulslaby paulslaby force-pushed the pbuchart/pdm-parsing branch 2 times, most recently from 24b5f4f to 4bb1476 Compare September 23, 2025 14:46
@paulslaby paulslaby requested a review from kzantow September 23, 2025 14:56
@paulslaby
Copy link
Contributor Author

paulslaby commented Sep 24, 2025

I see locally some failed unit tests, but it looks irrelevant to my changes.
One example:

#0 building with "orbstack" instance using docker driver

#1 [internal] load build definition from Dockerfile
#1 transferring dockerfile: 321B done
#1 DONE 0.0s

#2 [internal] load metadata for docker.io/library/ibmjava:8-jre@sha256:3588cd1cc9b8646fe03b3b15210e69b1b520f1321f8518b69c0e7013d702fd23
#2 ERROR: no match for platform in manifest: not found
------
> [internal] load metadata for docker.io/library/ibmjava:8-jre@sha256:3588cd1cc9b8646fe03b3b15210e69b1b520f1321f8518b69c0e7013d702fd23:
------
Dockerfile:1
--------------------
  1 | >>> FROM ibmjava:8-jre@sha256:3588cd1cc9b8646fe03b3b15210e69b1b520f1321f8518b69c0e7013d702fd23 AS builder
  2 |     
  3 |     FROM scratch
--------------------
ERROR: failed to solve: ibmjava:8-jre@sha256:3588cd1cc9b8646fe03b3b15210e69b1b520f1321f8518b69c0e7013d702fd23: failed to resolve source metadata for docker.io/library/ibmjava:8-jre@sha256:3588cd1cc9b8646fe03b3b15210e69b1b520f1321f8518b69c0e7013d702fd23: no match for platform in manifest: not found
--- FAIL: Test_JavaBinaryImage (2.15s)
   --- FAIL: Test_JavaBinaryImage/image-java-zulu-8 (0.68s)
       image_fixtures.go:241: Build docker image: name="stereoscope-fixture-image-java-zulu-8" tag="39fda6a633d8f50e1f4c3a145eb5305e5abca841d6d1763fd28c1e551d709a20"
       image_fixtures.go:250: 
               Error Trace:    /Users/pavel/go/pkg/mod/github.com/anchore/stereoscope@v0.1.10/pkg/imagetest/image_fixtures.go:250
                                                       /Users/pavel/go/pkg/mod/github.com/anchore/stereoscope@v0.1.10/pkg/imagetest/image_fixtures.go:185
                                                       /Users/pavel/go/pkg/mod/github.com/anchore/stereoscope@v0.1.10/pkg/imagetest/image_fixtures.go:210
                                                       /Users/pavel/go/pkg/mod/github.com/anchore/stereoscope@v0.1.10/pkg/imagetest/image_fixtures.go:35
                                                       /Users/pavel/go/pkg/mod/github.com/anchore/stereoscope@v0.1.10/pkg/imagetest/image_fixtures.go:64
                                                       /Users/pavel/Work/syft/syft/pkg/cataloger/internal/pkgtest/test_generic_parser.go:191
                                                       /Users/pavel/Work/syft/syft/pkg/cataloger/binary/classifiers_java_test.go:61
               Error:          Received unexpected error:
                               exit status 1
               Test:           Test_JavaBinaryImage/image-java-zulu-8
               Messages:       could not build docker image (shell out)

It's hard to notice what is relevant :-/

@kzantow
Copy link
Contributor

kzantow commented Sep 24, 2025

@paulslaby you can safely ignore that test failure, as you noted I really don't think it is related to your changes, it seems like something I've probably caused and I don't think will be happening in CI. I'll try to get it sorted, but it shouldn't be a blocker for this PR.

TestAllNames is a legitimate failure, though, I think that's the only test issue in CI (and static analysis)

@github-actions github-actions bot added the json-schema Changes the json schema label Sep 28, 2025
@github-actions

This comment has been minimized.

@paulslaby paulslaby force-pushed the pbuchart/pdm-parsing branch 2 times, most recently from fe2c6f3 to 7bf8766 Compare September 28, 2025 09:07
Signed-off-by: Pavel Buchart <pavel@buchart.cz>
Signed-off-by: Pavel Buchart <pavel@buchart.cz>
Signed-off-by: Pavel Buchart <pavel@buchart.cz>
Signed-off-by: Pavel Buchart <pavel@buchart.cz>
Signed-off-by: Pavel Buchart <pavel@buchart.cz>
Signed-off-by: Pavel Buchart <pavel@buchart.cz>
Signed-off-by: Pavel Buchart <pavel@buchart.cz>
Signed-off-by: Pavel Buchart <pavel@buchart.cz>
Signed-off-by: Pavel Buchart <pavel@buchart.cz>
Signed-off-by: Pavel Buchart <pavel@buchart.cz>
Signed-off-by: Pavel Buchart <pavel@buchart.cz>
Signed-off-by: Pavel Buchart <pavel@buchart.cz>
Signed-off-by: Pavel Buchart <pavel@buchart.cz>
@paulslaby
Copy link
Contributor Author

@kzantow I see one more issue i think it should be handled by me:

--- FAIL: Test_OriginatorSupplier (0.00s)
    originator_supplier_test.go:13: metadata type PythonPdmLockEntry is not covered by a test
FAIL
coverage: 92.8% of statements
FAIL    github.com/anchore/syft/syft/format/internal/spdxutil/helpers   0.333s

but what coverage does it expect?

@kzantow
Copy link
Contributor

kzantow commented Sep 29, 2025

@paulslaby have a look at this test, you just need to add your newly created metadata:

func Test_OriginatorSupplier(t *testing.T) {

Signed-off-by: Pavel Buchart <pavel@buchart.cz>
@github-actions github-actions bot added json-schema Changes the json schema and removed json-schema Changes the json schema labels Sep 30, 2025
@kzantow
Copy link
Contributor

kzantow commented Sep 30, 2025

Hey @paulslaby I think I fixed up the test issue and the static analysis problem (schema file wasn't matching up). I had a couple notes:

First, what is the purpose of the Index field? It is being set to a constant and I don't see anything in the sample file that looks like it would populate this, should it get removed? It is always easier to add things later rather than remove them.

Second, it looks like files on the filesystem are referenced in the PDM file. There is a FileOwner interface that metadata can implement to return these files and indicate that the particular package "owns" them, in other words, there's a claim that the package manager is provided them. Various things happen with this information, but one is that vulnerability results can be better if, for example, one was a binary that was also surfaced by a different cataloger... the PDM package "owning" it can convey this. I think it might be better to include file references with hash values instead of just the hashes, this way it could also capture the FileOwner information and the hashes would be more useful.

Last, it looks like the PDM files have dependency information in them. The catalogers return relationships with this information, users generally want this information to be surfaced, would you want to add this? Here's an example using utility functions.

@paulslaby
Copy link
Contributor Author

First, what is the purpose of the Index field? It is being set to a constant and I don't see anything in the sample file that looks like it would populate this, should it get removed?

In the beginning I was thinking that this might be the way how syft manages URL dependencies. Now I am not sure, if we should parse package name from URL. Do you have a pattern for that already established?

I will have a look iat the rest. Thank you for your feedback.

@paulslaby
Copy link
Contributor Author

@kzantow I implemented relationship parsing. but those can be conditional based on python version (or platform). You probably know, python lockfiles are not for one python version, but for constrained python version. Those conditions are currently ignored. Do you have a different opinion?

kzantow and others added 5 commits October 6, 2025 14:22
Signed-off-by: Keith Zantow <kzantow@gmail.com>
Signed-off-by: Pavel Buchart <pavel@buchart.cz>
Signed-off-by: Keith Zantow <kzantow@gmail.com>
Signed-off-by: Pavel Buchart <pavel@buchart.cz>
Signed-off-by: Pavel Buchart <pavel@buchart.cz>
Signed-off-by: Pavel Buchart <pavel@buchart.cz>
@paulslaby paulslaby force-pushed the pbuchart/pdm-parsing branch from 58f278c to eb0c58a Compare October 6, 2025 12:22
Signed-off-by: Pavel Buchart <pavel@buchart.cz>
@kzantow
Copy link
Contributor

kzantow commented Oct 6, 2025

@paulslaby generally speaking, the philosophy of syft is to surface as much information as we find. In the case of conditional things, if there is a possibility that a dependency could be included, we generally would include this by default. For example: NPM currently includes devDependencies and Java currently includes test dependencies. I think the correct default behavior is to include the dependencies, regardless of python version constraints. I don't think this necessarily needs to be configurable as part of this PR, but we could an option to exclude things at a later time. Am I understanding the question correctly?

@paulslaby
Copy link
Contributor Author

@kzantow Yes, you understand it correctly. That's the way it working now.

Signed-off-by: Pavel Buchart <pavel@buchart.cz>
Signed-off-by: Pavel Buchart <pavel@buchart.cz>
@paulslaby paulslaby force-pushed the pbuchart/pdm-parsing branch from bde780c to 6abc053 Compare October 8, 2025 07:44
paulslaby and others added 6 commits October 8, 2025 09:46
Signed-off-by: Pavel Buchart <pavel@buchart.cz>
Signed-off-by: Keith Zantow <kzantow@gmail.com>
Signed-off-by: Keith Zantow <kzantow@gmail.com>
Signed-off-by: Keith Zantow <kzantow@gmail.com>
Signed-off-by: Keith Zantow <kzantow@gmail.com>
@kzantow
Copy link
Contributor

kzantow commented Oct 14, 2025

Hey @paulslaby, I pushed a few changes to:

  • populate the metadata dependencies
  • correct the direction of the dependency relationship
  • avoid duplication of name & version in metadata
  • other misc. cleanup

Would you be able to give this a try to make sure it's still giving you the results you are expecting?

Signed-off-by: Keith Zantow <kzantow@gmail.com>
@paulslaby
Copy link
Contributor Author

@kzantow Yes, that works. thank you for finishing it 🙇

Copy link
Contributor

@kzantow kzantow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks very much for the contribution @paulslaby! 🎉

@kzantow kzantow merged commit e923db2 into anchore:main Oct 16, 2025
12 checks passed
spiffcs added a commit that referenced this pull request Oct 22, 2025
* main:
  chore(deps): update tools to latest versions (#4302)
  chore(deps): bump github.com/github/go-spdx/v2 from 2.3.3 to 2.3.4 (#4301)
  chore(deps): bump github/codeql-action from 4.30.8 to 4.30.9 (#4299)
  support universal (fat) mach-o binary files (#4278)
  chore(deps): bump sigstore/cosign-installer from 3.10.0 to 4.0.0 (#4296)
  chore(deps): bump anchore/sbom-action from 0.20.7 to 0.20.8 (#4297)
  convert posix path back to windows (#4285)
  Remove duplicate image source providers (#4289)
  chore(deps): bump anchore/sbom-action from 0.20.6 to 0.20.7 (#4293)
  feat: add option to fetch remote licenses for pnpm-lock.yaml files (#4286)
  Add PDM parser (#4234)
  chore(deps): update tools to latest versions (#4291)
  fix: panic during java archive maven resolution (#4290)
  Extract zip archive with multiple entries (#4283)
  chore: update to use old configuration on new cosign (#4287)
  chore(deps): update anchore dependencies (#4282)
  chore(deps): bump github.com/mholt/archives from 0.1.3 to 0.1.5 (#4280)
  add docs to configs (#4281)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

json-schema Changes the json schema

Projects

None yet

Development

Successfully merging this pull request may close these issues.

pdm support

2 participants