Skip to content

Conversation

paulslaby
Copy link

Description

Adds parser for PDM dependency manager lockfile.

Type of change

  • New feature (non-breaking change which adds functionality)

Checklist:

  • I have added unit tests that cover changed behavior
  • I have tested my code in common scenarios and confirmed there are no regressions
  • I have added comments to my code, particularly in hard-to-understand sections

I'll be honest, neither Go or Python is my primary language, but I am open to helping you. Please help me polish this PR so that it can eventually get merged into syft codebase.

Copy link
Contributor

@kzantow kzantow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks to be well on track already! I just thought I'd note you will want to add a function to create the cataloger and integrate it with the task selection, thanks for working on this!

Also make lint-fix can help with static analysis failures.

@paulslaby paulslaby force-pushed the pbuchart/pdm-parsing branch 2 times, most recently from 24b5f4f to 4bb1476 Compare September 23, 2025 14:46
@paulslaby paulslaby requested a review from kzantow September 23, 2025 14:56
@paulslaby
Copy link
Author

paulslaby commented Sep 24, 2025

I see locally some failed unit tests, but it looks irrelevant to my changes.
One example:

#0 building with "orbstack" instance using docker driver

#1 [internal] load build definition from Dockerfile
#1 transferring dockerfile: 321B done
#1 DONE 0.0s

#2 [internal] load metadata for docker.io/library/ibmjava:8-jre@sha256:3588cd1cc9b8646fe03b3b15210e69b1b520f1321f8518b69c0e7013d702fd23
#2 ERROR: no match for platform in manifest: not found
------
> [internal] load metadata for docker.io/library/ibmjava:8-jre@sha256:3588cd1cc9b8646fe03b3b15210e69b1b520f1321f8518b69c0e7013d702fd23:
------
Dockerfile:1
--------------------
  1 | >>> FROM ibmjava:8-jre@sha256:3588cd1cc9b8646fe03b3b15210e69b1b520f1321f8518b69c0e7013d702fd23 AS builder
  2 |     
  3 |     FROM scratch
--------------------
ERROR: failed to solve: ibmjava:8-jre@sha256:3588cd1cc9b8646fe03b3b15210e69b1b520f1321f8518b69c0e7013d702fd23: failed to resolve source metadata for docker.io/library/ibmjava:8-jre@sha256:3588cd1cc9b8646fe03b3b15210e69b1b520f1321f8518b69c0e7013d702fd23: no match for platform in manifest: not found
--- FAIL: Test_JavaBinaryImage (2.15s)
   --- FAIL: Test_JavaBinaryImage/image-java-zulu-8 (0.68s)
       image_fixtures.go:241: Build docker image: name="stereoscope-fixture-image-java-zulu-8" tag="39fda6a633d8f50e1f4c3a145eb5305e5abca841d6d1763fd28c1e551d709a20"
       image_fixtures.go:250: 
               Error Trace:    /Users/pavel/go/pkg/mod/github.com/anchore/stereoscope@v0.1.10/pkg/imagetest/image_fixtures.go:250
                                                       /Users/pavel/go/pkg/mod/github.com/anchore/stereoscope@v0.1.10/pkg/imagetest/image_fixtures.go:185
                                                       /Users/pavel/go/pkg/mod/github.com/anchore/stereoscope@v0.1.10/pkg/imagetest/image_fixtures.go:210
                                                       /Users/pavel/go/pkg/mod/github.com/anchore/stereoscope@v0.1.10/pkg/imagetest/image_fixtures.go:35
                                                       /Users/pavel/go/pkg/mod/github.com/anchore/stereoscope@v0.1.10/pkg/imagetest/image_fixtures.go:64
                                                       /Users/pavel/Work/syft/syft/pkg/cataloger/internal/pkgtest/test_generic_parser.go:191
                                                       /Users/pavel/Work/syft/syft/pkg/cataloger/binary/classifiers_java_test.go:61
               Error:          Received unexpected error:
                               exit status 1
               Test:           Test_JavaBinaryImage/image-java-zulu-8
               Messages:       could not build docker image (shell out)

It's hard to notice what is relevant :-/

@kzantow
Copy link
Contributor

kzantow commented Sep 24, 2025

@paulslaby you can safely ignore that test failure, as you noted I really don't think it is related to your changes, it seems like something I've probably caused and I don't think will be happening in CI. I'll try to get it sorted, but it shouldn't be a blocker for this PR.

TestAllNames is a legitimate failure, though, I think that's the only test issue in CI (and static analysis)

@github-actions github-actions bot added the json-schema Changes the json schema label Sep 28, 2025

This comment has been minimized.

@paulslaby paulslaby force-pushed the pbuchart/pdm-parsing branch 2 times, most recently from fe2c6f3 to 7bf8766 Compare September 28, 2025 09:07
Signed-off-by: Pavel Buchart <pavel@buchart.cz>
Signed-off-by: Pavel Buchart <pavel@buchart.cz>
Signed-off-by: Pavel Buchart <pavel@buchart.cz>
Signed-off-by: Pavel Buchart <pavel@buchart.cz>
Signed-off-by: Pavel Buchart <pavel@buchart.cz>
Signed-off-by: Pavel Buchart <pavel@buchart.cz>
Signed-off-by: Pavel Buchart <pavel@buchart.cz>
Signed-off-by: Pavel Buchart <pavel@buchart.cz>
Signed-off-by: Pavel Buchart <pavel@buchart.cz>
Signed-off-by: Pavel Buchart <pavel@buchart.cz>
Signed-off-by: Pavel Buchart <pavel@buchart.cz>
Signed-off-by: Pavel Buchart <pavel@buchart.cz>
Signed-off-by: Pavel Buchart <pavel@buchart.cz>
@paulslaby
Copy link
Author

@kzantow I see one more issue i think it should be handled by me:

--- FAIL: Test_OriginatorSupplier (0.00s)
    originator_supplier_test.go:13: metadata type PythonPdmLockEntry is not covered by a test
FAIL
coverage: 92.8% of statements
FAIL    github.com/anchore/syft/syft/format/internal/spdxutil/helpers   0.333s

but what coverage does it expect?

@kzantow
Copy link
Contributor

kzantow commented Sep 29, 2025

@paulslaby have a look at this test, you just need to add your newly created metadata:

func Test_OriginatorSupplier(t *testing.T) {

Signed-off-by: Pavel Buchart <pavel@buchart.cz>
@paulslaby paulslaby requested a review from kzantow September 29, 2025 14:59
@kzantow
Copy link
Contributor

kzantow commented Sep 29, 2025

@paulslaby it looks like there are some failing tests that are not related to your changes. I will have a look at this later today. Do you mind if I push to your branch?

@paulslaby
Copy link
Author

@paulslaby it looks like there are some failing tests that are not related to your changes. I will have a look at this later today. Do you mind if I push to your branch?

It's up to you :) wanna me rebase or something?

Signed-off-by: Keith Zantow <kzantow@gmail.com>
@github-actions github-actions bot added json-schema Changes the json schema and removed json-schema Changes the json schema labels Sep 30, 2025
@kzantow
Copy link
Contributor

kzantow commented Sep 30, 2025

Hey @paulslaby I think I fixed up the test issue and the static analysis problem (schema file wasn't matching up). I had a couple notes:

First, what is the purpose of the Index field? It is being set to a constant and I don't see anything in the sample file that looks like it would populate this, should it get removed? It is always easier to add things later rather than remove them.

Second, it looks like files on the filesystem are referenced in the PDM file. There is a FileOwner interface that metadata can implement to return these files and indicate that the particular package "owns" them, in other words, there's a claim that the package manager is provided them. Various things happen with this information, but one is that vulnerability results can be better if, for example, one was a binary that was also surfaced by a different cataloger... the PDM package "owning" it can convey this. I think it might be better to include file references with hash values instead of just the hashes, this way it could also capture the FileOwner information and the hashes would be more useful.

Last, it looks like the PDM files have dependency information in them. The catalogers return relationships with this information, users generally want this information to be surfaced, would you want to add this? Here's an example using utility functions.

@paulslaby
Copy link
Author

First, what is the purpose of the Index field? It is being set to a constant and I don't see anything in the sample file that looks like it would populate this, should it get removed?

In the beginning I was thinking that this might be the way how syft manages URL dependencies. Now I am not sure, if we should parse package name from URL. Do you have a pattern for that already established?

I will have a look iat the rest. Thank you for your feedback.

@paulslaby
Copy link
Author

@kzantow I implemented relationship parsing. but those can be conditional based on python version (or platform). You probably know, python lockfiles are not for one python version, but for constrained python version. Those conditions are currently ignored. Do you have a different opinion?

kzantow and others added 5 commits October 6, 2025 14:22
Signed-off-by: Keith Zantow <kzantow@gmail.com>
Signed-off-by: Pavel Buchart <pavel@buchart.cz>
Signed-off-by: Keith Zantow <kzantow@gmail.com>
Signed-off-by: Pavel Buchart <pavel@buchart.cz>
Signed-off-by: Pavel Buchart <pavel@buchart.cz>
Signed-off-by: Pavel Buchart <pavel@buchart.cz>
@paulslaby paulslaby force-pushed the pbuchart/pdm-parsing branch from 58f278c to eb0c58a Compare October 6, 2025 12:22
Signed-off-by: Pavel Buchart <pavel@buchart.cz>
@kzantow
Copy link
Contributor

kzantow commented Oct 6, 2025

@paulslaby generally speaking, the philosophy of syft is to surface as much information as we find. In the case of conditional things, if there is a possibility that a dependency could be included, we generally would include this by default. For example: NPM currently includes devDependencies and Java currently includes test dependencies. I think the correct default behavior is to include the dependencies, regardless of python version constraints. I don't think this necessarily needs to be configurable as part of this PR, but we could an option to exclude things at a later time. Am I understanding the question correctly?

@paulslaby
Copy link
Author

@kzantow Yes, you understand it correctly. That's the way it working now.

Signed-off-by: Pavel Buchart <pavel@buchart.cz>
Signed-off-by: Pavel Buchart <pavel@buchart.cz>
@paulslaby paulslaby force-pushed the pbuchart/pdm-parsing branch from bde780c to 6abc053 Compare October 8, 2025 07:44
Signed-off-by: Pavel Buchart <pavel@buchart.cz>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
json-schema Changes the json schema
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pdm support
2 participants