[P2] security: optional signature verification for MLnode PoC endpoints by x0152 · Pull Request #537 · gonka-ai/gonka

x0152 · 2026-01-09T14:57:19Z

MLnode api endpoints are currently unauthenticated, allowing an attacker who can reach the mlnode port to hijack POC by sending their own callback_url

Adds request signing using the existing signerAccount key. Network Node signs requests with X-Signature header, mlnode verifies using SIGNER_PUBKEY environment variable. For standard deployment where mlnode is on the same machine, it works automatically since SIGNER_PUBKEY defaults to ACCOUNT_PUBKEY

For remote mlnode setups, you need to set SIGNER_PUBKEY manually. If SIGNER_PUBKEY is not set, verification is skipped for backward compatibility

Get public key on network node:

source config.env
inferenced keys show $KEY_NAME --pubkey --keyring-backend $KEYRING_BACKEND | jq -r '.key'

Start mlnode with signature verification:

export SIGNER_PUBKEY=<public_key_from_above>
docker-compose -f docker-compose.mlnode.yml up -d

akup

It is uncomplete patch, as there are following functions in MLNodeClient:

// Training operations
	StartTraining(ctx context.Context, taskId uint64, participant string, nodeId string, masterNodeAddr string, rank int, worldSize int) error
	GetTrainingStatus(ctx context.Context) error

	// Node state operations
	Stop(ctx context.Context, num *int) error
	NodeState(ctx context.Context) (*StateResponse, error)

	// PoC operations
	GetPowStatus(ctx context.Context, num *int) (*PowStatusResponse, error)
	InitGenerate(ctx context.Context, dto InitDto, num *int) error
	InitValidate(ctx context.Context, dto InitDto, num *int) error
	ValidateBatch(ctx context.Context, batch ProofBatch, num *int) error

	// Inference operations
	InferenceHealth(ctx context.Context) (bool, error)
	InferenceUp(ctx context.Context, model string, args []string) error

	// GPU operations
	GetGPUDevices(ctx context.Context) (*GPUDevicesResponse, error)
	GetGPUDriver(ctx context.Context) (*DriverInfo, error)

	// Model management operations
	CheckModelStatus(ctx context.Context, model Model) (*ModelStatusResponse, error)
	DownloadModel(ctx context.Context, model Model) (*DownloadStartResponse, error)
	DeleteModel(ctx context.Context, model Model) (*DeleteResponse, error)
	ListModels(ctx context.Context) (*ModelListResponse, error)
	GetDiskSpace(ctx context.Context) (*DiskSpaceInfo, error)

Stop and InferenceUp are also critical to protect, as they are changing node state.
Other are get commands, but it could be more complete, to sign all the MLNode requests.

x0152 · 2026-01-09T21:24:28Z

Current PR covers POC as the most critical - direct reward theft. Firstly would be good to understand if this direction makes sense. I should update PR description to clarify scope

akup · 2026-01-14T14:59:20Z

Current PR covers POC as the most critical - direct reward theft. Firstly would be good to understand if this direction makes sense. I should update PR description to clarify scope

I'm just pointing that before cleaning it this PR cannot be merged, because it is bad practice to merge something semi-ready

tcharchian · 2026-03-21T00:53:06Z

Hey @x0152 @akup! It would be great if you could sync on the next steps for this pull request and make the needed decisions together. If you can move it forward on your own, it could be included in v0.2.12. But overall, this is a nice-to-have rather than something critical.

akup · 2026-03-23T06:10:10Z

I think it is useful PR, but it should be aligned with #717
And we should merge it with new MLNode and #417

The artifacts that are sent to callback URL also should be signed, as the attacker can send bad artifacts if the URL is not protected, and node will not pass validation on PoC.

I think we should first update main repo with new MLNode PoC apis (now PoC completely lives at https://github.com/gonka-ai/vllm repo). Then look and compare approaches with Jf/token auth and finally take into account with websocket that should go next

x0152 · 2026-03-25T15:37:36Z

@akup If the poc api update is coming, let's postpone #537 and #417 until after that. #717 can go in this upgrade - it's optional, shouldn't be affected by PoC API changes and enables cloud-hosted MLNode setups

#537 and #417 for security and stability should go on top of the new api once it is validated on mainnet, to avoid regressions

what do you think about that?

akup · 2026-03-26T11:50:05Z

@x0152 yes #717 can go in this release, but I want to align vision that best option of jf/token and request signatures go to main.
And hornestly, haven't time to dig it yet

But it seams that really it is better to align signatures in other PRs to jf/tokens that are present at #717

x0152 · 2026-03-27T15:49:40Z

Agree, let's merge #717 in this release and postpone #537/ #417 to the next one

akup · 2026-04-01T11:11:29Z

@x0152 I've added small PR that cleans the mlnode. Take a look: #994

I think this one can be merged next

patimen · 2026-04-29T00:26:48Z

@akup , @x0152 - Do we want to push on this one? You seemed ready to push it forward.

akup · 2026-04-29T06:12:25Z

I think that certification of communication between devshard (or dapi) and mlnode should be certified. Because there are always some attacks on open ports and this is negative to adoption and newcomers.
I think it could be scheduled to 0.2.13

x0152 · 2026-04-29T09:00:59Z

Agreed - let's include this in v0.2.13

tcharchian · 2026-05-21T22:45:25Z

Hi @x0152, are you ready to include this PR in the next upgrade?

x0152 · 2026-06-10T11:25:52Z

#1329

security: optional signature verification for MLnode PoC endpoints

d8d6d33

x0152 mentioned this pull request Jan 9, 2026

Update docker-compose.mlnode.yml and docker-compose.yml #538

Closed

akup requested changes Jan 9, 2026

View reviewed changes

IgnatovFedor reviewed Jan 14, 2026

View reviewed changes

Comment thread mlnode/packages/pow/src/pow/service/auth.py Outdated

Comment thread mlnode/packages/pow/src/pow/service/routes.py Outdated

x0152 marked this pull request as draft January 14, 2026 15:00

tcharchian linked an issue Jan 15, 2026 that may be closed by this pull request

[P2] Security MerkleTree Proofs; Merge participant validation till block0; Need to add signature check at recording #330

Open

minor fixes

e0a0cf9

x0152 marked this pull request as ready for review January 15, 2026 15:55

tcharchian added this to Triage Feb 9, 2026

github-project-automation Bot moved this to New in Triage Feb 9, 2026

tcharchian moved this from New to Needs triage in Triage Feb 9, 2026

tcharchian requested a review from DimaOrekhovPS February 11, 2026 01:08

tcharchian removed this from Triage Feb 11, 2026

tcharchian added this to Upgrade v0.2.11 Feb 11, 2026

github-project-automation Bot moved this to Todo in Upgrade v0.2.11 Feb 11, 2026

tcharchian added this to the v0.2.11 milestone Feb 11, 2026

IgnatovFedor changed the base branch from main to upgrade-v0.2.11 February 23, 2026 16:40

IgnatovFedor modified the milestones: v0.2.11, v0.2.12 Feb 24, 2026

IgnatovFedor removed this from Upgrade v0.2.11 Feb 24, 2026

IgnatovFedor added this to Upgrade v0.2.12 Feb 24, 2026

github-project-automation Bot moved this to Todo in Upgrade v0.2.12 Feb 24, 2026

tcharchian moved this from Todo to Needs reviewer in Upgrade v0.2.12 Feb 28, 2026

tcharchian changed the title ~~security: optional signature verification for MLnode PoC endpoints~~ [P2] security: optional signature verification for MLnode PoC endpoints Mar 21, 2026

tcharchian added the Priority: Low label Mar 21, 2026

tcharchian assigned akup and x0152 Mar 21, 2026

tcharchian moved this from Needs reviewer to Waiting on the author in Upgrade v0.2.12 Mar 21, 2026

tcharchian removed the request for review from DimaOrekhovPS March 21, 2026 00:50

akup mentioned this pull request Mar 23, 2026

[P2] Jf/token auth v2 #717

Open

IgnatovFedor changed the base branch from upgrade-v0.2.11 to upgrade-v0.2.12 March 23, 2026 12:59

x0152 removed this from the v0.2.12 milestone Mar 27, 2026

akup mentioned this pull request Apr 1, 2026

[P2] Mlnode cleanup #994

Open

x0152 modified the milestone: v0.2.13 Apr 29, 2026

x0152 marked this pull request as draft April 29, 2026 09:16

tcharchian removed this from Upgrade v0.2.12 Apr 29, 2026

x0152 mentioned this pull request Jun 10, 2026

feat: optional mTLS between DAPI and ML nodes #1329

Open

x0152 closed this Jun 10, 2026

tcharchian modified the milestones: v0.2.14, v0.2.15 Jun 24, 2026

Uh oh!

Conversation

x0152 commented Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

akup left a comment

Choose a reason for hiding this comment

Uh oh!

x0152 commented Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

akup commented Jan 14, 2026

Uh oh!

tcharchian commented Mar 21, 2026

Uh oh!

akup commented Mar 23, 2026

Uh oh!

x0152 commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

akup commented Mar 26, 2026

Uh oh!

x0152 commented Mar 27, 2026

Uh oh!

akup commented Apr 1, 2026

Uh oh!

patimen commented Apr 29, 2026

Uh oh!

akup commented Apr 29, 2026

Uh oh!

x0152 commented Apr 29, 2026

Uh oh!

tcharchian commented May 21, 2026

Uh oh!

x0152 commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

x0152 commented Jan 9, 2026 •

edited

Loading

x0152 commented Jan 9, 2026 •

edited

Loading

x0152 commented Mar 25, 2026 •

edited

Loading