-
Notifications
You must be signed in to change notification settings - Fork 949
Pull requests: kubeflow/trainer
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
feat: inject PET_* envs into init containers via envInjection config
do-not-merge/work-in-progress
size/XL
#3516
opened May 17, 2026 by
panpan0000
Contributor
•
Draft
feat: add validation for ReplicatedJobPatch name
size/M
#3515
opened May 15, 2026 by
pulkit-999
Contributor
Loading…
fix(cache): use status code (not identity) to detect Service AlreadyExists
ok-to-test
size/M
#3507
opened May 12, 2026 by
1fanwang
Loading…
chore(deps): bump arrow-flight from 55.2.0 to 58.3.0 in /pkg/data_cache
dependencies
Pull requests that update a dependency file
rust
Pull requests that update rust code
size/L
#3506
opened May 12, 2026 by
dependabot
Bot
Loading…
chore(deps): bump arrow from 55.2.0 to 58.3.0 in /pkg/data_cache
dependencies
Pull requests that update a dependency file
rust
Pull requests that update rust code
size/L
#3505
opened May 12, 2026 by
dependabot
Bot
Loading…
chore(deps): bump tonic from 0.12.3 to 0.14.6 in /pkg/data_cache
dependencies
Pull requests that update a dependency file
rust
Pull requests that update rust code
size/M
#3504
opened May 12, 2026 by
dependabot
Bot
Loading…
chore(deps): bump the kubernetes group with 3 updates
dependencies
Pull requests that update a dependency file
go
Pull requests that update Go code
size/M
#3494
opened May 12, 2026 by
dependabot
Bot
Loading…
chore(deps): bump hickory-resolver from 0.24.4 to 0.26.1 in /pkg/data_cache
dependencies
Pull requests that update a dependency file
rust
Pull requests that update rust code
size/L
#3485
opened May 5, 2026 by
dependabot
Bot
Loading…
fix(runtimes): propagate trainer environment variables to worker processes
size/L
#3454
opened Apr 25, 2026 by
AviralKaushal
Loading…
chore(api): Remove duplicate TrainJob status patch
size/S
#3448
opened Apr 24, 2026 by
robert-bell
Contributor
Loading…
1 task done
feat: Ship optional default Grafana dashboard via Helm
size/XL
#3445
opened Apr 21, 2026 by
sameerdattav
Contributor
Loading…
feat(operator): add controller-level Prometheus metrics and ServiceMonitor
size/XL
#3433
opened Apr 16, 2026 by
1Ayush-Petwal
Loading…
1 task
fix: apply clientConnection QPS/burst to the manager client
size/L
#3432
opened Apr 16, 2026 by
abhijeet-dhumal
Member
Loading…
1 task done
feat(docs): KEP-2599: Decouple runtime lifecycle from TrainJobs to simplify updating runtimes
approved
do-not-merge/hold
lgtm
size/L
#3428
opened Apr 14, 2026 by
robert-bell
Contributor
Loading…
chore(test): assert info state mutations and build errors in TestFlux- Fixes #3409
size/M
#3410
opened Apr 5, 2026 by
gojogourav
Loading…
1 task
feat(operator): support multi-slice TPU by enabling trainer replicas > 1
size/M
#3408
opened Apr 3, 2026 by
krishdef7
Contributor
Loading…
fix(statusserver): improve bearer token parsing and add helper tests
size/L
#3405
opened Mar 31, 2026 by
suchirkolli
Loading…
feat: Add failure-aware debugging for Go E2E tests
size/M
#3394
opened Mar 27, 2026 by
Goku2099
Contributor
Loading…
chore(api): rename CertManagement webhook fields to generic names
size/L
#3386
opened Mar 24, 2026 by
tariq-hasan
Member
Loading…
1 task
fix(operator): Scope MPI ConfigMap and Secret watches to owned objects
size/M
#3377
opened Mar 23, 2026 by
beep-boopp
Loading…
1 task
feat(test): Add integration tests for status server
size/L
#3373
opened Mar 22, 2026 by
digvijay-y
Loading…
1 task
fix(runtimes): set MPI SSH auth secret volume default mode to 0640
size/L
#3368
opened Mar 19, 2026 by
harxhist
Loading…
4 tasks done
chore(api): Use SSA to update TrainJob status
size/L
#3362
opened Mar 17, 2026 by
astefanutti
Contributor
Loading…
1 task done
feat(operator): add generic EnforceRuntimeInfoPlugin interface
size/M
#3361
opened Mar 17, 2026 by
krishdef7
Contributor
Loading…
Previous Next
ProTip!
Adding no:label will show everything without a label.