Skip to content

Add --sriov-whitelist argument to allow selecting specific SR-IOV devices (instead of enabling all found devices)#1

Open
Trolldemorted wants to merge 1 commit into
kernkonzept:masterfrom
Trolldemorted:sr-iov-whitelist
Open

Add --sriov-whitelist argument to allow selecting specific SR-IOV devices (instead of enabling all found devices)#1
Trolldemorted wants to merge 1 commit into
kernkonzept:masterfrom
Trolldemorted:sr-iov-whitelist

Conversation

@Trolldemorted

@Trolldemorted Trolldemorted commented Nov 2, 2025

Copy link
Copy Markdown

As of now, CONFIG_L4IO_PCI_SRIOV enables the SR-IOV capability for every device found on the bus. This breaks drivers of certain devices and causes io to deadlock when it encounters certain devices. To avoid such issues, this commit allows the user to specify multiple --sriov-whitelist $vendor_id:$device_id options.

If zero --sriov-whitelist options are specified, io continues to enable SR-IOV for all capable devices. This ensures backwards compatibility, but I would be very open to breaking backwards compatiblity and only loading explicitly whitelisted devices.

Examples of broken devices

This integrated graphics controller deadlocks io
00:02.0 VGA compatible controller: Intel Corporation AlderLake-S GT1 (rev 0c) (prog-if 00 [VGA controller])
        Subsystem: Dell AlderLake-S GT1
        Flags: bus master, fast devsel, latency 0, IRQ 171, IOMMU group 2
        Memory at 6000000000 (64-bit, non-prefetchable) [size=16M]
        Memory at 4000000000 (64-bit, prefetchable) [size=256M]
        I/O ports at 3000 [size=64]
        Expansion ROM at 000c0000 [virtual] [disabled] [size=128K]
        Capabilities: [40] Vendor Specific Information: Len=0c <?>
        Capabilities: [70] Express Root Complex Integrated Endpoint, MSI 00
        Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable+ 64bit-
        Capabilities: [d0] Power Management version 2
        Capabilities: [100] Process Address Space ID (PASID)
        Capabilities: [200] Address Translation Service (ATS)
        Capabilities: [300] Page Request Interface (PRI)
        Capabilities: [320] Single Root I/O Virtualization (SR-IOV)
        Kernel driver in use: i915
        Kernel modules: i915
This NIC cannot get its interface up
00:01.0 Ethernet controller: Intel Corporation Ethernet Controller X550 (rev 01)
        Subsystem: Super Micro Computer Inc Ethernet Controller X550
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin B routed to IRQ 10
        Region 0: Memory at 80000000 (64-bit, prefetchable) [size=4M]
        Region 4: Memory at 80400000 (64-bit, prefetchable) [size=16K]
        Capabilities: [40] Power Management version 3
                Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=1 PME-
        Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
                Address: 0000000000000000  Data: 0000
                Masking: 00000000  Pending: 00000000
        Capabilities: [70] MSI-X: Enable+ Count=64 Masked-
                Vector table: BAR=4 offset=00000000
                PBA: BAR=4 offset=00002000
        Capabilities: [a0] Express (v2) Endpoint, MSI 00
                DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
                        ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0W
                DevCtl: CorrErr- NonFatalErr- FatalErr+ UnsupReq-
                        RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop- FLReset-
                        MaxPayload 256 bytes, MaxReadReq 4096 bytes
                DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr- TransPend-
                LnkCap: Port #0, Speed 8GT/s, Width x4, ASPM not supported
                        ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
                LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 8GT/s, Width x4
                        TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Range ABCD, TimeoutDis+ NROPrPrP- LTR+
                         10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix-
                         EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
                         FRS- TPHComp- ExtTPHComp-
                         AtomicOpsCap: 32bit- 64bit- 128bitCAS-
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- 10BitTagReq- OBFF Disabled,
                         AtomicOpsCtl: ReqEn-
                LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete- EqualizationPhase1-
                         EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest-
                         Retimer- 2Retimers- CrosslinkRes: unsupported
        Capabilities: [100 v2] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt+ UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
                UESvrt: DLP+ SDES- TLP+ FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
                AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
                        MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
                HeaderLog: 00000000 00000000 00000000 00000000
        Capabilities: [140 v1] Device Serial Number 00-00-c9-ff-ff-00-00-00
        Capabilities: [150 v1] Alternative Routing-ID Interpretation (ARI)
                ARICap: MFVC- ACS-, Next Function: 0
                ARICtl: MFVC- ACS-, Function Group: 0
        Capabilities: [160 v1] Single Root I/O Virtualization (SR-IOV)
                IOVCap: Migration- 10BitTagReq- Interrupt Message Number: 000
                IOVCtl: Enable- Migration- Interrupt- MSE- ARIHierarchy- 10BitTagReq-
                IOVSta: Migration-
                Initial VFs: 64, Total VFs: 64, Number of VFs: 0, Function Dependency Link: 00
                VF offset: 128, stride: 2, Device ID: 1565
                Supported Page Size: 00000553, System Page Size: 00000001
                VF Migration: offset: 00000000, BIR: 0
        Capabilities: [1a0 v1] Transaction Processing Hints
                No steering table available
        Capabilities: [1b0 v0] Extended Capability ID 0xfe
        Kernel driver in use: ixgbe
        Kernel modules: ixgbe

@Trolldemorted

Copy link
Copy Markdown
Author

Do you have a config for an autoformatting tool like clang-tidy which ensures I don't violate the coding style guide?

@Trolldemorted Trolldemorted force-pushed the sr-iov-whitelist branch 2 times, most recently from 9bbc0ec to 89e2270 Compare November 2, 2025 15:49
@admlck

admlck commented Nov 2, 2025

Copy link
Copy Markdown
Contributor

I've put at least a clang-format file online: https://l4re.org/devel/clang-format.html

@admlck

admlck commented Nov 6, 2025

Copy link
Copy Markdown
Contributor

Thanks Benedikt, this is a tough one. Seems we definitely have to enable SR-IOV on request only. We were just wondering whether via cmdline is the way to go (like you did) or more in the config, or both.
For sure we must not enable it per default as it has some interesting effects.

Thanks for figuring this out!

@Trolldemorted

Copy link
Copy Markdown
Author

I've put at least a clang-format file online: https://l4re.org/devel/clang-format.html

I have tested your clang-format file, but unfortunately the diff caused by that is huge.

I am a big fan of automated formatting (to prevent discussions about style preferences on individual code locations) and strict formatting checks in CI pipelines (to not waste a human's time on checking whether something is up to standard), so if you establish that config file as the L4Re standard I will gladly use it.

Thanks Benedikt, this is a tough one. Seems we definitely have to enable SR-IOV on request only. We were just wondering whether via cmdline is the way to go (like you did) or more in the config, or both.
For sure we must not enable it per default as it has some interesting effects.

It is indeed just a small band-aid hotfix to enable the usage of the SR-IOV feature on stubborn hardware configurations. In the long-term future it would be great if io could receive SR-IOV activation requests at runtime (if that's possible), which would also solve the negative effects of the blanket activation.

@admlck

admlck commented Nov 10, 2025

Copy link
Copy Markdown
Contributor

I've put at least a clang-format file online: https://l4re.org/devel/clang-format.html

I have tested your clang-format file, but unfortunately the diff caused by that is huge.

I am a big fan of automated formatting (to prevent discussions about style preferences on individual code locations) and strict formatting checks in CI pipelines (to not waste a human's time on checking whether something is up to standard), so if you establish that config file as the L4Re standard I will gladly use it.

This clang-format file implements the L4Re style as much as it can.
As L4Re as a whole is not only code from "us", it is rather hard to automatically do it.

Thanks Benedikt, this is a tough one. Seems we definitely have to enable SR-IOV on request only. We were just wondering whether via cmdline is the way to go (like you did) or more in the config, or both.
For sure we must not enable it per default as it has some interesting effects.

It is indeed just a small band-aid hotfix to enable the usage of the SR-IOV feature on stubborn hardware configurations. In the long-term future it would be great if io could receive SR-IOV activation requests at runtime (if that's possible), which would also solve the negative effects of the blanket activation.

As a stop-gap, your proposed way is good. Please call it "--enable-sriov" and only enable devices when specified, i.e., no auto-enablement for if no such option is given.

Thanks.

As of now, CONFIG_L4IO_PCI_SRIOV enables the SR-IOV capability for
every device found on the bus. This breaks drivers of certain
devices and causes io to deadlock when it encounters certain
devices. To avoid such issues, this commit requires the user to
specify a "--enable-sriov $vendor_id:$device_id" argument.

Change-Id: Ie80bcb87155a3435531e3c5f82b448bfafe4f655
Signed-off-by: Benedikt Radtke <benediktradtke@gmail.com>
@Trolldemorted

Copy link
Copy Markdown
Author

This clang-format file implements the L4Re style as much as it can.

If the mountain will not go to Mahomet, let Mahomet go to the mountain :P

I never heard of clang format's git integration before, but it sounds super useful and I will check it out if I make another non-trivial contribution.

As L4Re as a whole is not only code from "us", it is rather hard to automatically do it.

Fair point, automatic checks/conversion should exclude third party code.

As a stop-gap, your proposed way is good. Please call it "--enable-sriov" and only enable devices when specified, i.e., no auto-enablement for if no such option is given.

Done, and I have amended my commit message to reflect the changes!

@Trolldemorted

Copy link
Copy Markdown
Author

Should I also open a PR for your documentation at https://github.com/L4Re/l4re.org or should this wait until a final design has emerged? Is there a public repo for the stuff behind https://l4re.org/doc/io.html ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants