Skip to content

Tags: andikleen/mcelog

Tags

v210

Toggle v210's commit message
mcelog: Add new model numbers for Nova Lake

Desktop and mobile variants for Nova Lake

Signed-off-by: Tony Luck <tony.luck@intel.com>

v209

Toggle v209's commit message
mcelog: Improve cache-error-trigger script

Two issues:

1) The script would attempt to take all CPUs offline for an L3 cache
error on a single socket system.
2) Many users don't want any CPUs taken offline because of the reduced
system performance.

Make the default to just log the affected CPUs. But make it simple to
enable offline for users that still want that.

If offline is enabled, sanity check AFFECTED_CPUS does not refer to
all online CPUs.

Signed-off-by: Tony Luck <tony.luck@intel.com>

v208

Toggle v208's commit message
mcelog: New model number for Wildcat Lake

New client CPU.

Signed-off-by: Tony Luck <tony.luck@intel.com>

v207

Toggle v207's commit message
mcelog: Add model-specific decoding for Diamond Rapids

The model-specific decoding for Diamond Rapids differs a lot from
that of earlier generations. Add the new model-specific decoding for
Diamond Rapids.

Details of error codes published in chapter 17 of the September
2025 edition of the Intel(R) Architecture Instruction Set Extensions
Programming Reference.

Suggested-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>

v206

Toggle v206's commit message
mcelog: Don't print "arstate" when PCC=1

The "action required" state is only meaningful when PCC==0.

Signed-off-by: Tony Luck <tony.luck@intel.com>

v205

Toggle v205's commit message
mcelog: Add a --binary option for reading records saved to pstore

The Linux kernel can be configured to save fatal error records to
persistent storage with the pstore file system. These are a raw
copy of "struct mce".

Add an option to skip the ioctl() calls that determine the record
size so that mcelog will decode a binary file given as argument.

Signed-off-by: Tony Luck <tony.luck@intel.com>

v204

Toggle v204's commit message
Enable offline retries by default

Signed-off-by: Andi Kleen <andi@firstfloor.org>

v203

Toggle v203's commit message
Add ability to retry failed page offlines with an exponential backoff

A page which fails to get offlined may become offlinable in the future,
depending on memory usage patterns. Under the circumstances that the page
continues to experience CEs, retrying the page offlining operation would
make sense.

This patch adds memory-ce-offline-retry, a mcelog.conf knob to turn on or
off the ability to retry offlining a page that continues to cross the
CE threshold. However, each successive retry will have an exponentially
higher threshold so as not to overrun the system with retries.

v202

Toggle v202's commit message
mcelog: Wire up model-specific decoding for Clearwater Forest

The model-specific decoding for Clearwater Forest is the same as
Granite Rapids'. Wire up the model-specific docoding of Granite Rapids
for Clearwater Forest.

Tested-by: Yi Lai <yi1.lai@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>

v201

Toggle v201's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Merge pull request #124 from meow-watermelon/add_listen_backlog_opt

add listen backlog config for mcelog server