Main Page

From EdacWiki
Revision as of 16:16, 4 March 2006 by TimSmall (talk | contribs)
Jump to navigation Jump to search

EDAC Wiki

This is a wiki for the Linux EDAC project

What is it?

EDAC Stands for "Error Detection and Correction". The Linux EDAC project comprises of a series of Linux kernel modules, which make use of error detection facilities of computer hardware, currently hardware which detects the following errors is supported:

  • System RAM errors (this is the original, and most mature part of the project) - many computers support RAM EDAC, (especially for chipsets which are aimed at high-reliability applications), but RAM which has extra storage capacity ("ECC RAM") is needed for these facilities to operate
  • PCI bus transfer errors - the majority of PCI bridges, and peripherals support such error detection

Status

The EDAC code is expected to be in Linux Kernel version 2.6.16

History

The EDAC project was renamed from the "bluesmoke" prior to submission to the mainline Linux kernel. The Bluesmoke code was created by Thayne Harbaugh. The Linux-ECC project was EDAC's predecessor and its major inspiration. Developed by Dan Hollis and others, the Linux-ECC project is no longer maintained.

Supported Hardware

System Main Memory ECAC

Supported Memory Controllers

Manufacturer Model EDAC Driver Chipset Documentation Controller Capabilities Status
AMD Opteron k8 EDAC, Error Scrub, Background Scrub Supported (Linux 2.6.16)
AMD Athlon64 k8 EDAC, Error Scrub, Background Scrub Supported (Linux 2.6.16)
AMD AthlonFX k8 EDAC, Error Scrub, Background Scrub Supported (Linux 2.6.16)
AMD 760 Supported (Linux 2.6.16)
AMD 762 Supported (Linux 2.6.16)
AMD 768 Supported (Linux 2.6.16)
Intel e7500 Supported (Linux 2.6.16)
Intel e7501 Supported (Linux 2.6.16)
Intel e7505 Supported (Linux 2.6.16)
Intel e7520 Supported (Linux 2.6.16)
Intel e7525 Supported (Linux 2.6.16)
Intel 82875p Supported (Linux 2.6.16)
Intel e7210 Supported (Linux 2.6.16)
Intel 82860 Supported (Linux 2.6.16)
Radisys 82600 r82600 EDAC, Error Scrub Supported (Linux 2.6.16)

PCI Error Reporting

PCI Parity error reporting facilities are included in the PCI specification, and the majority of add-in cards (and chips which are capable of being included in either add-in, or on-motherboard designs) support the PCI parity error detection, and reporting functionality. "Fake" PCI devices which are not physically connected by a PCI bus (such as e.g. some ATA host adaptors which are built-in to a motherboard chipset) typically do not include the functionality.

Error detection overhead

Polling all of the PCI devices' error status registers can be timeconsuming, especially on machines which have many devices. You may wish to slow the error polling rate, or disable it altogether on such systems.

Faulty Hardware

Some PCI devices (or just particular revisions of those devices) are broken with respect to PCI parity detection, and display false positives. You can check (and add to) the list of broken devices on the PCIDevicesWithBrokenParityDetection page.

Related Articles

An overview of EDAC technologies on Wikipedia [1]

The original Linux ECC project (Dan Hollis et al) - [2]

How to use this site

A Wiki is a collaborative site, anyone can contribute and share:

  • Edit any page by pressing Edit at the top or the bottom of the page
  • Create a link to another page with joined capitalized words (like WikiSandBox) or with [[quoted words in brackets]]
  • Search for page titles or text within pages using the search box at the top of any page
  • See HelpForBeginners to get you going, HelpContents for all help pages.

To learn more about what a WikiWikiWeb is, read about MoinMoin:WhyWikiWorks and the MoinMoin:WikiNature. Also, consult the MoinMoin:WikiWikiWebFaq.

This wiki is powered by MediaWiki.