Académique Documents
Professionnel Documents
Culture Documents
= MemTest-86 v3.4 =
= 3 Sep, 2007 =
= Chris Brady =
====================
Table of Contents
=================
1) Introduction
2) Licensing
3) Installation
4) Serial Port Console
5) Online Commands
6) Memory Sizing
7) Error Display
8) Trouble-shooting Memory Errors
9) Execution Time
10) Memory Testing Philosophy
11) Memtest86 Test Algorithms
12) Individual Test Descriptions
13) Problem Reporting - Contact Information
14) Known Problems
15) Planned Features List
16) Change Log
17) Acknowledgments
1) Introduction
===============
Memtest86 is thorough, stand alone memory test for Intel/AMD x86 architecture
systems. BIOS based memory tests are only a quick check and often miss
failures that are detected by Memtest86.
For updates go to the Memtest86 web page:
http://www.memtest86.com
2) Licensing
============
Memtest86 is released under the terms of the Gnu Public License (GPL). Other
than the provisions of the GPL there are no restrictions for use, private or
commercial. See: http://www.gnu.org/licenses/gpl.html for details.
3) Linux Installation
============================
Memtest86 is a stand alone program and can be loaded from either a disk
partition or from a floppy disk.
To build Memtest86:
1) Review the Makefile and adjust options as needed.
2) Type "make"
This creates a file named "memtest.bin" which is a bootable image. This
image file may be copied to a floppy disk or may be loaded from a disk
partition via Lilo or Grub image from a hard disk partition.
To create a Memtest86 bootdisk
1) Insert a blank write enabled floppy disk.
2) As root, Type "make install"
To boot from a disk partition via Grub
1) Copy the image file to a permanent location (ie. /boot/memtest.bin).
2) Add an entry in the Grub config file (/boot/grub/menu.lst) to boot
memtest86. Only the title and kernel fields need to be specified.
The following is a sample Grub entry for booting memtest86:
title Memtest86
kernel (hd0,0)/memtest.bin
To boot from a disk partition via Lilo
1) Copy the image file to a permanent location (ie. /boot/memtest.bin).
2) Add an entry in the lilo config file (usually /etc/lilo.conf) to boot
memtest86. Only the image and label fields need to be specified.
The following is a sample Lilo entry for booting memtest86:
image = /boot/memtest.bin
label = memtest86
3) As root, type "lilo"
If you encounter build problems a binary image has been included (precomp.bin).
To create a boot-disk with this pre-built image do the following:
1) Insert a blank write enabled floppy disk.
2) Type "make install-precomp"
4) Serial Console
=================
Memtest86 can be used on PC's equipped with a serial port for the console.
By default serial port console support is not enabled since it slows
down testing. To enable change the SERIAL_CONSOLE_DEFAULT define in
config.h from a zero to a one. The serial console baud rate may also
be set in config.h with the SERIAL_BAUD_RATE define. The other serial
port settings are no parity, 8 data bits, 1 stop bit. All of the features
used by memtest86 are accessible via the serial console. However, the
screen sometimes is garbled when the online commands are used.
5) Online Commands
==================
Memtest86 has a limited number of online commands. Online commands
provide control over caching, test selection, address range and error
scrolling. A help bar is displayed at the bottom of the screen listing
the available on-line commands.
Command Description
ESC Exits the test and does a warm restart via the BIOS.
c Enters test configuration menu
Menu options are:
1) Cache mode
2) Test selection
3) Address Range
4) Memory Sizing
5) Error Summary
6) Error Report Mode
7) ECC Mode
8) Restart
9) Adv. Options
SP Set scroll lock (Stops scrolling of error messages)
Note: Testing is stalled when the scroll lock is
set and the scroll region is full.
CR Clear scroll lock (Enables error message scrolling)
6) Memory Sizing
================
The BIOS in modern PC's will often reserve several sections of memory for
it's use and also to communicate information to the operating system (ie.
ACPI tables). It is just as important to test these reserved memory blocks
as it is for the remainder of memory. For proper operation all of memory
needs to function properly regardless of what the eventual use is. For
this reason Memtest86 has been designed to test as much memory as is
possible.
However, safely and reliably detecting all of the available memory has been
problematic. Versions of Memtest86 prior to v2.9 would probe to find where
memory is. This works for the vast majority of motherboards but is not 100%
reliable. Sometimes the memory size is incorrect and worse probing the wrong
places can in some cases cause the test to hang or crash.
Starting in version 2.9 alternative methods are available for determining the
memory size. By default the test attempts to get the memory size from the
BIOS using the "e820" method. With "e820" the BIOS provides a table of memory
segments and identifies what they will be used for. By default Memtest86
will test all of the ram marked as available and also the area reserved for
the ACPI tables. This is safe since the test does not use the ACPI tables
and the "e820" specifications state that this memory may be reused after the
tables have been copied. Although this is a safe default some memory will
not be tested.
Two additional options are available through online configuration options.
The first option (BIOS-All) also uses the "e820" method to obtain a memory
map. However, when this option is selected all of the reserved memory
segments are tested, regardless of what their intended use is. The only
exception is memory segments that begin above 3gb. Testing has shown that
these segments are typically not safe to test. The BIOS-All option is more
thorough but could be unstable with some motherboards.
The second option for memory sizing is the traditional "Probe" method.
This is a very thorough but not entirely safe method. In the majority of
cases the BIOS-All and Probe methods will return the same memory map.
For older BIOS's that do not support the "e820" method there are two
additional methods (e801 and e88) for getting the memory size from the
BIOS. These methods only provide the amount of extended memory that is
available, not a memory table. When the e801 and e88 methods are used
the BIOS-All option will not be available.
The MemMap field on the display shows what memory size method is in use.
Also the RsvdMem field shows how much memory is reserved and is not being
tested.
7) Error Information
======================
Memtest has three options for reporting errors. The default is an an error
summary that displays the most relevant error information. The second option
is reporting of individual errors. In BadRAM Patterns mode patterns are
created for use with the Linux BadRAM feature. This slick feature allows
Linux to avoid bad memory pages. Details about the BadRAM feature can be
found at:
http://home.zonnet.nl/vanrein/badram
The error summary mode displays the following information:
Error Confidence Value:
A value that indicates the validity of the errors being reported with
larger values indicating greater validity. There is a high probability
that all errors reported are valid regardless of this value. However,
when this value exceeds 100 it is nearly impossible that the reported
errors will be invalid.
Lowest Error Address:
The lowest address that where an error has been reported.
Highest Error Address:
The highest address that where an error has been reported.
Bits in Error Mask:
A mask of all bits that have been in error (hexadecimal).
Bits in Error:
Total bit in error for all error instances and the min, max and average
bit in error of each individual occurrence.
Max Contiguous Errors:
The maximum of contiguous addresses with errors.
ECC Correctable Errors:
The number of errors that have been corrected by ECC hardware.
Errors per DIMM slot:
Error counts are reported for each memory module installed in the system.
Use the â Show DMI Memory Infoâ runtime option for detailed memory module
information.
Test Errors:
On the right hand side of the screen the number of errors for each test
are displayed.
For individual errors the following information is displayed when a memory
error is detected. An error message is only displayed for errors with a
different address or failing bit pattern. All displayed values are in
hexadecimal.
Tst: Test number
Failing Address : Failing memory address
Good: Expected data pattern
Bad: Failing data pattern
Err-Bits: Exclusive or of good and bad data (this shows the
position of the failing bit(s))
Count: Number of consecutive errors with the same address
and failing bits
In BadRAM Patterns mode, Lines are printed in a form badram=F1,M1,F2,M2.
In each F/M pair, the F represents a fault address, and the corresponding M
is a bitmask for that address. These patterns state that faults have
occurred in addresses that equal F on all "1" bits in M. Such a pattern may
capture more errors that actually exist, but at least all the errors are
captured. These patterns have been designed to capture regular patterns of
errors caused by the hardware structure in a terse syntax.
The BadRAM patterns are `grown' increment-ally rather than `designed' from an
overview of all errors. The number of pairs is constrained to five for a
number of practical reasons. As a result, handcrafting patterns from the
output in address printing mode may, in exceptional cases, yield better
results.
9) Execution Time
==================
The time required for a complete pass of Memtest86 will vary greatly
depending on CPU speed, memory speed and memory size. Memtest86 executes
indefinitely. The pass counter increments each time that all of the
selected tests have been run. Generally a single pass is sufficient to
catch all but the most obscure errors. However, for complete confidence
when intermittent errors are suspected testing for a longer period is advised.
10) Memory Testing Philosophy
=============================
There are many good approaches for testing memory. However, many tests
simply throw some patterns at memory without much thought or knowledge
of memory architecture or how errors can best be detected. This
works fine for hard memory failures but does little to find intermittent
errors. BIOS based memory tests are useless for finding intermittent
memory errors.
Memory chips consist of a large array of tightly packed memory cells,
one for each bit of data. The vast majority of the intermittent failures
are a result of interaction between these memory cells. Often writing a
memory cell can cause one of the adjacent cells to be written with the
same data. An effective memory test attempts to test for this
condition. Therefore, an ideal strategy for testing memory would be
the following:
1) write a cell with a zero
2) write all of the adjacent cells with a one, one or more times
3) check that the first cell still has a zero
It should be obvious that this strategy requires an exact knowledge
of how the memory cells are laid out on the chip. In addition there is a
never ending number of possible chip layouts for different chip types
and manufacturers making this strategy impractical. However, there
are testing algorithms that can approximate this ideal strategy.
Enhancements in v2.3
A progress meter was added to replace the spinner and dots.
Measurement and reporting of memory and cache performance
was added.
Support for creating BadRAM patterns was added.
All of the test routines were rewritten in assembler to
improve both test performance and speed.
The screen layout was reworked to hopefully be more readable.
An error summary option was added to the online commands.
Enhancements in v2.2
Added two new address tests
Added an on-line command for setting test address range
Optimized test code for faster execution (-O3, -funroll-loops and
-fomit-frame-pointer)
Added and elapsed time counter.
Adjusted menu options for better consistency
Enhancements in v2.1
Fixed a bug in the CPU detection that caused the test to
hang or crash with some 486 and Cryrix CPU's
Added CPU detection for Cyrix CPU's
Extended and improved CPU detection for Intel and AMD CPU's
Added a compile time option (BIOS_MEMSZ) for obtaining the last
memory address from the BIOS. This should fix problems with memory
sizing on certain motherboards. This option is not enabled by default.
It may be enabled be default in a future release.
Enhancements in v2.0
Added new Modulo-20 test algorithm.
Added a 32 bit shifting pattern to the moving inversions algorithm.
Created test sections to specify algorithm, pattern and caching.
Improved test progress indicators.
Created popup menus for configuration.
Added menu for test selection.
Added CPU and cache identification.
Added a "bail out" feature to quit the current test when it does not
fit the test selection parameters.
Re-arranged the screen layout and colors.
Created local include files for I/O and serial interface definitions
rather than using the sometimes incompatible system include files.
Broke up the "C" source code into four separate source modules.
Enhancements in v1.5
Some additional changes were made to fix obscure memory sizing
problems.
The 4 bit wide data pattern was increased to 8 bits since 8 bit
wide memory chips are becoming more common.
A new test algorithm was added to improve detection of data
pattern sensitive errors.
Enhancements in v1.4
Changes to the memory sizing code to avoid problems with some
motherboards where memtest would find more memory than actually
exists.
Added support for a console serial port. (thanks to Doug Sisk)
On-line commands are now available for configuring Memtest86 on
the fly (see On-line Commands).
Enhancements in v1.3
Scrolling of memory errors is now provided. Previously, only one screen
of error information was displayed.
Memtest86 can now be booted from any disk via lilo.
Testing of up to 4gb of memory has been fixed is now enabled by default.
This capability was clearly broken in v1.2a and should work correctly
now but has not been fully tested (4gb PC's are a bit rare).
The maximum memory size supported by the motherboard is now being
calculated correctly. In previous versions there were cases where not
all of memory would be tested and the maximum memory size supported
was incorrect.
For some types of failures the good and bad values were reported to be
same with an Xor value of 0. This has been fixed by retaining the data
read from memory and not re-reading the bad data in the error reporting
routine.
APM (advanced power management) is now disabled by Memtest86. This
keeps the screen from blanking while the test is running.
Problems with enabling & disabling cache on some motherboards have been
corrected.
17) Acknowledgments
===================
Memtest86 was developed by Chris Brady with the resources and assistance
listed below:
- The initial versions of the source files bootsect.S, setup.S, head.S and
build.c are from the Linux 1.2.1 kernel and have been heavily modified.
- Doug Sisk provided code to support a console connected via a serial port.
- Code to create BadRAM patterns was provided by Rick van Rein.
- Tests 5 and 8 are based on Robert Redelmeier's burnBX test.
- Screen buffer code was provided by Jani Averbach.
- Eric Biederman provided all of the feature content for version 3.0
plus many bugfixes and significant code cleanup.
- Major enhancements to hardware detection and reporting in version 3.2,
3.3 pnd 3.4 rovided by Samuel Demeulemeester (from Memtest86+ v1.11, v1.60
and v1.70).