Early Unit End

From FaHWiki

Jump to: navigation, search

Contents

What is an Early Unit End?

Early Unit End messages, or EUEs, are displayed in the fahlog.txt when a WU reaches a final state from which processing (folding) can not continue further. The client then attempts to upload the partially completed WU, and downloads another. The term EUE, strictly speaking, is a noun. Although it is common to see it used as the verb "To EUE" or EUE'd.

Folding@home Core Shutdown: EARLY_UNIT_END

Why do WUs EUE?

The reasons why WUs enter this state are many - computational errors caused by CPU overclocking, poor system cooling, memory errors, ... or by the WU itself. Frequent and random EUEs are likely caused by the computer, but there are a few classes of projects that have higher EUE counts caused by the WU data. In these projects, EUEs are expected more often, and are not cause for alarm. The WUs from these projects are not broken, and valuable data is still sent back to the F@h servers.

As a general rule, an occasional EUE is expected and is nothing to worry about.

It can sometimes be difficult to determine the cause of a specific EUE, although some EUE Types are more common to one particular cause.

Machine Instability

Machine instability can be caused by overclocking, inadequate cooling and/or poor quality hardware. As a general rule, for an individual machine, an occasional EUE is expected. Machines that EUE with some regularity on various projects (known as WU trashing) are cause for concern. In this case, machine instability is most likely to blame. It is recommended that you stop folding and perform hardware testing and verification.

Hardware testing should include traditional tests of RAM and burn-in tests. Be sure to include the StressCPU program found in the 3rd party forum at folding-community.org

Unstable WUs

Occasionally, a late-Beta (an -advmethods WU) Project is issued which will EUE regularly (ie: P23xx during February of 2006) due to inherent instability. It must be pointed out that these WUs are not broken, they simply process differently and finish before reaching the estimated 100% completion point. These WUs are valuable to determine the assembly parameters of the Folding process. Partial credit is awarded depending upon how far the processing completed.

A very important part of protein simulation is the atomic motions related to the temperature of the protein. Thermal motions are described as random, but to run any simulation, you need to assume the sample starts with a particular set of random velocities. Sometimes the assumed randomness is inconsistent with reality and leads to an impossible simulation. Unfortunately you don't know that until the simulation reaches the impossible condition. That WU will issue an EUE message on any hardware. The server will reassign the WU to make sure it wasn't a hardware failure, and after a certain number of EUEs, the server will discard the WU. The Pande Group tries to keep this percentage small, but a certain percentage of WUs will always do this, and it's nothing to worry about.

If they're all the same project(s), and you find multiple reports on folding-community.org, the Pande Group may have been unable to keep the percentage low. This is much more common during beta testing, so if you're not sure of your hardware, try removing the -advmethods setting to avoid downloading late stage beta work units.

In very rare cases, a series of WUs are released that are completely unstable. These WUs will EUE instantly. The beta team usually catches all of these WUs well before public release, so "normal" users should never see WUs that produce instant EUEs.

Types of EUEs

As noted above, there are several causes for WU to EUE. All true EUEs result in the core exiting with a Core status of 72 (114). Other abnormal exits characterised by different core statuses are not EUEs, but rather the symptom of another problem.
A list of EUE Types is available here

What happens to my points?

All true EUEs will give partial credit for the results sent back. (see above)
For example: If a WU EUEs at 50% you will normally receive 50% credit assuming the data up to that point is valid. In relatively rare cases, the results are badly corrupted but can still be returned. In those cases, the points are estimated and the estimates are generally very low compared to the proportional points that are normally awarded.
Other core exits (importantly 0x0 and 0x1) will not return results and will give no credit since the client immediately deletes the WU regardless of current processing progress.

Reporting EUEs

In most cases, it is not necessary to report EUEs in the folding-community forum (see related link below) because true EUE information is reported back to the project servers (with noted exceptions above). Please do post if you can not determine the cause of the EUE (hardware or WU) from the suggestions above. Please also post about WUs from newly released projects, alerting Pande Group to potential problems with specific WUs or project series.

When reporting EUEs, it is helpful to report:

  • Project number
  • Run, Clone & Generation numbers
  • Code snippet containing the section of your fahlog.txt which shows the EUE message
  • Machine specs, especially any overclocked settings

Folding-community: See here for an example (bottom of page)

Related Links

Personal tools