Engineering the Windows Experience Index in Windows 7


The Windows Experience Index (WEI) measures the capability of your computer's hardware and software configuration and expresses this measurement as a number that is called a "base score." A higher base score generally means that your computer will run better and faster than a computer that has a lower base score, especially when the computer performs more advanced and resource-intensive tasks. This base score rating will help you more confidently buy additional hardware, programs, and software that are matched to your computer's base score. 

More Information

The WEI was introduced in Windows Vista to provide one way to measure the relative performance of key hardware components across computers. Like any index or benchmark, it is best used as a relative measure and should not be used to compare one measure with another. Unlike many other measures, the WEI measures only the relative capability of components. The WEI runs for only a short time and does not measure the interactions of components under a software load. Instead, it measures characteristics or your hardware. Therefore, it does not and cannot measure how a system will perform under your own usage scenarios. The WEI does not measure the performance of a system but merely the relative hardware capabilities when the system is running Windows 7. 

Take caution in trying to generalize an "absolute" WEI as necessary for a given person. We each have different tolerances or expectations for how a computer should perform, and the same WEI might mean very different things to different people. 

The overall WEI is defined as the lowest of the five top-level WEI subscores, where each subscore is computed by using a set of rules and a suite of system assessment tests. The five areas that are scored in Windows 7 are the same areas that were scored in Windows Vista. They are as follows:
  • Processor
  • Memory (RAM)
  • Graphics (general desktop work)
  • Gaming graphics (typically 3D)
  • Primary hard disk
Although the scoring areas are the same, the ranges have changed. In Windows Vista, the WEI scores ranged from 1.0 to 5.9. In Windows 7, the range was extended upward to 7.9. The scoring rules for devices have also changed from Windows Vista to reflect experience and feedback that resulted from comparing closely rated devices with different qualities of actual use (that is, to make the rating more indicative of actual use). The score has changed (compared with Windows Vista) for one or more components in some systems, and this tuning is responsible for the change. We describe this tuning here.

For a given score range, you can use some general guidelines to help you understand the experiences that a particular computer can be expected to deliver well, relatively speaking. These Windows Vista-era general guidelines for systems in the 1.0, 2.0, 3.0, 4.0, and 5.0 ranges still apply to Windows 7. But, as noted earlier, Windows 7 has added levels 6.0 and 7.0. This means that 7.9 is the maximum score possible. These new levels were designed to capture the rather significant improvements that we are seeing in key technologies as they enter the mainstream. These technologies include solid-state disks, multicore processors, and higher-end graphics adapters. Additionally, the amount of memory in a system is a determining factor.

For these new levels, there are simple guidelines to consider. As an example for gaming users, we expect systems that have gaming graphics scores in the 6.0 to 6.9 range to support DX10 graphics and to deliver good frames rates at typical screen resolutions (for example, 40 to 50 frames per second at 1280x1024). In the range of 7.0 to 7.9, we would expect higher frame rates at even higher screen resolutions. Obviously, the specifics of each game have much to do with this, and the WEI scores are also meant to help game developers decide how best to scale their experience on a given system. Graphics is an area in which there are both the widest variety of scores easily available in hardware and also the widest breadth of expectations. The extremes to which CAD, HD video, photography, and gamers push graphics compared with the average business user or consumer (doing many of these same things as an avocation instead of as a vocation) are significant.

Of course, the adding of new levels does not explain why a Windows Vista system or component that used to score 4.0 or higher is now obtaining a score of 2.9. In most cases, large score drops will be caused by the addition of some new disk tests in Windows 7, because that is where we have seen both interesting real-world learning and significant changes in the hardware landscape.

With respect to disk scores, we were able to capture thousands of detailed traces that cover periods of time during which the computer’s current user indicated that an application, or Windows, was experiencing severe responsiveness problems. In analyzing these traces, we saw a connection to disk I/O, and we frequently found typical 4KB disk reads to take longer than expected. We found them to take much, much longer, in fact (10x to 30x). Instead of taking tens of milliseconds to complete, we would frequently find sequences for which individual disk reads took many hundreds of milliseconds to finish. When sequences of these accumulate, higher-level application responsiveness can suffer significantly.

After we recognized the problem, we synthesized many of the I/O sequences and undertook a large study on many, many disk drives. This includes solid-state drives. Although we did find a good number of drives to be excellent, we also found many to have significant challenges under this kind of load. Based on telemetry, this is rather common. In particular, we found the first generation of solid-state drives to be generally challenged when they encounter these frequently seen client I/O sequences.

An example problematic sequence consists of a series of sequential and random I/Os intermixed with one or more flushes. During these sequences, many of the random writes complete in unrealistically short periods of time (say, 500 microseconds). Very short I/O completion times indicate caching. The actual work of moving the bits to spinning media, or to flash cells, is postponed. After a period of returning success very quickly, a backlog of deferred work is built up. What happens next differs from drive to drive. Some drives continue to consistently respond to reads as expected, regardless of the earlier issued and postponed writes/flushes. This yields good performance and no perceived problems for the person who is using the computer. On some drives, however, reads are frequently held off for very lengthy periods as the drives apparently try to clear their backlog of work. This results in a perceived "blocking" state or almost a "locked system." To validate this, on some systems, we replaced poor-performing disks with known good disks and observed significantly improved performance. In some cases, updating the drive’s firmware was sufficient to very noticeably improve responsiveness.

To reflect this real-world learning, in Windows 7, we have capped scores for drives that seem to exhibit the problematic behavior (during the scoring) and are using our feedback system to send information back to us so that we can better evaluate these results. Scores of 1.9, 2.0, 2.9, and 3.0 for the system disk are possible because of our current capping rules. Internally, we feel confident in the disk assessment and in these caps based on the data that we have observed to this point. Of course, we expect to learn from data that is coming from broader usage and from feedback and conversations that we have with drive manufacturers.

For those who are obtaining low disk scores but are otherwise satisfied with the performance, we are not recommending any action. (Of course, the WEI is not a tool for recommending hardware changes of any kind.) It is possible that the sequence of I/Os that are being issued for your common workload and applications is not encountering the issues that we are noting. As we said, the WEI is a metric, but only you can apply that metric to your computing needs.

Levels, 6 and 7 were added to recognize the improved experiences one might have with newer hardware, especially SSDs, graphics adapters, and multicore processors. With respect to SSDs, the focus of the newer tests is on random I/O rates and their avoidance of the long latency issues that were noted earlier. As a note, the tests do not specifically check whether the underlying storage device is an SSD. We run them regardless of the device type, and any device capable of sustaining very high random I/O rates will score well.

For graphics adapters, both DX9 and DX10 assessments can be run now. In Windows Vista, the tests were specific to DX9. To obtain scores in the 6 or 7 range, a graphics adapter must obtain very good performance scores, the adapter must support DX10, and the driver must be at least a WDDM 1.1 driver. For WDDM 1.0 drivers, only the DX9 assessments will be run. Therefore the overall score is capped at 5.9. For multicore processors, both single-threaded and multithreaded scenarios are run. With levels 6 and 7, we intend to indicate that these systems will be rarely CPU bound for typical use and very suitable for demanding processing tasks and for multitasking. As examples, we expect that many quad-core processors will be able to score in the high 6 to low 7 range, and we expect eight-core systems to be able to approach 7.9. This scoring has accounted for the very latest microprocessors available.

Additional Resources

What is the Windows Experience Index?

Why did my Windows Experience Index rating change?

Ways to improve your computer's performance

Optimize Windows 7 for better performance