The impact of computer components on performance. Councils for modernization (upgrade) of a personal computer. Cleaning and freeing up RAM to speed up processes

You can download the presentation for the lecture.

Simplified processor model

Additional Information:

The prototype of the circuit is partly a description of the von Neumann architecture, which has the following principles:

Binary principle
Principle of programmed control
The principle of memory homogeneity
The principle of memory addressability
The principle of sequential program control
Conditional jump principle

To make it easier to understand what the modern computing system , we must consider it in development. Therefore, I brought the most simple scheme that comes to mind. In essence, this is a simplified model. We have a certain control device inside the processor, arithmetic logic unit, system registers, the system bus, which allows exchange between the control device and other devices, memory and peripherals. Control device receives instructions, decrypts them, controls the arithmetic logic unit, transfers data between registers processor, memory, peripheral devices.

Simplified processor model

control unit (CU)
Arithmetic and Logic Unit (ALU)
system registers
system bus (Front Side Bus, FSB)
memory
peripherals

Control Unit (CU):

performs decryption of instructions coming from the computer's memory.
manages the ALU.
transfers data between CPU registers, memory, peripheral devices.

Arithmetic logic unit:

allows you to perform arithmetic and logical operations over system registers.

System registers:

a certain area of memory inside the CPU used for intermediate storage of information processed by the processor.

System bus:

used to transfer data between the CPU and memory, and between the CPU and peripherals.

Arithmetic logic unit consists of various electronic component allowing to perform operations on system registers. System registers are certain areas in memory, inside the central processor, used to store intermediate results processed by the processor. The system bus is used to transfer data between central processing unit and memory, as well as between the central processing unit and peripherals.

The high performance of the MP (microprocessor) is one of the key factors in the competitive struggle of processor manufacturers.

The performance of a processor is directly related to the amount of work, calculations, which it can perform per unit of time.

Very conditional:

Performance = Number of instructions / Time

We will look at the performance of processors based on IA32 and IA32e architectures. (IA32 with EM64T).

Factors affecting processor performance:

CPU clock speed.
The amount of addressable memory and the speed of access to external memory.
Execution speed and instruction set.
Usage internal memory, registers.
Conveyor quality.
Prefetch quality.
Superscalarity.
The presence of vector instructions.
Multi-core.

What performance? It is difficult to give an unambiguous definition of performance. You can formally bind it to a processor - how many instructions a particular processor can execute per unit of time. But it is easier to give a comparative definition - to take two processors and the one that executes a certain set of instructions faster, the more productive. That is, very conditionally, we can say that performance Is the number of instructions per lead time... Here we will mainly study those microprocessor architectures that Intel releases, that is, the IA32 architectures that are now called Intel 64. These are architectures that, on the one hand, support the old instructions from the IA32 set, on the other hand, they have EM64T - this is a kind of extension that allows you to use 64 bit addresses, i.e. address large memory sizes, and also includes some useful additions, such as an increased number of system registers, an increased number of vector registers.

What factors affect performance? Let's list all that come to mind. It:

The speed of execution of instructions, the completeness of the basic set of instructions.
Using internal register memory.
Conveyor quality.
The quality of transition prediction.
Prefetch quality.
Superscalarity.
Vectorization, use of vector instructions.
Parallelization and multicore.

Clock frequency

The processor consists of components that are triggered at different times and there is a timer in it that provides synchronization by sending periodic pulses. Its frequency is called the processor clock frequency.

Addressable memory size

Clock frequency.

Since the processor has many different electronic components that work independently, in order to synchronize their work, so that they know at what moment they need to start working, when they need to do their work and wait, there is a timer that sends a sync pulse. The frequency with which the sync pulse is sent is clock frequency processor. There are devices that manage to perform two operations during this time, nevertheless, the processor's work is tied to this sync pulse, and we can say that if we increase this frequency, then we will make all these microcircuits work with more voltage and stand idle less.

The amount of memory addressed and the speed of memory access.

Memory size - it is necessary that there is enough memory for our program and our data. That is, EM64T technology allows addressing great amount memory and at the moment the question with the fact that we do not have enough addressable memory is not worth it.

Since developers generally do not have the ability to influence these factors, I only mention them.

Execution speed and instruction set

Performance depends on how well the instructions are implemented, how completely the basic set of instructions covers all possible tasks.

CISC, RISC (complex, reduced instruction set computing)

Modern Intel® processors are a hybrid of CISC and RISC processors, converting CISC instructions into a simpler set of RISC instructions before execution.

The speed of execution of instructions and the completeness of the basic set of instructions.

Basically, when architects design processors, they constantly work to improve it. performance... One of their tasks is to collect statistics to determine which instructions or sequence of instructions are key from a performance standpoint. Trying to improve performance, architects are trying to make the hottest instructions faster, for some sets of instructions to make a special instruction that will replace this set and will work more efficiently. The characteristics of the instructions change from architecture to architecture, and new instructions appear to achieve better performance. Those. we can assume that from architecture to architecture, the basic set of instructions is constantly being improved and expanded. But if you do not specify on what architectures your program will run, then your application will use a certain default set of instructions that all the latest microprocessors support. Those. We can achieve the best performance only if we clearly specify the microprocessor on which the task will be executed.

Using registers and random access memory

The access time to registers is the smallest, so the number of available registers affects the performance of the microprocessor.

Displacement of registers (register spilling) - due to insufficient number of registers, there is a large exchange between registers and the application stack.

With the increase in processor performance, a problem arose due to the fact that the speed of access to external memory became lower than the speed of calculations.

There are two characteristics for describing memory properties:

The latency is the number of processor cycles required to transfer a unit of data from memory.
Bandwidth is the number of data items that can be sent to the processor from memory in one cycle.

Two Potential Strategies for Accelerating Performance - Reducing Response Time or Proactive Request desired memory.

Using registers and RAM.

Registers are the fastest memory elements, they are located directly on the kernel, and access to them is almost instantaneous. If your program does some calculations, you would like all intermediate data to be stored in registers. It is clear that this is impossible. One of possible problems performance is a register preemption problem. When you look at the assembler code under some kind of performance analyzer, you see that you have a lot of traffic from the stack to registers and back and forth paging registers onto the stack. The question is how to optimize the code so that the hottest addresses, the hottest intermediate data, lie exactly on the system registers.

The next piece of memory is regular RAM. As processor performance has increased, it has become clear that the bottleneck in performance is access to RAM. In order to get to the RAM, you need a hundred, or even two hundred processor cycles. That is, by requesting some memory cell in RAM, we will wait two hundred clock cycles, and the processor will be idle.

There are two characteristics to describe the properties of memory - they are response time, that is, the number of processor cycles required to transfer a unit of data from memory, and throughput - how many data items can be sent by the processor from memory in one cycle. Faced with the problem that our bottleneck is memory access, we can solve this problem in two ways - either by decreasing the response time, or by making preemptive requests for the required memory. That is, at the moment we are not interested in the value of some variable, but we know that we will need it soon, and we are already requesting it.

Caching

The cache memory serves to reduce the time of data access.

To do this, blocks of RAM are mapped to a faster cache.

If the memory address is in the cache, a "hit" occurs and the data retrieval speed increases significantly.

Otherwise - cache miss

In this case, a block of RAM is read into the cache in one or more bus cycles, called filling a cache line.

The following types of cache memory can be distinguished:

fully associative cache (each block can be mapped to any location in the cache)
direct mapped memory (each block can be mapped to one location)
hybrid options (sector memory, multi-associative access memory)

Multiple-associative access - the low-order bits determine the cache line where it can be displayed given memory, but this line can contain only a few words of main memory, the selection of which is carried out on an associative basis.

The quality of cache usage is key to performance.

Additional Information: on modern IA32 systems, the cache line size is 64 bytes.

The reduction in access time was achieved by the introduction of cache memory. Cache is memory is buffer memory located between the RAM and the microprocessor. It is implemented on the kernel, that is, access to it is much faster than to ordinary memory, but it is much more expensive, so when developing a microarchitecture, you need to find an exact balance between price and performance. If you look at the descriptions of the processors offered for sale, you will see that it is always written in the description how much memory cache of this or that level is on this processor. This figure seriously affects the price of this product. The cache memory is arranged in such a way that the usual memory is mapped to the cache memory, the mapping is done in blocks. When you request an address in RAM, you check whether this address is mapped to the cache memory. If this address is already in the cache memory, then you save time on accessing memory. You read this information from fast memory, and your response time is significantly reduced, but if this address is not in the cache memory, then we must turn to ordinary memory so that this address we need together with some block in which it is located , mapped into this cache.

There are different implementations of cache memory. There is a fully associative cache memory, where each block can be mapped to any location in the cache. There is direct-mapped memory, where each block can be mapped to one place, and there are various hybrid options - for example, a multi-associative cache. What is the difference? The difference in time and complexity of checking for the presence of the desired address in the cache memory. Let's say we need a specific address. In case of associative memory we need to check the entire cache - make sure that this address is not in the cache. In the case of direct mapping, we only need to check one cell. In the case of hybrid variants, for example, when using an MDA cache, we need to check, for example, four or eight cells. That is, the task of determining whether there is a cache address is also important. The quality of cache use is an important condition for performance. If we manage to write a program so that as often as possible the data with which we were going to work is in the cache, then such a program will work much faster.

Typical response times when accessing the cache memory for Nehalem i7:

L1 - latency 4
L2 - latency 11
L3 - latency 38

Response time for RAM> 100

Proactive memory access mechanism implemented using hardware prefetching.

There is a special set of instructions that allows the processor to load memory located at a specific address into the cache (software prefetching).

Take our latest Nehalem processor: the i7 as an example.

Here we have not just a cache, but a kind of hierarchical cache. Long time it was two-tier, in modern system Nehalem has three levels - very little very fast cache, a little more second-level cache and enough a large number of cache of the third level. Moreover, this system is built in such a way that if some address is in the cache of the first level, it is automatically located in the second and third levels. This is a hierarchical system. For the first level cache, the latency is 4 cycles, for the second - 11, for the third - 38, and the response time of the RAM is more than 100 processor cycles.

In any production of products, one of the main goals pursued by the company's management is to obtain a result. The only question is how much effort and resources will be required in the process of work to achieve the main goal. To determine the efficiency of the enterprise, the concept of "labor productivity" was introduced, which is an indicator of the productivity of personnel. The work that can be done by one person per unit of time is conventionally called "development".

It is very important for each enterprise to get a high result and at the same time spend as little resources as possible on production (this includes payments for electricity, rent, etc.).

The most important challenge in any enterprise that manufactures goods or provides services is to increase productivity. At the same time, there are a number of measures that are taken to reduce the amount of costs required for the workflow. Thus, over the period of development of the enterprise, labor productivity may change.

As a rule, several groups of factors are classified that can influence the change, namely the growth of production indicators. First of all, this is an economic and geographical factor, which includes the availability of free resources of labor, water, electricity, building materials, as well as the distance to communications, terrain, etc. No less important is the importance of accelerating scientific and technological progress, contributing to the introduction of new generations of modern technology and the use of advanced technologies and automated systems... It can also be assumed that labor productivity also depends on the factor of structural changes, which means a change in the share of components and purchased semi-finished products, as well as the structure of production and the share of certain types of products.

The social (human) moment still remains of great importance, because it is the concern for social benefits that underlies the increase in labor productivity. This includes: concern about the physical health of a person, the level of his intellectual development, professionalism, etc.

Factors of growth in labor productivity are the most important component of the entire work process, because it is they that affect the rate of development of any enterprise and, accordingly, contribute to an increase in profits.

It is also worth noting the organizational moment, which determines the level of production and labor management. It includes the improvement of the organization of enterprise management, the improvement of personnel, material and technical training.

When talking about productivity, it is impossible to ignore the intensity of labor. This concept is a reflection of the indicator of the amount of mental and physical energy expended by the employee for a certain period of working time.

It is very important to determine the optimal intensity for a given workflow, because excessive activity can lead to inevitable losses in productivity. As a rule, this occurs as a result of human overwork, the occurrence of occupational diseases, injuries, etc.

It is worth noting that the main indicators were identified that determine the intensity of labor. First of all, it is the workload of a person with work activities. This allows you to determine the intensity of the work process and, accordingly, the feasibility of costs. At the same time, it is customary to calculate the pace of work, that is, the frequency of actions relative to a unit of time. Taking these factors into account, the enterprise, as a rule, has certain standards, based on the indicators of which, a production work plan is established.

Factors of labor productivity are the subject of close attention of workers in science and practice, since they act as the primary causes that determine its level and dynamics. The factors investigated in the analysis can be classified according to different criteria. The most detailed classification is presented in table 1.

Table 1

Classification of factors affecting labor productivity

Classification attribute	Factor groups
By it's nature	Natural and climatic
	Socio-economic
	Production and economic
By the degree of impact on the result	The main
By the degree of impact on the result	Secondary
In relation to the object of research	Internal
In relation to the object of research
Depending on the team	Objective
	Subjective
By prevalence
By prevalence	Specific
By time of action	Permanent
By time of action	Variables
By the nature of the action	Extensive
By the nature of the action	Intensive
By the properties of reflected phenomena	Quantitative
By the properties of reflected phenomena	Qualitative
By its composition
By its composition
By the level of subordination (hierarchy)	First order
By the level of subordination (hierarchy)	Second order, etc.
Where possible, measuring the impact	Measurable
Where possible, measuring the impact	Unmeasurable

By their nature, factors are subdivided into natural and climatic, socio-economic and production-economic.

Natural and climatic factors have big influence on the results of activities in agriculture, mining, forestry and other industries. Taking into account their influence allows you to more accurately assess the results of the work of business entities. Socio-economic factors include the living conditions of workers, the organization of cultural, sports and recreational work at the enterprise, the general level of culture and education of personnel, etc. They contribute to more full use production resources of the enterprise and increase the efficiency of its work. Production and economic factors determine the completeness and efficiency of the use of production resources of the enterprise and the final results of its activities. According to the degree of influence on the results of economic activity, the factors are divided into main and secondary ones. The main factors include factors that have a decisive impact on the performance indicator. Secondary are those that do not have a decisive impact on the results of economic activities in the current environment. It should be noted here that the same factor, depending on the circumstances, can be both primary and secondary. The ability to single out the main, determining factors from a variety of factors ensures the correctness of the conclusions based on the analysis results.

In relation to the object of research, factors are classified into internal and external, i.e. dependent and not dependent on the activities of this enterprise. The main focus of the analysis should be on the study of internal factors that can be influenced by the enterprise.

At the same time, in many cases, with developed industrial ties and relations, the performance of each enterprise is largely influenced by the activities of other enterprises, for example, the uniformity and timeliness of the supply of raw materials, materials, their quality, cost, market conditions, inflationary processes, etc. These factors are external. They do not characterize the efforts of a given team, but their study makes it possible to more accurately determine the degree of influence of internal causes and thereby more fully reveal the internal reserves of production.

For a correct assessment of the activities of enterprises, factors must be further subdivided into objective and subjective. Objective factors, such as a natural disaster, do not depend on the will and desire of people. Unlike objective, subjective reasons depend on the activities of legal entities and individuals.

According to the degree of prevalence, factors are divided into general and specific. The general factors include factors that operate in all sectors of the economy. Specific are those that operate in a particular branch of the economy or enterprise. This division of factors makes it possible to more fully take into account the characteristics of individual enterprises, branches of production and more accurately assess their activities.

According to the duration of the impact on the results of activities, there are constant and variable factors. Constant factors influence the studied phenomenon continuously throughout the entire time. The influence of variable factors manifests itself periodically, for example, the development of new technology, new types of products, new production technology, etc.

Of great importance for assessing the activities of enterprises is the division of factors by the nature of their action into intensive and extensive. Extensive factors include factors that are associated with a quantitative rather than qualitative increase in the effective indicator, for example, an increase in the volume of production by expanding the cultivated area, increasing the number of animals, the number of workers, etc. Intensive factors characterize the degree of effort, labor intensity in the production process, for example, an increase in crop yields, livestock productivity, and the level of labor productivity.

If the analysis aims to measure the influence of each factor on the results of economic activity, then they are divided into quantitative and qualitative, simple and complex, measurable and unmeasured.

Factors that express the quantitative certainty of phenomena (the number of workers, equipment, raw materials, etc.) are considered quantitative. Qualitative factors determine the internal qualities, signs and characteristics of the objects under study (labor productivity, product quality, soil fertility, etc.).

Most of the studied factors are complex in their composition, consist of several elements. However, there are those that cannot be decomposed into component parts. Depending on the composition, factors are divided into complex (complex) and simple (elemental). An example of a complex factor is labor productivity, and a simple one is the number of working days in the reporting period.

As already indicated, some factors have a direct impact on the performance indicator, others indirectly. According to the level of subordination (hierarchy), factors of the first, second, third, etc. are distinguished. levels of subordination. The factors of the first level are those that directly affect the effective indicator. The factors that determine the effective indicator indirectly, using the factors of the first level, are called factors of the second level, etc. For example, relative to gross output, the factors of the first level are the average annual number of workers and the average annual output of one worker. The number of days worked by one worker and the average daily output are factors of the second level. The factors of the third level include the length of the working day and the average hourly output.

The basis of any business is the rational and efficient use of available resources, including labor. It is quite logical that management seeks to increase the volume of production without additional costs for hiring employees. Experts identify several factors that can improve performance:

Management style (the main task of a leader is to motivate staff, create an organizational culture that values activity and hard work).

Investments in technical innovations (the purchase of new equipment that meets the needs of the time can significantly reduce the time spent by each employee).

Trainings and seminars for advanced training (knowledge of the specifics of production allows personnel to participate in improving the production process).

People often ask me what determines the speed of the computer? I decided to write an article on this topic in order to consider in detail the factors that affect the performance of the system. After all, understanding this topic makes it possible to speed up your PC.

Iron

Computer speed directly depends on configuration of your PC. The quality of the components affects the performance of the PC, namely ..

HDD

The operating system accesses the hard disk thousands of times. Moreover, the OS itself is on the hard drive. The larger its volume, spindle rotation speed, cache memory, the faster the system will work. Also important is the amount of free space on the C drive (usually Windows is there). If less than 10% of the total, the OS slows down. We wrote an article,. See if there is unnecessary files, programs,. It is advisable to defragment the railway once a month. Pay attention, much more efficient than HDD.

RAM

The amount of RAM is the most important factor affecting computer speed... Temporary memory stores intermediate data, machine codes and instructions. In short, the more the better. using the memtest86 utility.

CPU

The computer brain is as important as RAM. It is worth paying attention to the clock speed, cache memory and the number of cores. All in all, the speed of the computer depends on clock speed and cache. And the number of cores ensures multitasking.

Cooling system

The high temperature of the components has a negative effect on computer speed... Overheating can damage your PC. Therefore, the cooling system plays an important role.

Video memory

Motherboard

Software

Computer speed also depends on installed software and OC. For example, if you have a weak PC, Windows XP is more suitable for you, or They are less demanding on resources. Any OS will work for medium to high configurations.

Many startup programs load the system, resulting in freezes and braking. For high performance, close unnecessary applications... Keep your PC clean and tidy and keep it up to date.

Malicious software slows down Windows work... Use antivirus software and do deep scans regularly. You can read how. Good luck.

The most basic parameters that affect the speed of a computer are - hardware... It depends on what kind of hardware is installed on the PC how it will work.

CPU

It can be called the heart of the computer. Many are simply sure that the main parameter that affects the speed of a PC is clock frequency and this is correct, but not completely.

Of course, the number of GHz is important, but the processor also plays an important role. You shouldn't go into too much detail, let's simplify: the higher the frequency and the more cores, the faster your computer.

RAM

Again, the more gigabytes of this memory the better. RAM or abbreviated RAM- temporary memory where program data is written for quick access... However, after shutdown PC they are all erased, that is, it is fickle - dynamic.

And here there are some nuances. Most, in pursuit of the amount of memory, put a bunch of bars from different manufacturers and with different parameters, thereby not getting the desired effect. So that the performance gain is maximum, you need to set planks with the same characteristics.

This memory also has a clock speed, and the higher it is, the better.

Video adapter

He might be discrete and built-in... The built-in is found on the motherboard and its specifications are very poor. They are only enough for regular office work.

If you plan to play modern games, use programs that process graphics, then you need discrete graphics card ... By doing so, you will raise performance your PC. This is a separate board that needs to be inserted into a special connector located on the motherboard.

Motherboard

It is the largest board in the block. From her directly performance depends the entire computer, since all its components are located on it or connected to it.

HDD

This is the storage device where we store all our files, installed games and programs. They are of two types: HDD andSSD... The latter work much faster, consume less power and are quiet. The former also have parameters that affect performance PC - rotation speed and volume. And again, the higher they are, the better.

Power Supply

It must supply energy to all the components of the PC in sufficient volume, otherwise the performance will decrease significantly.

Program parameters

Also, the speed of the computer is affected by:

State established operating system.
Version OS.

Installed OS and software should be correct tuned in and free from viruses, then the performance will be excellent.

Of course, from time to time you need reinstall system and all software, so that the computer works faster. Also, you need to keep track of software versions, because old ones can work. slowly because of the errors they contain. It is necessary to use utilities that cleanse the system from garbage and increase its performance.

Debunking Myths About GPU Performance | Defining the concept of performance

If you are a car enthusiast, then you have probably argued with your friends more than once about the possibilities of two sports cars. One of the cars may have more horsepower, faster speed, less weight, and better handling. But very often disputes are limited to comparing the speed of passing the Nurburgring circle and always end with someone from the company spoiling all the fun, reminding that none of the disputants can still afford the cars in question.

A similar analogy can be drawn with expensive video cards. We have an average frame rate, fluctuations in frame feed time, noise emission from the cooling system and a price that in some cases can be double the cost of modern game consoles... And to make it more convincing, some modern video cards use aluminum and magnesium alloys - almost like in racing cars. Alas, there are some differences. Despite all the attempts to impress the girl with the new GPU, be sure that she likes sports cars more.

What is the equivalent of the lap rate for a video card? What factor differentiates winners and losers at equal cost? This is clearly not an average frame rate, and the evidence of this is the presence of fluctuations in frame time, tearing, slowing down and fans humming like a jet engine. In addition, there are others specifications: texture rendering speed, computational performance, memory bandwidth. How important are these indicators? Do you have to play with headphones because of the unbearable fan noise? How to factor in overclocking potential when evaluating a graphics adapter?

Before delving into the myths about modern graphics cards, you first need to understand what exactly is performance.

Performance is a set of metrics, not a single metric

Discussions about GPU performance often boil down to a generic concept of frame rate, or FPS. In practice, the concept of video card performance includes many more parameters than just the frequency with which frames are rendered. They are easier to consider within a complex rather than a single meaning. The complex has four main aspects: speed (frame rate, frame lag and input lag), picture quality (resolution and image quality), silence (acoustic efficiency, taking into account power consumption and cooler design) and, of course, affordability in relation to cost.

There are other factors that affect the value of a video card: for example, the games that come with the bundle, or the exclusive technologies used by a particular manufacturer. We will look at them briefly. While in reality the value of CUDA, Mantle and ShadowPlay support is highly dependent on the needs of the individual user.

The above chart illustrates the position GeForce GTX 690 regarding a number of factors that we have described. In the standard configuration, the graphics accelerator in the test system (its description is given in a separate section) reaches 71.5 FPS in the Unigine Valley 1.0 benchmark in ExtremeHD mode. At the same time, the card generates a perceptible but not disturbing noise at the level of 42.5 dB (A). If you are ready to put up with the noise at the level of 45.5 dB (A), then you can safely overclock the chip until it reaches a stable frequency of 81.5 FPS in the same mode. Lowering the resolution or level of anti-aliasing (which affects quality) results in a significant increase in frame rate, with the remaining factors unchanged (including the already high $ 1000 price tag).

In order to provide a more controlled testing process, it is necessary to determine the benchmark for the performance of the video card.

MSI Afterburner and EVGA PrecisionX are free utilities allowing the use manual setting fan speed and, as a result, noise emission control.

For today's article, we defined performance as the number of frames per second that a video card can output at a selected resolution within a specific application (and under the following conditions):

The quality settings are set to their maximum values (usually Ultra or Extreme).
The resolution is set to a constant level (usually 1920x1080, 2560x1440, 3840x2160, or 5760x1080 pixels in a three-monitor configuration).
Drivers are tuned to the manufacturer's standard parameters (both in general and for a specific application).
The video card works in closed case at 40 dB (A) noise level measured at 90 cm from the enclosure (ideally tested against a reference platform that is updated annually).
The video card operates at an ambient temperature of 20 ° C and a pressure of one atmosphere (this is important, since this directly affects the response of thermal throttling).
The core and memory operate at temperatures up to thermal throttling so that the core frequency / temperature under load remains stable or changes in a very narrow range, while maintaining a constant noise level of 40 dB (A) (and, accordingly, the fan speed) ..
The fluctuations in the 95th percentile frame time do not exceed 8ms, which is equal to half the frame time, on a standard display with a refresh rate of 60 Hz.
The card runs at or near 100% GPU load (this is important to demonstrate that there are no bottlenecks in the platform; if there are any, the GPU load will be below 100% and the test results will be meaningless).
The average FPS and frame feed time fluctuations were obtained from at least three runs for each measurement, with each run lasting at least one minute, and individual samples should not deviate more than 5% from the average (ideally, we want try different cards at the same time, especially if there is a suspicion of significant discrepancies between products from the same manufacturer).
The frame rate of a single card is measured using Fraps or built-in counters. FCAT is used for several cards in the SLI / CrossFire bundle.

As you can imagine, the performance reference level depends on both the application and the resolution. But it is defined in such a way that it allows the tests to be repeated and verified independently. In this sense, this approach is really scientific. In fact, we are interested in having manufacturers and enthusiasts repeat the tests and inform us of any discrepancies. This is the only way to ensure the integrity of our work.

This definition of performance does not take into account overclocking or the range of behavior of a particular GPU in different video cards. Luckily we noticed this problem only in a few cases. Modern thermal throttling engines are designed to extract maximum frame rates in most possible scenarios, so graphics cards perform very close to their maximum capabilities. And the limit is often reached before overclocking provides a real speed advantage.

V this material we will make extensive use of the Unigine Valley 1.0 benchmark. It takes advantage of several DirectX 11 features and allows for easily reproducible benchmarks. In addition, it does not rely on physics (and, as a result, CPU) in the way that 3DMark does (at least in general and combined tests).

What are we going to do?

We have already figured out the definition of the performance of video cards. Next, we'll look at the methodology, V-sync, noise and performance adjusted for video card noise, and the amount of video memory actually needed to run. In the second part we will look at anti-aliasing techniques, display effects, various line configurations. PCI Express and the value of your investment in purchasing a graphics card.

It's time to get familiar with test configuration... In the context of this article, special attention should be paid to this section, as it contains important information about the tests themselves.

Debunking Myths About GPU Performance | How do we test

Two systems, two goals

We carried out all tests on two different stands. One stand is equipped with an old processor Intel Core i7-950 and the other is a modern chip Intel Core i7-4770K .

Test System 1
Frame	Corsair Obsidian Series 800D
CPU	Intel Core i7-950 (Bloomfield), Overclocked to 3.6GHz, Hyper-Threading & Power Saving Off Tower
Cooler CPU	CoolIT Systems ACO-R120 ALC, Tuniq TX-4 TIM, Scythe GentleTyphoon 1850 RPM fan
Motherboard	Asus Rampage III Formula Intel LGA 1366, Intel X58 Chipset, BIOS: 903
Network	Cisco-Linksys WMP600N (Ralink RT286)
RAM	Corsair CMX6GX3M3A1600C9, 3 x 2 GB 1600 MT / s, CL 9
Storage device	Samsung 840 Pro SSD 256GB SATA 6Gb / s
Video cards
Sound card	Asus Xonar Essence STX
Power Supply	Corsair AX850, 850 W
System software and drivers
Operating system	Windows 7 Enterprise x64, Aero off (see note below) Windows 8.1 Pro x64 (reference only)
DirectX	DirectX 11
Video Drivers	AMD Catalyst 13.11 Beta 9.5 Nvidia GeForce 331.82 WHQL

Test System 2
Frame	Cooler Master HAF XB, Desktop / Test Bench Hybrid Shape
CPU	Intel Core i7-4770k (Haswell), Overclocked to 4.6GHz, Hyper-Threading and Power Saving Off
Cooler CPU	Xigmatek Aegir SD128264, Xigmatek TIM, Xigmatek 120mm fan
Motherboard	ASRock Extreme6 / ac Intel LGA 1150, Intel Z87 Chipset, BIOS: 2.20
Network	mini-PCIe card Wi-Fi 802.11ac
RAM	G.Skill F3-2133C9D-8GAB, 2 x 4 GB, 2133 MT / s, CL 9
Storage device	Samsung 840 Pro SSD 128GB SATA 6Gb / s
Video cards	AMD Radeon R9 290X 4GB (press sample) Nvidia GeForce GTX 690 4GB (retail sample) Nvidia GeForce GTX Titan 6GB (press sample)
Sound card	Integrated Realtek ALC1150
Power Supply	Cooler Master V1000, 1000 W
System software and drivers
Operating system	Windows 8.1 Pro x64
DirectX	DirectX 11
Video Drivers	AMD Catalyst 13.11 Beta 9.5 Nvidia GeForce 332.21 WHQL

The first test system we need it to get repeatable results in real environments. Therefore, we put together a relatively old, but still powerful system based on the LGA 1366 platform in a large, full-size tower format.

The second test system should meet more specific requirements:

PCIe 3.0 support with limited lanes (Haswell CPU for LGA 1150 only offers 16 lanes)
No PLX bridge
Support for three cards in CrossFire in x8 / x4 / x4 configuration or two in SLI in x8 / x8

ASRock sent us motherboard Z87 Extreme6 / ac that fits our needs. We have previously tested this model(only without Wi-Fi module) in the article "Test of five motherboards on the Z87 chipset that cost less than $ 220" in which she received our Smart Buy award. The sample that came to our laboratory turned out to be easy to set up, and we overclocked our Intel Core i7-4770K up to 4.6 GHz.

UEFI boards allow you to configure the PCI Express baud rate for each slot, so you can test the first, second and third generation PCIe on the same motherboard. The results of these tests will be published in the second part of this article.

Cooler Master provided the case and power supply for the second test system. Unusual HAF XB case, which also received a Smart Buy award in an article Cooler Master HAF XB Case Review and Test, provides the necessary space for free access to components. The case has a lot of vents, so the components inside can get quite noisy if the cooling system is not properly matched. However, this model boasts good air circulation, especially if all optional fans are installed.

The V1000 modular power supply allows you to install three high-performance graphics cards in the case while maintaining a neat appearance of the cabling.

Compare test system # 1 with system # 2

It's amazing how similar these systems are in terms of performance, aside from architecture and focus on frame rates. Here are their comparison in 3DMark Firestrike .

As you can see, the performance of both systems in graphics tests is, in fact, equal, even though the second system is equipped with more fast memory(DDR3-2133 versus DDR3-1800, and Nehalem has a three-channel architecture, while Haswell has a two-channel architecture). Only in host processor tests Intel Core i7-4770K demonstrates its advantage.

The main advantage of the second system is more headroom for overclocking. Intel Core i7-4770K air-cooled was able to maintain a stable frequency of 4.6 GHz, and Intel Core i7-950 could not exceed 4 GHz water cooled.

It is also worth noting that the first test system is tested under the operating room. Windows system 7x64 instead of Windows 8.1... There are three reasons for this:

First, the virtual worker manager windows desktop(Windows Aero or wdm.exe) uses a significant amount of video memory. At 2160p resolution Windows 7 takes over 200 MB, Windows 8.1- 300 MB, in addition to 123 MB reserved Windows... V Windows 8.1 it is impossible to disable this option without significant side effects, but in Windows 7 the problem is solved by switching to the basic theme. 400 MB is 20% of the card's total video memory of 2 GB.
When basic (simplified) themes are activated, memory consumption in Windows 7 stabilizes. She always takes 99 MB at 1080p and 123 MB at 2160p with a graphics card GeForce GTX 690... This allows for maximum test repeatability. For comparison: Aero takes about 200 MB and +/- 40 MB.
WITH Nvidia driver 331.82 WHQL there is a bug when Windows activation Aero at 2160p. It appears only when Aero is turned on on the display, in which the 4K image is realized by two tiles and manifests itself in a reduced load on the GPU during testing (it jumps in the range of 60-80% instead of 100%), which affects the performance loss up to 15%. We've already notified Nvidia of our find.

It is impossible to show ghosting and tearing effects on regular screenshots and game videos. Therefore, we used a high-speed video camera to capture the real image on the screen.

The temperature in the case is measured by the built-in temperature sensor Samsung 840 Pro. The ambient temperature is 20-22 ° C. Background noise for everyone acoustic tests was 33.7 dB (A) +/- 0.5 dB (A).

Test configuration
Games
The Elder Scrolls V: Skyrim	Version 1.9.32.0.8, THG own test, 25 seconds, HWiNFO64
Hitman: Absolution	Version 1.0.447.0, built-in benchmark, HWiNFO64
Total War: Rome 2	Patch 7, built-in "Forest" benchmark, HWiNFO64
BioShock Infinite	Patch 11, Version 1.0.1593882, built-in benchmark, HWiNFO64
Synthetic tests
Ungine Valley	Version 1.0, ExtremeHD Preset, HWiNFO64
3DMark Fire Strike	Version 1.1

Many tools can be used to measure video memory consumption. We opted for HWiNFO64, which has received high ratings from the enthusiast community. The same result can be obtained with using MSI Afterburner, EVGA Precision X or RivaTuner Statistics Server.

Debunking Myths About GPU Performance | Whether or not to enable V-Sync is the question

When evaluating video cards, the first parameter that I want to compare is performance. How modern and most quick decisions overtake previous products? The worldwide web is replete with testing data from thousands of online resources that try to answer this question.

So let's start by looking at performance and factors to consider if you really want to know how fast a particular graphics card is.

Myth: frame rate is an indicator of the level of graphics performance.

Let's start with a factor that our readers probably already know, but many still have the wrong idea about it. Common sense dictates that a frame rate of 30 FPS or higher is considered suitable for the game. Some people think that even lower values are fine for normal gameplay, while others insist that even 30 FPS is too little.

However, in controversy, it is not always obvious that FPS is just the frequency behind which some complex matters lie. First, in films the frequency is constant, but in games it changes, and, as a result, is expressed as an average value. Frequency fluctuations are a byproduct of the power required by the graphics card to process the scene, and frame rates change as the content on the screen changes.

It's simple: the quality of the gaming experience is more important than a high average frame rate. The stability of staffing is another extremely important factor. Imagine driving on a highway at a constant speed of 100 km / h and the same journey at an average speed of 100 km / h, which consumes a lot of time shifting and braking. You will arrive at the appointed place at the same time, but the impressions of the trip will vary greatly.

So let's put aside the question "What level of performance will be sufficient?" to the side. We'll come back to it after we discuss other important topics.

Introducing V-sync

Myths: It is not necessary to have a frame rate higher than 30 FPS, since the human eye does not see the difference. Values above 60 FPS on a monitor with a 60 Hz refresh rate are optional, since the picture is already displayed 60 times per second. V-sync must always be turned on. V-sync should always be turned off.

How are rendered frames actually displayed? Almost all LCD monitors work in such a way that the image on the screen is refreshed a fixed number of times per second, usually 60. Although there are models capable of refreshing the picture at 120 and 144 Hz. This mechanism is called the refresh rate and is measured in hertz.

The discrepancy between the variable frame rate of the graphics card and the fixed refresh rate of the monitor can be a problem. When the frame rate is higher than the refresh rate, multiple frames can be displayed in a single scan, resulting in an artifact called screen tearing. In the image above, the colored bars emphasize the individual frames from the video card, which are displayed when ready. This can be very annoying, especially in action-packed first-person shooters.

The image below shows another artifact that often appears on the screen, but is difficult to capture. Since this artifact is associated with the operation of the display, it is not visible in the screenshots, but it is clearly visible to the naked eye. To catch him, you need a high-speed video camera. The FCAT utility that we used to capture a frame in Battlefield 4, shows a gap, but not a ghosting effect.

Screen tearing is evident in both images from BioShock Infinite. However, on the Sharp 60Hz panel, it is much more pronounced than on the Asus 120Hz monitor, since the VG236HE's screen refresh rate is twice as high. This artifact is the clearest indication that vertical sync, or V-sync, is not enabled in the game.

The second problem in the BioShock image is the ghosting effect, which is clearly visible at the bottom of the left image. This artifact is associated with the delay in displaying the image on the screen. In short: individual pixels do not change color quickly enough, and this is how given type afterglow. This effect appears much brighter in the game than shown in the image. The Sharp panel on the left has a gray-to-gray response time of 8ms and appears to be blurry with fast movements.

Let's go back to the gaps. The above vertical sync is a rather old solution to the problem. It consists in synchronizing the frequency at which the video card delivers frames with the refresh rate of the monitor. Since multiple frames no longer appear at the same time, there are no breaks either. But if the frame rate drops below 60 FPS (or below your panel's refresh rate) at maximum graphics settings for your favorite game, then the effective frame rate will jump between multiples of the refresh rate, as shown below. This is another artifact called braking.

One of the oldest controversies on the Internet is about vertical sync. Someone insists that the technology always needs to be turned on, someone is sure that it always needs to be turned off, and someone chooses the settings depending on the specific game.

So whether or not to enable V-sync?

Let's say you are in the majority and are using a regular display with a 60Hz refresh rate:

If you are playing first-person shooters and / or you have problems with perceived input lag and / or your system cannot maintain at least 60 FPS in the game at all times, and / or you are testing a video card, then vertical sync should be turned off.
If none of the above factors apply to you, and you see noticeable screen tearing, then vertical sync should be turned on.
If you're unsure, it's best to leave V-sync off.

If you are using a 120/144 Hz gaming display (if you own one of these displays, it is likely that you bought it because of the high refresh rate):

Vsync should only be turned on in older games where gameplay is running at frame rates above 120 FPS and you are constantly experiencing screen tearing.

Please note that in some cases the framerate drop effect due to V-sync does not appear. These applications support triple buffering, although this solution is not very common. Also in some games (for example, The Elder Scrolls V: Skyrim), V-sync is activated by default. Forcing shutdown by modifying some files leads to problems with the game engine. In such cases, it is best to leave Vsync enabled.

G-Sync, FreeSync and the future

Fortunately, even on the most weak computers the input delay will not exceed 200ms. Therefore, your own reaction has the greatest influence on the results of the game.

However, as the differences in input latency grow, their impact on gameplay grows. Imagine a professional gamer whose reaction is comparable to that of the best pilots, that is, 150ms. A 50ms input lag means the person will respond 30% slower (that's four frames on the display with a 60Hz refresh rate) of their opponent. On a professional level, this is a very noticeable difference.

For mere mortals (including our editors, who showed a result of 200 ms in the visual test) and for those who prefer playing Civilization V rather than Counter Strike 1.6, things are a little different. You can probably ignore input lag altogether.

Other things being equal, here are some factors that can worsen input latency:

Playing on an HDTV (especially if Game Mode is off) or playing on an LCD with video processing that cannot be turned off. An ordered list of input lag metrics for various displays can be found in the DisplayLag database .
Playing on LCDs using IPS panels with more high time response (typically 5-7ms G2G), instead of TN + Film panels (1-2ms GTG) or CRT displays (fastest available).
Play on low refresh rate displays. New gaming displays support 120Hz or 144Hz.
Low frame rate play (30 FPS is one frame every 33ms; 144 FPS is one frame every 7ms).
Using a USB mouse with a low polling rate. The cycle time at 125 Hz is about 6ms, which gives an average input lag of about 3ms. At the same time, the polling rate gaming mouse can go up to 1000 Hz, with an average input delay of 0.5 ms.
Using a low quality keyboard (typically, keyboard input lag is 16ms, but in cheaper models it can be even higher).
Enabling V-sync, especially when combined with triple buffering (there is a myth that Direct3D does not include triple buffering. In fact, Direct3D does allow for the multiple background buffering option, but few games use it). If you are tech-savvy, you can check out reviewed by Microsoft(English) on this matter.
A game with a high pre-render time. The default queue in Direct3D is three frames, or 48ms at 60Hz. This value can be increased up to 20 frames for more "smoothness" and decreased to one frame for better responsiveness due to increased fluctuations in frame time and, in some cases, the overall loss in FPS. There is no null parameter. Zero simply resets the settings to their original value of three frames. If you are tech-savvy, you can check out reviewed by Microsoft(English) on this matter.
High latency of the Internet connection. While this is not entirely relevant to the definition of input latency, it does have a noticeable effect on it.

Factors that do not affect input latency:

Using a keyboard with a PS / 2 or USB connector (see additional page in our review "Five Mechanical-Switch Keyboards: Only The Best For Your Hands"(English)).
Using wired or wireless network connection(check your router ping if you don’t believe it; ping should not exceed 1ms).
Using SLI or CrossFire. The longer render queues required to implement these technologies are offset by higher throughput.

Conclusion: input lag is only important for fast games and really plays a significant role at a professional level.

It's not just display technology and graphics card that affect input lag. Hardware, hardware settings, display, display settings and application settings all contribute to this indicator.

Debunking Myths About GPU Performance | Video memory myths

Video memory is responsible for resolution and quality settings, but does not increase speed

Manufacturers often use video memory as a marketing tool. Since gamers have been convinced that more is better, we often see entry-level graphics cards that have significantly more RAM than they actually need. But enthusiasts know that the most important thing is balance, and in all PC components.

Broadly speaking, video memory refers to the discrete GPU and the tasks it handles, regardless of the system memory installed in the motherboard. Graphics cards use several RAM technologies, the most popular of which are DDR3 and GDDR5 SDRAM.

Myth: graphics cards with 2GB of memory are faster than models with 1GB.

Not surprisingly, manufacturers are equipping inexpensive GPUs with more memory (and making higher margins), as many people believe that more memory will add speed. Let's take a look at this issue. The amount of video memory of a video card does not affect its performance, if you do not choose game settings that use all the available memory.

But why do you need additional video memory then? To answer this question, you need to find out what it is used for. The list is simplified but useful:

Drawing textures.
Frame buffer support.
Depth buffer support ("Z Buffer").
Support for other resources that are required to render the frame (shadow maps, etc.).

Of course, the size of the textures that are loaded into memory depends on the game and the detail settings. For example, Skyrim's high-res texture pack includes 3GB of textures. Most games dynamically load and unload textures as needed, however not all textures need to be in video memory. But the textures that should be rendered in a particular scene must be in memory.

A frame buffer is used to store the image as it is rendered before or while it is sent to the screen. Thus, the required amount of video memory depends on the output resolution (an image in a resolution of 1920x1080 pixels at 32 bits per pixel "weighs" about 8.3 MB, and a 4K image in a resolution of 3840x2160 pixels at 32 bits per pixel already weighs about 33.2 MB ) and the number of buffers (at least two, rarely three or more).

Special anti-aliasing modes (FSAA, MSAA, CSAA, CFAA, but not FXAA or MLAA) effectively increase the number of pixels that need to be rendered and proportionally increase the total amount of video memory required. Render-based anti-aliasing has a particularly large effect on memory consumption, which increases with sample size (2x, 4x, 8x, etc.). Additional buffers also consume video memory.

Thus, a video card with a large amount of graphics memory allows:

Play at higher resolutions.
Play at higher texture quality settings.
Play at higher anti-aliasing levels.

Now we destroy the myth.

Myth: You need 1, 2, 3, 4, or 6GB of VRAM to play on (insert your display's native resolution).

The most important factor to consider when choosing the amount of RAM is the resolution at which you will be playing. Naturally, higher resolutions require more memory. The second important factor is the use of the anti-aliasing technologies mentioned above. Other graphics parameters are less important in terms of the amount of memory required.

Before we get into the actual measurements, let me warn you. There is a special type of high-end graphics card with two GPUs (AMD Radeon HD 6990 and Radeon HD 7990 as well as Nvidia GeForce GTX 590 and GeForce GTX 690), which are equipped with a certain amount memory. But as a result of using a configuration of two GPUs, the data is essentially duplicated, dividing the effective amount of memory in two. For example, GeForce GTX 690 with 4 GB it behaves like two 2 GB cards in SLI. Moreover, when you add a second card to a CrossFire or SLI configuration, the video memory of the array does not double. Each card reserves only its own amount of memory.

We ran these tests on Windows 7 x64 with the Aero theme disabled. If you are using Aero (or Windows 8 / 8.1, which does not have Aero), then about 300 MB can be added to the metrics.

As seen from the latest poll on Steam, most gamers (about half) use video cards with 1 GB of video memory, about 20% have models with 2 GB, and a small amount of of users (less than 2%) work with graphics adapters with 3 GB of video memory or more.

We tested Skyrim with the official texture pack High Quality... As you can see, 1GB of memory is barely enough to play at 1080p without anti-aliasing or using MLAA / FXAA. 2GB allows you to run the game at a resolution of 1920x1080 pixels with maximum detail and at 2160p with reduced anti-aliasing. To activate the maximum settings and anti-aliasing 8xMSAA, even 2 GB is not enough.

Bethesda Creation Engine is unique this package benchmarks. It is not always limited by the GPU speed, but it is often limited by the platform's capabilities. But in these tests, we saw for the first time how Skyrim at maximum settings reaches the limit of the video memory of the graphics adapter.

It is also worth noting that FXAA activation does not consume additional memory... Therefore, there is a good compromise when using MSAA is not possible.

Debunking Myths About GPU Performance | Additional dimensions of video memory

The Glacier 2 graphics engine from Io Interactive, on which Hitman: Absolution is based, is very memory-hungry and in our tests is second only to the Warscape engine from Creative Assembly (Total War: Rome II) at maximum detail settings.

In Hitman: Absolution, a video card with 1GB of video memory is not enough to play at ultra-quality settings at 1080p. The 2GB model will enable 4xAA at 1080p or play without MSAA at 2160p.

To enable 8xMSAA at 1080p resolution, 3 GB of video memory is required, and 8xMSAA at 2160p pixels can pull out a video card no weaker GeForce GTX Titan with 6 GB of memory.

Here, FXAA activation does not use additional memory either.

Note: new test Ungine Valley 1.0 does not automatically support MLAA / FXAA. Thus, the memory consumption results with MLAA / FXAA are obtained forcibly using CCC / NVCP.

The data shows the Valley benchmark performs well on a card with 2GB of memory at 1080p (at least in terms of video memory). You can even use a 1GB card with 4xMSAA active, although not all games will be able to do this. However, at 2160p, the benchmark performs well on a 2GB card without anti-aliasing or post-processing effects enabled. The 2GB threshold is reached when 4xMSAA is activated.

Ultra HD with 8xMSAA requires up to 3GB of video memory. This means that with such settings, the benchmark will be passed only on GeForce GTX Titan or on one of the 4GB AMD models with a Hawaii chip.

Total War: Rome II uses Creative Assembly's updated Warscape engine. It does not currently support SLI (but CrossFire does). It also does not support any form of MSAA. Of all the forms of anti-aliasing, only AMD's MLAA can be used, which is one of the post-processing techniques like SMAA and FXAA.

An interesting feature of this engine is the ability to lower the image quality based on the available video memory. The game can maintain an acceptable level of speed with minimal user involvement. But the lack of SLI support is killing the game on Nvidia graphics card in a resolution of 3840x2160 pixels. At least for now, this game is best played on AMD card if you choose 4K resolution.

Without MLAA, the game's built-in "forest" benchmark on an Extreme rig uses 1848 MB of available video memory. Limit GeForce GTX 690 2GB is exceeded when MLAA is activated at 2160p. At 1920x1080 pixels, memory usage is in the 1400MB range.

note that AMD technology(MLAA) runs on Nvidia hardware. Since FXAA and MLAA are post-processing techniques, there is technically no reason why they cannot function on hardware from another manufacturer. Either Creative Assembly is secretly switching to FXAA (despite what the config file says), or AMD marketers didn't take that fact into account.

To play Total War: Rome II at 1080p on Extreme graphics settings, you need a 2GB graphics card, and to play smoothly at 2160p you need a CrossFire array of more than 3GB. If your card only has 1GB of VRAM, you can still play the new Total War, but only at 1080p and lower quality settings.

What happens when the video memory is fully utilized? In short, data is transferred to system memory via the PCI Express bus. In practice, this means performance drops significantly, especially when textures have been loaded. It is unlikely that you will want to face this, since it will be almost impossible to play the game due to constant slowdowns.

So how much video memory do you need?

If you have a video card with 1 GB of video memory and a monitor with a resolution of 1080p, then you don't need to think about an upgrade at the moment. However, the 2GB card will allow for higher anti-aliasing settings in most games, so consider this as a minimum starting point if you want to enjoy modern games at 1920x1080.

If you plan on using 1440p, 1600p, 2160p or multi-monitor configurations, then it is better to consider models with more than 2GB of memory, especially if you want to include MSAA. Better to consider buying a 3 GB model (or several cards with more than 3 GB memory in SLI / CrossFire).

Of course, as we said, it's important to strike a balance. A weak GPU, backed up by 4 GB of GDDR5 memory (instead of 2 GB), is unlikely to allow playing at high resolutions only due to the presence of a large amount of memory. That is why in our video card reviews we test several games, several resolutions and several detail settings. After all, before making any recommendations, it is necessary to identify all possible shortcomings.

Debunking Myths About GPU Performance | Thermal management in modern video cards

Modern AMD graphics cards and Nvidia use protective mechanisms to increase fan speed and ultimately lower clock speeds and voltage if the chip overheats. This technology does not always work for the stability of your system (especially when overclocked). It is designed to protect equipment from damage. Therefore, it is not uncommon for cards with too high parameter settings to fail and require a reset.

There is a lot of controversy about the maximum temperature for the GPU. However, higher temperatures, if tolerated by the equipment, are preferable because they indicate increased heat dissipation in general (due to the difference with the ambient temperature, the amount of heat that can be transferred is higher). At least from a technical standpoint, AMD's disappointment with the Hawaii GPU's thermal ceiling response is understandable. There are no long-term studies yet to be able to talk about the viability of these temperature settings. Based personal experience regarding the stability of the devices, we would prefer to rely on the manufacturer's specifications.

On the other hand, it is well known that silicon transistors perform better at lower temperatures. This is the main reason overclockers use liquid nitrogen coolers to maximize chip cooling. In general, lower temperatures help provide more headroom for overclocking.

The most voracious graphics cards in the world are Radeon HD 7990(TDP 375W) and GeForce GTX 690(TDP 300W). Both models are equipped with two GPUs... Single GPU cards consume much less power, although the series graphics cards Radeon R9 290 approaching the level of 300 watts. Anyway, it is high level heat dissipation.

The values are indicated in the description of cooling systems, so today we will not delve into them. We are more interested in what happens when a load is applied to modern GPUs.

You are running an intense task like a 3D game or Bitcoin mining.
The clock frequency of the video card is increased to the nominal or boost values. The card starts to heat up due to increased current consumption.
The fan speed gradually increases up to the point indicated in the firmware. As a rule, growth stops when the noise level reaches 50 dB (A).
If the programmed fan speed is not enough to keep the GPU temperature below a certain level, the clock speed begins to decrease until the temperature drops to the specified threshold.
The card should operate stably in a relatively narrow frequency and temperature range until the load is cut off.

It is not hard to imagine that the moment at which thermal throttling is activated depends on many factors, including the type of load, air exchange in the case, the ambient air temperature, and even the ambient pressure. This is why video cards throttled at different times. The thermal throttling trigger point can be used to determine the reference performance level. And if we set the fan speed (and, accordingly, the noise level) manually, we can create a measurement point depending on the noise. What's the point of this? Let's find out ...

Debunking Myths About GPU Performance | Testing performance at a constant noise level of 40 dB (A)

Why 40 dB (A)?

First, notice the A in parentheses. It means "adjusted for A." That is, sound pressure levels are corrected along a curve that simulates the sensitivity of the human ear to noise levels at different frequencies.

Forty decibels is considered the average for background noise in a normally quiet room. In recording studios, this value is in the region of 30 dB, and 50 dB corresponds to a quiet street or a conversation between two people in a room. Zero is the minimum threshold for human hearing, although it is very rare to hear sounds in the 0-5 dB range if you are over five years old. The decibel scale is logarithmic, not linear. Thus, 50 dB sounds twice as loud as 40, which, in turn, is twice as loud as 30.

The noise level of a PC operating at 40 dB (A) should be mixed with the background noise of a house or apartment. As a rule, it should not be heard.

Curious fact Curious fact: in the quietest room in the world the background noise level is -9 dB. If you spend less than an hour in the dark, then due to sensory deprivation (restriction of sensory information) hallucinations can begin. How to keep a constant noise level of 40 dB (A)?

Several factors affect the acoustic profile of a video card, one of which is the fan speed. Not all fans produce the same amount of noise at the same rotational speed, but each fan itself should be at the same level at constant speed rotation.

So, by measuring the noise level directly with an SPL meter at a distance of 90 cm, we manually adjusted the fan profile so that the sound pressure did not exceed 40 dB (A).

Video card

Fan setting%

Fan speed, rpm

dB (A) ± 0.5

Radeon R9 290X

2160

GeForce GTX 690

2160 GeForce GTX 690. On the other side, GeForce GTX Titan uses a different acoustic profile, reaching 40 dB (A) at a higher rotational speed of 2780 rpm. At the same time, the fan setting (65%) is close to GeForce GTX 690 (61%).

This table illustrates fan profiles along with a variety of presets. Overclocked cards can be very noisy under load: we got a value of 47 dB (A). When processing a typical task, the card turned out to be the quietest. GeForce GTX Titan(38.3 dB (A)), and the loudest is GeForce GTX 690(42.5 dB (A)).

Debunking Myths About GPU Performance | Can overclocking hurt performance at 40 dB (A)?

Myth: Overclocking always gives you a performance boost.

By tweaking a specific fan profile and letting the cards drop the clock to a stable level, we get interesting and repeatable benchmarks.

Video card	Env. temperature (° C)	Ventilation setting,%	Ventilation speed, rpm	dB (A) ± 0.5	GPU1 clock, MHz	GPU2 clock, MHz	Memory hour, MHz	FPS
Radeon R9 290X	30	41	2160	40	870-890	No	1250	55,5
Radeon R9 290X overclocking	28	41	2160	40	831-895	No	1375	55,5
GeForce GTX 690	42	61	2160	40	967-1006	1032	1503	73,1
GeForce GTX 690 overclocking	43	61	2160	40	575-1150	1124	1801	71,6
GeForce GTX Titan	30	65	2780	40	915-941	No	1503	62 Radeon R9 290X The Radeon R9 290X falls behind in more standard benchmarks. Also curious is the sharper increase in ambient temperature in the case when using GeForce GTX 690(12-14 ° C). It is connected to the axial fan, which is located in the center of the video card. It blows air out into the enclosure, limiting the thermal headroom. In most conventional cases, we would expect a similar picture. Thus, you have to decide on your own to increase the noise emission to improve performance (or vice versa) based on your own preferences. With a detailed understanding of vertical sync, input lag, video memory and testing a specific acoustic profile, we can return to work on the second part of the article, which already includes a study of PCIe data transfer rates, screen sizes, a detailed study of exclusive technologies from different manufacturers and a price analysis.