Shader processors in the card. Correct choice of video card

Automation of banking operations and implementation in the 1C Accounting program

Just as all the activities of a company can be divided into business processes, then the processes can be divided into larger warehouses. The methodology of business processes is called decomposition...

PC internal and peripheral devices

Development of a discrete population model using the additional Model Vision Studium program

The main “future element” of a description in MVS is a block. A block is an active object that functions in parallel and independently from other objects continuously. A block is an oriented block...

Wikoristannya LMS Moodle in the initial process

For any course, the presence of the central region is obligatory. The left and right columns with blocks may not be present. All different blocks that are included in the warehouse of the Moodle science management system will increase functionality...

Researching the possibilities of depositing in the Moodle remote learning system

To add new resources, elements, blocks or edit existing ones in your course, click the Edit button located in the management block. A special view of the course in the editing mode of submissions for Malyunka 2.5: Malyunka 2 ...

Modeling during software development

The UML language vocabulary includes three types of verb blocks: entities; vidnosini; diagrams. Essences are abstractions, which are the main elements of the model...

Modeling robots in the library

Operators - blocks form the logic of the model. GPSS / PC has about 50 different types of blocks, each of which has its own specific function. Behind each of these blocks there is a sub-program of the translator...

Basic features of CSS3

You can design the text in an original way using different sized blocks, which, again, are built on the basis of CSS3 technologies. (Fig. 5.) Fig. 5...

Basic features of CSS3

The effect of the element's visibility is clearly visible on the background image and detracts from the widening in different areas operating systems Because it looks stylish and beautiful...

preparation text document Subject to STP 01-01

Expansion blocks (payments) or cards (Cards), as their inodes are called, can be used to service devices that are connected to an IBM PC. Vony can be used to connect additional extensions(Display adapter, disk controller, etc.) ...

Video card breakdown and repair

These blocks work together with shader processors of all types, they select and filter texture data necessary for creating a scene...

Production process registration program for an automated electronics industry control system

There are 11 types of blocks, from which a specific MES system can be prepared for any other production...

Development of the software complex with compensation for overhaul

At the lowest level of granularity, Oracle database data is stored in data blocks. One block of data represents a large number of bytes of physical disk space...

Development of hardware and software management system for transport platforms in Simatic Step-7

System units are components of the operating system. Currants can be removed by programs (System Functions, SFC) or data (System Data Blocks, SDB). System units Allows access to important system functions...

An extension that enters the EOM warehouse

Expansion blocks (payments) or cards (Cards), as their inodes are called, can be used to service devices that are connected to an IBM PC. They can be used to connect additional devices (display adapters, disk controller, etc.) ...

Today's graphics processors have no functional units, depending on the number of characteristics they contain and the fluidity of rendering, which contributes to the comfort of the game. From the equal number of these blocks in different video chips, you can roughly estimate how powerful a particular GPU is. Video chips have a lot of characteristics, in which section we will look at the most important ones.

Clock frequency of the video chip

The GPU operating frequency is measured in megahertz, i.e., millions of cycles per second. This characteristic directly affects the productivity of the video chip - what’s more, the GPU can work more efficiently in one hour, processing a larger number of vertices and pixels. An example from real life: the frequency of the video chip installed on the Radeon HD 6670 board is 840 MHz, and exactly the same chip in the Radeon HD 6570 model operates at a frequency of 650 MHz. It is likely that all the main characteristics of productivity will vary. However, it is not only the operating frequency of the chip that determines productivity; its productivity is strongly influenced by the graphical architecture itself: devices and the number of final blocks, their characteristics, etc.

In some cases, the clock frequency of certain GPU blocks varies depending on the operating frequency of the chip. Thus, different parts of the GPU operate at different frequencies, and are designed to increase efficiency, and some blocks run at higher frequencies, and others do not. Most GeForce and NVIDIA video cards are equipped with these GPUs. In the latest applications, we have a video chip in the GTX 580 model, most of which operates at a frequency of 772 MHz, and the universal computing units of the chip operate at a double frequency - one thousand five hundred and forty MHz c.

Refill fluidity (fill rate)

The fluidity of the filling shows the fluidity of the video chip in the building's small pixels. There are two types of fill rate: pixel fill rate and texel rate. Pixel fluidity of the filling shows the fluidity of the pixels on the screen and depends on the operating frequency and the number of ROP blocks (blocks of rasterization and blending operations), and the texture fluidity Selection of data textures based on the frequency of operation and the number of texture blocks.

For example, the peak pixel fill rate of GeForce GTX 560 Ti is up to 822 (chip frequency) × 32 (number of ROP units) = 26304 megapixels per second, and texture - 822 × 64 (number of texture units) = 52608 megatexels / s. Simply put, the higher the first number, the faster the video card can produce ready pixels, and the higher the number, the faster the selection of texture data.

Although the importance of “pure” fill rate has recently decreased significantly, having sacrificed the calculation flexibility, the parameters are still not very important, especially for games with awkward geometry and even simple pixel and vertex calculations. So the offending parameters are no longer important for everyday games, but the guilty ones will be balanced. Therefore, the number of ROP blocks in current video chips is less than the number of texture units.

Number of computing (shader) units or processors

Perhaps these blocks are the main parts of the video chip. They involve special programs, like shaders. Moreover, since earlier pixel shaders formed blocks of pixel shaders, and vertex blocks, then from now on graphic architectures began to be unified, and these universal computing blocks began to deal with with them by their design: vertex, pixel, geometric and also universal calculations.

First, the architecture was unified and consolidated in the video chip game consoles Microsoft Xbox 360, this graphics processor was developed by ATI (later purchased by AMD). And in video processors for personal computers, unified shader units also appeared in the NVIDIA GeForce 8800 board. And from now on, all new video processors are based on a unified architecture, as we Universal code for various shader programs (vertex, pixel, geometry, etc. ), and other unified processors can be disabled as programs.

Due to the number of computational units and their frequencies, it is possible to equalize the mathematical productivity of different video cards. A large part of the game is limited by the productivity of pixel shaders, so the number of these blocks is very important. For example, if one model of a video card is based on a GPU with 384 computing processors in its warehouse, and another of the same line has a GPU with 192 computing units, then at the same frequency the other will be twice as high No matter what type of shaders, and in general there will be desktops more productive.

Although it is impossible to make unambiguous conclusions about productivity on the basis of one or more computing units, it is necessary to adjust the clock frequency and the different architecture of units of different generations and processors in chips. Only by these numbers can you compare the chips of only one line of one processor: AMD or NVIDIA. In other cases, it is necessary to devote attention to productivity tests in games or add-ons.

Texture Units (TMU)

These GPU units work together with computational processors, they select and filter textures and other data necessary for the execution of scenes and universal computations. The number of texture units in a video chip indicates texture productivity - the speed at which texels are selected from textures.

While we want to place more emphasis on mathematical developments, and replace some textures with procedural ones, the demand for TMU blocks is great, since in addition to the main textures, selections must be made from normal maps and objects, as well as pose screen rendering buffers render target.

Based on the emphasis on many factors, including the productivity of texturing blocks, we can say that the number of TMU blocks and the apparently high textural productivity are also one of the most important parameters for video iv. This parameter especially contributes to the speed of image rendering with high-speed anisotropic filtering, which allows for additional texture selections, as well as with advanced algorithms for soft shadows and new-fangled algorithms such as Screen Space Ambient Occlusion.

Rasterization operation blocks (ROP)

Rasterization blocks perform the operation of recording pixels filled by the video card into buffers and the operation of their mixing (blending). As we have already noted, the productivity of ROP blocks affects the fill rate and thus is one of the main characteristics of video cards of all hours. And although the value has also decreased significantly, there are still losses when the productivity of the additives is due to the fluidity and quantity of ROP blocks. Most often, this is explained by active post-processing filters and turning on anti-aliasing at high game settings.

Once again, it is important that daily video clips cannot be assessed solely by the number of different blocks and their frequency. The skin series of GPUs has a new architecture, in which the latest blocks are significantly different from the old ones, and a similar number of different blocks can be differentiated. Thus, AMD's ROP blocks in some solutions can consume more work per clock cycle than the blocks in NVIDIA solutions, and so on. The same is true about the differences in TMU texture units - there is a difference in different generations GPU various virobniks, And it is necessary to take care when cleaning.

geometric blocks

Until the end of the day, the number of geometry processing blocks was no longer very important. One block on the GPU was running out of work for the most part, as geometry in games became idle and the main focus of productivity was mathematical calculations. The importance of parallel geometry processing and the number of parallel blocks increased dramatically with the introduction of geometry tessellation support in DirectX 11. NVIDIA is the first to parallelize the processing of geometric data, since its GF1xx family chips have a number of blocks. Then, AMD released a similar solution (only in the top solutions of the Radeon HD 6700 line based on Cayman chips).

Within the framework of this material, we will not go into details; they can be read in the basic materials on our website, dedicated to DirectX 11 graphics processors. In this case, it is important for us that the number of geometry processing blocks has a significant impact on overall productivity in new games such as tessellation, such as Metro 2033, HAWX 2 and Crysis 2 (with remaining patches). And when choosing a daily gaming video card, it is important to pay attention to geometric productivity.

Video memory duties

The memory is used by video chips to save necessary data: textures, vertices, buffer data, etc. It would seem that the more of them, the better. But it’s not so simple, the assessment of the workload of the video card for video memory is the greatest benefit! The importance of video memory is often underestimated by computers, which are still used for the equalization of different models of video cards. Obviously, this parameter is indicated in the lists of characteristics of ready-made systems as one of the first, and it is also written on the boxes of video cards great font. It seems to the uninformed buyer that since the memory is twice as large, then the efficiency of such a solution is twice as high. The reality of this myth is that memory has different types and characteristics, and the growth of productivity only grows to the point of success, and after that achievement it simply slows down.

So, in every game and during singing and game scenes, there is a video memory system that can be accessed for all data. And even if 4 GB of video memory is installed there, there will be no reason to speed up rendering, it will be easy to separate video blocks, as needed, and there will simply be enough memory. In most cases, a video card with 1.5 GB of video memory performs at the same speed as a card with 3 GB (for other minds).

Situations where greater memory capacity leads to a visible increase in productivity appear to be very powerful games, especially at high levels and with maximum intensity adjustments. But such episodes do not always occur, and it is necessary to take care of your memory, without forgetting about those that are more important than ever, and productivity simply does not increase anymore. Memory chips have more important parameters, such as memory bus width and operating frequency. This topic is so important that the report on choosing video memory will be discussed in the remaining part of our material.

Memory bus width

The memory bus width is the most important characteristic that affects the memory bandwidth (MBB). Greater width allows more information to be transferred from video memory to the GPU and back in an hour, which has a positive impact on productivity in most cases. Theoretically, a 256-bit bus can transfer twice as much data per clock cycle as a 128-bit bus. In practice, the difference in rendering speed does not reach two times, but is even closer to what it is in many cases with an emphasis on video memory bandwidth.

today Gaming video cards Choose different bus widths: from 64 to 384 bits (previously there were chips with a 512-bit bus), depending on the price range and release time specific model GPU For the cheapest low-end video cards, 64 and up to 128 bits are most often used, for mid-range 128 up to 256 bits, and video cards from the upper price range use 2 buses 56 to 384 bits wide. The bus width can no longer be increased purely through physical boundaries - the size of the GPU crystal is insufficient to accommodate a larger than 512-bit bus, and this is too expensive. Therefore, the increase in memory bandwidth is simultaneously due to the additional development of new types of memory (div. Further).

video memory frequency

Another parameter that affects memory bandwidth is the clock frequency. And increased memory bandwidth often has a huge impact on the performance of the video card in 3D applications. The frequency of the memory bus on current versions ranges from 533 (1066, with quadruple adjustments) MHz to 1375 (5500, with quadruple adjustments) MHz, so it can increase more than five times! And since the memory bandwidth depends on the memory frequency and the width of the bus, then the memory with a 256-bit bus, which operates at a frequency of 800 (3200) MHz, will have greater bandwidth memory, It operates at 1000 (4000) MHz with a 128-bit bus.

Particular attention should be paid to the parameters of the memory bus width, type and frequency when purchasing fairly inexpensive video cards, most of which should be installed with 128-bit or 64-bit interfaces, so negatively impacts their productivity. However, we do not recommend purchasing a video card with a 64-bit video memory bus for a gaming PC at all. It is important to ensure superior performance if you want an average level with at least a 128 or 192-bit bus.

memory types

A number of different types of memory are installed on the current video card. The old SDR memory with single speed transfer is no longer useful, but the current types of memory DDR and GDDR may significantly change their characteristics. Different types of DDR and GDDR allow you to transmit twice or even times more data at the same clock frequency in one hour, and the operating frequency figure is most often indicated by double or quadruple, multiplying by 2 or 4. So, what for DDR memory If a frequency of 1400 MHz is specified, then this memory operates at a physical frequency of 700 MHz, but you also indicate the so-called “effective” frequency, which is the one that the SDR memory is required to operate in order to ensure the same throughput. The same with GDDR5, but the frequency here is quadrupled.

The main advantage of new types of memory lies in the ability to operate at high clock frequencies, and obviously in increased throughput in keeping with the latest technologies. This requires the use of larger locks, which, however, are not so important for video cards. The first board to use DDR2 memory was the NVIDIA GeForce FX 5800 Ultra. Since then, graphics memory technology has significantly advanced, with the expansion of the GDDR3 standard, which is close to DDR2 specifications, with some changes specifically for video cards.

GDDR3 is a specially designed memory for video cards, with the same technologies as DDR2, but with improved performance and thermal characteristics, which made it possible to create microcircuits that operate at higher clock frequencies. Regardless of the standard of developments in the ATI company, the first video card from Vikorist was another modification of the NVIDIA GeForce FX 5700 Ultra, and the next one was the GeForce 6800 Ultra.

GDDR4 is a further development of “graphics” memory, which is twice as fast as GDDR3. The main advantages of GDDR4 over GDDR3, which are important for home users, are once again increased operating frequencies and reduced energy consumption. Technically, GDDR4 memory is not much different from GDDR3, but a further development of the same ideas. The first video cards with GDDR4 chips on board were the ATI Radeon X1950 XTX, and NVIDIA never released products based on this type of memory. The advantage of new memory chips over GDDR3 is that the energy efficiency of the modules can be about a third lower. This is available at a higher voltage rating for GDDR4.

However, GDDR4 did not take away the wide bandwidth of AMD solutions. Starting with the RV7x0 family of GPUs, video card memory controllers are supporting a new type of GDDR5 memory, which operates at an effective quadruple frequency up to 5.5 GHz and higher (theoretically possible frequencies up to 7 GHz), which gives capacity up to 176 GB/s with a 256-bit interface. While in order to increase memory bandwidth in GDDR3/GDDR4 memory it was necessary to use a 512-bit bus, switching to a 512-bit GDDR5 bus allows for double productivity with smaller die sizes and less cost savings ergy.

The current types of video memory are GDDR3 and GDDR5, which are divided into DDR parts and also operate with double/quad data transfer. These types of memory contain special technologies that allow you to increase the operating frequency. Thus, GDDR2 memory operates at higher frequencies on par with DDR, GDDR3 - at even higher frequencies, and GDDR5 provides the maximum frequency and bandwidth at the moment. If you still install “non-graphic” DDR3 memory on inexpensive models with a significantly lower frequency, you will need to choose a more important video card.

Basic video card components:

  • come out;
  • interfaces;
  • refrigeration system;
  • graphics processor;
  • Video memory.

graphic technologies:

  • vocabulary book;
  • GPU architecture: functions
    vertex / pixel blocks, shaders, fill fluidity, texture / raster blocks, pipelines;
  • GPU architecture: technology
    technical process, graphics processor frequency, local video memory (system, bus, type, frequency), solutions with high-quality video cards;
  • visual functions
    DirectX, high dynamic range (HDR), full screen antialiasing, texture filtering, high resolution textures.

Glossary of basic graphic terms

Refresh Rate

Just like in a movie theater or on TV, your computer simulates a crash on the monitor, displaying a sequence of frames. The refresh rate of the monitor indicates how many times per second the picture will be updated on the screen. For example, a frequency of 75 Hz represents 75 updates per second.

If the computer processes frames faster than the monitor can display, problems may occur in games. For example, if the computer processes 100 frames per second, and the monitor refresh rate is 75 Hz, then through the overlays the monitor can display only part of the image during its refresh period. The result is visual artifacts.

In this case, you can enable V-Sync (vertical synchronization). It combines the number of frames seen by the computer until the monitor refresh rate, causing artifacts to appear. If you enable V-Sync, the number of frames processed per game will not exceed the update frequency. So at 75 Hz the computer will output no more than 75 frames per second.

Pixel

The word “Pixel” stands for “ pic ture el ement" - element of the image. It is a critical point on the display that can light up in different colors (most colors are displayed in three basic colors: red, green and blue). If you allowed the screen to be 1024 × 768, then you can mark a matrix with 1024 pixels in width and 768 pixels in height. All at once pixels form images. The picture on the screen is updated 60 to 120 times per second, depending on the type of display and data that is displayed by the video card output. EPT monitors renew the display row by row, and flat panel PK monitors can renew the display pixel by pixel.

Vertex

All objects in a 3D scene are composed of vertices. A vertex is a point in a trivial space with coordinates X, Y and Z. A number of vertices can be grouped into a polygon: most often a tricutaneous, but possibly even more collapsible forms. Then a texture is applied to the polygon, which allows the object to look realistic. The 3D cube, shown in the illustration, consists of eight vertices. Larger objects have curved surfaces, which actually consist of a very large number of vertices.

Texture

A texture is simply a 2D image of sufficient size that is superimposed on a 3D object to simulate its surface. For example, our 3D cube consists of eight vertices. Before applying the texture, it looks like a simple box. If we don’t have the same texture, then the box will look cakey.

Shader

Pixel shader programs allow the video card to see adverse effects, such as water in Elder Scrolls: Oblivion.

Today there are two types of shaders: vertex and pixel. Top program shaders can change or transform 3D objects. Pixel shader programs allow you to change pixel colors based on any data. Identify the source of light in the 3D scene, which makes the object illuminated brighter, and at the same time, bring out the shadow on other objects. All this is achieved by changing the color information of the pixels.

Pixel shaders are used to create complex effects in your favorite games. For example, the shader code can blur the pixels to make the 3D sword appear brighter. Another shader can process all the vertices of a folded 3D object and simulate bulges. Game developers are increasingly going to the extent of using advanced program shaders to create realistic graphics. Almost like a game with rich graphics, Vikorist shaders.

With the release of the Microsoft DirectX 10 Application Programming Interface (API), a third type of shaders called geometry shaders has been released. Using these, you can modify objects, modify them, and put them in storage to achieve the desired result. The third type of shaders can be programmed in exactly the same way as the first two, but its role will be different.

Fill Rate

Quite often, on a box with a video card, you can lose the fluid level. In principle, the filling speed indicates the speed at which the graphics processor can produce pixels. With older video cards, the triangle fill rate may be reduced. Today there are two types of fluid filling rates: pixel fill rate and texture fill rate. As already mentioned, the pixel fluidity of the filling reflects the fluidity of the types of pixels. The cost is calculated as the number of register operations (ROPs) multiplied by the clock frequency.

ATi and NVIDIA value the textural smoothness of the filling in different ways. Nvidia cares about the speed of multiplying the number of pixel pipelines per clock frequency. And ATI multiplies the number of texture units by the clock frequency. In principle, in a correct way, nVidia chips are converted to one texture unit per block of pixel shaders (that is, one per pixel conveyor).

With regard to these data, allow us to break things down and discuss more important functions graphics processor, why the stink is removed and why the stink is so significant.

Graphics processor architecture: functions

The realism of 3D graphics depends heavily on the productivity of the video card. The more blocks of pixel shaders the processor accommodates and the higher the frequency, the more effects can be applied to the 3D scene to enhance its visual impact.

The graphics processor contains many different functional units. A number of components can be used to determine how heavy the graphics processor is. Before we go further, let us take a look at the most important functional blocks.

Vertex processors (vertex shader units)

Like pixel shader blocks, vertex processors compile program shader code like vertex processors. Since a larger vertex budget allows for the creation of larger collapsible 3D objects, the performance of top-end processors is even more important in 3D scenes with collapsible objects or their large volume. However, vertex shader blocks do not have such an obvious impact on productivity as do pixel processors.

Pixel processors (pixel shader units)

A pixel processor is a component of a graphics chip that is used for processing pixel program shaders. These processors end up paying for only a few pixels. Shared pixels contain information about color, pixel shaders allow you to achieve various graphic effects. For example, most of the water effects that you see in games are created using pixel shaders. Increasing the number of pixel processors is used to equalize the pixel performance of video cards. If one card is equipped with eight pixel shader blocks, and another with 16 blocks, then it is logical to assume that a video card with 16 blocks will sooner produce complex pixel programs. Also, adjust the clock frequency, and then doubling the number of pixel processors is more energy-efficient than doubling the frequency of the graphics chip.

unified shaders

Unified shaders have not yet arrived on PCs, but the upcoming DirectX 10 standard is directly related to a similar architecture. Then the structure of the code of vertex, geometric and pixel programs will be the same, although the shaders will be different from each other. The new specification can be seen in the Xbox 360, a graphics processor specially developed by ATi for Microsoft. Once you see what the potential of the new DirectX 10 is.

Texture Mapping Unit (TMU)

Select and filter textures. This robot is designed with texture mapping blocks that work in conjunction with pixel and vertex shader blocks. The work of TMU involves a lot of texture operations on pixels. The number of texture units in a graphics processor is often adjusted to equalize the texture performance of video cards. It is entirely reasonable to assume that a video card with a large number of TMUs will provide greater texture performance.

Raster Operator Units (ROP)

Raster processors are responsible for writing pixel data into memory. The fluidity associated with this operation is the fill rate. In the early days of 3D graphics, ROP and filling speed were the most important characteristics of video cards. Today, ROP is as important as before, but the productivity of the video card is no longer limited to these blocks, as it was before. Therefore, the productivity (and number) of ROP is rarely used to evaluate the speed of a video card.

Conveyors

Pipelines are used to describe the architecture of video cards and provide general information about the performance of the graphics processor.

The conveyor cannot be described in any technical term. The graphics processor has different pipelines, which are divided into one type of one function. Historically, a pixel processor was built under the conveyor, which was connected to its texture mapping unit (TMU). For example, the Radeon 9700 video card uses all pixel processors from all connections to its TMU, so it is important to note that the card has all the pipelines.

It’s difficult to describe today’s processors by the number of conveyors. Similar to the latest designs, the new processors have a modular, fragmented structure. ATi can be considered an innovator in this area, as with the X1000 line of video cards it switched to a modular structure, which allowed for productivity gains through internal optimization. Some processor blocks are more complex than others, and to increase the productivity of the graphics processor, ATI tried to find a compromise between the number of required blocks and the flatness of the crystal (it cannot be increased much). In this architecture, the term “pixel conveyor” has already lost its meaning, as pixel processors are no longer connected to the power blocks of the TMU. For example, on a graphics processor ATi Radeon X1600 has 12 pixel shader units and several TMU texture mapping units. It is impossible to say that in the architecture of this processor there are 12 pixel pipelines, as they say, there are all of them. However, behind tradition, pixel pipelines are still a mystery.

Having said that, the number of pixel pipelines in a graphics processor is often used to level up video cards (for example, the ATi X1x00 line). For example, if you take video cards with 24 and 16 conveyors, then it is reasonable to assume that the card with 24 conveyors will be faster.

Graphics processor architecture: technology

technical process

This term refers to the size of one element (transistor) of the chip and the accuracy of the manufacturing process. Improved technological processes make it possible to produce elements of smaller sizes. For example, the 0.18-micron process produces larger elements, while the 0.13-micron process is not as efficient. Smaller transistors operate at lower voltages. In other words, a decrease in voltage leads to a change in thermal support, which results in a decrease in the amount of heat that is seen. Improved technological process allows you to change the distance between the functional blocks of the chip, and data transfer takes less than an hour. Shorter risers, lower voltages, and other enhancements allow higher clock frequencies to be achieved.

It is difficult to understand that today both micrometers (µm) and nanometers (nm) are used to define the technical process. In reality, everything is quite simple: 1 nanometer is more expensive than 0.001 micrometers, so 0.09-μm and 90-nm technological processes are the same. As has already been stated, a smaller technical process allows one to eliminate higher clock frequencies. For example, if you compare video cards with chips of 0.18 µm and 0.09 µm (90 nm), then it is reasonable to consider the 90 nm card as having a higher frequency.

Clock frequency of the graphics processor

The clock frequency of a graphics processor is measured in megahertz (MHz), or millions of cycles per second.

Clock frequency greatly affects the productivity of the graphics processor. Whatever the case, the more work you can do in a second. For the first example, let's take a video card nVidia GeForce 6600 and 6600 GT: The 6600 GT GPU runs at 500 MHz, while the primary 6600 card runs at 400 MHz. The two processors are technically identical; a 20% increase in the clock frequency of the 6600 GT results in higher productivity.

Just the clock frequency is not everything. It is clear that architecture has a significant impact on productivity. For another example, we’ll take the GeForce 6600 GT and GeForce 6800 GT video cards. The frequency of the 6600 GT graphics processor is set to 500 MHz, while the 6800 GT is only 350 MHz. And now let’s remember that the 6800 GT has 16 pixel pipelines, and the 6600 GT has only all of them. Therefore, a 6800 GT with 16 pipelines at 350 MHz will give approximately the same productivity as a processor with more pipelines and a double clock frequency (700 MHz). With that being said, the clock frequency can generally be adjusted to improve productivity.

local video memory

Video card memory has a significant impact on productivity. However, different memory parameters are reflected in different ways.

Video memory duties

The video memory capacity, in a nutshell, can be called the video card parameter that is most overestimated. Unsatisfied people often use video memory to equalize different cards among themselves, but in reality it has little impact on productivity depending on parameters such as memory bus frequency and interface (width neither).

In most cases, a card with 128 MB of video memory will perform the same as a card with 256 MB. Of course, there are situations in which more memory will lead to increased productivity, but there will be situations where more memory will not automatically lead to increased performance in games.

Where it is used is brown, so it is in games with high resolution textures. Game developers report several sets of textures. And the more memory there is on the video card, the more textures can be used. High resolution textures give more high clarity and detailing in graphics. Therefore, it is entirely reasonable to choose a card with a large amount of memory, since all other criteria are met. It is clear once again that the width of the memory bus and its frequency have a much stronger impact on productivity than the physical memory on the card.

Memory bus width

Memory bus width is one of the most important aspects of memory performance. Current buses range in width from 64 to 256 bits, and in some cases they range from 512 bits. The wider the memory bus, the more information it can transfer per clock cycle. And this has a huge impact on productivity. For example, if we take two buses with equal frequencies, then theoretically a 128-bit bus will transmit twice as much data per clock cycle as a 64-bit one. And the 256-bit bus is twice as big.

Higher bus bandwidth (expressed in bits or bytes per second, 1 byte = 8 bits) gives higher memory performance. In fact, the memory bus is much more important than it is. At equal frequencies, a 64-bit memory bus operates at only 25% speed compared to a 256-bit bus!

Let's take this example. A video card with 128 MB of video memory and a 256-bit bus gives much higher memory performance than a 512 MB model with a 64-bit bus. It is important to note that for some cards from the ATi X1x00 line, manufacturers specify the specifications of the internal memory bus, rather than specifying the parameters of the external memory bus. For example, in the X1600 the internal ring bus is 256 bits wide, while the external one is only 128 bits wide. And in reality, the memory bus operates at 128-bit performance.

memory types

Memory can be divided into two main categories: SDR (single data transfer) and DDR (double data transfer), in which data is transferred twice as fast per clock cycle. Today's single transmission SDR technology is outdated. Data fragments in DDR memory are transferred twice as fast as in SDR, it is important to remember that video cards with DDR memory often indicate the sub-frequency, not the physical one. For example, if the DDR memory is specified at a frequency of 1000 MHz, then this is the effective frequency at which the primary SDR memory must operate to achieve the same bandwidth. But in reality, the physical frequency should be set to 500 MHz.

For these reasons, many people are surprised if the frequency of 1200 MHz DDR is specified for the memory of their video card, and the utilities report about 600 MHz. So you have to ring. DDR2 and GDDR3 / GDDR4 memory operate on the same principle, that is, double data transfer. The difference between DDR, DDR2, GDDR3 and GDDR4 memory lies in the generation technology and various details. DDR2 can run at higher frequencies, lower than DDR memory, and DDR3 can run at even higher frequencies, lower than DDR2.

Memory bus frequency

Like a processor, memory (or, more precisely, the memory bus) operates at high clock speeds, measured in megahertz. Here, the increase in clock frequencies has a huge impact on memory productivity. The memory bus frequency is one of the parameters used to improve the performance of video cards. For example, if all other characteristics (memory bus width, etc.) will be the same, it is entirely logical to assume that a video card with 700 MHz memory works faster than 500 MHz.

I repeat, clock frequency is not everything. 700-MHz memory with a 64-bit bus will work harder, lower 400-MHz memory with a 128-bit bus. The performance of 400 MHz memory on a 128-bit bus is approximately the same as 800 MHz memory on a 64-bit bus. It should also be noted that the frequencies of the graphics processor and memory are completely different parameters, and therefore differ.

video card interface

All data that is transferred between the video card and the processor passes through the video card interface. Today, there are three types of interfaces available for video cards: PCI, AGP and PCI Express. They are differentiated by throughput capacity and other characteristics. It is clear that the greater the throughput of the building, the greater the fluidity of exchange. However, high throughput can only be achieved by the most current cards, and then only rarely. At some point, the fluidity of the interface has ceased to be a “university place”, and today it is simply enough.

The most powerful bus for which video cards have been produced is PCI (Peripheral Components Interconnect). It’s important not to go into history. PCI effectively reduced the productivity of video cards, so they switched to the AGP (Accelerated Graphics Port) interface. However, the AGP 1.0 and 2x specifications limited productivity. Once the standard has increased speed to the level of AGP 4x, we have begun to approach the practical limit of throughput that can be used on a video card. The AGP 8x specification once again reduced the throughput of the same with AGP 4x (2.16 GB / s), but the significant increase in graphics productivity was not denied.

The newest and fastest bus is PCI Express. New graphics cards use the PCI Express x16 interface, which connects 16 PCI Express lanes, giving a total throughput of 4 GB/s (one way). This is twice as large, with lower throughput than AGP 8x. The PCI Express bus provides a specified bandwidth for both directions (transferring data to and from the video card). Although the AGP 8x standard was already full of speed, the situation has not yet become more complicated if the transition to PCI Express gave an increase in productivity equal to that of AGP 8x (other hardware parameters are the same). For example, the AGP version of the GeForce 6800 Ultra will perform identically to the 6800 Ultra for PCI Express.

Best time to buy a card today PCI interface Express, a few more rocks are being leaked onto the market. The most productive cards are no longer released with the AGP 8x interface, and PCI Express solutions are usually easier to find than AGP counterparts, and they are cheaper to purchase.

Resolution on several days

Vycorting a number of video cards to increase graphics productivity is not a new idea. In the early days of 3D graphics, 3dfx entered the market with two video cards running in parallel. Aside from the knowledge of 3dfx, the technology of working with several live video cards was forgotten, although ATI released similar systems for professional simulators even after the release of the Radeon 9700. A couple of times, the technology turned to the market : with the new solution nVidia SLI and, a little later, ATi Crossfire.

A large number of graphics cards provide enough performance to handle loads running at high wattage settings with high performance. It’s not so easy to choose any other solution.

It is clear from the fact that solutions based on several video cards generate a great amount of energy, so the life unit must work hard. If all the heat comes out of the video card, it is necessary to apply proper cooling to the PC case so that the system does not overheat.

In addition, remember that SLI / CrossFire is available on different motherboards (either one technology or another), which makes the cost more expensive in the same and standard models. The nVidia SLI configuration will only work on the older nForce4 boards, and the ATi CrossFire cards will only work on the motherboards with the CrossFire chipset or on other Intel models. The situation is complicated by the fact that various CrossFire configurations require one of the cards to be special: CrossFire Edition. After the release of CrossFire for certain models of video cards, ATi allowed the inclusion of sleeper technology via the PCI Express bus, and with the release of new driver versions, the number of possible combinations increases. However, hardware CrossFire with a premium CrossFire Edition card provides greater productivity. Ale and CrossFire Edition cards are more expensive than basic models. Currently you can enable software CrossFire mode (without a CrossFire Edition card) on vkritikh Radeon X1300, X1600 and X1800 GTO.

Follow other factors. If I want two graphics cards that work efficiently and give an increase in productivity, I can’t do it twice. If you have pennies, you will pay twice as much. Most often the increase in productivity is 20-60%. And in such cases, through additional tax payments for the sake of growth, there is no fuss. For these reasons, the configuration on some cards is unlikely to be true with cheap models, since a high-end video card usually outperforms a couple of cheap cards. By the way, for most people, the decision to SLI / CrossFire does not matter. If you want to enable all the options for increased brightness or play at extreme resolutions, for example, 2560 × 1600, if you need to process more than 4 million pixels per frame, then without two or four paired You can't get by with video cards.

visual functions

In addition to purely hardware specifications, different generations and models of graphics processors may have a different set of functions. For example, it is often said that cards of the ATi Radeon X800 XT generation are great with Shader Model 2.0b (SM), while the nVidia GeForce 6800 Ultra is great with SM 3.0, although their hardware specifications are close to one another (16 pipelines) c). Therefore, there are many people who are afraid to choose the value of one or another decision, but they don’t know what the difference means.

Microsoft DirectX and Shader Model versions

This is what most people call vikors in super smells, but few people know what the stinks really mean. To get started, let's take a look at the history of graphics APIs. DirectX and OpenGL are graphics APIs, and Application Programming Interfaces are standard-based code accessible to the skin.

Before the advent of graphics APIs, the manufacturer of graphics processors used the power mechanism for merging games. The developers had to write crazy code for the skin graphics processor that they wanted to support. A very expensive and ineffective approach. To solve this problem, the API for 3D graphics was broken down so that developers wrote code for a specific API, and not for one or another video card. As a result, the problems of inconsistency fell on the shoulders of video card manufacturers, who had to guarantee that the drivers would be compatible with the API.

The same complexity is lost in the fact that today two different APIs are used, and Microsoft DirectX and OpenGL itself, where GL stands for Graphics Library. The DirectX API is more popular in games today, so we'll focus on that one. This standard has been pressed harder on the development of this standard.

DirectX Microsoft creation. In fact, DirectX includes several APIs, only one of which is used for 3D graphics. DirectX includes APIs for sound, music, input devices, etc. For 3D graphics in DirectX, the Direct3D API is supported. If we talk about video cards, then we must pay attention to the fact that in this regard the concepts DirectX and Direct3D are interchangeable.

DirectX is updated periodically as graphics technologies advance and game developers introduce new methods of game programming. Since the popularity of DirectX has quickly grown, manufacturers of graphics processors have begun to release new products using DirectX capabilities. For these reasons, video cards are often associated with hardware support for the next generation of DirectX (DirectX 8, 9.0 or 9.0c).

The situation is complicated by the fact that parts of the Direct3D API can change over time, without changing the DirectX generation. For example, the DirectX 9.0 specification includes Pixel Shader 2.0 support. The DirectX 9.0c update also includes Pixel Shader 3.0. Thus, even if cards are upgraded to DirectX 9 class, they can support different sets of functions. For example, the Radeon 9700 supports Shader Model 2.0, and the Radeon X1800 supports Shader Model 3.0, although both cards can be upgraded to the DirectX 9 generation.

Remember that when new games are created, retailers will insure the owners of old machines and video cards, and if they ignore this segment of customers, the sales rate will be lower. For these reasons, there will be a number of code steps in the game. In the DirectX 9 class, for the most part, there are DirectX 8 routes and DirectX 7 routes. If you choose the old route, the game will have some virtual effects, just like the new ones. Hello, let’s face it, you can play on the old “zaliz”.

Most new games require installation of a new version of DirectX, so your video card must be up to the latest generation. If a new game is running on DirectX 8, it still requires installing the new version of DirectX 9 for a DirectX 8-class video card.

What are the responsibilities between different versions Direct3D API in DirectX? early versions DirectX - 3, 5, 6 and 7 - were remarkably simple due to the capabilities of the Direct3D API. Retailers could choose visual effects from the list, and then check them with the robot in the game. The next important step in programmed graphics became DirectX 8. While it became possible to program a video card using shaders, developers first denied the freedom to program effects the way they needed. DirectX 8 supported versions of Pixel Shader 1.0 to 1.3 and Vertex Shader 1.0. DirectX 8.1, updated DirectX version 8, trimmed Pixel Shader 1.4 and Vertex Shader 1.1.

In DirectX 9 you can create even more complex program shaders. DirectX 9 supports Pixel Shader 2.0 and Vertex Shader 2.0. DirectX 9c, updated version of DirectX 9, included the Pixel Shader 3.0 specification.

DirectX 10, upcoming API version, will support the new one Windows version Vista. On Windows XP you cannot install DirectX 10.

HDR lighting and OpenEXR HDR

HDR stands for “High Dynamic Range”, high dynamic range. With HDR lighting you can get a much more realistic picture than without it, and not all video cards support HDR lighting.

Before the advent of DirectX 9-class video cards, graphics processors seriously limited the accuracy of lighting calculations. Until now, the lighting can only be purchased with 256 (8 bits) internal levels.

When video cards became DirectX 9-class, they lost the ability to produce high-precision lighting - again 24 bits or 16.7 million rubles.

For 16.7 million rubles and after the deadline for the productivity of video cards of the DirectX 9 / Shader Model 2.0 class was reached, HDR lighting became possible on computers. This is a complex technology, and it is necessary to marvel at its dynamics. Yakshcho kazati in simple words, Then HDR lighting increases contrast (dark colors look darker, light colors look lighter), while at the same time increasing the amount of lightening detail in dark and light areas. HDR lighting looks more vibrant and realistic, but nothing else.

Graphics processors, based on the remaining specifications of Pixel Shader 3.0, can render lighting with higher 32-bit accuracy, as well as blending with floating Choyu comu. Thus, SM 3.0 class video cards can support a special OpenEXR HDR lighting method specifically designed for the film industry.

Games that only support HDR lighting using the OpenEXR method will not work with HDR lighting using Shader Model 2.0. However, games that do not rely on the OpenEXR method work on any DirectX 9 video card. For example, Oblivion uses the OpenEXR HDR method and allows you to enable HDR lighting only on new video cards that support Shader Model 3.0 specification. For example, nVidia GeForce 6800 or ATi Radeon X1800. Games that use the Half-Life 2 3D engine, including Counter-Strike: Source and the upcoming Half-Life 2: Aftermath, allow you to enable HDR rendering on older DirectX 9 versions that only support Pixel Shader 2.0. As an example, you can use the GeForce 5 or ATi Radeon 9500 line.

Please note that all forms of HDR rendering involve significant computational effort and can bring your graphics processors to a standstill. If you want to play new games with HDR lighting, you can't do without high-performance graphics.

re-smoothing

Full-screen smoothing (abbreviated as AA) allows you to insert the characteristic “drabines” on the cordons of the landfills. However, it should be noted that full-screen smoothing requires a lot of computing resources, which leads to a drop in the frame rate.

Anti-aliasing can greatly affect the productivity of video memory, so a high-quality video card with a high-speed memory can perform full-screen anti-aliasing with less impact on productivity, a lower-cost video card. Smoothing can be turned on in different modes. For example, 4x smoothing will give a clearer picture, but 2x smoothing will be a big hit to productivity. When smoothing 2x the rootstock is horizontal and vertical, the mode 4x is quadrupled.

texture filtration

Textures are applied to all 3D objects in the game, and the more surfaces are visible, the more distorted the texture will appear. To achieve this effect, graphics processors use texture filtering.

The first method of filtration is called white and gives characteristic darkening that is not easily accepted by the eye. The situation improved with the introduction of trilinear filtering. Offer options on daily operating hours practically without harming productivity.

For today ourselves in the best possible way Texture filtering is anisotropic filtering (AF). Similar to full-screen smoothing, anisotropic filtration can be turned on at different levels. For example, 8x AF gives higher filtration power than 4x AF. As a result of continuous smoothing, anisotropic filtration produces a slight increase in the amount of pain, which increases with the level of AF.

Textures of high resolution

All 3D games are created with specific specifications, and one of these is the texture memory that the game requires. All the required textures must fit into the memory of the video card for an hour, otherwise productivity will drop greatly, and the fragments of processing for the texture in the RAM will cause a slight delay, not to mention the photo swapping on the hard drive. Therefore, since the software package requires 128 MB of video memory as a minimum requirement, the set of active textures does not have to exceed 128 MB at any time.

Current games have a number of texture sets, so the game will run without problems on older cards with less video memory, as well as on new cards with a large amount of video memory. For example, the game can accommodate three sets of textures: for 128 MB, 256 MB and 512 MB. It's not enough to support 512 MB of video memory these days, but it's still the most objective reason to buy a video card with that amount of memory. While increased memory usage doesn't necessarily come at the cost of productivity, you'll experience increased visual richness as the game supports a different set of textures.

What do you need to know about video cards?

In contact with

In the first part of our video card portfolio for early-stage customers, we looked at the key components: interfaces, outputs, cooling system, graphics processor and video memory. In another part we will talk about the functions and technologies of video cards.

Basic video card components:

  • come out;
  • interfaces;
  • refrigeration system;
  • graphics processor;
  • Video memory.

Part 2 (this article): graphic technologies:

  • vocabulary book;
  • GPU architecture: functions
    vertex / pixel blocks, shaders, fill fluidity, texture / raster blocks, pipelines;
  • GPU architecture: technology
    technical process, graphics processor frequency, local video memory (system, bus, type, frequency), solutions with high-quality video cards;
  • visual functions
    DirectX, high dynamic range (HDR), full screen antialiasing, texture filtering, high resolution textures.

Glossary of basic graphic terms

Refresh Rate

Just like in a movie theater or on TV, your computer simulates a crash on the monitor, displaying a sequence of frames. The refresh rate of the monitor indicates how many times per second the picture will be updated on the screen. For example, a frequency of 75 Hz represents 75 updates per second.

If the computer processes frames faster than the monitor can display, problems may occur in games. For example, if the computer processes 100 frames per second, and the monitor refresh rate is 75 Hz, then through the overlays the monitor can display only part of the image during its refresh period. The result is visual artifacts.

In this case, you can enable V-Sync (vertical synchronization). It combines the number of frames seen by the computer until the monitor refresh rate, causing artifacts to appear. If you enable V-Sync, the number of frames processed per game will not exceed the update frequency. So at 75 Hz the computer will output no more than 75 frames per second.

The word "Pixel" stands for " pic ture el ement "is an element of the image. It is a critical point on the display that can be illuminated in different colors (mostly the colors are displayed in combination with three basic colors: red, green and blue). If you set the screen to 1024x768, then you can mark the matrix with 1024 pixels in width and 768 pixels in height. All pixels and images are combined at the same time. The image on the screen is updated 60 to 120 times per second, depending on the type of display and data that is displayed. ohm video cards. EPT monitors update the display row by row. , and flat-panel PK monitors can change the skin pixel area.

All objects in a 3D scene are composed of vertices. A vertex is a point in a trivial space with coordinates X, Y and Z. A number of vertices can be grouped into a polygon: most often a tricutaneous, but possibly even more collapsible forms. Then a texture is applied to the polygon, which allows the object to look realistic. The 3D cube, shown in the illustration, consists of eight vertices. Larger objects have curved surfaces, which actually consist of a very large number of vertices.

A texture is simply a 2D image of sufficient size that is superimposed on a 3D object to simulate its surface. For example, our 3D cube consists of eight vertices. Before applying the texture, it looks like a simple box. If we don’t have the same texture, then the box will look cakey.


Pixel shader programs allow the video card to see adverse effects, such as water in Elder Scrolls: Oblivion.

Today there are two types of shaders: vertex and pixel. Top program shaders can change or transform 3D objects. Pixel shader programs allow you to change pixel colors based on any data. Identify the source of light in the 3D scene, which makes the object illuminated brighter, and at the same time, bring out the shadow on other objects. All this is achieved by changing the color information of the pixels.

Pixel shaders are used to create complex effects in your favorite games. For example, the shader code can blur the pixels to make the 3D sword appear brighter. Another shader can process all the vertices of a folded 3D object and simulate bulges. Game developers are increasingly going to the extent of using advanced program shaders to create realistic graphics. Almost like a game with rich graphics, Vikorist shaders.

With the release of the Microsoft DirectX 10 Application Programming Interface (API), a third type of shaders called geometry shaders has been released. Using these, you can modify objects, modify them, and put them in storage to achieve the desired result. The third type of shaders can be programmed in exactly the same way as the first two, but its role will be different.

Fill Rate

Quite often, on a box with a video card, you can lose the fluid level. In principle, the filling speed indicates the speed at which the graphics processor can produce pixels. With older video cards, the triangle fill rate may be reduced. Today there are two types of fluid filling rates: pixel fill rate and texture fill rate. As already mentioned, the pixel fluidity of the filling reflects the fluidity of the types of pixels. The cost is calculated as the number of register operations (ROPs) multiplied by the clock frequency.

ATi and NVIDIA value the textural smoothness of the filling in different ways. Nvidia cares about the speed of multiplying the number of pixel pipelines per clock frequency. And ATI multiplies the number of texture units by the clock frequency. In principle, in a correct way, nVidia chips are converted to one texture unit per block of pixel shaders (that is, one per pixel conveyor).

With these details in mind, let us go further and discuss the most important functions of the graphics processor, what they need to do and why they are so significant.

Graphics processor architecture: functions

The realism of 3D graphics depends heavily on the productivity of the video card. The more blocks of pixel shaders the processor accommodates and the higher the frequency, the more effects can be applied to the 3D scene to enhance its visual impact.

The graphics processor contains many different functional units. A number of components can be used to determine how heavy the graphics processor is. Before we go further, let us take a look at the most important functional blocks.

Vertex processors (vertex shader units)

Like pixel shader blocks, vertex processors compile program shader code like vertex processors. Since a larger vertex budget allows for the creation of larger collapsible 3D objects, the performance of top-end processors is even more important in 3D scenes with collapsible objects or their large volume. However, vertex shader blocks do not have such an obvious impact on productivity as do pixel processors.

Pixel processors (pixel shader units)

A pixel processor is a component of a graphics chip that is used for processing pixel program shaders. These processors end up paying for only a few pixels. Shared pixels contain information about color, pixel shaders allow you to achieve various graphic effects. For example, most of the water effects that you see in games are created using pixel shaders. Increasing the number of pixel processors is used to equalize the pixel performance of video cards. If one card is equipped with eight pixel shader blocks, and another with 16 blocks, then it is logical to assume that a video card with 16 blocks will sooner produce complex pixel programs. Also, adjust the clock frequency, and then doubling the number of pixel processors is more energy-efficient than doubling the frequency of the graphics chip.

unified shaders

Unified shaders have not yet arrived on PCs, but the upcoming DirectX 10 standard is directly related to a similar architecture. Then the structure of the code of vertex, geometric and pixel programs will be the same, although the shaders will be different from each other. The new specification can be seen in the Xbox 360, a graphics processor specially developed by ATi for Microsoft. Once you see what the potential of the new DirectX 10 is.

Texture Mapping Unit (TMU)

Select and filter textures. This robot is designed with texture mapping blocks that work in conjunction with pixel and vertex shader blocks. The work of TMU involves a lot of texture operations on pixels. The number of texture units in a graphics processor is often adjusted to equalize the texture performance of video cards. It is entirely reasonable to assume that a video card with a large number of TMUs will provide greater texture performance.

Raster Operator Units (ROP)

Raster processors are responsible for writing pixel data into memory. The fluidity associated with this operation is the fill rate. In the early days of 3D graphics, ROP and filling speed were the most important characteristics of video cards. Today, ROP is as important as before, but the productivity of the video card is no longer limited to these blocks, as it was before. Therefore, the productivity (and number) of ROP is rarely used to evaluate the speed of a video card.

Conveyors

Pipelines are used to describe the architecture of video cards and provide general information about the performance of the graphics processor.

The conveyor cannot be described in any technical term. The graphics processor has different pipelines, which are divided into one type of one function. Historically, a pixel processor was built under the conveyor, which was connected to its texture mapping unit (TMU). For example, the Radeon 9700 video card uses all pixel processors from all connections to its TMU, so it is important to note that the card has all the pipelines.

It’s difficult to describe today’s processors by the number of conveyors. Similar to the latest designs, the new processors have a modular, fragmented structure. ATi can be considered an innovator in this area, as with the X1000 line of video cards it switched to a modular structure, which allowed for productivity gains through internal optimization. Some processor blocks are more complex than others, and to increase the productivity of the graphics processor, ATI tried to find a compromise between the number of required blocks and the flatness of the crystal (it cannot be increased much). In this architecture, the term “pixel conveyor” has already lost its meaning, as pixel processors are no longer connected to the power blocks of the TMU. For example, the ATi Radeon X1600 graphics processor has 12 pixel shader units and several TMU texture mapping units. It is impossible to say that in the architecture of this processor there are 12 pixel pipelines, as they say, there are all of them. However, behind tradition, pixel pipelines are still a mystery.

Having said that, the number of pixel pipelines in a graphics processor is often used to level up video cards (for example, the ATi X1x00 line). For example, if you take video cards with 24 and 16 conveyors, then it is reasonable to assume that the card with 24 conveyors will be faster.

Graphics processor architecture: technology

technical process

This term refers to the size of one element (transistor) of the chip and the accuracy of the manufacturing process. Improved technological processes make it possible to produce elements of smaller sizes. For example, the 0.18-micron process produces larger elements, while the 0.13-micron process is not as efficient. Smaller transistors operate at lower voltages. In other words, a decrease in voltage leads to a change in thermal support, which results in a decrease in the amount of heat that is seen. Improved technological process allows you to change the distance between the functional blocks of the chip, and data transfer takes less than an hour. Shorter risers, lower voltages, and other enhancements allow higher clock frequencies to be achieved.

It is difficult to understand that today both micrometers (µm) and nanometers (nm) are used to define the technical process. In reality, everything is quite simple: 1 nanometer is more expensive than 0.001 micrometers, so 0.09-μm and 90-nm technological processes are the same. As has already been stated, a smaller technical process allows one to eliminate higher clock frequencies. For example, if you compare video cards with chips of 0.18 µm and 0.09 µm (90 nm), then it is reasonable to consider the 90 nm card as having a higher frequency.

Clock frequency of the graphics processor

The clock frequency of a graphics processor is measured in megahertz (MHz), or millions of cycles per second.

Clock frequency greatly affects the productivity of the graphics processor. Whatever the case, the more work you can do in a second. For the first example, we take nVidia GeForce 6600 and 6600 GT video cards: the 6600 GT graphics processor operates at a frequency of 500 MHz, and the primary 6600 card operates at 400 MHz. The two processors are technically identical; a 20% increase in the clock frequency of the 6600 GT results in higher productivity.

Just the clock frequency is not everything. It is clear that architecture has a significant impact on productivity. For another example, we’ll take the GeForce 6600 GT and GeForce 6800 GT video cards. The frequency of the 6600 GT graphics processor is set to 500 MHz, while the 6800 GT is only 350 MHz. And now let’s remember that the 6800 GT has 16 pixel pipelines, and the 6600 GT has only all of them. Therefore, a 6800 GT with 16 pipelines at 350 MHz will give approximately the same productivity as a processor with more pipelines and a double clock frequency (700 MHz). With that being said, the clock frequency can generally be adjusted to improve productivity.

local video memory

Video card memory has a significant impact on productivity. However, different memory parameters are reflected in different ways.

Video memory duties

The video memory capacity, in a nutshell, can be called the video card parameter that is most overestimated. Unsatisfied people often use video memory to equalize different cards among themselves, but in reality it has little impact on productivity depending on parameters such as memory bus frequency and interface (width neither).

In most cases, a card with 128 MB of video memory will perform the same as a card with 256 MB. Of course, there are situations in which more memory will lead to increased productivity, but there will be situations where more memory will not automatically lead to increased performance in games.

Where it is used is brown, so it is in games with high resolution textures. Game developers report several sets of textures. And the more memory there is on the video card, the more textures can be used. High resolution textures provide greater clarity and detail in graphics. Therefore, it is entirely reasonable to choose a card with a large amount of memory, since all other criteria are met. It is clear once again that the width of the memory bus and its frequency have a much stronger impact on productivity than the physical memory on the card.

Memory bus width

Memory bus width is one of the most important aspects of memory performance. Current buses range in width from 64 to 256 bits, and in some cases they range from 512 bits. The wider the memory bus, the more information it can transfer per clock cycle. And this has a huge impact on productivity. For example, if we take two buses with equal frequencies, then theoretically a 128-bit bus will transmit twice as much data per clock cycle as a 64-bit one. And the 256-bit bus is twice as big.

Higher bus bandwidth (expressed in bits or bytes per second, 1 byte = 8 bits) gives higher memory performance. In fact, the memory bus is much more important than it is. At equal frequencies, a 64-bit memory bus operates at only 25% speed compared to a 256-bit bus!

Let's take this example. A video card with 128 MB of video memory and a 256-bit bus gives much higher memory performance than a 512 MB model with a 64-bit bus. It is important to note that for some cards from the ATi X1x00 line, manufacturers specify the specifications of the internal memory bus, rather than specifying the parameters of the external memory bus. For example, in the X1600 the internal ring bus is 256 bits wide, while the external one is only 128 bits wide. And in reality, the memory bus operates at 128-bit performance.

memory types

Memory can be divided into two main categories: SDR (single data transfer) and DDR (double data transfer), in which data is transferred twice as fast per clock cycle. Today's single transmission SDR technology is outdated. Data fragments in DDR memory are transferred twice as fast as in SDR, it is important to remember that video cards with DDR memory often indicate the sub-frequency, not the physical one. For example, if the DDR memory is specified at a frequency of 1000 MHz, then this is the effective frequency at which the primary SDR memory must operate to achieve the same bandwidth. But in reality, the physical frequency should be set to 500 MHz.

For these reasons, many people are surprised if the frequency of 1200 MHz DDR is specified for the memory of their video card, and the utilities report about 600 MHz. So you have to ring. DDR2 and GDDR3 / GDDR4 memory operate on the same principle, that is, double data transfer. The difference between DDR, DDR2, GDDR3 and GDDR4 memory lies in the generation technology and various details. DDR2 can run at higher frequencies, lower than DDR memory, and DDR3 can run at even higher frequencies, lower than DDR2.

Memory bus frequency

Like a processor, memory (or, more precisely, the memory bus) operates at high clock speeds, measured in megahertz. Here, the increase in clock frequencies has a huge impact on memory productivity. The memory bus frequency is one of the parameters used to improve the performance of video cards. For example, if all other characteristics (memory bus width, etc.) will be the same, it is entirely logical to assume that a video card with 700 MHz memory works faster than 500 MHz.

I repeat, clock frequency is not everything. 700-MHz memory with a 64-bit bus will work harder, lower 400-MHz memory with a 128-bit bus. The performance of 400 MHz memory on a 128-bit bus is approximately the same as 800 MHz memory on a 64-bit bus. It should also be noted that the frequencies of the graphics processor and memory are completely different parameters, and therefore differ.

video card interface

All data that is transferred between the video card and the processor passes through the video card interface. Today, there are three types of interfaces available for video cards: PCI, AGP and PCI Express. They are differentiated by throughput capacity and other characteristics. It is clear that the greater the throughput of the building, the greater the fluidity of exchange. However, high throughput can only be achieved by the most current cards, and then only rarely. At some point, the fluidity of the interface has ceased to be a “university place”, and today it is simply enough.

The most powerful bus for which video cards have been produced is PCI (Peripheral Components Interconnect). It’s important not to go into history. PCI effectively reduced the productivity of video cards, so they switched to the AGP (Accelerated Graphics Port) interface. However, the AGP 1.0 and 2x specifications limited productivity. Once the standard has increased speed to the level of AGP 4x, we have begun to approach the practical limit of throughput that can be used on a video card. The AGP 8x specification once again reduced the throughput of the same with AGP 4x (2.16 GB / s), but the significant increase in graphics productivity was not denied.

The newest and fastest bus is PCI Express. New graphics cards use the PCI Express x16 interface, which connects 16 PCI Express lanes, giving a total throughput of 4 GB/s (one way). This is twice as large, with lower throughput than AGP 8x. The PCI Express bus provides a specified bandwidth for both directions (transferring data to and from the video card). Although the AGP 8x standard was already full of speed, the situation has not yet become more complicated if the transition to PCI Express gave an increase in productivity equal to that of AGP 8x (other hardware parameters are the same). For example, the AGP version of the GeForce 6800 Ultra will perform identically to the 6800 Ultra for PCI Express.

Nowadays it’s best to buy a card with a PCI Express interface, but there are still a lot of things on the market. The most productive cards are no longer released with the AGP 8x interface, and PCI Express solutions are usually easier to find than AGP counterparts, and they are cheaper to purchase.

Resolution on several days

Vycorting a number of video cards to increase graphics productivity is not a new idea. In the early days of 3D graphics, 3dfx entered the market with two video cards running in parallel. Aside from the knowledge of 3dfx, the technology of working with several live video cards was forgotten, although ATI released similar systems for professional simulators even after the release of the Radeon 9700. A couple of times, the technology turned to the market : with a new decision nVidia SLI and, a little later, ATi Crossfire .

A large number of graphics cards provide enough performance to handle loads running at high wattage settings with high performance. It’s not so easy to choose any other solution.

It is clear from the fact that solutions based on several video cards generate a great amount of energy, so the life unit must work hard. If all the heat comes out of the video card, it is necessary to apply proper cooling to the PC case so that the system does not overheat.

In addition, remember that SLI / CrossFire is available on different motherboards (either one technology or another), which makes the cost more expensive in the same and standard models. The nVidia SLI configuration will only work on older nForce4 boards, and ATi CrossFire cards will only work on motherboards with a CrossFire chipset or on some Intel models. The situation is complicated by the fact that various CrossFire configurations require one of the cards to be special: CrossFire Edition. After the release of CrossFire for certain models of video cards, ATi allowed the inclusion of sleeper technology via the PCI Express bus, and with the release of new driver versions, the number of possible combinations increases. However, hardware CrossFire with a premium CrossFire Edition card provides greater productivity. Ale and CrossFire Edition cards are more expensive than basic models. Currently, you can enable software CrossFire mode (without a CrossFire Edition card) on the powered Radeon X1300, X1600 and X1800 GTO.

Follow other factors. If I want two graphics cards that work efficiently and give an increase in productivity, I can’t do it twice. If you have pennies, you will pay twice as much. Most often the increase in productivity is 20-60%. And in such cases, through additional tax payments for the sake of growth, there is no fuss. For these reasons, the configuration on some cards is unlikely to be true with cheap models, since a high-end video card usually outperforms a couple of cheap cards. By the way, for most people, the decision to SLI / CrossFire does not matter. If you want to enable all the options for increased brightness or play at extreme resolutions, for example, 2560x1600, if you need to process more than 4 million pixels per frame, then without two or more paired You can't get by with cards.

visual functions

In addition to purely hardware specifications, different generations and models of graphics processors may have a different set of functions. For example, it is often said that cards of the ATi Radeon X800 XT generation are great with Shader Model 2.0b (SM), while the nVidia GeForce 6800 Ultra is great with SM 3.0, although their hardware specifications are close to one another (16 pipelines) c). Therefore, there are many people who are afraid to choose the value of one or another decision, but they don’t know what the difference means. Well, let me talk about visual functions and their significance for the end user.

This is what most people call vikors in super smells, but few people know what the stinks really mean. To get started, let's take a look at the history of graphics APIs. DirectX and OpenGL are graphics APIs, and Application Programming Interfaces are standard-based code accessible to the skin.

Before the advent of graphics APIs, the manufacturer of graphics processors used the power mechanism for merging games. The developers had to write crazy code for the skin graphics processor that they wanted to support. A very expensive and ineffective approach. To solve this problem, the API for 3D graphics was broken down so that developers wrote code for a specific API, and not for one or another video card. As a result, the problems of inconsistency fell on the shoulders of video card manufacturers, who had to guarantee that the drivers would be compatible with the API.

The same complexity is lost in the fact that today two different APIs are used, and Microsoft DirectX and OpenGL itself, where GL stands for Graphics Library. The DirectX API is more popular in games today, so we'll focus on that one. This standard has been pressed harder on the development of this standard.

DirectX is a Microsoft creation. In fact, DirectX includes several APIs, only one of which is used for 3D graphics. DirectX includes APIs for sound, music, input devices, etc. For 3D graphics in DirectX, the Direct3D API is supported. If we talk about video cards, then we must pay attention to the fact that in this regard the concepts DirectX and Direct3D are interchangeable.

DirectX is updated periodically as graphics technologies advance and game developers introduce new methods of game programming. Since the popularity of DirectX has quickly grown, manufacturers of graphics processors have begun to release new products using DirectX capabilities. For these reasons, video cards are often associated with hardware support for the next generation of DirectX (DirectX 8, 9.0 or 9.0c).

The situation is complicated by the fact that parts of the Direct3D API can change over time, without changing the DirectX generation. For example, the DirectX 9.0 specification includes Pixel Shader 2.0 support. The DirectX 9.0c update also includes Pixel Shader 3.0. Thus, even if cards are upgraded to DirectX 9 class, they can support different sets of functions. For example, the Radeon 9700 supports Shader Model 2.0, and the Radeon X1800 supports Shader Model 3.0, although both cards can be upgraded to the DirectX 9 generation.

Remember that when new games are created, retailers will insure the owners of old machines and video cards, and if they ignore this segment of customers, the sales rate will be lower. For these reasons, there will be a number of code steps in the game. In the DirectX 9 class, for the most part, there are DirectX 8 routes and DirectX 7 routes. If you choose the old route, the game will have some virtual effects, just like the new ones. Hello, let’s face it, you can play on the old “loop”.

Most new games require installation of a new version of DirectX, so your video card must be up to the latest generation. If a new game is running on DirectX 8, it still requires installing the new version of DirectX 9 for a DirectX 8-class video card.

What are the differences between different versions of the Direct3D API in DirectX? Early versions of DirectX - 3, 5, 6 and 7 - were remarkably simple to use with the capabilities of the Direct3D API. Developers could select visual effects from the list, and then check their work in the game. The next important step in programmed graphics became DirectX 8. While it became possible to program a video card using shaders, developers first denied the freedom to program effects the way they needed. DirectX 8 supported versions of Pixel Shader 1.0 to 1.3 and Vertex Shader 1.0. DirectX 8.1, updated version of DirectX 8, removed Pixel Shader 1.4 and Vertex Shader 1.1.

In DirectX 9 you can create even more complex program shaders. DirectX 9 supports Pixel Shader 2.0 and Vertex Shader 2.0. DirectX 9c, updated version of DirectX 9, included the Pixel Shader 3.0 specification.

DirectX 10, the latest version of the API, will support the new version of Windows Vista. On Windows XP you cannot install DirectX 10.

HDR stands for “High Dynamic Range”. With HDR lighting you can get a much more realistic picture than without it, and not all video cards support HDR lighting.

Before the advent of DirectX 9-class video cards, graphics processors seriously limited the accuracy of lighting calculations. Until now, the lighting can only be purchased with 256 (8 bits) internal levels.

When video cards became DirectX 9-class, they lost the ability to produce high-precision lighting - again 24 bits or 16.7 million rubles.

For 16.7 million rubles and after the deadline for the productivity of video cards of the DirectX 9 / Shader Model 2.0 class was reached, HDR lighting became possible on computers. This is a complex technology, and it is necessary to marvel at its dynamics. In simple terms, HDR lighting increases contrast (dark colors look darker, light colors look lighter), while at the same time increasing the amount of detail in the lighting in dark and light areas. HDR lighting looks more vibrant and realistic, but nothing else.

Graphics processors, based on the remaining specifications of Pixel Shader 3.0, can render lighting with higher 32-bit accuracy, as well as blending with floating Choyu comu. Thus, SM 3.0 class video cards can support a special OpenEXR HDR lighting method specifically designed for the film industry.

Games that only support HDR lighting using the OpenEXR method will not work with HDR lighting using Shader Model 2.0. However, games that do not rely on the OpenEXR method work on any DirectX 9 video card. For example, Oblivion uses the OpenEXR HDR method and allows you to enable HDR lighting only on new video cards that support Shader Model 3.0 specification. For example, nVidia GeForce 6800 or ATi Radeon X1800. Games that use the Half-Life 2 3D engine, including Counter-Strike: Source and the upcoming Half-Life 2: Aftermath, allow you to enable HDR rendering on older DirectX 9 versions that only support Pixel Shader 2.0. As an example, you can use the GeForce 5 or ATi Radeon 9500 line.

Please be aware that all forms of HDR rendering involve significant computational effort and can bring your graphics processors to a standstill. If you want to play new games with HDR lighting, you can't do without high-performance graphics.

Full-screen smoothing (abbreviated as AA) allows you to insert the characteristic “drabines” on the cordons of the landfills. However, it should be noted that full-screen smoothing requires a lot of computing resources, which leads to a drop in the frame rate.

Anti-aliasing can greatly affect the productivity of video memory, so a high-quality video card with a high-speed memory can perform full-screen anti-aliasing with less impact on productivity, a lower-cost video card. Smoothing can be turned on in different modes. For example, 4x smoothing will give a clearer picture, but 2x smoothing will be a big hit to productivity. When smoothing 2x the rootstock is horizontal and vertical, the mode 4x is quadrupled.

Textures are applied to all 3D objects in the game, and the more surfaces are visible, the more distorted the texture will appear. To achieve this effect, graphics processors use texture filtering.

The first method of filtration is called white and gives characteristic darkening that is not easily accepted by the eye. The situation improved with the introduction of trilinear filtering. Offer options on daily operating hours practically without harming productivity.

Today, the best way to filter textures is anisotropic filtration (AF). Similar to full-screen smoothing, anisotropic filtration can be turned on at different levels. For example, 8x AF gives higher filtration power than 4x AF. As a result of continuous smoothing, anisotropic filtration produces a slight increase in the amount of pain, which increases with the level of AF.

All 3D games are created with specific specifications, and one of these is the texture memory that the game requires. All the required textures must fit into the memory of the video card for an hour, otherwise productivity will drop greatly, and the fragments of processing for the texture in the RAM will cause a slight delay, not to mention the photo swapping on the hard drive. Therefore, since the software package requires 128 MB of video memory as a minimum requirement, the set of active textures does not have to exceed 128 MB at any time.

Current games have a number of texture sets, so the game will run without problems on older cards with less video memory, as well as on new cards with a large amount of video memory. For example, the game can accommodate three sets of textures: for 128 MB, 256 MB and 512 MB. It's not enough to support 512 MB of video memory these days, but it's still the most objective reason to buy a video card with that amount of memory. While increased memory usage doesn't necessarily come at the cost of productivity, you'll experience increased visual richness as the game supports a different set of textures.

Unified shader blocks combine two types of redesigned blocks, they can be created by both vertex and pixel programs (as well as geometric ones, which appeared in DirectX 10). Unification of shader blocks means that the code of different shader programs (vertex, pixel and geometry) is universal, and different unified processors can be eliminated from other programs . Apparently, in the new architecture, the number of pixel, vertex and geometry shader units seems to be combined into one number - the number of universal processors.

Texture blocks (tmu)

These blocks work together with shader processors of all types, they select and filter texture data necessary for executing the scene. The number of texture units in a video chip indicates texture productivity and the speed of sampling from textures. And although a large part of the development is made up of shader blocks, the emphasis on TMU blocks is still large, and with the emphasis of various additions to the productivity of texturing blocks, we can say that the number of TMU blocks And consistently high textural productivity is one of the most important parameters of video chips. Particularly, this parameter provides fluidity for variable trilinear and anisotropic filtering, which will require additional texture sampling.

Rasterization operation blocks (rop)

Rasterization blocks perform the operation of recording pixels filled by the video card into buffers and the operation of their mixing (blending). As has already been noted, the productivity of ROP blocks affects the fill rate and thus is one of the main characteristics of video cards. And although the remaining time the value has decreased significantly, there are still drops if the productivity of the additives is too low due to the fluidity and quantity of ROP blocks. Most often, this is explained by active post-processing filters and turning on anti-aliasing at high image settings.

Video memory duties

Powerful memory is used by video chips to save necessary data: textures, vertices, buffers, etc. It would seem that the more it is, the better it is. But it’s not so simple, the assessment of the workload of the video card for video memory is the greatest benefit! The importance of memory lags is often re-evaluated by computers used to level up different models of video cards. It’s clear that since the parameter that appears in dry dzherel is one of the first, twice as high, then the liquidity of the product must be twice as high, and the stench is important. The reality of this myth is that the increase in productivity grows to the point of success and after that achievement simply stops.

Each program has a heavy requirement for video memory, which is required for all data, and even if 4 GB is supplied there, there is no reason for it to speed up rendering, the ability to quickly separate video blocks. In fact, in all cases, a video card with 320 MB of video memory will perform with the same speed as a card with 640 MB (with other equal minds). Situations in which greater memory usage leads to a visible increase in productivity appear to be even more beneficial at high levels of separation and with maximum adjustments. Even though such problems are rare, it is essential to take care of the memory, but not forgetting about those that are more important than productivity simply does not increase, there are more important parameters, such as the width of the memory bus and її operating frequency.