Avagy az NV3X...
Ha netán törlödne a hozzászólásod, csak csekkold ezt: #1940
Ha netán törlödne a hozzászólásod, csak csekkold ezt: #1940
-
Cvd #2569 Ja, ezt:
Rewriting shaders behind an application's back in a way that changes the output under non-controlled circumstances is absolutely, positively wrong and indefensible.
Rewriting a shader so that it does exactly the same thing, but in a more efficient way, is generally acceptable compiler optimization, but there is a range of defensibility from completely generic instruction scheduling that helps almost everyone, to exact shader comparisons that only help one specific application. Full shader comparisons are morally grungy, but not deeply evil.
The significant issue that clouds current ATI / Nvidia comparisons is fragment shader precision. Nvidia can work at 12 bit integer, 16 bit float, and 32 bit float. ATI works only at 24 bit float. There isn't actually a mode where they can be exactly compared. DX9 and ARB_fragment_program assume 32 bit float operation, and ATI just converts everything to 24 bit. For just about any given set of operations, the Nvidia card operating at 16 bit float will be faster than the ATI, while the Nvidia operating at 32 bit float will be slower. When DOOM runs the NV30 specific fragment shader, it is faster than the ATI, while if they both run the ARB2 shader, the ATI is faster.
When the output goes to a normal 32 bit framebuffer, as all current tests do, it is possible for Nvidia to analyze data flow from textures, constants, and attributes, and change many 32 bit operations to 16 or even 12 bit operations with absolutely no loss of quality or functionality. This is completely acceptable, and will benefit all applications, but will almost certainly induce hard to find bugs in the shader compiler. You can really go overboard with this -- if you wanted every last possible precision savings, you would need to examine texture dimensions and track vertex buffer data ranges for each shader binding. That would be a really poor architectural decision, but benchmark pressure pushes vendors to such lengths if they avoid outright cheating. If really aggressive compiler optimizations are implemented, I hope they include a hint or pragma for "debug mode" that skips all the optimizations.
John Carmack
Ott a lényeg, mely szerint nem lehet rendesen összehasonlítani egy ATI kártyát egy NV3X-szel, mert nem tudnak egyazon precizitással menni. Az egyik oldal tud 24bitet, a másik meg 16 vagy 32-őt. Jópárszor írtam már Borgnak is, hogy természetesen lassabb lesz az FX 32 biten, de erre annyit tudott, hogy vagy lassú az FX, vagy csal mert csak 16 biten megy. Kíváncsi vagyok hogy most akkor Carmackkal is szembe megy-e:)