• Abu85
    #219
    Az nV féle megoldása az AA problémákra. A Clear Sky ezt használja.
    A deferred renderer is just incompatible with current hardware-assisted antialiasing, unfortunately (Hargreaves and Harris 2004). Thus, antialiasing becomes solely the responsibility of the application and the shader; we cannot rely on the GPU alone. Because aliasing itself arises from the mismatched frequencies of the source signal and of the destination discrete representation, a good approximation of an antialiasing filter is just a low-pass filter, which is simple blurring. This is a zero-cost operation in the console world, where any TV display works like a low-pass filter anyway. In the PC world, we need an alternative. Our solution was to trade some signal frequency at the discontinuities for smoothness, and to leave other parts of the image intact. This was performed in a way similar to the edge-detection filters used in nonphotorealistic applications: We detect discontinuities in both depth and normal direction by taking 8+1 samples of depth and finding how depth at the current pixel differs from the ideal line passed through opposite corner points. The normals were used to fix issues such as a wall perpendicular to the floor, where the depth forms a perfect line (or will be similar at all samples) but an aliased edge exists. The normals were processed in a similar cross-filter manner, and the dot product between normals was used to determine the presence of an edge.

    The two detectors were then multiplied to produce a single value indicating how much the current pixel "looks like an edge." This value was used to offset four bilinear texture lookups into the composited (near-final) back buffer. The result was automatic weighting of samples with a very strong edge-detection policy that seamlessly handles edge and alpha-test/texkill aliasing without blurring other parts of the image.



    struct v2p

    {

    float4 tc0: TEXCOORD0; // Center

    float4 tc1: TEXCOORD1; // Left Top



    float4 tc2: TEXCOORD2; // Right Bottom

    float4 tc3: TEXCOORD3; // Right Top

    float4 tc4: TEXCOORD4; // Left Bottom



    float4 tc5: TEXCOORD5; // Left / Right

    float4 tc6: TEXCOORD6; // Top / Bottom

    };





    /////////////////////////////////////////////////////////////////////

    uniform sampler2D s_distort;

    uniform half4 e_barrier; // x=norm(~.8f), y=depth(~.5f)

    uniform half4 e_weights; // x=norm, y=depth

    uniform half4 e_kernel; // x=norm, y=depth



    /////////////////////////////////////////////////////////////////////

    half4 main(v2p I) : COLOR

    {

    // Normal discontinuity filter

    half3 nc = tex2D(s_normal, I.tc0);

    half4 nd;

    nd.x = dot(nc, (half3)tex2D(s_normal, I.tc1));

    nd.y = dot(nc, (half3)tex2D(s_normal, I.tc2));

    nd.z = dot(nc, (half3)tex2D(s_normal, I.tc3));

    nd.w = dot(nc, (half3)tex2D(s_normal, I.tc4));

    nd -= e_barrier.x;

    nd = step(0, nd);

    half ne = saturate(dot(nd, e_weights.x));



    // Opposite coords



    float4 tc5r = I.tc5.wzyx;

    float4 tc6r = I.tc6.wzyx;



    // Depth filter : compute gradiental difference:

    // (c-sample1)+(c-sample1_opposite)

    half4 dc = tex2D(s_position, I.tc0);

    half4 dd;

    dd.x = (half)tex2D(s_position, I.tc1).z +

    (half)tex2D(s_position, I.tc2).z;

    dd.y = (half)tex2D(s_position, I.tc3).z +

    (half)tex2D(s_position, I.tc4).z;

    dd.z = (half)tex2D(s_position, I.tc5).z +

    (half)tex2D(s_position, tc5r).z;

    dd.w = (half)tex2D(s_position, I.tc6).z +

    (half)tex2D(s_position, tc6r).z;

    dd = abs(2 * dc.z - dd)- e_barrier.y;

    dd = step(dd, 0);

    half de = saturate(dot(dd, e_weights.y));



    // Weight



    half w = (1 - de * ne) * e_kernel.x; // 0 - no aa, 1=full aa



    // Smoothed color

    // (a-c)*w + c = a*w + c(1-w)

    float2 offset = I.tc0 * (1-w);

    half4 s0 = tex2D(s_image, offset + I.tc1 * w);

    half4 s1 = tex2D(s_image, offset + I.tc2 * w);

    half4 s2 = tex2D(s_image, offset + I.tc3 * w);

    half4 s3 = tex2D(s_image, offset + I.tc4 * w);

    return (s0 + s1 + s2 + s3)/4.h;

    }

    There is one side note to this approach: the parameters/delimiters tweaked for one resolution do not necessarily work well for another; even worse, they often do not work at all. That is because the lower the resolution, the more source signal is lost during discretization, and blurring becomes a worse approximation of an antialiasing filter. Visually, you get more and more false positives, and the picture becomes more blurred than necessary. However, lowering the blur radius according to resolution works fine.


    Érdekes a dolog. Mindenesetre aki idegeskedik a sebességet illetően azt megnyugtatom, hogy GeForce 280GTX SLI-vel már gyors az Edge Detect algoritmus ... persze a fejlesztők szerint.

    Nem értem miért kell ilyen öszvér megoldásokkal baxakodni. Ugyan ezt tudja a DX10.1 Multisample Acces eljárása sebességvesztés nélkül. Erre most az nV talált egy megoldást amihez nem kell DX10.1, viszont a sebességvesztés mérhetetlenül nagy (kb. tizede az eredeti AA nélküli sebességnek).