TIMER| InitVfs: 53.0498 ms TIMER| InitScripting: 6.52279 ms TIMER| CONFIG_Init: 9.5219 ms r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], GENERIC[0] 0: MOV OUT[0], IN[0] 1: MOV OUT[1], IN[1] 2: END Vertex Program: before compilation # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'transform loops' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MOV output[0], input[0]; 2: MOV output[2], input[0]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MOV output[0], input[0]; 2: MOV output[2], input[0]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MOV output[0], input[0]; 2: MOV output[2], input[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MOV output[0], input[0]; 2: MOV output[2], input[0]; Final vertex program code: 0: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 1: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src1: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 2: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src1: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 r300: DRM version: 2.7.0, Name: ATI RV530, ID: 0x71c5, GB: 1, Z: 2 r300: GART size: 509 MB, VRAM size: 256 MB r300: AA compression: NO, Z compression: NO, HiZ: NO TIMER| RunHardwareDetection: 3.60617 ms TIMER| InitRenderer: 57.93 ms r300: Initial fragment program FRAG DCL IN[0], GENERIC[0], LINEAR DCL OUT[0], COLOR DCL SAMP[0] 0: TEX OUT[0], IN[0], SAMP[0], 2D 1: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX output[0], input[0], 2D[0]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX output[0], input[0], 2D[0]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX output[0], input[0], 2D[0]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX output[0], input[0], 2D[0]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[1], input[0], 2D[0]; 1: MOV output[0], temp[1]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[1], input[0], 2D[0]; 1: MOV output[0], temp[1]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TEX temp[1], input[0].xy__, 2D[0]; 1: MOV output[0], temp[1]; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TEX temp[1], input[0].xy__, 2D[0]; 1: MOV output[0], temp[1]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[1], input[0].xy__, 2D[0]; 1: MOV output[0], temp[1]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[1], input[0].xy__, 2D[0]; 1: MOV output[0], temp[1]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[1], input[0].xy__, 2D[0]; 1: src0.xyz = temp[1], src0.w = temp[1] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[1], input[0].xy__, 2D[0]; 2: src0.xyz = temp[1], src0.w = temp[1] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[0], temp[0].xy__, 2D[0]; 2: src0.xyz = temp[0], src0.w = temp[0] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02400000: id: 0 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe400f400: src: 0 R/G/A/A dst: 0 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00078005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 GAME STARTED, ALL INIT COMPLETE TIMER| ps_console: 3.26276 ms TIMER| ps_lang_hotkeys: 1.01716 ms TIMER| common/setup.xml: 1.9457 ms TIMER| common/styles.xml: 535.54 us TIMER| common/sprite1.xml: 3.934 ms TIMER| common/icon_sprites.xml: 6.96251 ms TIMER| session/sprites.xml: 7.10715 ms TIMER| session/styles.xml: 396.906 us TIMER| session/session.xml: 71.001 ms TIMER| common/global.xml: 1.46694 ms r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], COLOR DCL CONST[0..3] DCL TEMP[0] 0: MUL TEMP[0], IN[0].xxxx, CONST[0] 1: MAD TEMP[0], IN[0].yyyy, CONST[1], TEMP[0] 2: MAD TEMP[0], IN[0].zzzz, CONST[2], TEMP[0] 3: MAD OUT[0], IN[0].wwww, CONST[3], TEMP[0] 4: MOV OUT[1], IN[1] 5: END Vertex Program: before compilation # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'transform loops' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Final vertex program code: 0: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 1: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 5: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 6: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 7 Instructions ~ 0 Flow Control Instructions ~ 1 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG DCL IN[0], GENERIC[0], LINEAR DCL OUT[0], COLOR 0: MOV OUT[0], IN[0] 1: END Fragment Program: before compilation # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: src0.xyz = temp[0], src0.w = temp[0] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00078005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 r300: Initial vertex program VERT DCL IN[0] DCL OUT[0], POSITION DCL CONST[0..3] DCL TEMP[0] 0: MUL TEMP[0], IN[0].xxxx, CONST[0] 1: MAD TEMP[0], IN[0].yyyy, CONST[1], TEMP[0] 2: MAD TEMP[0], IN[0].zzzz, CONST[2], TEMP[0] 3: MAD OUT[0], IN[0].wwww, CONST[3], TEMP[0] 4: END Vertex Program: before compilation # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'transform loops' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[0]; 5: MOV output[1], temp[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[0]; 5: MOV output[1], temp[0]; Final vertex program code: 0: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 1: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 5: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 6 Instructions ~ 0 Flow Control Instructions ~ 1 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL OUT[0], COLOR DCL CONST[0] 0: MOV OUT[0], CONST[0] 1: END Fragment Program: before compilation # Radeon Compiler Program 0: MOV output[0], const[0]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: MOV output[0], const[0]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: MOV output[0], const[0]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: MOV output[0], const[0]; Fragment Program: after 'saturate output writes' # Radeon Compiler Program 0: MOV_SAT output[0], const[0]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: MOV_SAT output[0], const[0]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: MOV_SAT output[0], const[0]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: MOV_SAT output[0], const[0]; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV_SAT output[0], const[0]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: MOV_SAT output[0], const[0]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: MOV_SAT output[0], const[0]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: src0.xyz = const[0], src0.w = const[0] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: src0.xyz = const[0], src0.w = const[0] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: src0.xyz = const[0], src0.w = const[0] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x001f8005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08020100:Addr0: 0c, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020100:Addr0: 0c, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], COLOR DCL CONST[0..3] 0: DP4 OUT[0].x, CONST[0], IN[0] 1: DP4 OUT[0].y, CONST[1], IN[0] 2: DP4 OUT[0].z, CONST[2], IN[0] 3: DP4 OUT[0].w, CONST[3], IN[0] 4: MOV OUT[1], IN[1] 5: END Vertex Program: before compilation # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Vertex Program: after 'transform loops' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Final vertex program code: 0: op: 0x00100001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 1: op: 0x00200001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 2: op: 0x00400001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 3: op: 0x00800001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 4: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 5: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 6: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 7 Instructions ~ 0 Flow Control Instructions ~ 1 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], COLOR, LINEAR DCL OUT[0], COLOR 0: MOV OUT[0], IN[0] 1: END Fragment Program: before compilation # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'saturate output writes' # Radeon Compiler Program 0: MOV_SAT output[0], input[0]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: MOV_SAT output[0], input[0]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: MOV_SAT output[0], input[0]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: MOV_SAT output[0], input[0]; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV_SAT output[0], input[0]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: MOV_SAT output[0], input[0]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: MOV_SAT output[0], input[0]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: src0.xyz = temp[0], src0.w = temp[0] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x001f8005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], COLOR DCL CONST[0..2] DCL CONST[4..7] DCL TEMP[0] IMM FLT32 { 1.0000, 0.0000, 0.0000, 0.0000} 0: DP4 TEMP[0].x, CONST[0], IN[0] 1: DP4 TEMP[0].y, CONST[1], IN[0] 2: DP4 TEMP[0].z, CONST[2], IN[0] 3: MOV TEMP[0].w, IMM[0].xxxx 4: DP4 OUT[0].x, CONST[4], TEMP[0] 5: DP4 OUT[0].y, CONST[5], TEMP[0] 6: DP4 OUT[0].z, CONST[6], TEMP[0] 7: DP4 OUT[0].w, CONST[7], TEMP[0] 8: MOV OUT[1], IN[1] 9: END Vertex Program: before compilation # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: MOV temp[0].w, temp[0].1111; 4: DP4 temp[1].x, const[4], temp[0]; 5: DP4 temp[1].y, const[5], temp[0]; 6: DP4 temp[1].z, const[6], temp[0]; 7: DP4 temp[1].w, const[7], temp[0]; 8: MOV output[1], input[1]; 9: MOV output[0], temp[1]; 10: MOV output[2], temp[1]; Vertex Program: after 'transform loops' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: MOV temp[0].w, temp[0].1111; 4: DP4 temp[1].x, const[4], temp[0]; 5: DP4 temp[1].y, const[5], temp[0]; 6: DP4 temp[1].z, const[6], temp[0]; 7: DP4 temp[1].w, const[7], temp[0]; 8: MOV output[1], input[1]; 9: MOV output[0], temp[1]; 10: MOV output[2], temp[1]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: MOV temp[0].w, temp[0].1111; 4: DP4 temp[1].x, const[4], temp[0]; 5: DP4 temp[1].y, const[5], temp[0]; 6: DP4 temp[1].z, const[6], temp[0]; 7: DP4 temp[1].w, const[7], temp[0]; 8: MOV output[1], input[1]; 9: MOV output[0], temp[1]; 10: MOV output[2], temp[1]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: MOV temp[0].w, temp[0].1111; 4: DP4 temp[1].x, const[4], temp[0]; 5: DP4 temp[1].y, const[5], temp[0]; 6: DP4 temp[1].z, const[6], temp[0]; 7: DP4 temp[1].w, const[7], temp[0]; 8: MOV output[1], input[1]; 9: MOV output[0], temp[1]; 10: MOV output[2], temp[1]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: MOV temp[0].w, temp[0].___1; 4: DP4 temp[1].x, const[4], temp[0]; 5: DP4 temp[1].y, const[5], temp[0]; 6: DP4 temp[1].z, const[6], temp[0]; 7: DP4 temp[1].w, const[7], temp[0]; 8: MOV output[1], input[1]; 9: MOV output[0], temp[1]; 10: MOV output[2], temp[1]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: MOV temp[0].w, none.___1; 4: DP4 temp[1].x, const[4], temp[0]; 5: DP4 temp[1].y, const[5], temp[0]; 6: DP4 temp[1].z, const[6], temp[0]; 7: DP4 temp[1].w, const[7], temp[0]; 8: MOV output[1], input[1]; 9: MOV output[0], temp[1]; 10: MOV output[2], temp[1]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: MOV temp[0].w, none.___1; 4: DP4 temp[1].x, const[4], temp[0]; 5: DP4 temp[1].y, const[5], temp[0]; 6: DP4 temp[1].z, const[6], temp[0]; 7: DP4 temp[1].w, const[7], temp[0]; 8: MOV output[1], input[1]; 9: MOV output[0], temp[1]; 10: MOV output[2], temp[1]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: MOV temp[0].w, none.___1; 4: DP4 temp[1].x, const[4], temp[0]; 5: DP4 temp[1].y, const[5], temp[0]; 6: DP4 temp[1].z, const[6], temp[0]; 7: DP4 temp[1].w, const[7], temp[0]; 8: MOV output[1], input[1]; 9: MOV output[0], temp[1]; 10: MOV output[2], temp[1]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: MOV temp[0].w, none.___1; 4: DP4 temp[1].x, const[4], temp[0]; 5: DP4 temp[1].y, const[5], temp[0]; 6: DP4 temp[1].z, const[6], temp[0]; 7: DP4 temp[1].w, const[7], temp[0]; 8: MOV output[1], input[1]; 9: MOV output[0], temp[1]; 10: MOV output[2], temp[1]; Final vertex program code: 0: op: 0x00100001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 1: op: 0x00200001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 2: op: 0x00400001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 3: op: 0x00800003 dst: 0t op: VE_ADD src0: 0x017fe000 reg: 0t swiz: U/ U/ U/ 1 src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 4: op: 0x00102001 dst: 1t op: VE_DOT_PRODUCT src0: 0x00d10082 reg: 4c swiz: X/ Y/ Z/ W src1: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 5: op: 0x00202001 dst: 1t op: VE_DOT_PRODUCT src0: 0x00d100a2 reg: 5c swiz: X/ Y/ Z/ W src1: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 6: op: 0x00402001 dst: 1t op: VE_DOT_PRODUCT src0: 0x00d100c2 reg: 6c swiz: X/ Y/ Z/ W src1: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 7: op: 0x00802001 dst: 1t op: VE_DOT_PRODUCT src0: 0x00d100e2 reg: 7c swiz: X/ Y/ Z/ W src1: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 8: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 9: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 10: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 11 Instructions ~ 0 Flow Control Instructions ~ 2 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], COLOR, LINEAR DCL OUT[0], COLOR 0: MOV OUT[0], IN[0] 1: END Fragment Program: before compilation # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'saturate output writes' # Radeon Compiler Program 0: MOV_SAT output[0], input[0]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: MOV_SAT output[0], input[0]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: MOV_SAT output[0], input[0]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: MOV_SAT output[0], input[0]; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV_SAT output[0], input[0]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: MOV_SAT output[0], input[0]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: MOV_SAT output[0], input[0]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: src0.xyz = temp[0], src0.w = temp[0] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x001f8005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], GENERIC[0] DCL CONST[0..3] 0: DP4 OUT[0].x, CONST[0], IN[0] 1: DP4 OUT[0].y, CONST[1], IN[0] 2: DP4 OUT[0].z, CONST[2], IN[0] 3: DP4 OUT[0].w, CONST[3], IN[0] 4: MOV OUT[1], IN[1] 5: END Vertex Program: before compilation # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Vertex Program: after 'transform loops' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Final vertex program code: 0: op: 0x00100001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 1: op: 0x00200001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 2: op: 0x00400001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 3: op: 0x00800001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 4: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 5: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 6: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 7 Instructions ~ 0 Flow Control Instructions ~ 1 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], GENERIC[0], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] 0: TEX OUT[0], IN[0], SAMP[0], 2D 1: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX output[0], input[0], 2D[0]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX output[0], input[0], 2D[0]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX output[0], input[0], 2D[0]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX output[0], input[0], 2D[0]; Fragment Program: after 'saturate output writes' # Radeon Compiler Program 0: TEX_SAT output[0], input[0], 2D[0]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[1], input[0], 2D[0]; 1: MOV_SAT output[0], temp[1]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[1], input[0], 2D[0]; 1: MOV_SAT output[0], temp[1]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TEX temp[1], input[0].xy__, 2D[0]; 1: MOV_SAT output[0], temp[1]; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TEX temp[1], input[0].xy__, 2D[0]; 1: MOV_SAT output[0], temp[1]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[1], input[0].xy__, 2D[0]; 1: MOV_SAT output[0], temp[1]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[1], input[0].xy__, 2D[0]; 1: MOV_SAT output[0], temp[1]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[1], input[0].xy__, 2D[0]; 1: src0.xyz = temp[1], src0.w = temp[1] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[1], input[0].xy__, 2D[0]; 2: src0.xyz = temp[1], src0.w = temp[1] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[0], temp[0].xy__, 2D[0]; 2: src0.xyz = temp[0], src0.w = temp[0] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02400000: id: 0 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe400f400: src: 0 R/G/A/A dst: 0 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x001f8005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], GENERIC[0] DCL CONST[0..3] DCL TEMP[0] 0: MUL TEMP[0], IN[0].xxxx, CONST[0] 1: MAD TEMP[0], IN[0].yyyy, CONST[1], TEMP[0] 2: MAD TEMP[0], IN[0].zzzz, CONST[2], TEMP[0] 3: MAD OUT[0], IN[0].wwww, CONST[3], TEMP[0] 4: MOV OUT[1], IN[1] 5: END Vertex Program: before compilation # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'transform loops' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Final vertex program code: 0: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 1: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 5: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 6: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 7 Instructions ~ 0 Flow Control Instructions ~ 1 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], GENERIC[0] DCL CONST[0..3] DCL TEMP[0] 0: MUL TEMP[0], IN[0].xxxx, CONST[0] 1: MAD TEMP[0], IN[0].yyyy, CONST[1], TEMP[0] 2: MAD TEMP[0], IN[0].zzzz, CONST[2], TEMP[0] 3: MAD OUT[0], IN[0].wwww, CONST[3], TEMP[0] 4: MOV OUT[1], IN[1] 5: END Vertex Program: before compilation # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'transform loops' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Final vertex program code: 0: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 1: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 5: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 6: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 7 Instructions ~ 0 Flow Control Instructions ~ 1 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], GENERIC[0], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL CONST[0] DCL TEMP[0] 0: TXP TEMP[0], IN[0], SAMP[0], 2D 1: MOV OUT[0].xyz, TEMP[0] 2: MOV OUT[0].w, CONST[0] 3: END Fragment Program: before compilation # Radeon Compiler Program 0: TXP temp[0], input[0], 2D[0]; 1: MOV output[0].xyz, temp[0]; 2: MOV output[0].w, const[0]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TXP temp[0], input[0], 2D[0]; 1: MOV output[0].xyz, temp[0]; 2: MOV output[0].w, const[0]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TXP temp[0], input[0], 2D[0]; 1: MOV output[0].xyz, temp[0]; 2: MOV output[0].w, const[0]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TXP temp[0], input[0], 2D[0]; 1: MOV output[0].xyz, temp[0]; 2: MOV output[0].w, const[0]; Fragment Program: after 'saturate output writes' # Radeon Compiler Program 0: TXP temp[0], input[0], 2D[0]; 1: MOV_SAT output[0].xyz, temp[0]; 2: MOV_SAT output[0].w, const[0]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TXP temp[0], input[0], 2D[0]; 1: MOV_SAT output[0].xyz, temp[0]; 2: MOV_SAT output[0].w, const[0]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TXP temp[0], input[0], 2D[0]; 1: MOV_SAT output[0].xyz, temp[0]; 2: MOV_SAT output[0].w, const[0]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TXP temp[0].xyz, input[0].xy_w, 2D[0]; 1: MOV_SAT output[0].xyz, temp[0].xyz_; 2: MOV_SAT output[0].w, const[0].___w; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TXP temp[0].xyz, input[0].xy_w, 2D[0]; 1: MOV_SAT output[0].xyz, temp[0].xyz_; 2: MOV_SAT output[0].w, const[0].___w; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TXP temp[0].xyz, input[0].xy_w, 2D[0]; 1: MOV_SAT output[0].xyz, temp[0].xyz_; 2: MOV_SAT output[0].w, const[0].___w; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TXP temp[0].xyz, input[0].xy_w, 2D[0]; 1: MOV_SAT output[0].xyz, temp[0].xyz_; 2: MOV_SAT output[0].w, const[0].___w; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TXP temp[0].xyz, input[0].xy_w, 2D[0]; 1: src0.xyz = temp[0] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 2: src0.w = const[0] MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TXP temp[0].xyz, input[0].xy_w, 2D[0]; 2: src0.xyz = temp[0], src0.w = const[0] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TXP temp[0].xyz, temp[0].xy_w, 2D[0]; 2: src0.xyz = temp[0], src0.w = const[0] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00003807:TEX TEX_WAIT wmask: RGB omask: NONE 1:TEX_INST: 0x02c00000: id: 0 op:PROJ, ACQ, SCALED 2:TEX_ADDR: 0xe400f400: src: 0 R/G/A/A dst: 0 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x001f8005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020100:Addr0: 0c, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL IN[2] DCL OUT[0], POSITION DCL OUT[1], COLOR DCL OUT[2], GENERIC[0] DCL OUT[3], GENERIC[2] DCL OUT[4], GENERIC[3] DCL CONST[0..3] DCL CONST[5..10] DCL TEMP[0] IMM FLT32 { 0.5000, 0.0000, 0.0000, 0.0000} 0: DP4 OUT[0].x, CONST[0], IN[0] 1: DP4 OUT[0].y, CONST[1], IN[0] 2: DP4 OUT[0].z, CONST[2], IN[0] 3: DP4 OUT[0].w, CONST[3], IN[0] 4: MUL TEMP[0], IN[1], IMM[0].xxxx 5: MUL OUT[1], TEMP[0], CONST[5] 6: MOV OUT[2], IN[2] 7: DP4 OUT[3].x, CONST[6], IN[0] 8: DP4 OUT[3].y, CONST[7], IN[0] 9: DP4 OUT[3].z, CONST[8], IN[0] 10: DP4 OUT[3].w, CONST[9], IN[0] 11: MAD OUT[4], IN[0].xzzz, CONST[10].xxxx, CONST[10].yyyy 12: END Vertex Program: before compilation # Radeon Compiler Program 0: DP4 temp[1].x, const[0], input[0]; 1: DP4 temp[1].y, const[1], input[0]; 2: DP4 temp[1].z, const[2], input[0]; 3: DP4 temp[1].w, const[3], input[0]; 4: MUL temp[0], input[1], const[11].xxxx; 5: MUL output[1], temp[0], const[5]; 6: MOV output[2], input[2]; 7: DP4 output[3].x, const[6], input[0]; 8: DP4 output[3].y, const[7], input[0]; 9: DP4 output[3].z, const[8], input[0]; 10: DP4 output[3].w, const[9], input[0]; 11: MAD output[4], input[0].xzzz, const[10].xxxx, const[10].yyyy; 12: MOV output[0], temp[1]; 13: MOV output[5], temp[1]; Vertex Program: after 'transform loops' # Radeon Compiler Program 0: DP4 temp[1].x, const[0], input[0]; 1: DP4 temp[1].y, const[1], input[0]; 2: DP4 temp[1].z, const[2], input[0]; 3: DP4 temp[1].w, const[3], input[0]; 4: MUL temp[0], input[1], const[11].xxxx; 5: MUL output[1], temp[0], const[5]; 6: MOV output[2], input[2]; 7: DP4 output[3].x, const[6], input[0]; 8: DP4 output[3].y, const[7], input[0]; 9: DP4 output[3].z, const[8], input[0]; 10: DP4 output[3].w, const[9], input[0]; 11: MAD output[4], input[0].xzzz, const[10].xxxx, const[10].yyyy; 12: MOV output[0], temp[1]; 13: MOV output[5], temp[1]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: DP4 temp[1].x, const[0], input[0]; 1: DP4 temp[1].y, const[1], input[0]; 2: DP4 temp[1].z, const[2], input[0]; 3: DP4 temp[1].w, const[3], input[0]; 4: MUL temp[0], input[1], const[11].xxxx; 5: MUL output[1], temp[0], const[5]; 6: MOV output[2], input[2]; 7: DP4 output[3].x, const[6], input[0]; 8: DP4 output[3].y, const[7], input[0]; 9: DP4 output[3].z, const[8], input[0]; 10: DP4 output[3].w, const[9], input[0]; 11: MAD output[4], input[0].xzzz, const[10].xxxx, const[10].yyyy; 12: MOV output[0], temp[1]; 13: MOV output[5], temp[1]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: DP4 temp[1].x, const[0], input[0]; 1: DP4 temp[1].y, const[1], input[0]; 2: DP4 temp[1].z, const[2], input[0]; 3: DP4 temp[1].w, const[3], input[0]; 4: MUL temp[0], input[1], const[11].xxxx; 5: MUL output[1], temp[0], const[5]; 6: MOV output[2], input[2]; 7: DP4 output[3].x, const[6], input[0]; 8: DP4 output[3].y, const[7], input[0]; 9: DP4 output[3].z, const[8], input[0]; 10: DP4 output[3].w, const[9], input[0]; 11: MAD output[4], input[0].xzzz, const[10].xxxx, const[10].yyyy; 12: MOV output[0], temp[1]; 13: MOV output[5], temp[1]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: DP4 temp[1].x, const[0], input[0]; 1: DP4 temp[1].y, const[1], input[0]; 2: DP4 temp[1].z, const[2], input[0]; 3: DP4 temp[1].w, const[3], input[0]; 4: MUL temp[0], input[1], const[11].xxxx; 5: MUL output[1], temp[0], const[5]; 6: MOV output[2], input[2]; 7: DP4 output[3].x, const[6], input[0]; 8: DP4 output[3].y, const[7], input[0]; 9: DP4 output[3].z, const[8], input[0]; 10: DP4 output[3].w, const[9], input[0]; 11: MAD output[4], input[0].xzzz, const[10].xxxx, const[10].yyyy; 12: MOV output[0], temp[1]; 13: MOV output[5], temp[1]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: DP4 temp[1].x, const[0], input[0]; 1: DP4 temp[1].y, const[1], input[0]; 2: DP4 temp[1].z, const[2], input[0]; 3: DP4 temp[1].w, const[3], input[0]; 4: MUL temp[0], input[1], const[11].xxxx; 5: MUL output[1], temp[0], const[5]; 6: MOV output[2], input[2]; 7: DP4 output[3].x, const[6], input[0]; 8: DP4 output[3].y, const[7], input[0]; 9: DP4 output[3].z, const[8], input[0]; 10: DP4 output[3].w, const[9], input[0]; 11: MAD output[4], input[0].xzzz, const[10].xxxx, const[10].yyyy; 12: MOV output[0], temp[1]; 13: MOV output[5], temp[1]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: DP4 temp[1].x, const[0], input[0]; 1: DP4 temp[1].y, const[1], input[0]; 2: DP4 temp[1].z, const[2], input[0]; 3: DP4 temp[1].w, const[3], input[0]; 4: MUL temp[0], input[1], const[11].xxxx; 5: MUL output[1], temp[0], const[5]; 6: MOV output[2], input[2]; 7: DP4 output[3].x, const[6], input[0]; 8: DP4 output[3].y, const[7], input[0]; 9: DP4 output[3].z, const[8], input[0]; 10: DP4 output[3].w, const[9], input[0]; 11: MAD output[4], input[0].xzzz, const[10].xxxx, const[10].yyyy; 12: MOV output[0], temp[1]; 13: MOV output[5], temp[1]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: MUL temp[1], input[1], const[11].xxxx; 5: MUL output[1], temp[1], const[5]; 6: MOV output[2], input[2]; 7: DP4 output[3].x, const[6], input[0]; 8: DP4 output[3].y, const[7], input[0]; 9: DP4 output[3].z, const[8], input[0]; 10: DP4 output[3].w, const[9], input[0]; 11: MAD output[4], input[0].xzzz, const[10].xxxx, const[10].yyyy; 12: MOV output[0], temp[0]; 13: MOV output[5], temp[0]; CONST[11] = { 0.5000 0.0000 0.0000 0.0000 } Vertex Program: after 'dead constants' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: MUL temp[1], input[1], const[11].xxxx; 5: MUL output[1], temp[1], const[5]; 6: MOV output[2], input[2]; 7: DP4 output[3].x, const[6], input[0]; 8: DP4 output[3].y, const[7], input[0]; 9: DP4 output[3].z, const[8], input[0]; 10: DP4 output[3].w, const[9], input[0]; 11: MAD output[4], input[0].xzzz, const[10].xxxx, const[10].yyyy; 12: MOV output[0], temp[0]; 13: MOV output[5], temp[0]; Final vertex program code: 0: op: 0x00100001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 1: op: 0x00200001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 2: op: 0x00400001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 3: op: 0x00800001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 4: op: 0x00f02002 dst: 1t op: VE_MULTIPLY src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W src1: 0x00000162 reg: 11c swiz: X/ X/ X/ X src2: 0x01248162 reg: 11c swiz: 0/ 0/ 0/ 0 5: op: 0x00f02202 dst: 1o op: VE_MULTIPLY src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x00d100a2 reg: 5c swiz: X/ Y/ Z/ W src2: 0x012480a2 reg: 5c swiz: 0/ 0/ 0/ 0 6: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10041 reg: 2i swiz: X/ Y/ Z/ W src1: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 src2: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 7: op: 0x00106201 dst: 3o op: VE_DOT_PRODUCT src0: 0x00d100c2 reg: 6c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 8: op: 0x00206201 dst: 3o op: VE_DOT_PRODUCT src0: 0x00d100e2 reg: 7c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 9: op: 0x00406201 dst: 3o op: VE_DOT_PRODUCT src0: 0x00d10102 reg: 8c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 10: op: 0x00806201 dst: 3o op: VE_DOT_PRODUCT src0: 0x00d10122 reg: 9c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 11: op: 0x00f08204 dst: 4o op: VE_MULTIPLY_ADD src0: 0x00920001 reg: 0i swiz: X/ Z/ Z/ Z src1: 0x00000142 reg: 10c swiz: X/ X/ X/ X src2: 0x00492142 reg: 10c swiz: Y/ Y/ Y/ Y 12: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 13: op: 0x00f0a203 dst: 5o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 14 Instructions ~ 0 Flow Control Instructions ~ 2 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], COLOR, LINEAR DCL IN[1], GENERIC[0], PERSPECTIVE DCL IN[2], GENERIC[2], PERSPECTIVE DCL IN[3], GENERIC[3], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL SAMP[2] DCL SAMP[3] DCL CONST[1] DCL TEMP[0..3] IMM FLT32 { 2.0000, 0.0000, 0.0000, 0.0000} 0: TEX TEMP[0], IN[1], SAMP[0], 2D 1: TEX TEMP[1], IN[2], SAMP[2], SHADOW2D 2: MUL TEMP[2].xyz, IN[0], IMM[0].xxxx 3: MAD_SAT TEMP[1].xyz, TEMP[2], TEMP[1], CONST[1] 4: MUL TEMP[0].xyz, TEMP[0], TEMP[1] 5: TEX TEMP[3].w, IN[3], SAMP[3], 2D 6: MUL TEMP[0].xyz, TEMP[0], TEMP[3].wwww 7: MOV OUT[0].xyz, TEMP[0] 8: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: TEX temp[1], input[2], 2DSHADOW[2]; 2: MUL temp[2].xyz, input[0], const[2].xxxx; 3: MAD_SAT temp[1].xyz, temp[2], temp[1], const[1]; 4: MUL temp[0].xyz, temp[0], temp[1]; 5: TEX temp[3].w, input[3], 2D[3]; 6: MUL temp[0].xyz, temp[0], temp[3].wwww; 7: MOV output[0].xyz, temp[0]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: TEX temp[1], input[2], 2DSHADOW[2]; 2: MUL temp[2].xyz, input[0], const[2].xxxx; 3: MAD_SAT temp[1].xyz, temp[2], temp[1], const[1]; 4: MUL temp[0].xyz, temp[0], temp[1]; 5: TEX temp[3].w, input[3], 2D[3]; 6: MUL temp[0].xyz, temp[0], temp[3].wwww; 7: MOV output[0].xyz, temp[0]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: TEX temp[1], input[2], 2DSHADOW[2]; 2: MUL temp[2].xyz, input[0], const[2].xxxx; 3: MAD_SAT temp[1].xyz, temp[2], temp[1], const[1]; 4: MUL temp[0].xyz, temp[0], temp[1]; 5: TEX temp[3].w, input[3], 2D[3]; 6: MUL temp[0].xyz, temp[0], temp[3].wwww; 7: MOV output[0].xyz, temp[0]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: TEX temp[1], input[2], 2DSHADOW[2]; 2: MUL temp[2].xyz, input[0], const[2].xxxx; 3: MAD_SAT temp[1].xyz, temp[2], temp[1], const[1]; 4: MUL temp[0].xyz, temp[0], temp[1]; 5: TEX temp[3].w, input[3], 2D[3]; 6: MUL temp[0].xyz, temp[0], temp[3].wwww; 7: MOV output[0].xyz, temp[0]; Fragment Program: after 'saturate output writes' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: TEX temp[1], input[2], 2DSHADOW[2]; 2: MUL temp[2].xyz, input[0], const[2].xxxx; 3: MAD_SAT temp[1].xyz, temp[2], temp[1], const[1]; 4: MUL temp[0].xyz, temp[0], temp[1]; 5: TEX temp[3].w, input[3], 2D[3]; 6: MUL temp[0].xyz, temp[0], temp[3].wwww; 7: MOV_SAT output[0].xyz, temp[0]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: TEX temp[4], input[2], 2DSHADOW[2]; 2: MOV_SAT temp[5].w, input[2].zzzz; 3: ADD temp[5].w, -temp[5].wwww, temp[4].xxxx; 4: CMP temp[1], temp[5].www1, none.0001, none.1111; 5: MUL temp[2].xyz, input[0], const[2].xxxx; 6: MAD_SAT temp[1].xyz, temp[2], temp[1], const[1]; 7: MUL temp[0].xyz, temp[0], temp[1]; 8: TEX temp[3].w, input[3], 2D[3]; 9: MUL temp[0].xyz, temp[0], temp[3].wwww; 10: MOV_SAT output[0].xyz, temp[0]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: TEX temp[4], input[2], 2DSHADOW[2]; 2: MOV_SAT temp[5].w, input[2].zzzz; 3: ADD temp[5].w, -temp[5].wwww, temp[4].xxxx; 4: CMP temp[1], temp[5].www1, none.0001, none.1111; 5: MUL temp[2].xyz, input[0], const[2].xxxx; 6: MAD_SAT temp[1].xyz, temp[2], temp[1], const[1]; 7: MUL temp[0].xyz, temp[0], temp[1]; 8: TEX temp[3].w, input[3], 2D[3]; 9: MUL temp[0].xyz, temp[0], temp[3].wwww; 10: MOV_SAT output[0].xyz, temp[0]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TEX temp[0].xyz, input[1].xy__, 2D[0]; 1: TEX temp[4].x, input[2].xy__, 2DSHADOW[2]; 2: MOV_SAT temp[5].w, input[2].___z; 3: ADD temp[5].w, -temp[5].___w, temp[4].___x; 4: CMP temp[1].xyz, temp[5].www_, none.000_, none.111_; 5: MUL temp[2].xyz, input[0].xyz_, const[2].xxx_; 6: MAD_SAT temp[1].xyz, temp[2].xyz_, temp[1].xyz_, const[1].xyz_; 7: MUL temp[0].xyz, temp[0].xyz_, temp[1].xyz_; 8: TEX temp[3].w, input[3].xy__, 2D[3]; 9: MUL temp[0].xyz, temp[0].xyz_, temp[3].www_; 10: MOV_SAT output[0].xyz, temp[0].xyz_; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TEX temp[0].xyz, input[1].xy__, 2D[0]; 1: TEX temp[4].x, input[2].xy__, 2DSHADOW[2]; 2: MOV_SAT temp[5].w, input[2].___z; 3: ADD temp[5].w, -temp[5].___w, temp[4].___x; 4: CMP temp[1].xyz, temp[5].www_, none.000_, none.111_; 5: MUL temp[2].xyz, input[0].xyz_, const[2].xxx_; 6: MAD_SAT temp[1].xyz, temp[2].xyz_, temp[1].xyz_, const[1].xyz_; 7: MUL temp[0].xyz, temp[0].xyz_, temp[1].xyz_; 8: TEX temp[3].w, input[3].xy__, 2D[3]; 9: MUL temp[0].xyz, temp[0].xyz_, temp[3].www_; 10: MOV_SAT output[0].xyz, temp[0].xyz_; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[0].xyz, input[1].xy__, 2D[0]; 1: TEX temp[4].x, input[2].xy__, 2DSHADOW[2]; 2: MOV_SAT temp[5].w, input[2].___z; 3: ADD temp[5].w, -temp[5].___w, temp[4].___x; 4: CMP temp[1].xyz, temp[5].www_, none.000_, none.111_; 5: MUL temp[2].xyz, input[0].xyz_, const[2].xxx_; 6: MAD_SAT temp[1].xyz, temp[2].xyz_, temp[1].xyz_, const[1].xyz_; 7: MUL temp[0].xyz, temp[0].xyz_, temp[1].xyz_; 8: TEX temp[3].w, input[3].xy__, 2D[3]; 9: MUL temp[0].xyz, temp[0].xyz_, temp[3].www_; 10: MOV_SAT output[0].xyz, temp[0].xyz_; CONST[2] = { 2.0000 0.0000 0.0000 0.0000 } Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[0].xyz, input[1].xy__, 2D[0]; 1: TEX temp[4].x, input[2].xy__, 2DSHADOW[2]; 2: MOV_SAT temp[5].w, input[2].___z; 3: ADD temp[5].w, -temp[5].___w, temp[4].___x; 4: CMP temp[1].xyz, temp[5].www_, none.000_, none.111_; 5: MUL temp[2].xyz, input[0].xyz_, const[2].xxx_; 6: MAD_SAT temp[1].xyz, temp[2].xyz_, temp[1].xyz_, const[1].xyz_; 7: MUL temp[0].xyz, temp[0].xyz_, temp[1].xyz_; 8: TEX temp[3].w, input[3].xy__, 2D[3]; 9: MUL temp[0].xyz, temp[0].xyz_, temp[3].www_; 10: MOV_SAT output[0].xyz, temp[0].xyz_; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[0].xyz, input[1].xy__, 2D[0]; 1: TEX temp[4].x, input[2].xy__, 2DSHADOW[2]; 2: src0.xyz = input[2] MAD_SAT temp[5].w, src0.z, src0.1, src0.0 3: src0.xyz = temp[4], src0.w = temp[5] MAD temp[5].w, -src0.w, src0.1, src0.x 4: src0.w = temp[5] CMP temp[1].xyz, src0.111, src0.000, src0.www 5: src0.xyz = input[0], src1.xyz = const[2] MAD temp[2].xyz, src0.xyz, src1.xxx, src0.000 6: src0.xyz = temp[2], src1.xyz = temp[1], src2.xyz = const[1] MAD_SAT temp[1].xyz, src0.xyz, src1.xyz, src2.xyz 7: src0.xyz = temp[0], src1.xyz = temp[1] MAD temp[0].xyz, src0.xyz, src1.xyz, src0.000 8: TEX temp[3].w, input[3].xy__, 2D[3]; 9: src0.xyz = temp[0], src0.w = temp[3] MAD temp[0].xyz, src0.xyz, src0.www, src0.000 10: src0.xyz = temp[0] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[0].xyz, input[1].xy__, 2D[0]; 2: TEX temp[4].x, input[2].xy__, 2DSHADOW[2]; 3: TEX temp[3].w, input[3].xy__, 2D[3]; 4: src0.xyz = input[0], src1.xyz = const[2], src2.xyz = input[2] MAD temp[2].xyz, src0.xyz, src1.xxx, src0.000 MAD_SAT temp[5].w, src2.z, src0.1, src0.0 5: src0.xyz = temp[4], src0.w = temp[5] MAD temp[5].w, -src0.w, src0.1, src0.x 6: src0.w = temp[5] CMP temp[1].xyz, src0.111, src0.000, src0.www 7: src0.xyz = temp[2], src1.xyz = temp[1], src2.xyz = const[1] MAD_SAT temp[1].xyz, src0.xyz, src1.xyz, src2.xyz 8: src0.xyz = temp[0], src1.xyz = temp[1] MAD temp[0].xyz, src0.xyz, src1.xyz, src0.000 9: src0.xyz = temp[0], src0.w = temp[3] MAD temp[0].xyz, src0.xyz, src0.www, src0.000 10: src0.xyz = temp[0] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[1].xyz, temp[1].xy__, 2D[0]; 2: TEX temp[4].x, temp[2].xy__, 2DSHADOW[2]; 3: TEX temp[3].w, temp[3].xy__, 2D[3]; 4: src0.xyz = temp[0], src1.xyz = const[2], src2.xyz = temp[2] MAD temp[2].xyz, src0.xyz, src1.xxx, src0.000 MAD_SAT temp[0].w, src2.z, src0.1, src0.0 5: src0.xyz = temp[4], src0.w = temp[0] MAD temp[0].w, -src0.w, src0.1, src0.x 6: src0.w = temp[0] CMP temp[0].xyz, src0.111, src0.000, src0.www 7: src0.xyz = temp[2], src1.xyz = temp[0], src2.xyz = const[1] MAD_SAT temp[0].xyz, src0.xyz, src1.xyz, src2.xyz 8: src0.xyz = temp[1], src1.xyz = temp[0] MAD temp[1].xyz, src0.xyz, src1.xyz, src0.000 9: src0.xyz = temp[1], src0.w = temp[3] MAD temp[1].xyz, src0.xyz, src0.www, src0.000 10: src0.xyz = temp[1] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 R500 Fragment Program: -------- 0 0:CMN_INST 0x00003807:TEX TEX_WAIT wmask: RGB omask: NONE 1:TEX_INST: 0x02400000: id: 0 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe401f401: src: 1 R/G/A/A dst: 1 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00000807:TEX TEX_WAIT wmask: R omask: NONE 1:TEX_INST: 0x02420000: id: 2 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe404f402: src: 2 R/G/A/A dst: 4 R/G/B/A 3:TEX_DXDY: 0x00000000 2 0:CMN_INST 0x00004007:TEX TEX_WAIT wmask: A omask: NONE 1:TEX_INST: 0x02430000: id: 3 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe403f403: src: 3 R/G/A/A dst: 3 R/G/B/A 3:TEX_DXDY: 0x00000000 3 0:CMN_INST 0x00107804:ALU TEX_WAIT wmask: ARGB omask: NONE 1:RGB_ADDR 0x00240800:Addr0: 0t, Addr1: 2c, Addr2: 2t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00002220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/R/R 0 targ: 0 4 ALPHA_INST:0x00c0a000:MAD dest:0 alp_A_src:2 B 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490020:MAD dest:2 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 4 0:CMN_INST 0x00004004:ALU TEX_WAIT wmask: A omask: NONE 1:RGB_ADDR 0x08020004:Addr0: 4t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x00c2c000:MAD dest:0 alp_A_src:0 A 1 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x00000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 5 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x009206d8:rgb_A_src:0 1/1/1 0 rgb_B_src:0 0/0/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x0036c008:CMP dest:0 rgb_C_src:0 A/A/A 0 alp_C_src:0 R 0 6 0:CMN_INST 0x00083804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x10100002:Addr0: 2t, Addr1: 0t, Addr2: 1c, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00222000:MAD dest:0 rgb_C_src:2 R/G/B 0 alp_C_src:0 R 0 7 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x08000001:Addr0: 1t, Addr1: 0t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490010:MAD dest:1 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 8 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020003:Addr0: 3t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490010:MAD dest:1 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 9 0:CMN_INST 0x000b8005:OUT TEX_WAIT wmask: NONE omask: RGB 1:RGB_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 ~~~~~~~~ FRAGMENT PROGRAM ~~~~~~~ ~ 10 Instructions ~ 6 Vector Instructions (RGB) ~ 2 Scalar Instructions (Alpha) ~ 0 Flow Control Instructions ~ 3 Texture Instructions ~ 0 Presub Operations ~ 5 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL IN[2] DCL IN[3] DCL OUT[0], POSITION DCL OUT[1], COLOR DCL OUT[2], GENERIC[0] DCL OUT[3], GENERIC[1] DCL OUT[4], GENERIC[2] DCL OUT[5], GENERIC[3] DCL CONST[0..3] DCL CONST[5..10] DCL TEMP[0] IMM FLT32 { 0.5000, 0.0000, 0.0000, 0.0000} 0: DP4 OUT[0].x, CONST[0], IN[0] 1: DP4 OUT[0].y, CONST[1], IN[0] 2: DP4 OUT[0].z, CONST[2], IN[0] 3: DP4 OUT[0].w, CONST[3], IN[0] 4: MUL TEMP[0], IN[1], IMM[0].xxxx 5: MUL OUT[1], TEMP[0], CONST[5] 6: MOV OUT[2], IN[2] 7: MOV OUT[3], IN[3] 8: DP4 OUT[4].x, CONST[6], IN[0] 9: DP4 OUT[4].y, CONST[7], IN[0] 10: DP4 OUT[4].z, CONST[8], IN[0] 11: DP4 OUT[4].w, CONST[9], IN[0] 12: MAD OUT[5], IN[0].xzzz, CONST[10].xxxx, CONST[10].yyyy 13: END Vertex Program: before compilation # Radeon Compiler Program 0: DP4 temp[1].x, const[0], input[0]; 1: DP4 temp[1].y, const[1], input[0]; 2: DP4 temp[1].z, const[2], input[0]; 3: DP4 temp[1].w, const[3], input[0]; 4: MUL temp[0], input[1], const[11].xxxx; 5: MUL output[1], temp[0], const[5]; 6: MOV output[2], input[2]; 7: MOV output[3], input[3]; 8: DP4 output[4].x, const[6], input[0]; 9: DP4 output[4].y, const[7], input[0]; 10: DP4 output[4].z, const[8], input[0]; 11: DP4 output[4].w, const[9], input[0]; 12: MAD output[5], input[0].xzzz, const[10].xxxx, const[10].yyyy; 13: MOV output[0], temp[1]; 14: MOV output[6], temp[1]; Vertex Program: after 'transform loops' # Radeon Compiler Program 0: DP4 temp[1].x, const[0], input[0]; 1: DP4 temp[1].y, const[1], input[0]; 2: DP4 temp[1].z, const[2], input[0]; 3: DP4 temp[1].w, const[3], input[0]; 4: MUL temp[0], input[1], const[11].xxxx; 5: MUL output[1], temp[0], const[5]; 6: MOV output[2], input[2]; 7: MOV output[3], input[3]; 8: DP4 output[4].x, const[6], input[0]; 9: DP4 output[4].y, const[7], input[0]; 10: DP4 output[4].z, const[8], input[0]; 11: DP4 output[4].w, const[9], input[0]; 12: MAD output[5], input[0].xzzz, const[10].xxxx, const[10].yyyy; 13: MOV output[0], temp[1]; 14: MOV output[6], temp[1]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: DP4 temp[1].x, const[0], input[0]; 1: DP4 temp[1].y, const[1], input[0]; 2: DP4 temp[1].z, const[2], input[0]; 3: DP4 temp[1].w, const[3], input[0]; 4: MUL temp[0], input[1], const[11].xxxx; 5: MUL output[1], temp[0], const[5]; 6: MOV output[2], input[2]; 7: MOV output[3], input[3]; 8: DP4 output[4].x, const[6], input[0]; 9: DP4 output[4].y, const[7], input[0]; 10: DP4 output[4].z, const[8], input[0]; 11: DP4 output[4].w, const[9], input[0]; 12: MAD output[5], input[0].xzzz, const[10].xxxx, const[10].yyyy; 13: MOV output[0], temp[1]; 14: MOV output[6], temp[1]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: DP4 temp[1].x, const[0], input[0]; 1: DP4 temp[1].y, const[1], input[0]; 2: DP4 temp[1].z, const[2], input[0]; 3: DP4 temp[1].w, const[3], input[0]; 4: MUL temp[0], input[1], const[11].xxxx; 5: MUL output[1], temp[0], const[5]; 6: MOV output[2], input[2]; 7: MOV output[3], input[3]; 8: DP4 output[4].x, const[6], input[0]; 9: DP4 output[4].y, const[7], input[0]; 10: DP4 output[4].z, const[8], input[0]; 11: DP4 output[4].w, const[9], input[0]; 12: MAD output[5], input[0].xzzz, const[10].xxxx, const[10].yyyy; 13: MOV output[0], temp[1]; 14: MOV output[6], temp[1]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: DP4 temp[1].x, const[0], input[0]; 1: DP4 temp[1].y, const[1], input[0]; 2: DP4 temp[1].z, const[2], input[0]; 3: DP4 temp[1].w, const[3], input[0]; 4: MUL temp[0], input[1], const[11].xxxx; 5: MUL output[1], temp[0], const[5]; 6: MOV output[2], input[2]; 7: MOV output[3], input[3]; 8: DP4 output[4].x, const[6], input[0]; 9: DP4 output[4].y, const[7], input[0]; 10: DP4 output[4].z, const[8], input[0]; 11: DP4 output[4].w, const[9], input[0]; 12: MAD output[5], input[0].xzzz, const[10].xxxx, const[10].yyyy; 13: MOV output[0], temp[1]; 14: MOV output[6], temp[1]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: DP4 temp[1].x, const[0], input[0]; 1: DP4 temp[1].y, const[1], input[0]; 2: DP4 temp[1].z, const[2], input[0]; 3: DP4 temp[1].w, const[3], input[0]; 4: MUL temp[0], input[1], const[11].xxxx; 5: MUL output[1], temp[0], const[5]; 6: MOV output[2], input[2]; 7: MOV output[3], input[3]; 8: DP4 output[4].x, const[6], input[0]; 9: DP4 output[4].y, const[7], input[0]; 10: DP4 output[4].z, const[8], input[0]; 11: DP4 output[4].w, const[9], input[0]; 12: MAD output[5], input[0].xzzz, const[10].xxxx, const[10].yyyy; 13: MOV output[0], temp[1]; 14: MOV output[6], temp[1]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: DP4 temp[1].x, const[0], input[0]; 1: DP4 temp[1].y, const[1], input[0]; 2: DP4 temp[1].z, const[2], input[0]; 3: DP4 temp[1].w, const[3], input[0]; 4: MUL temp[0], input[1], const[11].xxxx; 5: MUL output[1], temp[0], const[5]; 6: MOV output[2], input[2]; 7: MOV output[3], input[3]; 8: DP4 output[4].x, const[6], input[0]; 9: DP4 output[4].y, const[7], input[0]; 10: DP4 output[4].z, const[8], input[0]; 11: DP4 output[4].w, const[9], input[0]; 12: MAD output[5], input[0].xzzz, const[10].xxxx, const[10].yyyy; 13: MOV output[0], temp[1]; 14: MOV output[6], temp[1]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: MUL temp[1], input[1], const[11].xxxx; 5: MUL output[1], temp[1], const[5]; 6: MOV output[2], input[2]; 7: MOV output[3], input[3]; 8: DP4 output[4].x, const[6], input[0]; 9: DP4 output[4].y, const[7], input[0]; 10: DP4 output[4].z, const[8], input[0]; 11: DP4 output[4].w, const[9], input[0]; 12: MAD output[5], input[0].xzzz, const[10].xxxx, const[10].yyyy; 13: MOV output[0], temp[0]; 14: MOV output[6], temp[0]; CONST[11] = { 0.5000 0.0000 0.0000 0.0000 } Vertex Program: after 'dead constants' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: MUL temp[1], input[1], const[11].xxxx; 5: MUL output[1], temp[1], const[5]; 6: MOV output[2], input[2]; 7: MOV output[3], input[3]; 8: DP4 output[4].x, const[6], input[0]; 9: DP4 output[4].y, const[7], input[0]; 10: DP4 output[4].z, const[8], input[0]; 11: DP4 output[4].w, const[9], input[0]; 12: MAD output[5], input[0].xzzz, const[10].xxxx, const[10].yyyy; 13: MOV output[0], temp[0]; 14: MOV output[6], temp[0]; Final vertex program code: 0: op: 0x00100001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 1: op: 0x00200001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 2: op: 0x00400001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 3: op: 0x00800001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 4: op: 0x00f02002 dst: 1t op: VE_MULTIPLY src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W src1: 0x00000162 reg: 11c swiz: X/ X/ X/ X src2: 0x01248162 reg: 11c swiz: 0/ 0/ 0/ 0 5: op: 0x00f02202 dst: 1o op: VE_MULTIPLY src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x00d100a2 reg: 5c swiz: X/ Y/ Z/ W src2: 0x012480a2 reg: 5c swiz: 0/ 0/ 0/ 0 6: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10041 reg: 2i swiz: X/ Y/ Z/ W src1: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 src2: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 7: op: 0x00f06203 dst: 3o op: VE_ADD src0: 0x00d10061 reg: 3i swiz: X/ Y/ Z/ W src1: 0x01248061 reg: 3i swiz: 0/ 0/ 0/ 0 src2: 0x01248061 reg: 3i swiz: 0/ 0/ 0/ 0 8: op: 0x00108201 dst: 4o op: VE_DOT_PRODUCT src0: 0x00d100c2 reg: 6c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 9: op: 0x00208201 dst: 4o op: VE_DOT_PRODUCT src0: 0x00d100e2 reg: 7c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 10: op: 0x00408201 dst: 4o op: VE_DOT_PRODUCT src0: 0x00d10102 reg: 8c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 11: op: 0x00808201 dst: 4o op: VE_DOT_PRODUCT src0: 0x00d10122 reg: 9c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 12: op: 0x00f0a204 dst: 5o op: VE_MULTIPLY_ADD src0: 0x00920001 reg: 0i swiz: X/ Z/ Z/ Z src1: 0x00000142 reg: 10c swiz: X/ X/ X/ X src2: 0x00492142 reg: 10c swiz: Y/ Y/ Y/ Y 13: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 14: op: 0x00f0c203 dst: 6o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 15 Instructions ~ 0 Flow Control Instructions ~ 2 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], COLOR, LINEAR DCL IN[1], GENERIC[0], PERSPECTIVE DCL IN[2], GENERIC[1], PERSPECTIVE DCL IN[3], GENERIC[2], PERSPECTIVE DCL IN[4], GENERIC[3], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL SAMP[1] DCL SAMP[2] DCL SAMP[3] DCL CONST[1] DCL TEMP[0..3] IMM FLT32 { 1.0000, 2.0000, 0.0000, 0.0000} 0: TEX TEMP[0].w, IN[2], SAMP[1], 2D 1: SUB OUT[0].w, IMM[0].xxxx, TEMP[0].wwww 2: TEX TEMP[1], IN[1], SAMP[0], 2D 3: TEX TEMP[2], IN[3], SAMP[2], SHADOW2D 4: MUL TEMP[3].xyz, IN[0], IMM[0].yyyy 5: MAD_SAT TEMP[2].xyz, TEMP[3], TEMP[2], CONST[1] 6: MUL TEMP[1].xyz, TEMP[1], TEMP[2] 7: TEX TEMP[0].w, IN[4], SAMP[3], 2D 8: MUL TEMP[1].xyz, TEMP[1], TEMP[0].wwww 9: MOV OUT[0].xyz, TEMP[1] 10: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0].w, input[2], 2D[1]; 1: SUB output[0].w, const[2].xxxx, temp[0].wwww; 2: TEX temp[1], input[1], 2D[0]; 3: TEX temp[2], input[3], 2DSHADOW[2]; 4: MUL temp[3].xyz, input[0], const[2].yyyy; 5: MAD_SAT temp[2].xyz, temp[3], temp[2], const[1]; 6: MUL temp[1].xyz, temp[1], temp[2]; 7: TEX temp[0].w, input[4], 2D[3]; 8: MUL temp[1].xyz, temp[1], temp[0].wwww; 9: MOV output[0].xyz, temp[1]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0].w, input[2], 2D[1]; 1: SUB output[0].w, const[2].xxxx, temp[0].wwww; 2: TEX temp[1], input[1], 2D[0]; 3: TEX temp[2], input[3], 2DSHADOW[2]; 4: MUL temp[3].xyz, input[0], const[2].yyyy; 5: MAD_SAT temp[2].xyz, temp[3], temp[2], const[1]; 6: MUL temp[1].xyz, temp[1], temp[2]; 7: TEX temp[0].w, input[4], 2D[3]; 8: MUL temp[1].xyz, temp[1], temp[0].wwww; 9: MOV output[0].xyz, temp[1]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0].w, input[2], 2D[1]; 1: SUB output[0].w, const[2].xxxx, temp[0].wwww; 2: TEX temp[1], input[1], 2D[0]; 3: TEX temp[2], input[3], 2DSHADOW[2]; 4: MUL temp[3].xyz, input[0], const[2].yyyy; 5: MAD_SAT temp[2].xyz, temp[3], temp[2], const[1]; 6: MUL temp[1].xyz, temp[1], temp[2]; 7: TEX temp[0].w, input[4], 2D[3]; 8: MUL temp[1].xyz, temp[1], temp[0].wwww; 9: MOV output[0].xyz, temp[1]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0].w, input[2], 2D[1]; 1: SUB output[0].w, const[2].xxxx, temp[0].wwww; 2: TEX temp[1], input[1], 2D[0]; 3: TEX temp[2], input[3], 2DSHADOW[2]; 4: MUL temp[3].xyz, input[0], const[2].yyyy; 5: MAD_SAT temp[2].xyz, temp[3], temp[2], const[1]; 6: MUL temp[1].xyz, temp[1], temp[2]; 7: TEX temp[0].w, input[4], 2D[3]; 8: MUL temp[1].xyz, temp[1], temp[0].wwww; 9: MOV output[0].xyz, temp[1]; Fragment Program: after 'saturate output writes' # Radeon Compiler Program 0: TEX temp[0].w, input[2], 2D[1]; 1: SUB_SAT output[0].w, const[2].xxxx, temp[0].wwww; 2: TEX temp[1], input[1], 2D[0]; 3: TEX temp[2], input[3], 2DSHADOW[2]; 4: MUL temp[3].xyz, input[0], const[2].yyyy; 5: MAD_SAT temp[2].xyz, temp[3], temp[2], const[1]; 6: MUL temp[1].xyz, temp[1], temp[2]; 7: TEX temp[0].w, input[4], 2D[3]; 8: MUL temp[1].xyz, temp[1], temp[0].wwww; 9: MOV_SAT output[0].xyz, temp[1]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0].w, input[2], 2D[1]; 1: SUB_SAT output[0].w, const[2].xxxx, temp[0].wwww; 2: TEX temp[1], input[1], 2D[0]; 3: TEX temp[4], input[3], 2DSHADOW[2]; 4: MOV_SAT temp[5].w, input[3].zzzz; 5: ADD temp[5].w, -temp[5].wwww, temp[4].xxxx; 6: CMP temp[2], temp[5].www1, none.0001, none.1111; 7: MUL temp[3].xyz, input[0], const[2].yyyy; 8: MAD_SAT temp[2].xyz, temp[3], temp[2], const[1]; 9: MUL temp[1].xyz, temp[1], temp[2]; 10: TEX temp[0].w, input[4], 2D[3]; 11: MUL temp[1].xyz, temp[1], temp[0].wwww; 12: MOV_SAT output[0].xyz, temp[1]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0].w, input[2], 2D[1]; 1: ADD_SAT output[0].w, const[2].xxxx, -temp[0].wwww; 2: TEX temp[1], input[1], 2D[0]; 3: TEX temp[4], input[3], 2DSHADOW[2]; 4: MOV_SAT temp[5].w, input[3].zzzz; 5: ADD temp[5].w, -temp[5].wwww, temp[4].xxxx; 6: CMP temp[2], temp[5].www1, none.0001, none.1111; 7: MUL temp[3].xyz, input[0], const[2].yyyy; 8: MAD_SAT temp[2].xyz, temp[3], temp[2], const[1]; 9: MUL temp[1].xyz, temp[1], temp[2]; 10: TEX temp[0].w, input[4], 2D[3]; 11: MUL temp[1].xyz, temp[1], temp[0].wwww; 12: MOV_SAT output[0].xyz, temp[1]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TEX temp[0].w, input[2].xy__, 2D[1]; 1: ADD_SAT output[0].w, const[2].___x, -temp[0].___w; 2: TEX temp[1].xyz, input[1].xy__, 2D[0]; 3: TEX temp[4].x, input[3].xy__, 2DSHADOW[2]; 4: MOV_SAT temp[5].w, input[3].___z; 5: ADD temp[5].w, -temp[5].___w, temp[4].___x; 6: CMP temp[2].xyz, temp[5].www_, none.000_, none.111_; 7: MUL temp[3].xyz, input[0].xyz_, const[2].yyy_; 8: MAD_SAT temp[2].xyz, temp[3].xyz_, temp[2].xyz_, const[1].xyz_; 9: MUL temp[1].xyz, temp[1].xyz_, temp[2].xyz_; 10: TEX temp[0].w, input[4].xy__, 2D[3]; 11: MUL temp[1].xyz, temp[1].xyz_, temp[0].www_; 12: MOV_SAT output[0].xyz, temp[1].xyz_; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TEX temp[0].w, input[2].xy__, 2D[1]; 1: ADD_SAT output[0].w, none.___1, -temp[0].___w; 2: TEX temp[1].xyz, input[1].xy__, 2D[0]; 3: TEX temp[4].x, input[3].xy__, 2DSHADOW[2]; 4: MOV_SAT temp[5].w, input[3].___z; 5: ADD temp[5].w, -temp[5].___w, temp[4].___x; 6: CMP temp[2].xyz, temp[5].www_, none.000_, none.111_; 7: MUL temp[3].xyz, input[0].xyz_, const[2].yyy_; 8: MAD_SAT temp[2].xyz, temp[3].xyz_, temp[2].xyz_, const[1].xyz_; 9: MUL temp[1].xyz, temp[1].xyz_, temp[2].xyz_; 10: TEX temp[0].w, input[4].xy__, 2D[3]; 11: MUL temp[1].xyz, temp[1].xyz_, temp[0].www_; 12: MOV_SAT output[0].xyz, temp[1].xyz_; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[0].w, input[2].xy__, 2D[1]; 1: ADD_SAT output[0].w, none.___1, -temp[0].___w; 2: TEX temp[1].xyz, input[1].xy__, 2D[0]; 3: TEX temp[4].x, input[3].xy__, 2DSHADOW[2]; 4: MOV_SAT temp[5].w, input[3].___z; 5: ADD temp[5].w, -temp[5].___w, temp[4].___x; 6: CMP temp[2].xyz, temp[5].www_, none.000_, none.111_; 7: MUL temp[3].xyz, input[0].xyz_, const[2].yyy_; 8: MAD_SAT temp[2].xyz, temp[3].xyz_, temp[2].xyz_, const[1].xyz_; 9: MUL temp[1].xyz, temp[1].xyz_, temp[2].xyz_; 10: TEX temp[0].w, input[4].xy__, 2D[3]; 11: MUL temp[1].xyz, temp[1].xyz_, temp[0].www_; 12: MOV_SAT output[0].xyz, temp[1].xyz_; CONST[2] = { 1.0000 2.0000 0.0000 0.0000 } Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[0].w, input[2].xy__, 2D[1]; 1: ADD_SAT output[0].w, none.___1, -temp[0].___w; 2: TEX temp[1].xyz, input[1].xy__, 2D[0]; 3: TEX temp[4].x, input[3].xy__, 2DSHADOW[2]; 4: MOV_SAT temp[5].w, input[3].___z; 5: ADD temp[5].w, -temp[5].___w, temp[4].___x; 6: CMP temp[2].xyz, temp[5].www_, none.000_, none.111_; 7: MUL temp[3].xyz, input[0].xyz_, const[2].yyy_; 8: MAD_SAT temp[2].xyz, temp[3].xyz_, temp[2].xyz_, const[1].xyz_; 9: MUL temp[1].xyz, temp[1].xyz_, temp[2].xyz_; 10: TEX temp[0].w, input[4].xy__, 2D[3]; 11: MUL temp[1].xyz, temp[1].xyz_, temp[0].www_; 12: MOV_SAT output[0].xyz, temp[1].xyz_; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[0].w, input[2].xy__, 2D[1]; 1: src0.w = temp[0] MAD_SAT color[0].w, src0.1, src0.1, -src0.w 2: TEX temp[1].xyz, input[1].xy__, 2D[0]; 3: TEX temp[4].x, input[3].xy__, 2DSHADOW[2]; 4: src0.xyz = input[3] MAD_SAT temp[5].w, src0.z, src0.1, src0.0 5: src0.xyz = temp[4], src0.w = temp[5] MAD temp[5].w, -src0.w, src0.1, src0.x 6: src0.w = temp[5] CMP temp[2].xyz, src0.111, src0.000, src0.www 7: src0.xyz = input[0], src1.xyz = const[2] MAD temp[3].xyz, src0.xyz, src1.yyy, src0.000 8: src0.xyz = temp[3], src1.xyz = temp[2], src2.xyz = const[1] MAD_SAT temp[2].xyz, src0.xyz, src1.xyz, src2.xyz 9: src0.xyz = temp[1], src1.xyz = temp[2] MAD temp[1].xyz, src0.xyz, src1.xyz, src0.000 10: TEX temp[0].w, input[4].xy__, 2D[3]; 11: src0.xyz = temp[1], src0.w = temp[0] MAD temp[1].xyz, src0.xyz, src0.www, src0.000 12: src0.xyz = temp[1] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[0].w, input[2].xy__, 2D[1]; 2: TEX temp[1].xyz, input[1].xy__, 2D[0]; 3: TEX temp[4].x, input[3].xy__, 2DSHADOW[2]; 4: src0.xyz = input[0], src0.w = temp[0], src1.xyz = const[2] MAD temp[3].xyz, src0.xyz, src1.yyy, src0.000 MAD_SAT color[0].w, src0.1, src0.1, -src0.w 5: src0.xyz = input[3] MAD_SAT temp[5].w, src0.z, src0.1, src0.0 6: src0.xyz = temp[4], src0.w = temp[5] MAD temp[5].w, -src0.w, src0.1, src0.x 7: src0.w = temp[5] CMP temp[2].xyz, src0.111, src0.000, src0.www 8: src0.xyz = temp[3], src1.xyz = temp[2], src2.xyz = const[1] MAD_SAT temp[2].xyz, src0.xyz, src1.xyz, src2.xyz 9: src0.xyz = temp[1], src1.xyz = temp[2] MAD temp[1].xyz, src0.xyz, src1.xyz, src0.000 10: BEGIN_TEX; 11: TEX temp[0].w, input[4].xy__, 2D[3]; 12: src0.xyz = temp[1], src0.w = temp[0] MAD temp[1].xyz, src0.xyz, src0.www, src0.000 13: src0.xyz = temp[1] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[2].w, temp[2].xy__, 2D[1]; 2: TEX temp[1].xyz, temp[1].xy__, 2D[0]; 3: TEX temp[6].x, temp[3].xy__, 2DSHADOW[2]; 4: src0.xyz = temp[0], src0.w = temp[2], src1.xyz = const[2] MAD temp[5].xyz, src0.xyz, src1.yyy, src0.000 MAD_SAT color[0].w, src0.1, src0.1, -src0.w 5: src0.xyz = temp[3] MAD_SAT temp[0].w, src0.z, src0.1, src0.0 6: src0.xyz = temp[6], src0.w = temp[0] MAD temp[0].w, -src0.w, src0.1, src0.x 7: src0.w = temp[0] CMP temp[0].xyz, src0.111, src0.000, src0.www 8: src0.xyz = temp[5], src1.xyz = temp[0], src2.xyz = const[1] MAD_SAT temp[0].xyz, src0.xyz, src1.xyz, src2.xyz 9: src0.xyz = temp[1], src1.xyz = temp[0] MAD temp[1].xyz, src0.xyz, src1.xyz, src0.000 10: BEGIN_TEX; 11: TEX temp[2].w, temp[4].xy__, 2D[3]; 12: src0.xyz = temp[1], src0.w = temp[2] MAD temp[1].xyz, src0.xyz, src0.www, src0.000 13: src0.xyz = temp[1] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 R500 Fragment Program: -------- 0 0:CMN_INST 0x00004007:TEX TEX_WAIT wmask: A omask: NONE 1:TEX_INST: 0x02410000: id: 1 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe402f402: src: 2 R/G/A/A dst: 2 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00003807:TEX TEX_WAIT wmask: RGB omask: NONE 1:TEX_INST: 0x02400000: id: 0 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe401f401: src: 1 R/G/A/A dst: 1 R/G/B/A 3:TEX_DXDY: 0x00000000 2 0:CMN_INST 0x00000807:TEX TEX_WAIT wmask: R omask: NONE 1:TEX_INST: 0x02420000: id: 2 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe406f403: src: 3 R/G/A/A dst: 6 R/G/B/A 3:TEX_DXDY: 0x00000000 3 0:CMN_INST 0x00143805:OUT TEX_WAIT wmask: RGB omask: A 1:RGB_ADDR 0x08040800:Addr0: 0t, Addr1: 2c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0024a220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 G/G/G 0 targ: 0 4 ALPHA_INST:0x00c18000:MAD dest:0 alp_A_src:0 1 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x58490050:MAD dest:5 rgb_C_src:0 0/0/0 0 alp_C_src:0 A 1 4 0:CMN_INST 0x00104004:ALU TEX_WAIT wmask: A omask: NONE 1:RGB_ADDR 0x08020003:Addr0: 3t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x00c08000:MAD dest:0 alp_A_src:0 B 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 0 0 5 0:CMN_INST 0x00004004:ALU TEX_WAIT wmask: A omask: NONE 1:RGB_ADDR 0x08020006:Addr0: 6t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x00c2c000:MAD dest:0 alp_A_src:0 A 1 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x00000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 6 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x009206d8:rgb_A_src:0 1/1/1 0 rgb_B_src:0 0/0/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x0036c008:CMP dest:0 rgb_C_src:0 A/A/A 0 alp_C_src:0 R 0 7 0:CMN_INST 0x00083804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x10100005:Addr0: 5t, Addr1: 0t, Addr2: 1c, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00222000:MAD dest:0 rgb_C_src:2 R/G/B 0 alp_C_src:0 R 0 8 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x08000001:Addr0: 1t, Addr1: 0t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490010:MAD dest:1 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 9 0:CMN_INST 0x00004007:TEX TEX_WAIT wmask: A omask: NONE 1:TEX_INST: 0x02430000: id: 3 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe402f404: src: 4 R/G/A/A dst: 2 R/G/B/A 3:TEX_DXDY: 0x00000000 10 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490010:MAD dest:1 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 11 0:CMN_INST 0x000b8005:OUT TEX_WAIT wmask: NONE omask: RGB 1:RGB_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 ~~~~~~~~ FRAGMENT PROGRAM ~~~~~~~ ~ 12 Instructions ~ 6 Vector Instructions (RGB) ~ 3 Scalar Instructions (Alpha) ~ 0 Flow Control Instructions ~ 4 Texture Instructions ~ 0 Presub Operations ~ 7 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL IN[2] DCL OUT[0], POSITION DCL OUT[1], COLOR DCL OUT[2], GENERIC[0] DCL OUT[3], GENERIC[2] DCL OUT[4], GENERIC[3] DCL CONST[0..3] DCL CONST[5..10] DCL TEMP[0] IMM FLT32 { 0.5000, 0.0000, 0.0000, 0.0000} 0: DP4 OUT[0].x, CONST[0], IN[0] 1: DP4 OUT[0].y, CONST[1], IN[0] 2: DP4 OUT[0].z, CONST[2], IN[0] 3: DP4 OUT[0].w, CONST[3], IN[0] 4: MUL TEMP[0], IN[1], IMM[0].xxxx 5: MUL OUT[1], TEMP[0], CONST[5] 6: MOV OUT[2], IN[2] 7: DP4 OUT[3].x, CONST[6], IN[0] 8: DP4 OUT[3].y, CONST[7], IN[0] 9: DP4 OUT[3].z, CONST[8], IN[0] 10: DP4 OUT[3].w, CONST[9], IN[0] 11: MAD OUT[4], IN[0].xzzz, CONST[10].xxxx, CONST[10].yyyy 12: END Vertex Program: before compilation # Radeon Compiler Program 0: DP4 temp[1].x, const[0], input[0]; 1: DP4 temp[1].y, const[1], input[0]; 2: DP4 temp[1].z, const[2], input[0]; 3: DP4 temp[1].w, const[3], input[0]; 4: MUL temp[0], input[1], const[11].xxxx; 5: MUL output[1], temp[0], const[5]; 6: MOV output[2], input[2]; 7: DP4 output[3].x, const[6], input[0]; 8: DP4 output[3].y, const[7], input[0]; 9: DP4 output[3].z, const[8], input[0]; 10: DP4 output[3].w, const[9], input[0]; 11: MAD output[4], input[0].xzzz, const[10].xxxx, const[10].yyyy; 12: MOV output[0], temp[1]; 13: MOV output[5], temp[1]; Vertex Program: after 'transform loops' # Radeon Compiler Program 0: DP4 temp[1].x, const[0], input[0]; 1: DP4 temp[1].y, const[1], input[0]; 2: DP4 temp[1].z, const[2], input[0]; 3: DP4 temp[1].w, const[3], input[0]; 4: MUL temp[0], input[1], const[11].xxxx; 5: MUL output[1], temp[0], const[5]; 6: MOV output[2], input[2]; 7: DP4 output[3].x, const[6], input[0]; 8: DP4 output[3].y, const[7], input[0]; 9: DP4 output[3].z, const[8], input[0]; 10: DP4 output[3].w, const[9], input[0]; 11: MAD output[4], input[0].xzzz, const[10].xxxx, const[10].yyyy; 12: MOV output[0], temp[1]; 13: MOV output[5], temp[1]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: DP4 temp[1].x, const[0], input[0]; 1: DP4 temp[1].y, const[1], input[0]; 2: DP4 temp[1].z, const[2], input[0]; 3: DP4 temp[1].w, const[3], input[0]; 4: MUL temp[0], input[1], const[11].xxxx; 5: MUL output[1], temp[0], const[5]; 6: MOV output[2], input[2]; 7: DP4 output[3].x, const[6], input[0]; 8: DP4 output[3].y, const[7], input[0]; 9: DP4 output[3].z, const[8], input[0]; 10: DP4 output[3].w, const[9], input[0]; 11: MAD output[4], input[0].xzzz, const[10].xxxx, const[10].yyyy; 12: MOV output[0], temp[1]; 13: MOV output[5], temp[1]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: DP4 temp[1].x, const[0], input[0]; 1: DP4 temp[1].y, const[1], input[0]; 2: DP4 temp[1].z, const[2], input[0]; 3: DP4 temp[1].w, const[3], input[0]; 4: MUL temp[0], input[1], const[11].xxxx; 5: MUL output[1], temp[0], const[5]; 6: MOV output[2], input[2]; 7: DP4 output[3].x, const[6], input[0]; 8: DP4 output[3].y, const[7], input[0]; 9: DP4 output[3].z, const[8], input[0]; 10: DP4 output[3].w, const[9], input[0]; 11: MAD output[4], input[0].xzzz, const[10].xxxx, const[10].yyyy; 12: MOV output[0], temp[1]; 13: MOV output[5], temp[1]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: DP4 temp[1].x, const[0], input[0]; 1: DP4 temp[1].y, const[1], input[0]; 2: DP4 temp[1].z, const[2], input[0]; 3: DP4 temp[1].w, const[3], input[0]; 4: MUL temp[0], input[1], const[11].xxxx; 5: MUL output[1], temp[0], const[5]; 6: MOV output[2], input[2]; 7: DP4 output[3].x, const[6], input[0]; 8: DP4 output[3].y, const[7], input[0]; 9: DP4 output[3].z, const[8], input[0]; 10: DP4 output[3].w, const[9], input[0]; 11: MAD output[4], input[0].xzzz, const[10].xxxx, const[10].yyyy; 12: MOV output[0], temp[1]; 13: MOV output[5], temp[1]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: DP4 temp[1].x, const[0], input[0]; 1: DP4 temp[1].y, const[1], input[0]; 2: DP4 temp[1].z, const[2], input[0]; 3: DP4 temp[1].w, const[3], input[0]; 4: MUL temp[0], input[1], const[11].xxxx; 5: MUL output[1], temp[0], const[5]; 6: MOV output[2], input[2]; 7: DP4 output[3].x, const[6], input[0]; 8: DP4 output[3].y, const[7], input[0]; 9: DP4 output[3].z, const[8], input[0]; 10: DP4 output[3].w, const[9], input[0]; 11: MAD output[4], input[0].xzzz, const[10].xxxx, const[10].yyyy; 12: MOV output[0], temp[1]; 13: MOV output[5], temp[1]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: DP4 temp[1].x, const[0], input[0]; 1: DP4 temp[1].y, const[1], input[0]; 2: DP4 temp[1].z, const[2], input[0]; 3: DP4 temp[1].w, const[3], input[0]; 4: MUL temp[0], input[1], const[11].xxxx; 5: MUL output[1], temp[0], const[5]; 6: MOV output[2], input[2]; 7: DP4 output[3].x, const[6], input[0]; 8: DP4 output[3].y, const[7], input[0]; 9: DP4 output[3].z, const[8], input[0]; 10: DP4 output[3].w, const[9], input[0]; 11: MAD output[4], input[0].xzzz, const[10].xxxx, const[10].yyyy; 12: MOV output[0], temp[1]; 13: MOV output[5], temp[1]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: MUL temp[1], input[1], const[11].xxxx; 5: MUL output[1], temp[1], const[5]; 6: MOV output[2], input[2]; 7: DP4 output[3].x, const[6], input[0]; 8: DP4 output[3].y, const[7], input[0]; 9: DP4 output[3].z, const[8], input[0]; 10: DP4 output[3].w, const[9], input[0]; 11: MAD output[4], input[0].xzzz, const[10].xxxx, const[10].yyyy; 12: MOV output[0], temp[0]; 13: MOV output[5], temp[0]; CONST[11] = { 0.5000 0.0000 0.0000 0.0000 } Vertex Program: after 'dead constants' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: MUL temp[1], input[1], const[11].xxxx; 5: MUL output[1], temp[1], const[5]; 6: MOV output[2], input[2]; 7: DP4 output[3].x, const[6], input[0]; 8: DP4 output[3].y, const[7], input[0]; 9: DP4 output[3].z, const[8], input[0]; 10: DP4 output[3].w, const[9], input[0]; 11: MAD output[4], input[0].xzzz, const[10].xxxx, const[10].yyyy; 12: MOV output[0], temp[0]; 13: MOV output[5], temp[0]; Final vertex program code: 0: op: 0x00100001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 1: op: 0x00200001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 2: op: 0x00400001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 3: op: 0x00800001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 4: op: 0x00f02002 dst: 1t op: VE_MULTIPLY src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W src1: 0x00000162 reg: 11c swiz: X/ X/ X/ X src2: 0x01248162 reg: 11c swiz: 0/ 0/ 0/ 0 5: op: 0x00f02202 dst: 1o op: VE_MULTIPLY src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x00d100a2 reg: 5c swiz: X/ Y/ Z/ W src2: 0x012480a2 reg: 5c swiz: 0/ 0/ 0/ 0 6: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10041 reg: 2i swiz: X/ Y/ Z/ W src1: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 src2: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 7: op: 0x00106201 dst: 3o op: VE_DOT_PRODUCT src0: 0x00d100c2 reg: 6c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 8: op: 0x00206201 dst: 3o op: VE_DOT_PRODUCT src0: 0x00d100e2 reg: 7c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 9: op: 0x00406201 dst: 3o op: VE_DOT_PRODUCT src0: 0x00d10102 reg: 8c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 10: op: 0x00806201 dst: 3o op: VE_DOT_PRODUCT src0: 0x00d10122 reg: 9c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 11: op: 0x00f08204 dst: 4o op: VE_MULTIPLY_ADD src0: 0x00920001 reg: 0i swiz: X/ Z/ Z/ Z src1: 0x00000142 reg: 10c swiz: X/ X/ X/ X src2: 0x00492142 reg: 10c swiz: Y/ Y/ Y/ Y 12: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 13: op: 0x00f0a203 dst: 5o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 14 Instructions ~ 0 Flow Control Instructions ~ 2 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], COLOR, LINEAR DCL IN[1], GENERIC[0], PERSPECTIVE DCL IN[2], GENERIC[2], PERSPECTIVE DCL IN[3], GENERIC[3], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL SAMP[2] DCL SAMP[3] DCL CONST[1..2] DCL TEMP[0..3] IMM FLT32 { 2.0000, 0.0000, 0.0000, 0.0000} 0: TEX TEMP[0], IN[1], SAMP[0], 2D 1: MOV OUT[0].w, TEMP[0] 2: TEX TEMP[1], IN[2], SAMP[2], SHADOW2D 3: MUL TEMP[2].xyz, IN[0], IMM[0].xxxx 4: MAD_SAT TEMP[1].xyz, TEMP[2], TEMP[1], CONST[1] 5: MUL TEMP[0].xyz, TEMP[0], TEMP[1] 6: TEX TEMP[3].w, IN[3], SAMP[3], 2D 7: MUL TEMP[0].xyz, TEMP[0], TEMP[3].wwww 8: MUL OUT[0].xyz, TEMP[0], CONST[2] 9: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MOV output[0].w, temp[0]; 2: TEX temp[1], input[2], 2DSHADOW[2]; 3: MUL temp[2].xyz, input[0], const[3].xxxx; 4: MAD_SAT temp[1].xyz, temp[2], temp[1], const[1]; 5: MUL temp[0].xyz, temp[0], temp[1]; 6: TEX temp[3].w, input[3], 2D[3]; 7: MUL temp[0].xyz, temp[0], temp[3].wwww; 8: MUL output[0].xyz, temp[0], const[2]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MOV output[0].w, temp[0]; 2: TEX temp[1], input[2], 2DSHADOW[2]; 3: MUL temp[2].xyz, input[0], const[3].xxxx; 4: MAD_SAT temp[1].xyz, temp[2], temp[1], const[1]; 5: MUL temp[0].xyz, temp[0], temp[1]; 6: TEX temp[3].w, input[3], 2D[3]; 7: MUL temp[0].xyz, temp[0], temp[3].wwww; 8: MUL output[0].xyz, temp[0], const[2]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MOV output[0].w, temp[0]; 2: TEX temp[1], input[2], 2DSHADOW[2]; 3: MUL temp[2].xyz, input[0], const[3].xxxx; 4: MAD_SAT temp[1].xyz, temp[2], temp[1], const[1]; 5: MUL temp[0].xyz, temp[0], temp[1]; 6: TEX temp[3].w, input[3], 2D[3]; 7: MUL temp[0].xyz, temp[0], temp[3].wwww; 8: MUL output[0].xyz, temp[0], const[2]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MOV output[0].w, temp[0]; 2: TEX temp[1], input[2], 2DSHADOW[2]; 3: MUL temp[2].xyz, input[0], const[3].xxxx; 4: MAD_SAT temp[1].xyz, temp[2], temp[1], const[1]; 5: MUL temp[0].xyz, temp[0], temp[1]; 6: TEX temp[3].w, input[3], 2D[3]; 7: MUL temp[0].xyz, temp[0], temp[3].wwww; 8: MUL output[0].xyz, temp[0], const[2]; Fragment Program: after 'saturate output writes' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MOV_SAT output[0].w, temp[0]; 2: TEX temp[1], input[2], 2DSHADOW[2]; 3: MUL temp[2].xyz, input[0], const[3].xxxx; 4: MAD_SAT temp[1].xyz, temp[2], temp[1], const[1]; 5: MUL temp[0].xyz, temp[0], temp[1]; 6: TEX temp[3].w, input[3], 2D[3]; 7: MUL temp[0].xyz, temp[0], temp[3].wwww; 8: MUL_SAT output[0].xyz, temp[0], const[2]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MOV_SAT output[0].w, temp[0]; 2: TEX temp[4], input[2], 2DSHADOW[2]; 3: MOV_SAT temp[5].w, input[2].zzzz; 4: ADD temp[5].w, -temp[5].wwww, temp[4].xxxx; 5: CMP temp[1], temp[5].www1, none.0001, none.1111; 6: MUL temp[2].xyz, input[0], const[3].xxxx; 7: MAD_SAT temp[1].xyz, temp[2], temp[1], const[1]; 8: MUL temp[0].xyz, temp[0], temp[1]; 9: TEX temp[3].w, input[3], 2D[3]; 10: MUL temp[0].xyz, temp[0], temp[3].wwww; 11: MUL_SAT output[0].xyz, temp[0], const[2]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MOV_SAT output[0].w, temp[0]; 2: TEX temp[4], input[2], 2DSHADOW[2]; 3: MOV_SAT temp[5].w, input[2].zzzz; 4: ADD temp[5].w, -temp[5].wwww, temp[4].xxxx; 5: CMP temp[1], temp[5].www1, none.0001, none.1111; 6: MUL temp[2].xyz, input[0], const[3].xxxx; 7: MAD_SAT temp[1].xyz, temp[2], temp[1], const[1]; 8: MUL temp[0].xyz, temp[0], temp[1]; 9: TEX temp[3].w, input[3], 2D[3]; 10: MUL temp[0].xyz, temp[0], temp[3].wwww; 11: MUL_SAT output[0].xyz, temp[0], const[2]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TEX temp[0], input[1].xy__, 2D[0]; 1: MOV_SAT output[0].w, temp[0].___w; 2: TEX temp[4].x, input[2].xy__, 2DSHADOW[2]; 3: MOV_SAT temp[5].w, input[2].___z; 4: ADD temp[5].w, -temp[5].___w, temp[4].___x; 5: CMP temp[1].xyz, temp[5].www_, none.000_, none.111_; 6: MUL temp[2].xyz, input[0].xyz_, const[3].xxx_; 7: MAD_SAT temp[1].xyz, temp[2].xyz_, temp[1].xyz_, const[1].xyz_; 8: MUL temp[0].xyz, temp[0].xyz_, temp[1].xyz_; 9: TEX temp[3].w, input[3].xy__, 2D[3]; 10: MUL temp[0].xyz, temp[0].xyz_, temp[3].www_; 11: MUL_SAT output[0].xyz, temp[0].xyz_, const[2].xyz_; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TEX temp[0], input[1].xy__, 2D[0]; 1: MOV_SAT output[0].w, temp[0].___w; 2: TEX temp[4].x, input[2].xy__, 2DSHADOW[2]; 3: MOV_SAT temp[5].w, input[2].___z; 4: ADD temp[5].w, -temp[5].___w, temp[4].___x; 5: CMP temp[1].xyz, temp[5].www_, none.000_, none.111_; 6: MUL temp[2].xyz, input[0].xyz_, const[3].xxx_; 7: MAD_SAT temp[1].xyz, temp[2].xyz_, temp[1].xyz_, const[1].xyz_; 8: MUL temp[0].xyz, temp[0].xyz_, temp[1].xyz_; 9: TEX temp[3].w, input[3].xy__, 2D[3]; 10: MUL temp[0].xyz, temp[0].xyz_, temp[3].www_; 11: MUL_SAT output[0].xyz, temp[0].xyz_, const[2].xyz_; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[0], input[1].xy__, 2D[0]; 1: MOV_SAT output[0].w, temp[0].___w; 2: TEX temp[4].x, input[2].xy__, 2DSHADOW[2]; 3: MOV_SAT temp[5].w, input[2].___z; 4: ADD temp[5].w, -temp[5].___w, temp[4].___x; 5: CMP temp[1].xyz, temp[5].www_, none.000_, none.111_; 6: MUL temp[2].xyz, input[0].xyz_, const[3].xxx_; 7: MAD_SAT temp[1].xyz, temp[2].xyz_, temp[1].xyz_, const[1].xyz_; 8: MUL temp[0].xyz, temp[0].xyz_, temp[1].xyz_; 9: TEX temp[3].w, input[3].xy__, 2D[3]; 10: MUL temp[0].xyz, temp[0].xyz_, temp[3].www_; 11: MUL_SAT output[0].xyz, temp[0].xyz_, const[2].xyz_; CONST[3] = { 2.0000 0.0000 0.0000 0.0000 } Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[0], input[1].xy__, 2D[0]; 1: MOV_SAT output[0].w, temp[0].___w; 2: TEX temp[4].x, input[2].xy__, 2DSHADOW[2]; 3: MOV_SAT temp[5].w, input[2].___z; 4: ADD temp[5].w, -temp[5].___w, temp[4].___x; 5: CMP temp[1].xyz, temp[5].www_, none.000_, none.111_; 6: MUL temp[2].xyz, input[0].xyz_, const[3].xxx_; 7: MAD_SAT temp[1].xyz, temp[2].xyz_, temp[1].xyz_, const[1].xyz_; 8: MUL temp[0].xyz, temp[0].xyz_, temp[1].xyz_; 9: TEX temp[3].w, input[3].xy__, 2D[3]; 10: MUL temp[0].xyz, temp[0].xyz_, temp[3].www_; 11: MUL_SAT output[0].xyz, temp[0].xyz_, const[2].xyz_; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[0], input[1].xy__, 2D[0]; 1: src0.w = temp[0] MAD_SAT color[0].w, src0.w, src0.1, src0.0 2: TEX temp[4].x, input[2].xy__, 2DSHADOW[2]; 3: src0.xyz = input[2] MAD_SAT temp[5].w, src0.z, src0.1, src0.0 4: src0.xyz = temp[4], src0.w = temp[5] MAD temp[5].w, -src0.w, src0.1, src0.x 5: src0.w = temp[5] CMP temp[1].xyz, src0.111, src0.000, src0.www 6: src0.xyz = input[0], src1.xyz = const[3] MAD temp[2].xyz, src0.xyz, src1.xxx, src0.000 7: src0.xyz = temp[2], src1.xyz = temp[1], src2.xyz = const[1] MAD_SAT temp[1].xyz, src0.xyz, src1.xyz, src2.xyz 8: src0.xyz = temp[0], src1.xyz = temp[1] MAD temp[0].xyz, src0.xyz, src1.xyz, src0.000 9: TEX temp[3].w, input[3].xy__, 2D[3]; 10: src0.xyz = temp[0], src0.w = temp[3] MAD temp[0].xyz, src0.xyz, src0.www, src0.000 11: src0.xyz = temp[0], src1.xyz = const[2] MAD_SAT color[0].xyz, src0.xyz, src1.xyz, src0.000 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[0], input[1].xy__, 2D[0]; 2: TEX temp[4].x, input[2].xy__, 2DSHADOW[2]; 3: TEX temp[3].w, input[3].xy__, 2D[3]; 4: src0.xyz = input[0], src0.w = temp[0], src1.xyz = const[3] MAD temp[2].xyz, src0.xyz, src1.xxx, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 5: src0.xyz = input[2] MAD_SAT temp[5].w, src0.z, src0.1, src0.0 6: src0.xyz = temp[4], src0.w = temp[5] MAD temp[5].w, -src0.w, src0.1, src0.x 7: src0.w = temp[5] CMP temp[1].xyz, src0.111, src0.000, src0.www 8: src0.xyz = temp[2], src1.xyz = temp[1], src2.xyz = const[1] MAD_SAT temp[1].xyz, src0.xyz, src1.xyz, src2.xyz 9: src0.xyz = temp[0], src1.xyz = temp[1] MAD temp[0].xyz, src0.xyz, src1.xyz, src0.000 10: src0.xyz = temp[0], src0.w = temp[3] MAD temp[0].xyz, src0.xyz, src0.www, src0.000 11: src0.xyz = temp[0], src1.xyz = const[2] MAD_SAT color[0].xyz, src0.xyz, src1.xyz, src0.000 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[1], temp[1].xy__, 2D[0]; 2: TEX temp[5].x, temp[2].xy__, 2DSHADOW[2]; 3: TEX temp[4].w, temp[3].xy__, 2D[3]; 4: src0.xyz = temp[0], src0.w = temp[1], src1.xyz = const[3] MAD temp[3].xyz, src0.xyz, src1.xxx, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 5: src0.xyz = temp[2] MAD_SAT temp[0].w, src0.z, src0.1, src0.0 6: src0.xyz = temp[5], src0.w = temp[0] MAD temp[0].w, -src0.w, src0.1, src0.x 7: src0.w = temp[0] CMP temp[0].xyz, src0.111, src0.000, src0.www 8: src0.xyz = temp[3], src1.xyz = temp[0], src2.xyz = const[1] MAD_SAT temp[0].xyz, src0.xyz, src1.xyz, src2.xyz 9: src0.xyz = temp[1], src1.xyz = temp[0] MAD temp[1].xyz, src0.xyz, src1.xyz, src0.000 10: src0.xyz = temp[1], src0.w = temp[4] MAD temp[1].xyz, src0.xyz, src0.www, src0.000 11: src0.xyz = temp[1], src1.xyz = const[2] MAD_SAT color[0].xyz, src0.xyz, src1.xyz, src0.000 R500 Fragment Program: -------- 0 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02400000: id: 0 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe401f401: src: 1 R/G/A/A dst: 1 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00000807:TEX TEX_WAIT wmask: R omask: NONE 1:TEX_INST: 0x02420000: id: 2 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe405f402: src: 2 R/G/A/A dst: 5 R/G/B/A 3:TEX_DXDY: 0x00000000 2 0:CMN_INST 0x00004007:TEX TEX_WAIT wmask: A omask: NONE 1:TEX_INST: 0x02430000: id: 3 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe404f403: src: 3 R/G/A/A dst: 4 R/G/B/A 3:TEX_DXDY: 0x00000000 3 0:CMN_INST 0x00143805:OUT TEX_WAIT wmask: RGB omask: A 1:RGB_ADDR 0x08040c00:Addr0: 0t, Addr1: 3c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00002220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/R/R 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490030:MAD dest:3 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 4 0:CMN_INST 0x00104004:ALU TEX_WAIT wmask: A omask: NONE 1:RGB_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x00c08000:MAD dest:0 alp_A_src:0 B 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 0 0 5 0:CMN_INST 0x00004004:ALU TEX_WAIT wmask: A omask: NONE 1:RGB_ADDR 0x08020005:Addr0: 5t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x00c2c000:MAD dest:0 alp_A_src:0 A 1 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x00000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 6 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x009206d8:rgb_A_src:0 1/1/1 0 rgb_B_src:0 0/0/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x0036c008:CMP dest:0 rgb_C_src:0 A/A/A 0 alp_C_src:0 R 0 7 0:CMN_INST 0x00083804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x10100003:Addr0: 3t, Addr1: 0t, Addr2: 1c, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00222000:MAD dest:0 rgb_C_src:2 R/G/B 0 alp_C_src:0 R 0 8 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x08000001:Addr0: 1t, Addr1: 0t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490010:MAD dest:1 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 9 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020004:Addr0: 4t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490010:MAD dest:1 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 10 0:CMN_INST 0x000b8005:OUT TEX_WAIT wmask: NONE omask: RGB 1:RGB_ADDR 0x08040801:Addr0: 1t, Addr1: 2c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 ~~~~~~~~ FRAGMENT PROGRAM ~~~~~~~ ~ 11 Instructions ~ 6 Vector Instructions (RGB) ~ 3 Scalar Instructions (Alpha) ~ 0 Flow Control Instructions ~ 3 Texture Instructions ~ 0 Presub Operations ~ 6 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL IN[2] DCL OUT[0], POSITION DCL OUT[1], COLOR DCL OUT[2], GENERIC[0] DCL OUT[3], GENERIC[1] DCL OUT[4], GENERIC[2] DCL CONST[0..4] DCL CONST[6..11] DCL TEMP[0] IMM FLT32 { 0.0000, 0.5000, 0.0000, 0.0000} 0: DP4 OUT[0].x, CONST[0], IN[0] 1: DP4 OUT[0].y, CONST[1], IN[0] 2: DP4 OUT[0].z, CONST[2], IN[0] 3: DP4 OUT[0].w, CONST[3], IN[0] 4: DP3 TEMP[0], IN[1], -CONST[4] 5: MAX TEMP[0], IMM[0].xxxx, TEMP[0] 6: MUL TEMP[0], TEMP[0], IMM[0].yyyy 7: MUL OUT[1], TEMP[0], CONST[6] 8: MOV OUT[2], IN[2] 9: DP4 OUT[3].x, CONST[7], IN[0] 10: DP4 OUT[3].y, CONST[8], IN[0] 11: DP4 OUT[3].z, CONST[9], IN[0] 12: DP4 OUT[3].w, CONST[10], IN[0] 13: MAD OUT[4], IN[0].xzzz, CONST[11].xxxx, CONST[11].yyyy 14: END Vertex Program: before compilation # Radeon Compiler Program 0: DP4 temp[1].x, const[0], input[0]; 1: DP4 temp[1].y, const[1], input[0]; 2: DP4 temp[1].z, const[2], input[0]; 3: DP4 temp[1].w, const[3], input[0]; 4: DP3 temp[0], input[1], -const[4]; 5: MAX temp[0], const[12].xxxx, temp[0]; 6: MUL temp[0], temp[0], const[12].yyyy; 7: MUL output[1], temp[0], const[6]; 8: MOV output[2], input[2]; 9: DP4 output[3].x, const[7], input[0]; 10: DP4 output[3].y, const[8], input[0]; 11: DP4 output[3].z, const[9], input[0]; 12: DP4 output[3].w, const[10], input[0]; 13: MAD output[4], input[0].xzzz, const[11].xxxx, const[11].yyyy; 14: MOV output[0], temp[1]; 15: MOV output[5], temp[1]; Vertex Program: after 'transform loops' # Radeon Compiler Program 0: DP4 temp[1].x, const[0], input[0]; 1: DP4 temp[1].y, const[1], input[0]; 2: DP4 temp[1].z, const[2], input[0]; 3: DP4 temp[1].w, const[3], input[0]; 4: DP3 temp[0], input[1], -const[4]; 5: MAX temp[0], const[12].xxxx, temp[0]; 6: MUL temp[0], temp[0], const[12].yyyy; 7: MUL output[1], temp[0], const[6]; 8: MOV output[2], input[2]; 9: DP4 output[3].x, const[7], input[0]; 10: DP4 output[3].y, const[8], input[0]; 11: DP4 output[3].z, const[9], input[0]; 12: DP4 output[3].w, const[10], input[0]; 13: MAD output[4], input[0].xzzz, const[11].xxxx, const[11].yyyy; 14: MOV output[0], temp[1]; 15: MOV output[5], temp[1]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: DP4 temp[1].x, const[0], input[0]; 1: DP4 temp[1].y, const[1], input[0]; 2: DP4 temp[1].z, const[2], input[0]; 3: DP4 temp[1].w, const[3], input[0]; 4: DP3 temp[0], input[1], -const[4]; 5: MAX temp[0], const[12].xxxx, temp[0]; 6: MUL temp[0], temp[0], const[12].yyyy; 7: MUL output[1], temp[0], const[6]; 8: MOV output[2], input[2]; 9: DP4 output[3].x, const[7], input[0]; 10: DP4 output[3].y, const[8], input[0]; 11: DP4 output[3].z, const[9], input[0]; 12: DP4 output[3].w, const[10], input[0]; 13: MAD output[4], input[0].xzzz, const[11].xxxx, const[11].yyyy; 14: MOV output[0], temp[1]; 15: MOV output[5], temp[1]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: DP4 temp[1].x, const[0], input[0]; 1: DP4 temp[1].y, const[1], input[0]; 2: DP4 temp[1].z, const[2], input[0]; 3: DP4 temp[1].w, const[3], input[0]; 4: DP4 temp[0], input[1].xyz0, const[4].-x-y-z0; 5: MAX temp[0], const[12].xxxx, temp[0]; 6: MUL temp[0], temp[0], const[12].yyyy; 7: MUL output[1], temp[0], const[6]; 8: MOV output[2], input[2]; 9: DP4 output[3].x, const[7], input[0]; 10: DP4 output[3].y, const[8], input[0]; 11: DP4 output[3].z, const[9], input[0]; 12: DP4 output[3].w, const[10], input[0]; 13: MAD output[4], input[0].xzzz, const[11].xxxx, const[11].yyyy; 14: MOV output[0], temp[1]; 15: MOV output[5], temp[1]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: DP4 temp[1].x, const[0], input[0]; 1: DP4 temp[1].y, const[1], input[0]; 2: DP4 temp[1].z, const[2], input[0]; 3: DP4 temp[1].w, const[3], input[0]; 4: DP4 temp[0], input[1].xyz0, const[4].-x-y-z0; 5: MAX temp[0], const[12].xxxx, temp[0]; 6: MUL temp[0], temp[0], const[12].yyyy; 7: MUL output[1], temp[0], const[6]; 8: MOV output[2], input[2]; 9: DP4 output[3].x, const[7], input[0]; 10: DP4 output[3].y, const[8], input[0]; 11: DP4 output[3].z, const[9], input[0]; 12: DP4 output[3].w, const[10], input[0]; 13: MAD output[4], input[0].xzzz, const[11].xxxx, const[11].yyyy; 14: MOV output[0], temp[1]; 15: MOV output[5], temp[1]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: DP4 temp[1].x, const[0], input[0]; 1: DP4 temp[1].y, const[1], input[0]; 2: DP4 temp[1].z, const[2], input[0]; 3: DP4 temp[1].w, const[3], input[0]; 4: DP4 temp[0], input[1].xyz0, const[4].-x-y-z0; 5: MAX temp[0], none.0000, temp[0]; 6: MUL temp[0], temp[0], const[12].yyyy; 7: MUL output[1], temp[0], const[6]; 8: MOV output[2], input[2]; 9: DP4 output[3].x, const[7], input[0]; 10: DP4 output[3].y, const[8], input[0]; 11: DP4 output[3].z, const[9], input[0]; 12: DP4 output[3].w, const[10], input[0]; 13: MAD output[4], input[0].xzzz, const[11].xxxx, const[11].yyyy; 14: MOV output[0], temp[1]; 15: MOV output[5], temp[1]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: DP4 temp[1].x, const[0], input[0]; 1: DP4 temp[1].y, const[1], input[0]; 2: DP4 temp[1].z, const[2], input[0]; 3: DP4 temp[1].w, const[3], input[0]; 4: DP4 temp[0], input[1].xyz0, const[4].-x-y-z0; 5: MAX temp[0], none.0000, temp[0]; 6: MUL temp[0], temp[0], const[12].yyyy; 7: MUL output[1], temp[0], const[6]; 8: MOV output[2], input[2]; 9: DP4 output[3].x, const[7], input[0]; 10: DP4 output[3].y, const[8], input[0]; 11: DP4 output[3].z, const[9], input[0]; 12: DP4 output[3].w, const[10], input[0]; 13: MAD output[4], input[0].xzzz, const[11].xxxx, const[11].yyyy; 14: MOV output[0], temp[1]; 15: MOV output[5], temp[1]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: DP4 temp[1], input[1].xyz0, const[4].-x-y-z0; 5: MAX temp[1], none.0000, temp[1]; 6: MUL temp[1], temp[1], const[12].yyyy; 7: MUL output[1], temp[1], const[6]; 8: MOV output[2], input[2]; 9: DP4 output[3].x, const[7], input[0]; 10: DP4 output[3].y, const[8], input[0]; 11: DP4 output[3].z, const[9], input[0]; 12: DP4 output[3].w, const[10], input[0]; 13: MAD output[4], input[0].xzzz, const[11].xxxx, const[11].yyyy; 14: MOV output[0], temp[0]; 15: MOV output[5], temp[0]; CONST[12] = { 0.0000 0.5000 0.0000 0.0000 } Vertex Program: after 'dead constants' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: DP4 temp[1], input[1].xyz0, const[4].-x-y-z0; 5: MAX temp[1], none.0000, temp[1]; 6: MUL temp[1], temp[1], const[12].yyyy; 7: MUL output[1], temp[1], const[6]; 8: MOV output[2], input[2]; 9: DP4 output[3].x, const[7], input[0]; 10: DP4 output[3].y, const[8], input[0]; 11: DP4 output[3].z, const[9], input[0]; 12: DP4 output[3].w, const[10], input[0]; 13: MAD output[4], input[0].xzzz, const[11].xxxx, const[11].yyyy; 14: MOV output[0], temp[0]; 15: MOV output[5], temp[0]; Final vertex program code: 0: op: 0x00100001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 1: op: 0x00200001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 2: op: 0x00400001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 3: op: 0x00800001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 4: op: 0x00f02001 dst: 1t op: VE_DOT_PRODUCT src0: 0x01110021 reg: 1i swiz: X/ Y/ Z/ 0 src1: 0x0f110082 reg: 4c swiz: -X/-Y/-Z/ 0 src2: 0x01248082 reg: 4c swiz: 0/ 0/ 0/ 0 5: op: 0x00f02007 dst: 1t op: VE_MAXIMUM src0: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src1: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 6: op: 0x00f02002 dst: 1t op: VE_MULTIPLY src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x00492182 reg: 12c swiz: Y/ Y/ Y/ Y src2: 0x01248182 reg: 12c swiz: 0/ 0/ 0/ 0 7: op: 0x00f02202 dst: 1o op: VE_MULTIPLY src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x00d100c2 reg: 6c swiz: X/ Y/ Z/ W src2: 0x012480c2 reg: 6c swiz: 0/ 0/ 0/ 0 8: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10041 reg: 2i swiz: X/ Y/ Z/ W src1: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 src2: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 9: op: 0x00106201 dst: 3o op: VE_DOT_PRODUCT src0: 0x00d100e2 reg: 7c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 10: op: 0x00206201 dst: 3o op: VE_DOT_PRODUCT src0: 0x00d10102 reg: 8c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 11: op: 0x00406201 dst: 3o op: VE_DOT_PRODUCT src0: 0x00d10122 reg: 9c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 12: op: 0x00806201 dst: 3o op: VE_DOT_PRODUCT src0: 0x00d10142 reg: 10c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 13: op: 0x00f08204 dst: 4o op: VE_MULTIPLY_ADD src0: 0x00920001 reg: 0i swiz: X/ Z/ Z/ Z src1: 0x00000162 reg: 11c swiz: X/ X/ X/ X src2: 0x00492162 reg: 11c swiz: Y/ Y/ Y/ Y 14: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 15: op: 0x00f0a203 dst: 5o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 16 Instructions ~ 0 Flow Control Instructions ~ 2 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], COLOR, LINEAR DCL IN[1], GENERIC[0], PERSPECTIVE DCL IN[2], GENERIC[1], PERSPECTIVE DCL IN[3], GENERIC[2], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL SAMP[1] DCL SAMP[2] DCL CONST[1..2] DCL TEMP[0..3] IMM FLT32 { 2.0000, 0.0000, 0.0000, 0.0000} 0: TEX TEMP[0], IN[1], SAMP[0], 2D 1: MOV TEMP[1].xyz, TEMP[0] 2: TEX TEMP[2], IN[2], SAMP[1], SHADOW2D 3: MUL TEMP[3].xyz, IN[0], IMM[0].xxxx 4: MAD_SAT TEMP[2].xyz, TEMP[3], TEMP[2], CONST[1] 5: MUL TEMP[1].xyz, TEMP[1], TEMP[2] 6: TEX TEMP[0].w, IN[3], SAMP[2], 2D 7: MUL TEMP[1].xyz, TEMP[1], TEMP[0].wwww 8: MUL OUT[0].xyz, TEMP[1], CONST[2] 9: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MOV temp[1].xyz, temp[0]; 2: TEX temp[2], input[2], 2DSHADOW[1]; 3: MUL temp[3].xyz, input[0], const[3].xxxx; 4: MAD_SAT temp[2].xyz, temp[3], temp[2], const[1]; 5: MUL temp[1].xyz, temp[1], temp[2]; 6: TEX temp[0].w, input[3], 2D[2]; 7: MUL temp[1].xyz, temp[1], temp[0].wwww; 8: MUL output[0].xyz, temp[1], const[2]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MOV temp[1].xyz, temp[0]; 2: TEX temp[2], input[2], 2DSHADOW[1]; 3: MUL temp[3].xyz, input[0], const[3].xxxx; 4: MAD_SAT temp[2].xyz, temp[3], temp[2], const[1]; 5: MUL temp[1].xyz, temp[1], temp[2]; 6: TEX temp[0].w, input[3], 2D[2]; 7: MUL temp[1].xyz, temp[1], temp[0].wwww; 8: MUL output[0].xyz, temp[1], const[2]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MOV temp[1].xyz, temp[0]; 2: TEX temp[2], input[2], 2DSHADOW[1]; 3: MUL temp[3].xyz, input[0], const[3].xxxx; 4: MAD_SAT temp[2].xyz, temp[3], temp[2], const[1]; 5: MUL temp[1].xyz, temp[1], temp[2]; 6: TEX temp[0].w, input[3], 2D[2]; 7: MUL temp[1].xyz, temp[1], temp[0].wwww; 8: MUL output[0].xyz, temp[1], const[2]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MOV temp[1].xyz, temp[0]; 2: TEX temp[2], input[2], 2DSHADOW[1]; 3: MUL temp[3].xyz, input[0], const[3].xxxx; 4: MAD_SAT temp[2].xyz, temp[3], temp[2], const[1]; 5: MUL temp[1].xyz, temp[1], temp[2]; 6: TEX temp[0].w, input[3], 2D[2]; 7: MUL temp[1].xyz, temp[1], temp[0].wwww; 8: MUL output[0].xyz, temp[1], const[2]; Fragment Program: after 'saturate output writes' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MOV temp[1].xyz, temp[0]; 2: TEX temp[2], input[2], 2DSHADOW[1]; 3: MUL temp[3].xyz, input[0], const[3].xxxx; 4: MAD_SAT temp[2].xyz, temp[3], temp[2], const[1]; 5: MUL temp[1].xyz, temp[1], temp[2]; 6: TEX temp[0].w, input[3], 2D[2]; 7: MUL temp[1].xyz, temp[1], temp[0].wwww; 8: MUL_SAT output[0].xyz, temp[1], const[2]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MOV temp[1].xyz, temp[0]; 2: TEX temp[4], input[2], 2DSHADOW[1]; 3: MOV_SAT temp[5].w, input[2].zzzz; 4: ADD temp[5].w, -temp[5].wwww, temp[4].xxxx; 5: CMP temp[2], temp[5].www1, none.0001, none.1111; 6: MUL temp[3].xyz, input[0], const[3].xxxx; 7: MAD_SAT temp[2].xyz, temp[3], temp[2], const[1]; 8: MUL temp[1].xyz, temp[1], temp[2]; 9: TEX temp[0].w, input[3], 2D[2]; 10: MUL temp[1].xyz, temp[1], temp[0].wwww; 11: MUL_SAT output[0].xyz, temp[1], const[2]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MOV temp[1].xyz, temp[0]; 2: TEX temp[4], input[2], 2DSHADOW[1]; 3: MOV_SAT temp[5].w, input[2].zzzz; 4: ADD temp[5].w, -temp[5].wwww, temp[4].xxxx; 5: CMP temp[2], temp[5].www1, none.0001, none.1111; 6: MUL temp[3].xyz, input[0], const[3].xxxx; 7: MAD_SAT temp[2].xyz, temp[3], temp[2], const[1]; 8: MUL temp[1].xyz, temp[1], temp[2]; 9: TEX temp[0].w, input[3], 2D[2]; 10: MUL temp[1].xyz, temp[1], temp[0].wwww; 11: MUL_SAT output[0].xyz, temp[1], const[2]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TEX temp[0].xyz, input[1].xy__, 2D[0]; 1: MOV temp[1].xyz, temp[0].xyz_; 2: TEX temp[4].x, input[2].xy__, 2DSHADOW[1]; 3: MOV_SAT temp[5].w, input[2].___z; 4: ADD temp[5].w, -temp[5].___w, temp[4].___x; 5: CMP temp[2].xyz, temp[5].www_, none.000_, none.111_; 6: MUL temp[3].xyz, input[0].xyz_, const[3].xxx_; 7: MAD_SAT temp[2].xyz, temp[3].xyz_, temp[2].xyz_, const[1].xyz_; 8: MUL temp[1].xyz, temp[1].xyz_, temp[2].xyz_; 9: TEX temp[0].w, input[3].xy__, 2D[2]; 10: MUL temp[1].xyz, temp[1].xyz_, temp[0].www_; 11: MUL_SAT output[0].xyz, temp[1].xyz_, const[2].xyz_; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TEX temp[0].xyz, input[1].xy__, 2D[0]; 1: TEX temp[4].x, input[2].xy__, 2DSHADOW[1]; 2: MOV_SAT temp[5].w, input[2].___z; 3: ADD temp[5].w, -temp[5].___w, temp[4].___x; 4: CMP temp[2].xyz, temp[5].www_, none.000_, none.111_; 5: MUL temp[3].xyz, input[0].xyz_, const[3].xxx_; 6: MAD_SAT temp[2].xyz, temp[3].xyz_, temp[2].xyz_, const[1].xyz_; 7: MUL temp[1].xyz, temp[0].xyz_, temp[2].xyz_; 8: TEX temp[0].w, input[3].xy__, 2D[2]; 9: MUL temp[1].xyz, temp[1].xyz_, temp[0].www_; 10: MUL_SAT output[0].xyz, temp[1].xyz_, const[2].xyz_; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[0].xyz, input[1].xy__, 2D[0]; 1: TEX temp[4].x, input[2].xy__, 2DSHADOW[1]; 2: MOV_SAT temp[5].w, input[2].___z; 3: ADD temp[5].w, -temp[5].___w, temp[4].___x; 4: CMP temp[2].xyz, temp[5].www_, none.000_, none.111_; 5: MUL temp[3].xyz, input[0].xyz_, const[3].xxx_; 6: MAD_SAT temp[2].xyz, temp[3].xyz_, temp[2].xyz_, const[1].xyz_; 7: MUL temp[1].xyz, temp[0].xyz_, temp[2].xyz_; 8: TEX temp[0].w, input[3].xy__, 2D[2]; 9: MUL temp[1].xyz, temp[1].xyz_, temp[0].www_; 10: MUL_SAT output[0].xyz, temp[1].xyz_, const[2].xyz_; CONST[3] = { 2.0000 0.0000 0.0000 0.0000 } Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[0].xyz, input[1].xy__, 2D[0]; 1: TEX temp[4].x, input[2].xy__, 2DSHADOW[1]; 2: MOV_SAT temp[5].w, input[2].___z; 3: ADD temp[5].w, -temp[5].___w, temp[4].___x; 4: CMP temp[2].xyz, temp[5].www_, none.000_, none.111_; 5: MUL temp[3].xyz, input[0].xyz_, const[3].xxx_; 6: MAD_SAT temp[2].xyz, temp[3].xyz_, temp[2].xyz_, const[1].xyz_; 7: MUL temp[1].xyz, temp[0].xyz_, temp[2].xyz_; 8: TEX temp[0].w, input[3].xy__, 2D[2]; 9: MUL temp[1].xyz, temp[1].xyz_, temp[0].www_; 10: MUL_SAT output[0].xyz, temp[1].xyz_, const[2].xyz_; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[0].xyz, input[1].xy__, 2D[0]; 1: TEX temp[4].x, input[2].xy__, 2DSHADOW[1]; 2: src0.xyz = input[2] MAD_SAT temp[5].w, src0.z, src0.1, src0.0 3: src0.xyz = temp[4], src0.w = temp[5] MAD temp[5].w, -src0.w, src0.1, src0.x 4: src0.w = temp[5] CMP temp[2].xyz, src0.111, src0.000, src0.www 5: src0.xyz = input[0], src1.xyz = const[3] MAD temp[3].xyz, src0.xyz, src1.xxx, src0.000 6: src0.xyz = temp[3], src1.xyz = temp[2], src2.xyz = const[1] MAD_SAT temp[2].xyz, src0.xyz, src1.xyz, src2.xyz 7: src0.xyz = temp[0], src1.xyz = temp[2] MAD temp[1].xyz, src0.xyz, src1.xyz, src0.000 8: TEX temp[0].w, input[3].xy__, 2D[2]; 9: src0.xyz = temp[1], src0.w = temp[0] MAD temp[1].xyz, src0.xyz, src0.www, src0.000 10: src0.xyz = temp[1], src1.xyz = const[2] MAD_SAT color[0].xyz, src0.xyz, src1.xyz, src0.000 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[0].xyz, input[1].xy__, 2D[0]; 2: TEX temp[4].x, input[2].xy__, 2DSHADOW[1]; 3: TEX temp[0].w, input[3].xy__, 2D[2]; 4: src0.xyz = input[0], src1.xyz = const[3], src2.xyz = input[2] MAD temp[3].xyz, src0.xyz, src1.xxx, src0.000 MAD_SAT temp[5].w, src2.z, src0.1, src0.0 5: src0.xyz = temp[4], src0.w = temp[5] MAD temp[5].w, -src0.w, src0.1, src0.x 6: src0.w = temp[5] CMP temp[2].xyz, src0.111, src0.000, src0.www 7: src0.xyz = temp[3], src1.xyz = temp[2], src2.xyz = const[1] MAD_SAT temp[2].xyz, src0.xyz, src1.xyz, src2.xyz 8: src0.xyz = temp[0], src1.xyz = temp[2] MAD temp[1].xyz, src0.xyz, src1.xyz, src0.000 9: src0.xyz = temp[1], src0.w = temp[0] MAD temp[1].xyz, src0.xyz, src0.www, src0.000 10: src0.xyz = temp[1], src1.xyz = const[2] MAD_SAT color[0].xyz, src0.xyz, src1.xyz, src0.000 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[1].xyz, temp[1].xy__, 2D[0]; 2: TEX temp[4].x, temp[2].xy__, 2DSHADOW[1]; 3: TEX temp[1].w, temp[3].xy__, 2D[2]; 4: src0.xyz = temp[0], src1.xyz = const[3], src2.xyz = temp[2] MAD temp[2].xyz, src0.xyz, src1.xxx, src0.000 MAD_SAT temp[0].w, src2.z, src0.1, src0.0 5: src0.xyz = temp[4], src0.w = temp[0] MAD temp[0].w, -src0.w, src0.1, src0.x 6: src0.w = temp[0] CMP temp[0].xyz, src0.111, src0.000, src0.www 7: src0.xyz = temp[2], src1.xyz = temp[0], src2.xyz = const[1] MAD_SAT temp[0].xyz, src0.xyz, src1.xyz, src2.xyz 8: src0.xyz = temp[1], src1.xyz = temp[0] MAD temp[0].xyz, src0.xyz, src1.xyz, src0.000 9: src0.xyz = temp[0], src0.w = temp[1] MAD temp[0].xyz, src0.xyz, src0.www, src0.000 10: src0.xyz = temp[0], src1.xyz = const[2] MAD_SAT color[0].xyz, src0.xyz, src1.xyz, src0.000 R500 Fragment Program: -------- 0 0:CMN_INST 0x00003807:TEX TEX_WAIT wmask: RGB omask: NONE 1:TEX_INST: 0x02400000: id: 0 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe401f401: src: 1 R/G/A/A dst: 1 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00000807:TEX TEX_WAIT wmask: R omask: NONE 1:TEX_INST: 0x02410000: id: 1 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe404f402: src: 2 R/G/A/A dst: 4 R/G/B/A 3:TEX_DXDY: 0x00000000 2 0:CMN_INST 0x00004007:TEX TEX_WAIT wmask: A omask: NONE 1:TEX_INST: 0x02420000: id: 2 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe401f403: src: 3 R/G/A/A dst: 1 R/G/B/A 3:TEX_DXDY: 0x00000000 3 0:CMN_INST 0x00107804:ALU TEX_WAIT wmask: ARGB omask: NONE 1:RGB_ADDR 0x00240c00:Addr0: 0t, Addr1: 3c, Addr2: 2t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00002220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/R/R 0 targ: 0 4 ALPHA_INST:0x00c0a000:MAD dest:0 alp_A_src:2 B 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490020:MAD dest:2 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 4 0:CMN_INST 0x00004004:ALU TEX_WAIT wmask: A omask: NONE 1:RGB_ADDR 0x08020004:Addr0: 4t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x00c2c000:MAD dest:0 alp_A_src:0 A 1 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x00000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 5 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x009206d8:rgb_A_src:0 1/1/1 0 rgb_B_src:0 0/0/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x0036c008:CMP dest:0 rgb_C_src:0 A/A/A 0 alp_C_src:0 R 0 6 0:CMN_INST 0x00083804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x10100002:Addr0: 2t, Addr1: 0t, Addr2: 1c, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00222000:MAD dest:0 rgb_C_src:2 R/G/B 0 alp_C_src:0 R 0 7 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x08000001:Addr0: 1t, Addr1: 0t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 8 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 9 0:CMN_INST 0x000b8005:OUT TEX_WAIT wmask: NONE omask: RGB 1:RGB_ADDR 0x08040800:Addr0: 0t, Addr1: 2c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 ~~~~~~~~ FRAGMENT PROGRAM ~~~~~~~ ~ 10 Instructions ~ 6 Vector Instructions (RGB) ~ 2 Scalar Instructions (Alpha) ~ 0 Flow Control Instructions ~ 3 Texture Instructions ~ 0 Presub Operations ~ 5 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL IN[2] DCL OUT[0], POSITION DCL OUT[1], COLOR DCL OUT[2], GENERIC[0] DCL OUT[3], GENERIC[1] DCL OUT[4], GENERIC[2] DCL CONST[0..2] DCL CONST[4..14] DCL TEMP[0..2] IMM FLT32 { 1.0000, 0.0000, 0.5000, 0.0000} 0: DP4 TEMP[0].x, CONST[0], IN[0] 1: DP4 TEMP[0].y, CONST[1], IN[0] 2: DP4 TEMP[0].z, CONST[2], IN[0] 3: MOV TEMP[0].w, IMM[0].xxxx 4: DP3 TEMP[1].x, CONST[0], IN[1] 5: DP3 TEMP[1].y, CONST[1], IN[1] 6: DP3 TEMP[1].z, CONST[2], IN[1] 7: DP4 OUT[0].x, CONST[4], TEMP[0] 8: DP4 OUT[0].y, CONST[5], TEMP[0] 9: DP4 OUT[0].z, CONST[6], TEMP[0] 10: DP4 OUT[0].w, CONST[7], TEMP[0] 11: DP3 TEMP[2], TEMP[1], -CONST[8] 12: MAX TEMP[2], IMM[0].yyyy, TEMP[2] 13: MUL TEMP[2], TEMP[2], IMM[0].zzzz 14: MUL OUT[1], TEMP[2], CONST[9] 15: MOV OUT[2], IN[2] 16: DP4 OUT[3].x, CONST[10], TEMP[0] 17: DP4 OUT[3].y, CONST[11], TEMP[0] 18: DP4 OUT[3].z, CONST[12], TEMP[0] 19: DP4 OUT[3].w, CONST[13], TEMP[0] 20: MAD OUT[4], TEMP[0].xzzz, CONST[14].xxxx, CONST[14].yyyy 21: END Vertex Program: before compilation # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: MOV temp[0].w, const[15].xxxx; 4: DP3 temp[1].x, const[0], input[1]; 5: DP3 temp[1].y, const[1], input[1]; 6: DP3 temp[1].z, const[2], input[1]; 7: DP4 temp[3].x, const[4], temp[0]; 8: DP4 temp[3].y, const[5], temp[0]; 9: DP4 temp[3].z, const[6], temp[0]; 10: DP4 temp[3].w, const[7], temp[0]; 11: DP3 temp[2], temp[1], -const[8]; 12: MAX temp[2], const[15].yyyy, temp[2]; 13: MUL temp[2], temp[2], const[15].zzzz; 14: MUL output[1], temp[2], const[9]; 15: MOV output[2], input[2]; 16: DP4 output[3].x, const[10], temp[0]; 17: DP4 output[3].y, const[11], temp[0]; 18: DP4 output[3].z, const[12], temp[0]; 19: DP4 output[3].w, const[13], temp[0]; 20: MAD output[4], temp[0].xzzz, const[14].xxxx, const[14].yyyy; 21: MOV output[0], temp[3]; 22: MOV output[5], temp[3]; Vertex Program: after 'transform loops' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: MOV temp[0].w, const[15].xxxx; 4: DP3 temp[1].x, const[0], input[1]; 5: DP3 temp[1].y, const[1], input[1]; 6: DP3 temp[1].z, const[2], input[1]; 7: DP4 temp[3].x, const[4], temp[0]; 8: DP4 temp[3].y, const[5], temp[0]; 9: DP4 temp[3].z, const[6], temp[0]; 10: DP4 temp[3].w, const[7], temp[0]; 11: DP3 temp[2], temp[1], -const[8]; 12: MAX temp[2], const[15].yyyy, temp[2]; 13: MUL temp[2], temp[2], const[15].zzzz; 14: MUL output[1], temp[2], const[9]; 15: MOV output[2], input[2]; 16: DP4 output[3].x, const[10], temp[0]; 17: DP4 output[3].y, const[11], temp[0]; 18: DP4 output[3].z, const[12], temp[0]; 19: DP4 output[3].w, const[13], temp[0]; 20: MAD output[4], temp[0].xzzz, const[14].xxxx, const[14].yyyy; 21: MOV output[0], temp[3]; 22: MOV output[5], temp[3]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: MOV temp[0].w, const[15].xxxx; 4: DP3 temp[1].x, const[0], input[1]; 5: DP3 temp[1].y, const[1], input[1]; 6: DP3 temp[1].z, const[2], input[1]; 7: DP4 temp[3].x, const[4], temp[0]; 8: DP4 temp[3].y, const[5], temp[0]; 9: DP4 temp[3].z, const[6], temp[0]; 10: DP4 temp[3].w, const[7], temp[0]; 11: DP3 temp[2], temp[1], -const[8]; 12: MAX temp[2], const[15].yyyy, temp[2]; 13: MUL temp[2], temp[2], const[15].zzzz; 14: MUL output[1], temp[2], const[9]; 15: MOV output[2], input[2]; 16: DP4 output[3].x, const[10], temp[0]; 17: DP4 output[3].y, const[11], temp[0]; 18: DP4 output[3].z, const[12], temp[0]; 19: DP4 output[3].w, const[13], temp[0]; 20: MAD output[4], temp[0].xzzz, const[14].xxxx, const[14].yyyy; 21: MOV output[0], temp[3]; 22: MOV output[5], temp[3]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: MOV temp[0].w, const[15].xxxx; 4: DP4 temp[1].x, const[0].xyz0, input[1].xyz0; 5: DP4 temp[1].y, const[1].xyz0, input[1].xyz0; 6: DP4 temp[1].z, const[2].xyz0, input[1].xyz0; 7: DP4 temp[3].x, const[4], temp[0]; 8: DP4 temp[3].y, const[5], temp[0]; 9: DP4 temp[3].z, const[6], temp[0]; 10: DP4 temp[3].w, const[7], temp[0]; 11: DP4 temp[2], temp[1].xyz0, const[8].-x-y-z0; 12: MAX temp[2], const[15].yyyy, temp[2]; 13: MUL temp[2], temp[2], const[15].zzzz; 14: MUL output[1], temp[2], const[9]; 15: MOV output[2], input[2]; 16: DP4 output[3].x, const[10], temp[0]; 17: DP4 output[3].y, const[11], temp[0]; 18: DP4 output[3].z, const[12], temp[0]; 19: DP4 output[3].w, const[13], temp[0]; 20: MAD output[4], temp[0].xzzz, const[14].xxxx, const[14].yyyy; 21: MOV output[0], temp[3]; 22: MOV output[5], temp[3]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: MOV temp[0].w, const[15].___x; 4: DP4 temp[1].x, const[0].xyz0, input[1].xyz0; 5: DP4 temp[1].y, const[1].xyz0, input[1].xyz0; 6: DP4 temp[1].z, const[2].xyz0, input[1].xyz0; 7: DP4 temp[3].x, const[4], temp[0]; 8: DP4 temp[3].y, const[5], temp[0]; 9: DP4 temp[3].z, const[6], temp[0]; 10: DP4 temp[3].w, const[7], temp[0]; 11: DP4 temp[2], temp[1].xyz0, const[8].-x-y-z0; 12: MAX temp[2], const[15].yyyy, temp[2]; 13: MUL temp[2], temp[2], const[15].zzzz; 14: MUL output[1], temp[2], const[9]; 15: MOV output[2], input[2]; 16: DP4 output[3].x, const[10], temp[0]; 17: DP4 output[3].y, const[11], temp[0]; 18: DP4 output[3].z, const[12], temp[0]; 19: DP4 output[3].w, const[13], temp[0]; 20: MAD output[4], temp[0].xzzz, const[14].xxxx, const[14].yyyy; 21: MOV output[0], temp[3]; 22: MOV output[5], temp[3]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: MOV temp[0].w, none.___1; 4: DP4 temp[1].x, const[0].xyz0, input[1].xyz0; 5: DP4 temp[1].y, const[1].xyz0, input[1].xyz0; 6: DP4 temp[1].z, const[2].xyz0, input[1].xyz0; 7: DP4 temp[3].x, const[4], temp[0]; 8: DP4 temp[3].y, const[5], temp[0]; 9: DP4 temp[3].z, const[6], temp[0]; 10: DP4 temp[3].w, const[7], temp[0]; 11: DP4 temp[2], temp[1].xyz0, const[8].-x-y-z0; 12: MAX temp[2], none.0000, temp[2]; 13: MUL temp[2], temp[2], const[15].zzzz; 14: MUL output[1], temp[2], const[9]; 15: MOV output[2], input[2]; 16: DP4 output[3].x, const[10], temp[0]; 17: DP4 output[3].y, const[11], temp[0]; 18: DP4 output[3].z, const[12], temp[0]; 19: DP4 output[3].w, const[13], temp[0]; 20: MAD output[4], temp[0].xzzz, const[14].xxxx, const[14].yyyy; 21: MOV output[0], temp[3]; 22: MOV output[5], temp[3]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: MOV temp[0].w, none.___1; 4: DP4 temp[1].x, const[0].xyz0, input[1].xyz0; 5: DP4 temp[1].y, const[1].xyz0, input[1].xyz0; 6: DP4 temp[1].z, const[2].xyz0, input[1].xyz0; 7: DP4 temp[3].x, const[4], temp[0]; 8: DP4 temp[3].y, const[5], temp[0]; 9: DP4 temp[3].z, const[6], temp[0]; 10: DP4 temp[3].w, const[7], temp[0]; 11: DP4 temp[2], temp[1].xyz0, const[8].-x-y-z0; 12: MAX temp[2], none.0000, temp[2]; 13: MUL temp[2], temp[2], const[15].zzzz; 14: MUL output[1], temp[2], const[9]; 15: MOV output[2], input[2]; 16: DP4 output[3].x, const[10], temp[0]; 17: DP4 output[3].y, const[11], temp[0]; 18: DP4 output[3].z, const[12], temp[0]; 19: DP4 output[3].w, const[13], temp[0]; 20: MAD output[4], temp[0].xzzz, const[14].xxxx, const[14].yyyy; 21: MOV output[0], temp[3]; 22: MOV output[5], temp[3]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: MOV temp[0].w, none.___1; 4: DP4 temp[1].x, const[0].xyz0, input[1].xyz0; 5: DP4 temp[1].y, const[1].xyz0, input[1].xyz0; 6: DP4 temp[1].z, const[2].xyz0, input[1].xyz0; 7: DP4 temp[2].x, const[4], temp[0]; 8: DP4 temp[2].y, const[5], temp[0]; 9: DP4 temp[2].z, const[6], temp[0]; 10: DP4 temp[2].w, const[7], temp[0]; 11: DP4 temp[1], temp[1].xyz0, const[8].-x-y-z0; 12: MAX temp[1], none.0000, temp[1]; 13: MUL temp[1], temp[1], const[15].zzzz; 14: MUL output[1], temp[1], const[9]; 15: MOV output[2], input[2]; 16: DP4 output[3].x, const[10], temp[0]; 17: DP4 output[3].y, const[11], temp[0]; 18: DP4 output[3].z, const[12], temp[0]; 19: DP4 output[3].w, const[13], temp[0]; 20: MAD output[4], temp[0].xzzz, const[14].xxxx, const[14].yyyy; 21: MOV output[0], temp[2]; 22: MOV output[5], temp[2]; CONST[15] = { 1.0000 0.0000 0.5000 0.0000 } Vertex Program: after 'dead constants' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: MOV temp[0].w, none.___1; 4: DP4 temp[1].x, const[0].xyz0, input[1].xyz0; 5: DP4 temp[1].y, const[1].xyz0, input[1].xyz0; 6: DP4 temp[1].z, const[2].xyz0, input[1].xyz0; 7: DP4 temp[2].x, const[4], temp[0]; 8: DP4 temp[2].y, const[5], temp[0]; 9: DP4 temp[2].z, const[6], temp[0]; 10: DP4 temp[2].w, const[7], temp[0]; 11: DP4 temp[1], temp[1].xyz0, const[8].-x-y-z0; 12: MAX temp[1], none.0000, temp[1]; 13: MUL temp[1], temp[1], const[15].zzzz; 14: MUL output[1], temp[1], const[9]; 15: MOV output[2], input[2]; 16: DP4 output[3].x, const[10], temp[0]; 17: DP4 output[3].y, const[11], temp[0]; 18: DP4 output[3].z, const[12], temp[0]; 19: DP4 output[3].w, const[13], temp[0]; 20: MAD output[4], temp[0].xzzz, const[14].xxxx, const[14].yyyy; 21: MOV output[0], temp[2]; 22: MOV output[5], temp[2]; Final vertex program code: 0: op: 0x00100001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 1: op: 0x00200001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 2: op: 0x00400001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 3: op: 0x00800003 dst: 0t op: VE_ADD src0: 0x017fe000 reg: 0t swiz: U/ U/ U/ 1 src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 4: op: 0x00102001 dst: 1t op: VE_DOT_PRODUCT src0: 0x01110002 reg: 0c swiz: X/ Y/ Z/ 0 src1: 0x01110021 reg: 1i swiz: X/ Y/ Z/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 5: op: 0x00202001 dst: 1t op: VE_DOT_PRODUCT src0: 0x01110022 reg: 1c swiz: X/ Y/ Z/ 0 src1: 0x01110021 reg: 1i swiz: X/ Y/ Z/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 6: op: 0x00402001 dst: 1t op: VE_DOT_PRODUCT src0: 0x01110042 reg: 2c swiz: X/ Y/ Z/ 0 src1: 0x01110021 reg: 1i swiz: X/ Y/ Z/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 7: op: 0x00104001 dst: 2t op: VE_DOT_PRODUCT src0: 0x00d10082 reg: 4c swiz: X/ Y/ Z/ W src1: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 8: op: 0x00204001 dst: 2t op: VE_DOT_PRODUCT src0: 0x00d100a2 reg: 5c swiz: X/ Y/ Z/ W src1: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 9: op: 0x00404001 dst: 2t op: VE_DOT_PRODUCT src0: 0x00d100c2 reg: 6c swiz: X/ Y/ Z/ W src1: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 10: op: 0x00804001 dst: 2t op: VE_DOT_PRODUCT src0: 0x00d100e2 reg: 7c swiz: X/ Y/ Z/ W src1: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 11: op: 0x00f02001 dst: 1t op: VE_DOT_PRODUCT src0: 0x01110020 reg: 1t swiz: X/ Y/ Z/ 0 src1: 0x0f110102 reg: 8c swiz: -X/-Y/-Z/ 0 src2: 0x01248102 reg: 8c swiz: 0/ 0/ 0/ 0 12: op: 0x00f02007 dst: 1t op: VE_MAXIMUM src0: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src1: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 13: op: 0x00f02002 dst: 1t op: VE_MULTIPLY src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x009241e2 reg: 15c swiz: Z/ Z/ Z/ Z src2: 0x012481e2 reg: 15c swiz: 0/ 0/ 0/ 0 14: op: 0x00f02202 dst: 1o op: VE_MULTIPLY src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x00d10122 reg: 9c swiz: X/ Y/ Z/ W src2: 0x01248122 reg: 9c swiz: 0/ 0/ 0/ 0 15: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10041 reg: 2i swiz: X/ Y/ Z/ W src1: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 src2: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 16: op: 0x00106201 dst: 3o op: VE_DOT_PRODUCT src0: 0x00d10142 reg: 10c swiz: X/ Y/ Z/ W src1: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 17: op: 0x00206201 dst: 3o op: VE_DOT_PRODUCT src0: 0x00d10162 reg: 11c swiz: X/ Y/ Z/ W src1: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 18: op: 0x00406201 dst: 3o op: VE_DOT_PRODUCT src0: 0x00d10182 reg: 12c swiz: X/ Y/ Z/ W src1: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 19: op: 0x00806201 dst: 3o op: VE_DOT_PRODUCT src0: 0x00d101a2 reg: 13c swiz: X/ Y/ Z/ W src1: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 20: op: 0x00f08204 dst: 4o op: VE_MULTIPLY_ADD src0: 0x00920000 reg: 0t swiz: X/ Z/ Z/ Z src1: 0x000001c2 reg: 14c swiz: X/ X/ X/ X src2: 0x004921c2 reg: 14c swiz: Y/ Y/ Y/ Y 21: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10040 reg: 2t swiz: X/ Y/ Z/ W src1: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 src2: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 22: op: 0x00f0a203 dst: 5o op: VE_ADD src0: 0x00d10040 reg: 2t swiz: X/ Y/ Z/ W src1: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 src2: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 23 Instructions ~ 0 Flow Control Instructions ~ 3 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], COLOR, LINEAR DCL IN[1], GENERIC[0], PERSPECTIVE DCL IN[2], GENERIC[1], PERSPECTIVE DCL IN[3], GENERIC[2], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL SAMP[1] DCL SAMP[2] DCL CONST[1..2] DCL TEMP[0..3] IMM FLT32 { 2.0000, 0.0000, 0.0000, 0.0000} 0: TEX TEMP[0], IN[1], SAMP[0], 2D 1: MOV TEMP[1].xyz, TEMP[0] 2: TEX TEMP[2], IN[2], SAMP[1], SHADOW2D 3: MUL TEMP[3].xyz, IN[0], IMM[0].xxxx 4: MAD_SAT TEMP[2].xyz, TEMP[3], TEMP[2], CONST[1] 5: MUL TEMP[1].xyz, TEMP[1], TEMP[2] 6: TEX TEMP[0].w, IN[3], SAMP[2], 2D 7: MUL TEMP[1].xyz, TEMP[1], TEMP[0].wwww 8: MUL OUT[0].xyz, TEMP[1], CONST[2] 9: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MOV temp[1].xyz, temp[0]; 2: TEX temp[2], input[2], 2DSHADOW[1]; 3: MUL temp[3].xyz, input[0], const[3].xxxx; 4: MAD_SAT temp[2].xyz, temp[3], temp[2], const[1]; 5: MUL temp[1].xyz, temp[1], temp[2]; 6: TEX temp[0].w, input[3], 2D[2]; 7: MUL temp[1].xyz, temp[1], temp[0].wwww; 8: MUL output[0].xyz, temp[1], const[2]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MOV temp[1].xyz, temp[0]; 2: TEX temp[2], input[2], 2DSHADOW[1]; 3: MUL temp[3].xyz, input[0], const[3].xxxx; 4: MAD_SAT temp[2].xyz, temp[3], temp[2], const[1]; 5: MUL temp[1].xyz, temp[1], temp[2]; 6: TEX temp[0].w, input[3], 2D[2]; 7: MUL temp[1].xyz, temp[1], temp[0].wwww; 8: MUL output[0].xyz, temp[1], const[2]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MOV temp[1].xyz, temp[0]; 2: TEX temp[2], input[2], 2DSHADOW[1]; 3: MUL temp[3].xyz, input[0], const[3].xxxx; 4: MAD_SAT temp[2].xyz, temp[3], temp[2], const[1]; 5: MUL temp[1].xyz, temp[1], temp[2]; 6: TEX temp[0].w, input[3], 2D[2]; 7: MUL temp[1].xyz, temp[1], temp[0].wwww; 8: MUL output[0].xyz, temp[1], const[2]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MOV temp[1].xyz, temp[0]; 2: TEX temp[2], input[2], 2DSHADOW[1]; 3: MUL temp[3].xyz, input[0], const[3].xxxx; 4: MAD_SAT temp[2].xyz, temp[3], temp[2], const[1]; 5: MUL temp[1].xyz, temp[1], temp[2]; 6: TEX temp[0].w, input[3], 2D[2]; 7: MUL temp[1].xyz, temp[1], temp[0].wwww; 8: MUL output[0].xyz, temp[1], const[2]; Fragment Program: after 'saturate output writes' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MOV temp[1].xyz, temp[0]; 2: TEX temp[2], input[2], 2DSHADOW[1]; 3: MUL temp[3].xyz, input[0], const[3].xxxx; 4: MAD_SAT temp[2].xyz, temp[3], temp[2], const[1]; 5: MUL temp[1].xyz, temp[1], temp[2]; 6: TEX temp[0].w, input[3], 2D[2]; 7: MUL temp[1].xyz, temp[1], temp[0].wwww; 8: MUL_SAT output[0].xyz, temp[1], const[2]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MOV temp[1].xyz, temp[0]; 2: TEX temp[4], input[2], 2DSHADOW[1]; 3: MOV_SAT temp[5].w, input[2].zzzz; 4: ADD temp[5].w, -temp[5].wwww, temp[4].xxxx; 5: CMP temp[2], temp[5].www1, none.0001, none.1111; 6: MUL temp[3].xyz, input[0], const[3].xxxx; 7: MAD_SAT temp[2].xyz, temp[3], temp[2], const[1]; 8: MUL temp[1].xyz, temp[1], temp[2]; 9: TEX temp[0].w, input[3], 2D[2]; 10: MUL temp[1].xyz, temp[1], temp[0].wwww; 11: MUL_SAT output[0].xyz, temp[1], const[2]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MOV temp[1].xyz, temp[0]; 2: TEX temp[4], input[2], 2DSHADOW[1]; 3: MOV_SAT temp[5].w, input[2].zzzz; 4: ADD temp[5].w, -temp[5].wwww, temp[4].xxxx; 5: CMP temp[2], temp[5].www1, none.0001, none.1111; 6: MUL temp[3].xyz, input[0], const[3].xxxx; 7: MAD_SAT temp[2].xyz, temp[3], temp[2], const[1]; 8: MUL temp[1].xyz, temp[1], temp[2]; 9: TEX temp[0].w, input[3], 2D[2]; 10: MUL temp[1].xyz, temp[1], temp[0].wwww; 11: MUL_SAT output[0].xyz, temp[1], const[2]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TEX temp[0].xyz, input[1].xy__, 2D[0]; 1: MOV temp[1].xyz, temp[0].xyz_; 2: TEX temp[4].x, input[2].xy__, 2DSHADOW[1]; 3: MOV_SAT temp[5].w, input[2].___z; 4: ADD temp[5].w, -temp[5].___w, temp[4].___x; 5: CMP temp[2].xyz, temp[5].www_, none.000_, none.111_; 6: MUL temp[3].xyz, input[0].xyz_, const[3].xxx_; 7: MAD_SAT temp[2].xyz, temp[3].xyz_, temp[2].xyz_, const[1].xyz_; 8: MUL temp[1].xyz, temp[1].xyz_, temp[2].xyz_; 9: TEX temp[0].w, input[3].xy__, 2D[2]; 10: MUL temp[1].xyz, temp[1].xyz_, temp[0].www_; 11: MUL_SAT output[0].xyz, temp[1].xyz_, const[2].xyz_; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TEX temp[0].xyz, input[1].xy__, 2D[0]; 1: TEX temp[4].x, input[2].xy__, 2DSHADOW[1]; 2: MOV_SAT temp[5].w, input[2].___z; 3: ADD temp[5].w, -temp[5].___w, temp[4].___x; 4: CMP temp[2].xyz, temp[5].www_, none.000_, none.111_; 5: MUL temp[3].xyz, input[0].xyz_, const[3].xxx_; 6: MAD_SAT temp[2].xyz, temp[3].xyz_, temp[2].xyz_, const[1].xyz_; 7: MUL temp[1].xyz, temp[0].xyz_, temp[2].xyz_; 8: TEX temp[0].w, input[3].xy__, 2D[2]; 9: MUL temp[1].xyz, temp[1].xyz_, temp[0].www_; 10: MUL_SAT output[0].xyz, temp[1].xyz_, const[2].xyz_; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[0].xyz, input[1].xy__, 2D[0]; 1: TEX temp[4].x, input[2].xy__, 2DSHADOW[1]; 2: MOV_SAT temp[5].w, input[2].___z; 3: ADD temp[5].w, -temp[5].___w, temp[4].___x; 4: CMP temp[2].xyz, temp[5].www_, none.000_, none.111_; 5: MUL temp[3].xyz, input[0].xyz_, const[3].xxx_; 6: MAD_SAT temp[2].xyz, temp[3].xyz_, temp[2].xyz_, const[1].xyz_; 7: MUL temp[1].xyz, temp[0].xyz_, temp[2].xyz_; 8: TEX temp[0].w, input[3].xy__, 2D[2]; 9: MUL temp[1].xyz, temp[1].xyz_, temp[0].www_; 10: MUL_SAT output[0].xyz, temp[1].xyz_, const[2].xyz_; CONST[3] = { 2.0000 0.0000 0.0000 0.0000 } Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[0].xyz, input[1].xy__, 2D[0]; 1: TEX temp[4].x, input[2].xy__, 2DSHADOW[1]; 2: MOV_SAT temp[5].w, input[2].___z; 3: ADD temp[5].w, -temp[5].___w, temp[4].___x; 4: CMP temp[2].xyz, temp[5].www_, none.000_, none.111_; 5: MUL temp[3].xyz, input[0].xyz_, const[3].xxx_; 6: MAD_SAT temp[2].xyz, temp[3].xyz_, temp[2].xyz_, const[1].xyz_; 7: MUL temp[1].xyz, temp[0].xyz_, temp[2].xyz_; 8: TEX temp[0].w, input[3].xy__, 2D[2]; 9: MUL temp[1].xyz, temp[1].xyz_, temp[0].www_; 10: MUL_SAT output[0].xyz, temp[1].xyz_, const[2].xyz_; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[0].xyz, input[1].xy__, 2D[0]; 1: TEX temp[4].x, input[2].xy__, 2DSHADOW[1]; 2: src0.xyz = input[2] MAD_SAT temp[5].w, src0.z, src0.1, src0.0 3: src0.xyz = temp[4], src0.w = temp[5] MAD temp[5].w, -src0.w, src0.1, src0.x 4: src0.w = temp[5] CMP temp[2].xyz, src0.111, src0.000, src0.www 5: src0.xyz = input[0], src1.xyz = const[3] MAD temp[3].xyz, src0.xyz, src1.xxx, src0.000 6: src0.xyz = temp[3], src1.xyz = temp[2], src2.xyz = const[1] MAD_SAT temp[2].xyz, src0.xyz, src1.xyz, src2.xyz 7: src0.xyz = temp[0], src1.xyz = temp[2] MAD temp[1].xyz, src0.xyz, src1.xyz, src0.000 8: TEX temp[0].w, input[3].xy__, 2D[2]; 9: src0.xyz = temp[1], src0.w = temp[0] MAD temp[1].xyz, src0.xyz, src0.www, src0.000 10: src0.xyz = temp[1], src1.xyz = const[2] MAD_SAT color[0].xyz, src0.xyz, src1.xyz, src0.000 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[0].xyz, input[1].xy__, 2D[0]; 2: TEX temp[4].x, input[2].xy__, 2DSHADOW[1]; 3: TEX temp[0].w, input[3].xy__, 2D[2]; 4: src0.xyz = input[0], src1.xyz = const[3], src2.xyz = input[2] MAD temp[3].xyz, src0.xyz, src1.xxx, src0.000 MAD_SAT temp[5].w, src2.z, src0.1, src0.0 5: src0.xyz = temp[4], src0.w = temp[5] MAD temp[5].w, -src0.w, src0.1, src0.x 6: src0.w = temp[5] CMP temp[2].xyz, src0.111, src0.000, src0.www 7: src0.xyz = temp[3], src1.xyz = temp[2], src2.xyz = const[1] MAD_SAT temp[2].xyz, src0.xyz, src1.xyz, src2.xyz 8: src0.xyz = temp[0], src1.xyz = temp[2] MAD temp[1].xyz, src0.xyz, src1.xyz, src0.000 9: src0.xyz = temp[1], src0.w = temp[0] MAD temp[1].xyz, src0.xyz, src0.www, src0.000 10: src0.xyz = temp[1], src1.xyz = const[2] MAD_SAT color[0].xyz, src0.xyz, src1.xyz, src0.000 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[1].xyz, temp[1].xy__, 2D[0]; 2: TEX temp[4].x, temp[2].xy__, 2DSHADOW[1]; 3: TEX temp[1].w, temp[3].xy__, 2D[2]; 4: src0.xyz = temp[0], src1.xyz = const[3], src2.xyz = temp[2] MAD temp[2].xyz, src0.xyz, src1.xxx, src0.000 MAD_SAT temp[0].w, src2.z, src0.1, src0.0 5: src0.xyz = temp[4], src0.w = temp[0] MAD temp[0].w, -src0.w, src0.1, src0.x 6: src0.w = temp[0] CMP temp[0].xyz, src0.111, src0.000, src0.www 7: src0.xyz = temp[2], src1.xyz = temp[0], src2.xyz = const[1] MAD_SAT temp[0].xyz, src0.xyz, src1.xyz, src2.xyz 8: src0.xyz = temp[1], src1.xyz = temp[0] MAD temp[0].xyz, src0.xyz, src1.xyz, src0.000 9: src0.xyz = temp[0], src0.w = temp[1] MAD temp[0].xyz, src0.xyz, src0.www, src0.000 10: src0.xyz = temp[0], src1.xyz = const[2] MAD_SAT color[0].xyz, src0.xyz, src1.xyz, src0.000 R500 Fragment Program: -------- 0 0:CMN_INST 0x00003807:TEX TEX_WAIT wmask: RGB omask: NONE 1:TEX_INST: 0x02400000: id: 0 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe401f401: src: 1 R/G/A/A dst: 1 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00000807:TEX TEX_WAIT wmask: R omask: NONE 1:TEX_INST: 0x02410000: id: 1 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe404f402: src: 2 R/G/A/A dst: 4 R/G/B/A 3:TEX_DXDY: 0x00000000 2 0:CMN_INST 0x00004007:TEX TEX_WAIT wmask: A omask: NONE 1:TEX_INST: 0x02420000: id: 2 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe401f403: src: 3 R/G/A/A dst: 1 R/G/B/A 3:TEX_DXDY: 0x00000000 3 0:CMN_INST 0x00107804:ALU TEX_WAIT wmask: ARGB omask: NONE 1:RGB_ADDR 0x00240c00:Addr0: 0t, Addr1: 3c, Addr2: 2t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00002220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/R/R 0 targ: 0 4 ALPHA_INST:0x00c0a000:MAD dest:0 alp_A_src:2 B 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490020:MAD dest:2 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 4 0:CMN_INST 0x00004004:ALU TEX_WAIT wmask: A omask: NONE 1:RGB_ADDR 0x08020004:Addr0: 4t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x00c2c000:MAD dest:0 alp_A_src:0 A 1 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x00000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 5 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x009206d8:rgb_A_src:0 1/1/1 0 rgb_B_src:0 0/0/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x0036c008:CMP dest:0 rgb_C_src:0 A/A/A 0 alp_C_src:0 R 0 6 0:CMN_INST 0x00083804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x10100002:Addr0: 2t, Addr1: 0t, Addr2: 1c, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00222000:MAD dest:0 rgb_C_src:2 R/G/B 0 alp_C_src:0 R 0 7 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x08000001:Addr0: 1t, Addr1: 0t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 8 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 9 0:CMN_INST 0x000b8005:OUT TEX_WAIT wmask: NONE omask: RGB 1:RGB_ADDR 0x08040800:Addr0: 0t, Addr1: 2c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 ~~~~~~~~ FRAGMENT PROGRAM ~~~~~~~ ~ 10 Instructions ~ 6 Vector Instructions (RGB) ~ 2 Scalar Instructions (Alpha) ~ 0 Flow Control Instructions ~ 3 Texture Instructions ~ 0 Presub Operations ~ 5 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL IN[2] DCL OUT[0], POSITION DCL OUT[1], COLOR DCL OUT[2], GENERIC[0] DCL OUT[3], GENERIC[1] DCL OUT[4], GENERIC[2] DCL CONST[0..4] DCL CONST[6..11] DCL TEMP[0] IMM FLT32 { 0.0000, 0.5000, 0.0000, 0.0000} 0: DP4 OUT[0].x, CONST[0], IN[0] 1: DP4 OUT[0].y, CONST[1], IN[0] 2: DP4 OUT[0].z, CONST[2], IN[0] 3: DP4 OUT[0].w, CONST[3], IN[0] 4: DP3 TEMP[0], IN[1], -CONST[4] 5: MAX TEMP[0], IMM[0].xxxx, TEMP[0] 6: MUL TEMP[0], TEMP[0], IMM[0].yyyy 7: MUL OUT[1], TEMP[0], CONST[6] 8: MOV OUT[2], IN[2] 9: DP4 OUT[3].x, CONST[7], IN[0] 10: DP4 OUT[3].y, CONST[8], IN[0] 11: DP4 OUT[3].z, CONST[9], IN[0] 12: DP4 OUT[3].w, CONST[10], IN[0] 13: MAD OUT[4], IN[0].xzzz, CONST[11].xxxx, CONST[11].yyyy 14: END Vertex Program: before compilation # Radeon Compiler Program 0: DP4 temp[1].x, const[0], input[0]; 1: DP4 temp[1].y, const[1], input[0]; 2: DP4 temp[1].z, const[2], input[0]; 3: DP4 temp[1].w, const[3], input[0]; 4: DP3 temp[0], input[1], -const[4]; 5: MAX temp[0], const[12].xxxx, temp[0]; 6: MUL temp[0], temp[0], const[12].yyyy; 7: MUL output[1], temp[0], const[6]; 8: MOV output[2], input[2]; 9: DP4 output[3].x, const[7], input[0]; 10: DP4 output[3].y, const[8], input[0]; 11: DP4 output[3].z, const[9], input[0]; 12: DP4 output[3].w, const[10], input[0]; 13: MAD output[4], input[0].xzzz, const[11].xxxx, const[11].yyyy; 14: MOV output[0], temp[1]; 15: MOV output[5], temp[1]; Vertex Program: after 'transform loops' # Radeon Compiler Program 0: DP4 temp[1].x, const[0], input[0]; 1: DP4 temp[1].y, const[1], input[0]; 2: DP4 temp[1].z, const[2], input[0]; 3: DP4 temp[1].w, const[3], input[0]; 4: DP3 temp[0], input[1], -const[4]; 5: MAX temp[0], const[12].xxxx, temp[0]; 6: MUL temp[0], temp[0], const[12].yyyy; 7: MUL output[1], temp[0], const[6]; 8: MOV output[2], input[2]; 9: DP4 output[3].x, const[7], input[0]; 10: DP4 output[3].y, const[8], input[0]; 11: DP4 output[3].z, const[9], input[0]; 12: DP4 output[3].w, const[10], input[0]; 13: MAD output[4], input[0].xzzz, const[11].xxxx, const[11].yyyy; 14: MOV output[0], temp[1]; 15: MOV output[5], temp[1]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: DP4 temp[1].x, const[0], input[0]; 1: DP4 temp[1].y, const[1], input[0]; 2: DP4 temp[1].z, const[2], input[0]; 3: DP4 temp[1].w, const[3], input[0]; 4: DP3 temp[0], input[1], -const[4]; 5: MAX temp[0], const[12].xxxx, temp[0]; 6: MUL temp[0], temp[0], const[12].yyyy; 7: MUL output[1], temp[0], const[6]; 8: MOV output[2], input[2]; 9: DP4 output[3].x, const[7], input[0]; 10: DP4 output[3].y, const[8], input[0]; 11: DP4 output[3].z, const[9], input[0]; 12: DP4 output[3].w, const[10], input[0]; 13: MAD output[4], input[0].xzzz, const[11].xxxx, const[11].yyyy; 14: MOV output[0], temp[1]; 15: MOV output[5], temp[1]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: DP4 temp[1].x, const[0], input[0]; 1: DP4 temp[1].y, const[1], input[0]; 2: DP4 temp[1].z, const[2], input[0]; 3: DP4 temp[1].w, const[3], input[0]; 4: DP4 temp[0], input[1].xyz0, const[4].-x-y-z0; 5: MAX temp[0], const[12].xxxx, temp[0]; 6: MUL temp[0], temp[0], const[12].yyyy; 7: MUL output[1], temp[0], const[6]; 8: MOV output[2], input[2]; 9: DP4 output[3].x, const[7], input[0]; 10: DP4 output[3].y, const[8], input[0]; 11: DP4 output[3].z, const[9], input[0]; 12: DP4 output[3].w, const[10], input[0]; 13: MAD output[4], input[0].xzzz, const[11].xxxx, const[11].yyyy; 14: MOV output[0], temp[1]; 15: MOV output[5], temp[1]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: DP4 temp[1].x, const[0], input[0]; 1: DP4 temp[1].y, const[1], input[0]; 2: DP4 temp[1].z, const[2], input[0]; 3: DP4 temp[1].w, const[3], input[0]; 4: DP4 temp[0], input[1].xyz0, const[4].-x-y-z0; 5: MAX temp[0], const[12].xxxx, temp[0]; 6: MUL temp[0], temp[0], const[12].yyyy; 7: MUL output[1], temp[0], const[6]; 8: MOV output[2], input[2]; 9: DP4 output[3].x, const[7], input[0]; 10: DP4 output[3].y, const[8], input[0]; 11: DP4 output[3].z, const[9], input[0]; 12: DP4 output[3].w, const[10], input[0]; 13: MAD output[4], input[0].xzzz, const[11].xxxx, const[11].yyyy; 14: MOV output[0], temp[1]; 15: MOV output[5], temp[1]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: DP4 temp[1].x, const[0], input[0]; 1: DP4 temp[1].y, const[1], input[0]; 2: DP4 temp[1].z, const[2], input[0]; 3: DP4 temp[1].w, const[3], input[0]; 4: DP4 temp[0], input[1].xyz0, const[4].-x-y-z0; 5: MAX temp[0], none.0000, temp[0]; 6: MUL temp[0], temp[0], const[12].yyyy; 7: MUL output[1], temp[0], const[6]; 8: MOV output[2], input[2]; 9: DP4 output[3].x, const[7], input[0]; 10: DP4 output[3].y, const[8], input[0]; 11: DP4 output[3].z, const[9], input[0]; 12: DP4 output[3].w, const[10], input[0]; 13: MAD output[4], input[0].xzzz, const[11].xxxx, const[11].yyyy; 14: MOV output[0], temp[1]; 15: MOV output[5], temp[1]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: DP4 temp[1].x, const[0], input[0]; 1: DP4 temp[1].y, const[1], input[0]; 2: DP4 temp[1].z, const[2], input[0]; 3: DP4 temp[1].w, const[3], input[0]; 4: DP4 temp[0], input[1].xyz0, const[4].-x-y-z0; 5: MAX temp[0], none.0000, temp[0]; 6: MUL temp[0], temp[0], const[12].yyyy; 7: MUL output[1], temp[0], const[6]; 8: MOV output[2], input[2]; 9: DP4 output[3].x, const[7], input[0]; 10: DP4 output[3].y, const[8], input[0]; 11: DP4 output[3].z, const[9], input[0]; 12: DP4 output[3].w, const[10], input[0]; 13: MAD output[4], input[0].xzzz, const[11].xxxx, const[11].yyyy; 14: MOV output[0], temp[1]; 15: MOV output[5], temp[1]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: DP4 temp[1], input[1].xyz0, const[4].-x-y-z0; 5: MAX temp[1], none.0000, temp[1]; 6: MUL temp[1], temp[1], const[12].yyyy; 7: MUL output[1], temp[1], const[6]; 8: MOV output[2], input[2]; 9: DP4 output[3].x, const[7], input[0]; 10: DP4 output[3].y, const[8], input[0]; 11: DP4 output[3].z, const[9], input[0]; 12: DP4 output[3].w, const[10], input[0]; 13: MAD output[4], input[0].xzzz, const[11].xxxx, const[11].yyyy; 14: MOV output[0], temp[0]; 15: MOV output[5], temp[0]; CONST[12] = { 0.0000 0.5000 0.0000 0.0000 } Vertex Program: after 'dead constants' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: DP4 temp[1], input[1].xyz0, const[4].-x-y-z0; 5: MAX temp[1], none.0000, temp[1]; 6: MUL temp[1], temp[1], const[12].yyyy; 7: MUL output[1], temp[1], const[6]; 8: MOV output[2], input[2]; 9: DP4 output[3].x, const[7], input[0]; 10: DP4 output[3].y, const[8], input[0]; 11: DP4 output[3].z, const[9], input[0]; 12: DP4 output[3].w, const[10], input[0]; 13: MAD output[4], input[0].xzzz, const[11].xxxx, const[11].yyyy; 14: MOV output[0], temp[0]; 15: MOV output[5], temp[0]; Final vertex program code: 0: op: 0x00100001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 1: op: 0x00200001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 2: op: 0x00400001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 3: op: 0x00800001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 4: op: 0x00f02001 dst: 1t op: VE_DOT_PRODUCT src0: 0x01110021 reg: 1i swiz: X/ Y/ Z/ 0 src1: 0x0f110082 reg: 4c swiz: -X/-Y/-Z/ 0 src2: 0x01248082 reg: 4c swiz: 0/ 0/ 0/ 0 5: op: 0x00f02007 dst: 1t op: VE_MAXIMUM src0: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src1: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 6: op: 0x00f02002 dst: 1t op: VE_MULTIPLY src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x00492182 reg: 12c swiz: Y/ Y/ Y/ Y src2: 0x01248182 reg: 12c swiz: 0/ 0/ 0/ 0 7: op: 0x00f02202 dst: 1o op: VE_MULTIPLY src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x00d100c2 reg: 6c swiz: X/ Y/ Z/ W src2: 0x012480c2 reg: 6c swiz: 0/ 0/ 0/ 0 8: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10041 reg: 2i swiz: X/ Y/ Z/ W src1: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 src2: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 9: op: 0x00106201 dst: 3o op: VE_DOT_PRODUCT src0: 0x00d100e2 reg: 7c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 10: op: 0x00206201 dst: 3o op: VE_DOT_PRODUCT src0: 0x00d10102 reg: 8c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 11: op: 0x00406201 dst: 3o op: VE_DOT_PRODUCT src0: 0x00d10122 reg: 9c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 12: op: 0x00806201 dst: 3o op: VE_DOT_PRODUCT src0: 0x00d10142 reg: 10c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 13: op: 0x00f08204 dst: 4o op: VE_MULTIPLY_ADD src0: 0x00920001 reg: 0i swiz: X/ Z/ Z/ Z src1: 0x00000162 reg: 11c swiz: X/ X/ X/ X src2: 0x00492162 reg: 11c swiz: Y/ Y/ Y/ Y 14: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 15: op: 0x00f0a203 dst: 5o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 16 Instructions ~ 0 Flow Control Instructions ~ 2 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], COLOR, LINEAR DCL IN[1], GENERIC[0], PERSPECTIVE DCL IN[2], GENERIC[1], PERSPECTIVE DCL IN[3], GENERIC[2], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL SAMP[1] DCL SAMP[2] DCL CONST[0] DCL CONST[2..3] DCL TEMP[0..3] IMM FLT32 { 1.0000, 2.0000, 0.0000, 0.0000} 0: TEX TEMP[0], IN[1], SAMP[0], 2D 1: LRP TEMP[1].xyz, CONST[0], IMM[0].xxxx, TEMP[0].wwww 2: MUL TEMP[2].xyz, TEMP[0], TEMP[1] 3: TEX TEMP[1], IN[2], SAMP[1], SHADOW2D 4: MUL TEMP[3].xyz, IN[0], IMM[0].yyyy 5: MAD_SAT TEMP[1].xyz, TEMP[3], TEMP[1], CONST[2] 6: MUL TEMP[2].xyz, TEMP[2], TEMP[1] 7: TEX TEMP[0].w, IN[3], SAMP[2], 2D 8: MUL TEMP[2].xyz, TEMP[2], TEMP[0].wwww 9: MUL OUT[0].xyz, TEMP[2], CONST[3] 10: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: LRP temp[1].xyz, const[0], const[4].xxxx, temp[0].wwww; 2: MUL temp[2].xyz, temp[0], temp[1]; 3: TEX temp[1], input[2], 2DSHADOW[1]; 4: MUL temp[3].xyz, input[0], const[4].yyyy; 5: MAD_SAT temp[1].xyz, temp[3], temp[1], const[2]; 6: MUL temp[2].xyz, temp[2], temp[1]; 7: TEX temp[0].w, input[3], 2D[2]; 8: MUL temp[2].xyz, temp[2], temp[0].wwww; 9: MUL output[0].xyz, temp[2], const[3]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: LRP temp[1].xyz, const[0], const[4].xxxx, temp[0].wwww; 2: MUL temp[2].xyz, temp[0], temp[1]; 3: TEX temp[1], input[2], 2DSHADOW[1]; 4: MUL temp[3].xyz, input[0], const[4].yyyy; 5: MAD_SAT temp[1].xyz, temp[3], temp[1], const[2]; 6: MUL temp[2].xyz, temp[2], temp[1]; 7: TEX temp[0].w, input[3], 2D[2]; 8: MUL temp[2].xyz, temp[2], temp[0].wwww; 9: MUL output[0].xyz, temp[2], const[3]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: LRP temp[1].xyz, const[0], const[4].xxxx, temp[0].wwww; 2: MUL temp[2].xyz, temp[0], temp[1]; 3: TEX temp[1], input[2], 2DSHADOW[1]; 4: MUL temp[3].xyz, input[0], const[4].yyyy; 5: MAD_SAT temp[1].xyz, temp[3], temp[1], const[2]; 6: MUL temp[2].xyz, temp[2], temp[1]; 7: TEX temp[0].w, input[3], 2D[2]; 8: MUL temp[2].xyz, temp[2], temp[0].wwww; 9: MUL output[0].xyz, temp[2], const[3]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: LRP temp[1].xyz, const[0], const[4].xxxx, temp[0].wwww; 2: MUL temp[2].xyz, temp[0], temp[1]; 3: TEX temp[1], input[2], 2DSHADOW[1]; 4: MUL temp[3].xyz, input[0], const[4].yyyy; 5: MAD_SAT temp[1].xyz, temp[3], temp[1], const[2]; 6: MUL temp[2].xyz, temp[2], temp[1]; 7: TEX temp[0].w, input[3], 2D[2]; 8: MUL temp[2].xyz, temp[2], temp[0].wwww; 9: MUL output[0].xyz, temp[2], const[3]; Fragment Program: after 'saturate output writes' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: LRP temp[1].xyz, const[0], const[4].xxxx, temp[0].wwww; 2: MUL temp[2].xyz, temp[0], temp[1]; 3: TEX temp[1], input[2], 2DSHADOW[1]; 4: MUL temp[3].xyz, input[0], const[4].yyyy; 5: MAD_SAT temp[1].xyz, temp[3], temp[1], const[2]; 6: MUL temp[2].xyz, temp[2], temp[1]; 7: TEX temp[0].w, input[3], 2D[2]; 8: MUL temp[2].xyz, temp[2], temp[0].wwww; 9: MUL_SAT output[0].xyz, temp[2], const[3]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: LRP temp[1].xyz, const[0], const[4].xxxx, temp[0].wwww; 2: MUL temp[2].xyz, temp[0], temp[1]; 3: TEX temp[4], input[2], 2DSHADOW[1]; 4: MOV_SAT temp[5].w, input[2].zzzz; 5: ADD temp[5].w, -temp[5].wwww, temp[4].xxxx; 6: CMP temp[1], temp[5].www1, none.0001, none.1111; 7: MUL temp[3].xyz, input[0], const[4].yyyy; 8: MAD_SAT temp[1].xyz, temp[3], temp[1], const[2]; 9: MUL temp[2].xyz, temp[2], temp[1]; 10: TEX temp[0].w, input[3], 2D[2]; 11: MUL temp[2].xyz, temp[2], temp[0].wwww; 12: MUL_SAT output[0].xyz, temp[2], const[3]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: ADD temp[1].xyz, const[4].xxxx, -temp[0].wwww; 2: MAD temp[1].xyz, const[0], temp[1], temp[0].wwww; 3: MUL temp[2].xyz, temp[0], temp[1]; 4: TEX temp[4], input[2], 2DSHADOW[1]; 5: MOV_SAT temp[5].w, input[2].zzzz; 6: ADD temp[5].w, -temp[5].wwww, temp[4].xxxx; 7: CMP temp[1], temp[5].www1, none.0001, none.1111; 8: MUL temp[3].xyz, input[0], const[4].yyyy; 9: MAD_SAT temp[1].xyz, temp[3], temp[1], const[2]; 10: MUL temp[2].xyz, temp[2], temp[1]; 11: TEX temp[0].w, input[3], 2D[2]; 12: MUL temp[2].xyz, temp[2], temp[0].wwww; 13: MUL_SAT output[0].xyz, temp[2], const[3]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TEX temp[0], input[1].xy__, 2D[0]; 1: ADD temp[1].xyz, const[4].xxx_, -temp[0].www_; 2: MAD temp[1].xyz, const[0].xyz_, temp[1].xyz_, temp[0].www_; 3: MUL temp[2].xyz, temp[0].xyz_, temp[1].xyz_; 4: TEX temp[4].x, input[2].xy__, 2DSHADOW[1]; 5: MOV_SAT temp[5].w, input[2].___z; 6: ADD temp[5].w, -temp[5].___w, temp[4].___x; 7: CMP temp[1].xyz, temp[5].www_, none.000_, none.111_; 8: MUL temp[3].xyz, input[0].xyz_, const[4].yyy_; 9: MAD_SAT temp[1].xyz, temp[3].xyz_, temp[1].xyz_, const[2].xyz_; 10: MUL temp[2].xyz, temp[2].xyz_, temp[1].xyz_; 11: TEX temp[0].w, input[3].xy__, 2D[2]; 12: MUL temp[2].xyz, temp[2].xyz_, temp[0].www_; 13: MUL_SAT output[0].xyz, temp[2].xyz_, const[3].xyz_; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TEX temp[0], input[1].xy__, 2D[0]; 1: MAD temp[1].xyz, const[0].xyz_, (1 - temp[0]).www_, temp[0].www_; 2: MUL temp[2].xyz, temp[0].xyz_, temp[1].xyz_; 3: TEX temp[4].x, input[2].xy__, 2DSHADOW[1]; 4: MOV_SAT temp[5].w, input[2].___z; 5: ADD temp[5].w, -temp[5].___w, temp[4].___x; 6: CMP temp[1].xyz, temp[5].www_, none.000_, none.111_; 7: MUL temp[3].xyz, input[0].xyz_, const[4].yyy_; 8: MAD_SAT temp[1].xyz, temp[3].xyz_, temp[1].xyz_, const[2].xyz_; 9: MUL temp[2].xyz, temp[2].xyz_, temp[1].xyz_; 10: TEX temp[0].w, input[3].xy__, 2D[2]; 11: MUL temp[2].xyz, temp[2].xyz_, temp[0].www_; 12: MUL_SAT output[0].xyz, temp[2].xyz_, const[3].xyz_; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[0], input[1].xy__, 2D[0]; 1: MAD temp[1].xyz, const[0].xyz_, (1 - temp[0]).www_, temp[0].www_; 2: MUL temp[2].xyz, temp[0].xyz_, temp[1].xyz_; 3: TEX temp[4].x, input[2].xy__, 2DSHADOW[1]; 4: MOV_SAT temp[5].w, input[2].___z; 5: ADD temp[5].w, -temp[5].___w, temp[4].___x; 6: CMP temp[1].xyz, temp[5].www_, none.000_, none.111_; 7: MUL temp[3].xyz, input[0].xyz_, const[4].yyy_; 8: MAD_SAT temp[1].xyz, temp[3].xyz_, temp[1].xyz_, const[2].xyz_; 9: MUL temp[2].xyz, temp[2].xyz_, temp[1].xyz_; 10: TEX temp[0].w, input[3].xy__, 2D[2]; 11: MUL temp[2].xyz, temp[2].xyz_, temp[0].www_; 12: MUL_SAT output[0].xyz, temp[2].xyz_, const[3].xyz_; CONST[4] = { 1.0000 2.0000 0.0000 0.0000 } Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[0], input[1].xy__, 2D[0]; 1: MAD temp[1].xyz, const[0].xyz_, (1 - temp[0]).www_, temp[0].www_; 2: MUL temp[2].xyz, temp[0].xyz_, temp[1].xyz_; 3: TEX temp[4].x, input[2].xy__, 2DSHADOW[1]; 4: MOV_SAT temp[5].w, input[2].___z; 5: ADD temp[5].w, -temp[5].___w, temp[4].___x; 6: CMP temp[1].xyz, temp[5].www_, none.000_, none.111_; 7: MUL temp[3].xyz, input[0].xyz_, const[4].yyy_; 8: MAD_SAT temp[1].xyz, temp[3].xyz_, temp[1].xyz_, const[2].xyz_; 9: MUL temp[2].xyz, temp[2].xyz_, temp[1].xyz_; 10: TEX temp[0].w, input[3].xy__, 2D[2]; 11: MUL temp[2].xyz, temp[2].xyz_, temp[0].www_; 12: MUL_SAT output[0].xyz, temp[2].xyz_, const[3].xyz_; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[0], input[1].xy__, 2D[0]; 1: src0.xyz = const[0], src0.w = temp[0], srcp.w = (1 - src0) MAD temp[1].xyz, src0.xyz, srcp.www, src0.www 2: src0.xyz = temp[0], src1.xyz = temp[1] MAD temp[2].xyz, src0.xyz, src1.xyz, src0.000 3: TEX temp[4].x, input[2].xy__, 2DSHADOW[1]; 4: src0.xyz = input[2] MAD_SAT temp[5].w, src0.z, src0.1, src0.0 5: src0.xyz = temp[4], src0.w = temp[5] MAD temp[5].w, -src0.w, src0.1, src0.x 6: src0.w = temp[5] CMP temp[1].xyz, src0.111, src0.000, src0.www 7: src0.xyz = input[0], src1.xyz = const[4] MAD temp[3].xyz, src0.xyz, src1.yyy, src0.000 8: src0.xyz = temp[3], src1.xyz = temp[1], src2.xyz = const[2] MAD_SAT temp[1].xyz, src0.xyz, src1.xyz, src2.xyz 9: src0.xyz = temp[2], src1.xyz = temp[1] MAD temp[2].xyz, src0.xyz, src1.xyz, src0.000 10: TEX temp[0].w, input[3].xy__, 2D[2]; 11: src0.xyz = temp[2], src0.w = temp[0] MAD temp[2].xyz, src0.xyz, src0.www, src0.000 12: src0.xyz = temp[2], src1.xyz = const[3] MAD_SAT color[0].xyz, src0.xyz, src1.xyz, src0.000 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[0], input[1].xy__, 2D[0]; 2: TEX temp[4].x, input[2].xy__, 2DSHADOW[1]; 3: src0.xyz = const[0], src0.w = temp[0], src1.xyz = input[2], srcp.w = (1 - src0) MAD temp[1].xyz, src0.xyz, srcp.www, src0.www MAD_SAT temp[5].w, src1.z, src0.1, src0.0 4: src0.xyz = temp[0], src0.w = temp[5], src1.xyz = temp[1], src2.xyz = temp[4] MAD temp[2].xyz, src0.xyz, src1.xyz, src0.000 MAD temp[5].w, -src0.w, src0.1, src2.x 5: src0.w = temp[5] CMP temp[1].xyz, src0.111, src0.000, src0.www 6: src0.xyz = input[0], src1.xyz = const[4] MAD temp[3].xyz, src0.xyz, src1.yyy, src0.000 7: src0.xyz = temp[3], src1.xyz = temp[1], src2.xyz = const[2] MAD_SAT temp[1].xyz, src0.xyz, src1.xyz, src2.xyz 8: src0.xyz = temp[2], src1.xyz = temp[1] MAD temp[2].xyz, src0.xyz, src1.xyz, src0.000 9: BEGIN_TEX; 10: TEX temp[0].w, input[3].xy__, 2D[2]; 11: src0.xyz = temp[2], src0.w = temp[0] MAD temp[2].xyz, src0.xyz, src0.www, src0.000 12: src0.xyz = temp[2], src1.xyz = const[3] MAD_SAT color[0].xyz, src0.xyz, src1.xyz, src0.000 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[1], temp[1].xy__, 2D[0]; 2: TEX temp[4].x, temp[2].xy__, 2DSHADOW[1]; 3: src0.xyz = const[0], src0.w = temp[1], src1.xyz = temp[2], srcp.w = (1 - src0) MAD temp[2].xyz, src0.xyz, srcp.www, src0.www MAD_SAT temp[5].w, src1.z, src0.1, src0.0 4: src0.xyz = temp[1], src0.w = temp[5], src1.xyz = temp[2], src2.xyz = temp[4] MAD temp[4].xyz, src0.xyz, src1.xyz, src0.000 MAD temp[5].w, -src0.w, src0.1, src2.x 5: src0.w = temp[5] CMP temp[2].xyz, src0.111, src0.000, src0.www 6: src0.xyz = temp[0], src1.xyz = const[4] MAD temp[0].xyz, src0.xyz, src1.yyy, src0.000 7: src0.xyz = temp[0], src1.xyz = temp[2], src2.xyz = const[2] MAD_SAT temp[2].xyz, src0.xyz, src1.xyz, src2.xyz 8: src0.xyz = temp[4], src1.xyz = temp[2] MAD temp[4].xyz, src0.xyz, src1.xyz, src0.000 9: BEGIN_TEX; 10: TEX temp[1].w, temp[3].xy__, 2D[2]; 11: src0.xyz = temp[4], src0.w = temp[1] MAD temp[4].xyz, src0.xyz, src0.www, src0.000 12: src0.xyz = temp[4], src1.xyz = const[3] MAD_SAT color[0].xyz, src0.xyz, src1.xyz, src0.000 R500 Fragment Program: -------- 0 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02400000: id: 0 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe401f401: src: 1 R/G/A/A dst: 1 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00000807:TEX TEX_WAIT wmask: R omask: NONE 1:TEX_INST: 0x02410000: id: 1 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe404f402: src: 2 R/G/A/A dst: 4 R/G/B/A 3:TEX_DXDY: 0x00000000 2 0:CMN_INST 0x00107804:ALU TEX_WAIT wmask: ARGB omask: NONE 1:RGB_ADDR 0x08000900:Addr0: 0c, Addr1: 2t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0xc8020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:3 3 RGB_INST: 0x006de220:rgb_A_src:0 R/G/B 0 rgb_B_src:3 A/A/A 0 targ: 0 4 ALPHA_INST:0x00c09050:MAD dest:5 alp_A_src:1 B 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x2036c020:MAD dest:2 rgb_C_src:0 A/A/A 0 alp_C_src:0 0 0 3 0:CMN_INST 0x00007804:ALU TEX_WAIT wmask: ARGB omask: NONE 1:RGB_ADDR 0x00400801:Addr0: 1t, Addr1: 2t, Addr2: 4t, srcp:0 2:ALPHA_ADDR 0x08020005:Addr0: 5t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00c2c050:MAD dest:5 alp_A_src:0 A 1 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x04490040:MAD dest:4 rgb_C_src:0 0/0/0 0 alp_C_src:2 R 0 4 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020005:Addr0: 5t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x009206d8:rgb_A_src:0 1/1/1 0 rgb_B_src:0 0/0/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x0036c028:CMP dest:2 rgb_C_src:0 A/A/A 0 alp_C_src:0 R 0 5 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x08041000:Addr0: 0t, Addr1: 4c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0024a220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 G/G/G 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 6 0:CMN_INST 0x00083804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x10200800:Addr0: 0t, Addr1: 2t, Addr2: 2c, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00222020:MAD dest:2 rgb_C_src:2 R/G/B 0 alp_C_src:0 R 0 7 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x08000804:Addr0: 4t, Addr1: 2t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490040:MAD dest:4 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 8 0:CMN_INST 0x00004007:TEX TEX_WAIT wmask: A omask: NONE 1:TEX_INST: 0x02420000: id: 2 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe401f403: src: 3 R/G/A/A dst: 1 R/G/B/A 3:TEX_DXDY: 0x00000000 9 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x08020004:Addr0: 4t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490040:MAD dest:4 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 10 0:CMN_INST 0x000b8005:OUT TEX_WAIT wmask: NONE omask: RGB 1:RGB_ADDR 0x08040c04:Addr0: 4t, Addr1: 3c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 ~~~~~~~~ FRAGMENT PROGRAM ~~~~~~~ ~ 11 Instructions ~ 8 Vector Instructions (RGB) ~ 2 Scalar Instructions (Alpha) ~ 0 Flow Control Instructions ~ 3 Texture Instructions ~ 1 Presub Operations ~ 6 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL IN[2] DCL OUT[0], POSITION DCL OUT[1], COLOR DCL OUT[2], GENERIC[0] DCL OUT[3], GENERIC[1] DCL OUT[4], GENERIC[2] DCL CONST[0..2] DCL CONST[4..14] DCL TEMP[0..2] IMM FLT32 { 1.0000, 0.0000, 0.5000, 0.0000} 0: DP4 TEMP[0].x, CONST[0], IN[0] 1: DP4 TEMP[0].y, CONST[1], IN[0] 2: DP4 TEMP[0].z, CONST[2], IN[0] 3: MOV TEMP[0].w, IMM[0].xxxx 4: DP3 TEMP[1].x, CONST[0], IN[1] 5: DP3 TEMP[1].y, CONST[1], IN[1] 6: DP3 TEMP[1].z, CONST[2], IN[1] 7: DP4 OUT[0].x, CONST[4], TEMP[0] 8: DP4 OUT[0].y, CONST[5], TEMP[0] 9: DP4 OUT[0].z, CONST[6], TEMP[0] 10: DP4 OUT[0].w, CONST[7], TEMP[0] 11: DP3 TEMP[2], TEMP[1], -CONST[8] 12: MAX TEMP[2], IMM[0].yyyy, TEMP[2] 13: MUL TEMP[2], TEMP[2], IMM[0].zzzz 14: MUL OUT[1], TEMP[2], CONST[9] 15: MOV OUT[2], IN[2] 16: DP4 OUT[3].x, CONST[10], TEMP[0] 17: DP4 OUT[3].y, CONST[11], TEMP[0] 18: DP4 OUT[3].z, CONST[12], TEMP[0] 19: DP4 OUT[3].w, CONST[13], TEMP[0] 20: MAD OUT[4], TEMP[0].xzzz, CONST[14].xxxx, CONST[14].yyyy 21: END Vertex Program: before compilation # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: MOV temp[0].w, const[15].xxxx; 4: DP3 temp[1].x, const[0], input[1]; 5: DP3 temp[1].y, const[1], input[1]; 6: DP3 temp[1].z, const[2], input[1]; 7: DP4 temp[3].x, const[4], temp[0]; 8: DP4 temp[3].y, const[5], temp[0]; 9: DP4 temp[3].z, const[6], temp[0]; 10: DP4 temp[3].w, const[7], temp[0]; 11: DP3 temp[2], temp[1], -const[8]; 12: MAX temp[2], const[15].yyyy, temp[2]; 13: MUL temp[2], temp[2], const[15].zzzz; 14: MUL output[1], temp[2], const[9]; 15: MOV output[2], input[2]; 16: DP4 output[3].x, const[10], temp[0]; 17: DP4 output[3].y, const[11], temp[0]; 18: DP4 output[3].z, const[12], temp[0]; 19: DP4 output[3].w, const[13], temp[0]; 20: MAD output[4], temp[0].xzzz, const[14].xxxx, const[14].yyyy; 21: MOV output[0], temp[3]; 22: MOV output[5], temp[3]; Vertex Program: after 'transform loops' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: MOV temp[0].w, const[15].xxxx; 4: DP3 temp[1].x, const[0], input[1]; 5: DP3 temp[1].y, const[1], input[1]; 6: DP3 temp[1].z, const[2], input[1]; 7: DP4 temp[3].x, const[4], temp[0]; 8: DP4 temp[3].y, const[5], temp[0]; 9: DP4 temp[3].z, const[6], temp[0]; 10: DP4 temp[3].w, const[7], temp[0]; 11: DP3 temp[2], temp[1], -const[8]; 12: MAX temp[2], const[15].yyyy, temp[2]; 13: MUL temp[2], temp[2], const[15].zzzz; 14: MUL output[1], temp[2], const[9]; 15: MOV output[2], input[2]; 16: DP4 output[3].x, const[10], temp[0]; 17: DP4 output[3].y, const[11], temp[0]; 18: DP4 output[3].z, const[12], temp[0]; 19: DP4 output[3].w, const[13], temp[0]; 20: MAD output[4], temp[0].xzzz, const[14].xxxx, const[14].yyyy; 21: MOV output[0], temp[3]; 22: MOV output[5], temp[3]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: MOV temp[0].w, const[15].xxxx; 4: DP3 temp[1].x, const[0], input[1]; 5: DP3 temp[1].y, const[1], input[1]; 6: DP3 temp[1].z, const[2], input[1]; 7: DP4 temp[3].x, const[4], temp[0]; 8: DP4 temp[3].y, const[5], temp[0]; 9: DP4 temp[3].z, const[6], temp[0]; 10: DP4 temp[3].w, const[7], temp[0]; 11: DP3 temp[2], temp[1], -const[8]; 12: MAX temp[2], const[15].yyyy, temp[2]; 13: MUL temp[2], temp[2], const[15].zzzz; 14: MUL output[1], temp[2], const[9]; 15: MOV output[2], input[2]; 16: DP4 output[3].x, const[10], temp[0]; 17: DP4 output[3].y, const[11], temp[0]; 18: DP4 output[3].z, const[12], temp[0]; 19: DP4 output[3].w, const[13], temp[0]; 20: MAD output[4], temp[0].xzzz, const[14].xxxx, const[14].yyyy; 21: MOV output[0], temp[3]; 22: MOV output[5], temp[3]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: MOV temp[0].w, const[15].xxxx; 4: DP4 temp[1].x, const[0].xyz0, input[1].xyz0; 5: DP4 temp[1].y, const[1].xyz0, input[1].xyz0; 6: DP4 temp[1].z, const[2].xyz0, input[1].xyz0; 7: DP4 temp[3].x, const[4], temp[0]; 8: DP4 temp[3].y, const[5], temp[0]; 9: DP4 temp[3].z, const[6], temp[0]; 10: DP4 temp[3].w, const[7], temp[0]; 11: DP4 temp[2], temp[1].xyz0, const[8].-x-y-z0; 12: MAX temp[2], const[15].yyyy, temp[2]; 13: MUL temp[2], temp[2], const[15].zzzz; 14: MUL output[1], temp[2], const[9]; 15: MOV output[2], input[2]; 16: DP4 output[3].x, const[10], temp[0]; 17: DP4 output[3].y, const[11], temp[0]; 18: DP4 output[3].z, const[12], temp[0]; 19: DP4 output[3].w, const[13], temp[0]; 20: MAD output[4], temp[0].xzzz, const[14].xxxx, const[14].yyyy; 21: MOV output[0], temp[3]; 22: MOV output[5], temp[3]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: MOV temp[0].w, const[15].___x; 4: DP4 temp[1].x, const[0].xyz0, input[1].xyz0; 5: DP4 temp[1].y, const[1].xyz0, input[1].xyz0; 6: DP4 temp[1].z, const[2].xyz0, input[1].xyz0; 7: DP4 temp[3].x, const[4], temp[0]; 8: DP4 temp[3].y, const[5], temp[0]; 9: DP4 temp[3].z, const[6], temp[0]; 10: DP4 temp[3].w, const[7], temp[0]; 11: DP4 temp[2], temp[1].xyz0, const[8].-x-y-z0; 12: MAX temp[2], const[15].yyyy, temp[2]; 13: MUL temp[2], temp[2], const[15].zzzz; 14: MUL output[1], temp[2], const[9]; 15: MOV output[2], input[2]; 16: DP4 output[3].x, const[10], temp[0]; 17: DP4 output[3].y, const[11], temp[0]; 18: DP4 output[3].z, const[12], temp[0]; 19: DP4 output[3].w, const[13], temp[0]; 20: MAD output[4], temp[0].xzzz, const[14].xxxx, const[14].yyyy; 21: MOV output[0], temp[3]; 22: MOV output[5], temp[3]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: MOV temp[0].w, none.___1; 4: DP4 temp[1].x, const[0].xyz0, input[1].xyz0; 5: DP4 temp[1].y, const[1].xyz0, input[1].xyz0; 6: DP4 temp[1].z, const[2].xyz0, input[1].xyz0; 7: DP4 temp[3].x, const[4], temp[0]; 8: DP4 temp[3].y, const[5], temp[0]; 9: DP4 temp[3].z, const[6], temp[0]; 10: DP4 temp[3].w, const[7], temp[0]; 11: DP4 temp[2], temp[1].xyz0, const[8].-x-y-z0; 12: MAX temp[2], none.0000, temp[2]; 13: MUL temp[2], temp[2], const[15].zzzz; 14: MUL output[1], temp[2], const[9]; 15: MOV output[2], input[2]; 16: DP4 output[3].x, const[10], temp[0]; 17: DP4 output[3].y, const[11], temp[0]; 18: DP4 output[3].z, const[12], temp[0]; 19: DP4 output[3].w, const[13], temp[0]; 20: MAD output[4], temp[0].xzzz, const[14].xxxx, const[14].yyyy; 21: MOV output[0], temp[3]; 22: MOV output[5], temp[3]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: MOV temp[0].w, none.___1; 4: DP4 temp[1].x, const[0].xyz0, input[1].xyz0; 5: DP4 temp[1].y, const[1].xyz0, input[1].xyz0; 6: DP4 temp[1].z, const[2].xyz0, input[1].xyz0; 7: DP4 temp[3].x, const[4], temp[0]; 8: DP4 temp[3].y, const[5], temp[0]; 9: DP4 temp[3].z, const[6], temp[0]; 10: DP4 temp[3].w, const[7], temp[0]; 11: DP4 temp[2], temp[1].xyz0, const[8].-x-y-z0; 12: MAX temp[2], none.0000, temp[2]; 13: MUL temp[2], temp[2], const[15].zzzz; 14: MUL output[1], temp[2], const[9]; 15: MOV output[2], input[2]; 16: DP4 output[3].x, const[10], temp[0]; 17: DP4 output[3].y, const[11], temp[0]; 18: DP4 output[3].z, const[12], temp[0]; 19: DP4 output[3].w, const[13], temp[0]; 20: MAD output[4], temp[0].xzzz, const[14].xxxx, const[14].yyyy; 21: MOV output[0], temp[3]; 22: MOV output[5], temp[3]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: MOV temp[0].w, none.___1; 4: DP4 temp[1].x, const[0].xyz0, input[1].xyz0; 5: DP4 temp[1].y, const[1].xyz0, input[1].xyz0; 6: DP4 temp[1].z, const[2].xyz0, input[1].xyz0; 7: DP4 temp[2].x, const[4], temp[0]; 8: DP4 temp[2].y, const[5], temp[0]; 9: DP4 temp[2].z, const[6], temp[0]; 10: DP4 temp[2].w, const[7], temp[0]; 11: DP4 temp[1], temp[1].xyz0, const[8].-x-y-z0; 12: MAX temp[1], none.0000, temp[1]; 13: MUL temp[1], temp[1], const[15].zzzz; 14: MUL output[1], temp[1], const[9]; 15: MOV output[2], input[2]; 16: DP4 output[3].x, const[10], temp[0]; 17: DP4 output[3].y, const[11], temp[0]; 18: DP4 output[3].z, const[12], temp[0]; 19: DP4 output[3].w, const[13], temp[0]; 20: MAD output[4], temp[0].xzzz, const[14].xxxx, const[14].yyyy; 21: MOV output[0], temp[2]; 22: MOV output[5], temp[2]; CONST[15] = { 1.0000 0.0000 0.5000 0.0000 } Vertex Program: after 'dead constants' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: MOV temp[0].w, none.___1; 4: DP4 temp[1].x, const[0].xyz0, input[1].xyz0; 5: DP4 temp[1].y, const[1].xyz0, input[1].xyz0; 6: DP4 temp[1].z, const[2].xyz0, input[1].xyz0; 7: DP4 temp[2].x, const[4], temp[0]; 8: DP4 temp[2].y, const[5], temp[0]; 9: DP4 temp[2].z, const[6], temp[0]; 10: DP4 temp[2].w, const[7], temp[0]; 11: DP4 temp[1], temp[1].xyz0, const[8].-x-y-z0; 12: MAX temp[1], none.0000, temp[1]; 13: MUL temp[1], temp[1], const[15].zzzz; 14: MUL output[1], temp[1], const[9]; 15: MOV output[2], input[2]; 16: DP4 output[3].x, const[10], temp[0]; 17: DP4 output[3].y, const[11], temp[0]; 18: DP4 output[3].z, const[12], temp[0]; 19: DP4 output[3].w, const[13], temp[0]; 20: MAD output[4], temp[0].xzzz, const[14].xxxx, const[14].yyyy; 21: MOV output[0], temp[2]; 22: MOV output[5], temp[2]; Final vertex program code: 0: op: 0x00100001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 1: op: 0x00200001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 2: op: 0x00400001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 3: op: 0x00800003 dst: 0t op: VE_ADD src0: 0x017fe000 reg: 0t swiz: U/ U/ U/ 1 src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 4: op: 0x00102001 dst: 1t op: VE_DOT_PRODUCT src0: 0x01110002 reg: 0c swiz: X/ Y/ Z/ 0 src1: 0x01110021 reg: 1i swiz: X/ Y/ Z/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 5: op: 0x00202001 dst: 1t op: VE_DOT_PRODUCT src0: 0x01110022 reg: 1c swiz: X/ Y/ Z/ 0 src1: 0x01110021 reg: 1i swiz: X/ Y/ Z/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 6: op: 0x00402001 dst: 1t op: VE_DOT_PRODUCT src0: 0x01110042 reg: 2c swiz: X/ Y/ Z/ 0 src1: 0x01110021 reg: 1i swiz: X/ Y/ Z/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 7: op: 0x00104001 dst: 2t op: VE_DOT_PRODUCT src0: 0x00d10082 reg: 4c swiz: X/ Y/ Z/ W src1: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 8: op: 0x00204001 dst: 2t op: VE_DOT_PRODUCT src0: 0x00d100a2 reg: 5c swiz: X/ Y/ Z/ W src1: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 9: op: 0x00404001 dst: 2t op: VE_DOT_PRODUCT src0: 0x00d100c2 reg: 6c swiz: X/ Y/ Z/ W src1: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 10: op: 0x00804001 dst: 2t op: VE_DOT_PRODUCT src0: 0x00d100e2 reg: 7c swiz: X/ Y/ Z/ W src1: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 11: op: 0x00f02001 dst: 1t op: VE_DOT_PRODUCT src0: 0x01110020 reg: 1t swiz: X/ Y/ Z/ 0 src1: 0x0f110102 reg: 8c swiz: -X/-Y/-Z/ 0 src2: 0x01248102 reg: 8c swiz: 0/ 0/ 0/ 0 12: op: 0x00f02007 dst: 1t op: VE_MAXIMUM src0: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src1: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 13: op: 0x00f02002 dst: 1t op: VE_MULTIPLY src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x009241e2 reg: 15c swiz: Z/ Z/ Z/ Z src2: 0x012481e2 reg: 15c swiz: 0/ 0/ 0/ 0 14: op: 0x00f02202 dst: 1o op: VE_MULTIPLY src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x00d10122 reg: 9c swiz: X/ Y/ Z/ W src2: 0x01248122 reg: 9c swiz: 0/ 0/ 0/ 0 15: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10041 reg: 2i swiz: X/ Y/ Z/ W src1: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 src2: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 16: op: 0x00106201 dst: 3o op: VE_DOT_PRODUCT src0: 0x00d10142 reg: 10c swiz: X/ Y/ Z/ W src1: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 17: op: 0x00206201 dst: 3o op: VE_DOT_PRODUCT src0: 0x00d10162 reg: 11c swiz: X/ Y/ Z/ W src1: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 18: op: 0x00406201 dst: 3o op: VE_DOT_PRODUCT src0: 0x00d10182 reg: 12c swiz: X/ Y/ Z/ W src1: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 19: op: 0x00806201 dst: 3o op: VE_DOT_PRODUCT src0: 0x00d101a2 reg: 13c swiz: X/ Y/ Z/ W src1: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 20: op: 0x00f08204 dst: 4o op: VE_MULTIPLY_ADD src0: 0x00920000 reg: 0t swiz: X/ Z/ Z/ Z src1: 0x000001c2 reg: 14c swiz: X/ X/ X/ X src2: 0x004921c2 reg: 14c swiz: Y/ Y/ Y/ Y 21: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10040 reg: 2t swiz: X/ Y/ Z/ W src1: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 src2: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 22: op: 0x00f0a203 dst: 5o op: VE_ADD src0: 0x00d10040 reg: 2t swiz: X/ Y/ Z/ W src1: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 src2: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 23 Instructions ~ 0 Flow Control Instructions ~ 3 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], COLOR, LINEAR DCL IN[1], GENERIC[0], PERSPECTIVE DCL IN[2], GENERIC[1], PERSPECTIVE DCL IN[3], GENERIC[2], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL SAMP[1] DCL SAMP[2] DCL CONST[0] DCL CONST[2..3] DCL TEMP[0..3] IMM FLT32 { 1.0000, 2.0000, 0.0000, 0.0000} 0: TEX TEMP[0], IN[1], SAMP[0], 2D 1: LRP TEMP[1].xyz, CONST[0], IMM[0].xxxx, TEMP[0].wwww 2: MUL TEMP[2].xyz, TEMP[0], TEMP[1] 3: TEX TEMP[1], IN[2], SAMP[1], SHADOW2D 4: MUL TEMP[3].xyz, IN[0], IMM[0].yyyy 5: MAD_SAT TEMP[1].xyz, TEMP[3], TEMP[1], CONST[2] 6: MUL TEMP[2].xyz, TEMP[2], TEMP[1] 7: TEX TEMP[0].w, IN[3], SAMP[2], 2D 8: MUL TEMP[2].xyz, TEMP[2], TEMP[0].wwww 9: MUL OUT[0].xyz, TEMP[2], CONST[3] 10: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: LRP temp[1].xyz, const[0], const[4].xxxx, temp[0].wwww; 2: MUL temp[2].xyz, temp[0], temp[1]; 3: TEX temp[1], input[2], 2DSHADOW[1]; 4: MUL temp[3].xyz, input[0], const[4].yyyy; 5: MAD_SAT temp[1].xyz, temp[3], temp[1], const[2]; 6: MUL temp[2].xyz, temp[2], temp[1]; 7: TEX temp[0].w, input[3], 2D[2]; 8: MUL temp[2].xyz, temp[2], temp[0].wwww; 9: MUL output[0].xyz, temp[2], const[3]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: LRP temp[1].xyz, const[0], const[4].xxxx, temp[0].wwww; 2: MUL temp[2].xyz, temp[0], temp[1]; 3: TEX temp[1], input[2], 2DSHADOW[1]; 4: MUL temp[3].xyz, input[0], const[4].yyyy; 5: MAD_SAT temp[1].xyz, temp[3], temp[1], const[2]; 6: MUL temp[2].xyz, temp[2], temp[1]; 7: TEX temp[0].w, input[3], 2D[2]; 8: MUL temp[2].xyz, temp[2], temp[0].wwww; 9: MUL output[0].xyz, temp[2], const[3]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: LRP temp[1].xyz, const[0], const[4].xxxx, temp[0].wwww; 2: MUL temp[2].xyz, temp[0], temp[1]; 3: TEX temp[1], input[2], 2DSHADOW[1]; 4: MUL temp[3].xyz, input[0], const[4].yyyy; 5: MAD_SAT temp[1].xyz, temp[3], temp[1], const[2]; 6: MUL temp[2].xyz, temp[2], temp[1]; 7: TEX temp[0].w, input[3], 2D[2]; 8: MUL temp[2].xyz, temp[2], temp[0].wwww; 9: MUL output[0].xyz, temp[2], const[3]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: LRP temp[1].xyz, const[0], const[4].xxxx, temp[0].wwww; 2: MUL temp[2].xyz, temp[0], temp[1]; 3: TEX temp[1], input[2], 2DSHADOW[1]; 4: MUL temp[3].xyz, input[0], const[4].yyyy; 5: MAD_SAT temp[1].xyz, temp[3], temp[1], const[2]; 6: MUL temp[2].xyz, temp[2], temp[1]; 7: TEX temp[0].w, input[3], 2D[2]; 8: MUL temp[2].xyz, temp[2], temp[0].wwww; 9: MUL output[0].xyz, temp[2], const[3]; Fragment Program: after 'saturate output writes' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: LRP temp[1].xyz, const[0], const[4].xxxx, temp[0].wwww; 2: MUL temp[2].xyz, temp[0], temp[1]; 3: TEX temp[1], input[2], 2DSHADOW[1]; 4: MUL temp[3].xyz, input[0], const[4].yyyy; 5: MAD_SAT temp[1].xyz, temp[3], temp[1], const[2]; 6: MUL temp[2].xyz, temp[2], temp[1]; 7: TEX temp[0].w, input[3], 2D[2]; 8: MUL temp[2].xyz, temp[2], temp[0].wwww; 9: MUL_SAT output[0].xyz, temp[2], const[3]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: LRP temp[1].xyz, const[0], const[4].xxxx, temp[0].wwww; 2: MUL temp[2].xyz, temp[0], temp[1]; 3: TEX temp[4], input[2], 2DSHADOW[1]; 4: MOV_SAT temp[5].w, input[2].zzzz; 5: ADD temp[5].w, -temp[5].wwww, temp[4].xxxx; 6: CMP temp[1], temp[5].www1, none.0001, none.1111; 7: MUL temp[3].xyz, input[0], const[4].yyyy; 8: MAD_SAT temp[1].xyz, temp[3], temp[1], const[2]; 9: MUL temp[2].xyz, temp[2], temp[1]; 10: TEX temp[0].w, input[3], 2D[2]; 11: MUL temp[2].xyz, temp[2], temp[0].wwww; 12: MUL_SAT output[0].xyz, temp[2], const[3]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: ADD temp[1].xyz, const[4].xxxx, -temp[0].wwww; 2: MAD temp[1].xyz, const[0], temp[1], temp[0].wwww; 3: MUL temp[2].xyz, temp[0], temp[1]; 4: TEX temp[4], input[2], 2DSHADOW[1]; 5: MOV_SAT temp[5].w, input[2].zzzz; 6: ADD temp[5].w, -temp[5].wwww, temp[4].xxxx; 7: CMP temp[1], temp[5].www1, none.0001, none.1111; 8: MUL temp[3].xyz, input[0], const[4].yyyy; 9: MAD_SAT temp[1].xyz, temp[3], temp[1], const[2]; 10: MUL temp[2].xyz, temp[2], temp[1]; 11: TEX temp[0].w, input[3], 2D[2]; 12: MUL temp[2].xyz, temp[2], temp[0].wwww; 13: MUL_SAT output[0].xyz, temp[2], const[3]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TEX temp[0], input[1].xy__, 2D[0]; 1: ADD temp[1].xyz, const[4].xxx_, -temp[0].www_; 2: MAD temp[1].xyz, const[0].xyz_, temp[1].xyz_, temp[0].www_; 3: MUL temp[2].xyz, temp[0].xyz_, temp[1].xyz_; 4: TEX temp[4].x, input[2].xy__, 2DSHADOW[1]; 5: MOV_SAT temp[5].w, input[2].___z; 6: ADD temp[5].w, -temp[5].___w, temp[4].___x; 7: CMP temp[1].xyz, temp[5].www_, none.000_, none.111_; 8: MUL temp[3].xyz, input[0].xyz_, const[4].yyy_; 9: MAD_SAT temp[1].xyz, temp[3].xyz_, temp[1].xyz_, const[2].xyz_; 10: MUL temp[2].xyz, temp[2].xyz_, temp[1].xyz_; 11: TEX temp[0].w, input[3].xy__, 2D[2]; 12: MUL temp[2].xyz, temp[2].xyz_, temp[0].www_; 13: MUL_SAT output[0].xyz, temp[2].xyz_, const[3].xyz_; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TEX temp[0], input[1].xy__, 2D[0]; 1: MAD temp[1].xyz, const[0].xyz_, (1 - temp[0]).www_, temp[0].www_; 2: MUL temp[2].xyz, temp[0].xyz_, temp[1].xyz_; 3: TEX temp[4].x, input[2].xy__, 2DSHADOW[1]; 4: MOV_SAT temp[5].w, input[2].___z; 5: ADD temp[5].w, -temp[5].___w, temp[4].___x; 6: CMP temp[1].xyz, temp[5].www_, none.000_, none.111_; 7: MUL temp[3].xyz, input[0].xyz_, const[4].yyy_; 8: MAD_SAT temp[1].xyz, temp[3].xyz_, temp[1].xyz_, const[2].xyz_; 9: MUL temp[2].xyz, temp[2].xyz_, temp[1].xyz_; 10: TEX temp[0].w, input[3].xy__, 2D[2]; 11: MUL temp[2].xyz, temp[2].xyz_, temp[0].www_; 12: MUL_SAT output[0].xyz, temp[2].xyz_, const[3].xyz_; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[0], input[1].xy__, 2D[0]; 1: MAD temp[1].xyz, const[0].xyz_, (1 - temp[0]).www_, temp[0].www_; 2: MUL temp[2].xyz, temp[0].xyz_, temp[1].xyz_; 3: TEX temp[4].x, input[2].xy__, 2DSHADOW[1]; 4: MOV_SAT temp[5].w, input[2].___z; 5: ADD temp[5].w, -temp[5].___w, temp[4].___x; 6: CMP temp[1].xyz, temp[5].www_, none.000_, none.111_; 7: MUL temp[3].xyz, input[0].xyz_, const[4].yyy_; 8: MAD_SAT temp[1].xyz, temp[3].xyz_, temp[1].xyz_, const[2].xyz_; 9: MUL temp[2].xyz, temp[2].xyz_, temp[1].xyz_; 10: TEX temp[0].w, input[3].xy__, 2D[2]; 11: MUL temp[2].xyz, temp[2].xyz_, temp[0].www_; 12: MUL_SAT output[0].xyz, temp[2].xyz_, const[3].xyz_; CONST[4] = { 1.0000 2.0000 0.0000 0.0000 } Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[0], input[1].xy__, 2D[0]; 1: MAD temp[1].xyz, const[0].xyz_, (1 - temp[0]).www_, temp[0].www_; 2: MUL temp[2].xyz, temp[0].xyz_, temp[1].xyz_; 3: TEX temp[4].x, input[2].xy__, 2DSHADOW[1]; 4: MOV_SAT temp[5].w, input[2].___z; 5: ADD temp[5].w, -temp[5].___w, temp[4].___x; 6: CMP temp[1].xyz, temp[5].www_, none.000_, none.111_; 7: MUL temp[3].xyz, input[0].xyz_, const[4].yyy_; 8: MAD_SAT temp[1].xyz, temp[3].xyz_, temp[1].xyz_, const[2].xyz_; 9: MUL temp[2].xyz, temp[2].xyz_, temp[1].xyz_; 10: TEX temp[0].w, input[3].xy__, 2D[2]; 11: MUL temp[2].xyz, temp[2].xyz_, temp[0].www_; 12: MUL_SAT output[0].xyz, temp[2].xyz_, const[3].xyz_; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[0], input[1].xy__, 2D[0]; 1: src0.xyz = const[0], src0.w = temp[0], srcp.w = (1 - src0) MAD temp[1].xyz, src0.xyz, srcp.www, src0.www 2: src0.xyz = temp[0], src1.xyz = temp[1] MAD temp[2].xyz, src0.xyz, src1.xyz, src0.000 3: TEX temp[4].x, input[2].xy__, 2DSHADOW[1]; 4: src0.xyz = input[2] MAD_SAT temp[5].w, src0.z, src0.1, src0.0 5: src0.xyz = temp[4], src0.w = temp[5] MAD temp[5].w, -src0.w, src0.1, src0.x 6: src0.w = temp[5] CMP temp[1].xyz, src0.111, src0.000, src0.www 7: src0.xyz = input[0], src1.xyz = const[4] MAD temp[3].xyz, src0.xyz, src1.yyy, src0.000 8: src0.xyz = temp[3], src1.xyz = temp[1], src2.xyz = const[2] MAD_SAT temp[1].xyz, src0.xyz, src1.xyz, src2.xyz 9: src0.xyz = temp[2], src1.xyz = temp[1] MAD temp[2].xyz, src0.xyz, src1.xyz, src0.000 10: TEX temp[0].w, input[3].xy__, 2D[2]; 11: src0.xyz = temp[2], src0.w = temp[0] MAD temp[2].xyz, src0.xyz, src0.www, src0.000 12: src0.xyz = temp[2], src1.xyz = const[3] MAD_SAT color[0].xyz, src0.xyz, src1.xyz, src0.000 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[0], input[1].xy__, 2D[0]; 2: TEX temp[4].x, input[2].xy__, 2DSHADOW[1]; 3: src0.xyz = const[0], src0.w = temp[0], src1.xyz = input[2], srcp.w = (1 - src0) MAD temp[1].xyz, src0.xyz, srcp.www, src0.www MAD_SAT temp[5].w, src1.z, src0.1, src0.0 4: src0.xyz = temp[0], src0.w = temp[5], src1.xyz = temp[1], src2.xyz = temp[4] MAD temp[2].xyz, src0.xyz, src1.xyz, src0.000 MAD temp[5].w, -src0.w, src0.1, src2.x 5: src0.w = temp[5] CMP temp[1].xyz, src0.111, src0.000, src0.www 6: src0.xyz = input[0], src1.xyz = const[4] MAD temp[3].xyz, src0.xyz, src1.yyy, src0.000 7: src0.xyz = temp[3], src1.xyz = temp[1], src2.xyz = const[2] MAD_SAT temp[1].xyz, src0.xyz, src1.xyz, src2.xyz 8: src0.xyz = temp[2], src1.xyz = temp[1] MAD temp[2].xyz, src0.xyz, src1.xyz, src0.000 9: BEGIN_TEX; 10: TEX temp[0].w, input[3].xy__, 2D[2]; 11: src0.xyz = temp[2], src0.w = temp[0] MAD temp[2].xyz, src0.xyz, src0.www, src0.000 12: src0.xyz = temp[2], src1.xyz = const[3] MAD_SAT color[0].xyz, src0.xyz, src1.xyz, src0.000 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[1], temp[1].xy__, 2D[0]; 2: TEX temp[4].x, temp[2].xy__, 2DSHADOW[1]; 3: src0.xyz = const[0], src0.w = temp[1], src1.xyz = temp[2], srcp.w = (1 - src0) MAD temp[2].xyz, src0.xyz, srcp.www, src0.www MAD_SAT temp[5].w, src1.z, src0.1, src0.0 4: src0.xyz = temp[1], src0.w = temp[5], src1.xyz = temp[2], src2.xyz = temp[4] MAD temp[4].xyz, src0.xyz, src1.xyz, src0.000 MAD temp[5].w, -src0.w, src0.1, src2.x 5: src0.w = temp[5] CMP temp[2].xyz, src0.111, src0.000, src0.www 6: src0.xyz = temp[0], src1.xyz = const[4] MAD temp[0].xyz, src0.xyz, src1.yyy, src0.000 7: src0.xyz = temp[0], src1.xyz = temp[2], src2.xyz = const[2] MAD_SAT temp[2].xyz, src0.xyz, src1.xyz, src2.xyz 8: src0.xyz = temp[4], src1.xyz = temp[2] MAD temp[4].xyz, src0.xyz, src1.xyz, src0.000 9: BEGIN_TEX; 10: TEX temp[1].w, temp[3].xy__, 2D[2]; 11: src0.xyz = temp[4], src0.w = temp[1] MAD temp[4].xyz, src0.xyz, src0.www, src0.000 12: src0.xyz = temp[4], src1.xyz = const[3] MAD_SAT color[0].xyz, src0.xyz, src1.xyz, src0.000 R500 Fragment Program: -------- 0 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02400000: id: 0 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe401f401: src: 1 R/G/A/A dst: 1 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00000807:TEX TEX_WAIT wmask: R omask: NONE 1:TEX_INST: 0x02410000: id: 1 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe404f402: src: 2 R/G/A/A dst: 4 R/G/B/A 3:TEX_DXDY: 0x00000000 2 0:CMN_INST 0x00107804:ALU TEX_WAIT wmask: ARGB omask: NONE 1:RGB_ADDR 0x08000900:Addr0: 0c, Addr1: 2t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0xc8020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:3 3 RGB_INST: 0x006de220:rgb_A_src:0 R/G/B 0 rgb_B_src:3 A/A/A 0 targ: 0 4 ALPHA_INST:0x00c09050:MAD dest:5 alp_A_src:1 B 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x2036c020:MAD dest:2 rgb_C_src:0 A/A/A 0 alp_C_src:0 0 0 3 0:CMN_INST 0x00007804:ALU TEX_WAIT wmask: ARGB omask: NONE 1:RGB_ADDR 0x00400801:Addr0: 1t, Addr1: 2t, Addr2: 4t, srcp:0 2:ALPHA_ADDR 0x08020005:Addr0: 5t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00c2c050:MAD dest:5 alp_A_src:0 A 1 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x04490040:MAD dest:4 rgb_C_src:0 0/0/0 0 alp_C_src:2 R 0 4 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020005:Addr0: 5t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x009206d8:rgb_A_src:0 1/1/1 0 rgb_B_src:0 0/0/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x0036c028:CMP dest:2 rgb_C_src:0 A/A/A 0 alp_C_src:0 R 0 5 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x08041000:Addr0: 0t, Addr1: 4c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0024a220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 G/G/G 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 6 0:CMN_INST 0x00083804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x10200800:Addr0: 0t, Addr1: 2t, Addr2: 2c, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00222020:MAD dest:2 rgb_C_src:2 R/G/B 0 alp_C_src:0 R 0 7 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x08000804:Addr0: 4t, Addr1: 2t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490040:MAD dest:4 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 8 0:CMN_INST 0x00004007:TEX TEX_WAIT wmask: A omask: NONE 1:TEX_INST: 0x02420000: id: 2 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe401f403: src: 3 R/G/A/A dst: 1 R/G/B/A 3:TEX_DXDY: 0x00000000 9 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x08020004:Addr0: 4t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490040:MAD dest:4 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 10 0:CMN_INST 0x000b8005:OUT TEX_WAIT wmask: NONE omask: RGB 1:RGB_ADDR 0x08040c04:Addr0: 4t, Addr1: 3c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 ~~~~~~~~ FRAGMENT PROGRAM ~~~~~~~ ~ 11 Instructions ~ 8 Vector Instructions (RGB) ~ 2 Scalar Instructions (Alpha) ~ 0 Flow Control Instructions ~ 3 Texture Instructions ~ 1 Presub Operations ~ 6 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], GENERIC[0] DCL CONST[0..3] 0: DP4 OUT[0].x, CONST[0], IN[0] 1: DP4 OUT[0].y, CONST[1], IN[0] 2: DP4 OUT[0].z, CONST[2], IN[0] 3: DP4 OUT[0].w, CONST[3], IN[0] 4: MOV OUT[1], IN[1] 5: END Vertex Program: before compilation # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Vertex Program: after 'transform loops' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Final vertex program code: 0: op: 0x00100001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 1: op: 0x00200001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 2: op: 0x00400001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 3: op: 0x00800001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 4: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 5: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 6: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 7 Instructions ~ 0 Flow Control Instructions ~ 1 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], GENERIC[0], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] 0: TEX OUT[0], IN[0], SAMP[0], 2D 1: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX output[0], input[0], 2D[0]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX output[0], input[0], 2D[0]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX output[0], input[0], 2D[0]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX output[0], input[0], 2D[0]; Fragment Program: after 'saturate output writes' # Radeon Compiler Program 0: TEX_SAT output[0], input[0], 2D[0]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[1], input[0], 2D[0]; 1: MOV_SAT output[0], temp[1]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[1], input[0], 2D[0]; 1: MOV_SAT output[0], temp[1]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TEX temp[1], input[0].xy__, 2D[0]; 1: MOV_SAT output[0], temp[1]; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TEX temp[1], input[0].xy__, 2D[0]; 1: MOV_SAT output[0], temp[1]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[1], input[0].xy__, 2D[0]; 1: MOV_SAT output[0], temp[1]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[1], input[0].xy__, 2D[0]; 1: MOV_SAT output[0], temp[1]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[1], input[0].xy__, 2D[0]; 1: src0.xyz = temp[1], src0.w = temp[1] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[1], input[0].xy__, 2D[0]; 2: src0.xyz = temp[1], src0.w = temp[1] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[0], temp[0].xy__, 2D[0]; 2: src0.xyz = temp[0], src0.w = temp[0] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02400000: id: 0 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe400f400: src: 0 R/G/A/A dst: 0 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x001f8005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL IN[2] DCL OUT[0], POSITION DCL OUT[1], COLOR DCL OUT[2], GENERIC[0] DCL OUT[3], GENERIC[1] DCL OUT[4], GENERIC[2] DCL CONST[0..4] DCL CONST[6..11] DCL TEMP[0] IMM FLT32 { 0.0000, 0.5000, 0.0000, 0.0000} 0: DP4 OUT[0].x, CONST[0], IN[0] 1: DP4 OUT[0].y, CONST[1], IN[0] 2: DP4 OUT[0].z, CONST[2], IN[0] 3: DP4 OUT[0].w, CONST[3], IN[0] 4: DP3 TEMP[0], IN[1], -CONST[4] 5: MAX TEMP[0], IMM[0].xxxx, TEMP[0] 6: MUL TEMP[0], TEMP[0], IMM[0].yyyy 7: MUL OUT[1], TEMP[0], CONST[6] 8: MOV OUT[2], IN[2] 9: DP4 OUT[3].x, CONST[7], IN[0] 10: DP4 OUT[3].y, CONST[8], IN[0] 11: DP4 OUT[3].z, CONST[9], IN[0] 12: DP4 OUT[3].w, CONST[10], IN[0] 13: MAD OUT[4], IN[0].xzzz, CONST[11].xxxx, CONST[11].yyyy 14: END Vertex Program: before compilation # Radeon Compiler Program 0: DP4 temp[1].x, const[0], input[0]; 1: DP4 temp[1].y, const[1], input[0]; 2: DP4 temp[1].z, const[2], input[0]; 3: DP4 temp[1].w, const[3], input[0]; 4: DP3 temp[0], input[1], -const[4]; 5: MAX temp[0], const[12].xxxx, temp[0]; 6: MUL temp[0], temp[0], const[12].yyyy; 7: MUL output[1], temp[0], const[6]; 8: MOV output[2], input[2]; 9: DP4 output[3].x, const[7], input[0]; 10: DP4 output[3].y, const[8], input[0]; 11: DP4 output[3].z, const[9], input[0]; 12: DP4 output[3].w, const[10], input[0]; 13: MAD output[4], input[0].xzzz, const[11].xxxx, const[11].yyyy; 14: MOV output[0], temp[1]; 15: MOV output[5], temp[1]; Vertex Program: after 'transform loops' # Radeon Compiler Program 0: DP4 temp[1].x, const[0], input[0]; 1: DP4 temp[1].y, const[1], input[0]; 2: DP4 temp[1].z, const[2], input[0]; 3: DP4 temp[1].w, const[3], input[0]; 4: DP3 temp[0], input[1], -const[4]; 5: MAX temp[0], const[12].xxxx, temp[0]; 6: MUL temp[0], temp[0], const[12].yyyy; 7: MUL output[1], temp[0], const[6]; 8: MOV output[2], input[2]; 9: DP4 output[3].x, const[7], input[0]; 10: DP4 output[3].y, const[8], input[0]; 11: DP4 output[3].z, const[9], input[0]; 12: DP4 output[3].w, const[10], input[0]; 13: MAD output[4], input[0].xzzz, const[11].xxxx, const[11].yyyy; 14: MOV output[0], temp[1]; 15: MOV output[5], temp[1]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: DP4 temp[1].x, const[0], input[0]; 1: DP4 temp[1].y, const[1], input[0]; 2: DP4 temp[1].z, const[2], input[0]; 3: DP4 temp[1].w, const[3], input[0]; 4: DP3 temp[0], input[1], -const[4]; 5: MAX temp[0], const[12].xxxx, temp[0]; 6: MUL temp[0], temp[0], const[12].yyyy; 7: MUL output[1], temp[0], const[6]; 8: MOV output[2], input[2]; 9: DP4 output[3].x, const[7], input[0]; 10: DP4 output[3].y, const[8], input[0]; 11: DP4 output[3].z, const[9], input[0]; 12: DP4 output[3].w, const[10], input[0]; 13: MAD output[4], input[0].xzzz, const[11].xxxx, const[11].yyyy; 14: MOV output[0], temp[1]; 15: MOV output[5], temp[1]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: DP4 temp[1].x, const[0], input[0]; 1: DP4 temp[1].y, const[1], input[0]; 2: DP4 temp[1].z, const[2], input[0]; 3: DP4 temp[1].w, const[3], input[0]; 4: DP4 temp[0], input[1].xyz0, const[4].-x-y-z0; 5: MAX temp[0], const[12].xxxx, temp[0]; 6: MUL temp[0], temp[0], const[12].yyyy; 7: MUL output[1], temp[0], const[6]; 8: MOV output[2], input[2]; 9: DP4 output[3].x, const[7], input[0]; 10: DP4 output[3].y, const[8], input[0]; 11: DP4 output[3].z, const[9], input[0]; 12: DP4 output[3].w, const[10], input[0]; 13: MAD output[4], input[0].xzzz, const[11].xxxx, const[11].yyyy; 14: MOV output[0], temp[1]; 15: MOV output[5], temp[1]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: DP4 temp[1].x, const[0], input[0]; 1: DP4 temp[1].y, const[1], input[0]; 2: DP4 temp[1].z, const[2], input[0]; 3: DP4 temp[1].w, const[3], input[0]; 4: DP4 temp[0], input[1].xyz0, const[4].-x-y-z0; 5: MAX temp[0], const[12].xxxx, temp[0]; 6: MUL temp[0], temp[0], const[12].yyyy; 7: MUL output[1], temp[0], const[6]; 8: MOV output[2], input[2]; 9: DP4 output[3].x, const[7], input[0]; 10: DP4 output[3].y, const[8], input[0]; 11: DP4 output[3].z, const[9], input[0]; 12: DP4 output[3].w, const[10], input[0]; 13: MAD output[4], input[0].xzzz, const[11].xxxx, const[11].yyyy; 14: MOV output[0], temp[1]; 15: MOV output[5], temp[1]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: DP4 temp[1].x, const[0], input[0]; 1: DP4 temp[1].y, const[1], input[0]; 2: DP4 temp[1].z, const[2], input[0]; 3: DP4 temp[1].w, const[3], input[0]; 4: DP4 temp[0], input[1].xyz0, const[4].-x-y-z0; 5: MAX temp[0], none.0000, temp[0]; 6: MUL temp[0], temp[0], const[12].yyyy; 7: MUL output[1], temp[0], const[6]; 8: MOV output[2], input[2]; 9: DP4 output[3].x, const[7], input[0]; 10: DP4 output[3].y, const[8], input[0]; 11: DP4 output[3].z, const[9], input[0]; 12: DP4 output[3].w, const[10], input[0]; 13: MAD output[4], input[0].xzzz, const[11].xxxx, const[11].yyyy; 14: MOV output[0], temp[1]; 15: MOV output[5], temp[1]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: DP4 temp[1].x, const[0], input[0]; 1: DP4 temp[1].y, const[1], input[0]; 2: DP4 temp[1].z, const[2], input[0]; 3: DP4 temp[1].w, const[3], input[0]; 4: DP4 temp[0], input[1].xyz0, const[4].-x-y-z0; 5: MAX temp[0], none.0000, temp[0]; 6: MUL temp[0], temp[0], const[12].yyyy; 7: MUL output[1], temp[0], const[6]; 8: MOV output[2], input[2]; 9: DP4 output[3].x, const[7], input[0]; 10: DP4 output[3].y, const[8], input[0]; 11: DP4 output[3].z, const[9], input[0]; 12: DP4 output[3].w, const[10], input[0]; 13: MAD output[4], input[0].xzzz, const[11].xxxx, const[11].yyyy; 14: MOV output[0], temp[1]; 15: MOV output[5], temp[1]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: DP4 temp[1], input[1].xyz0, const[4].-x-y-z0; 5: MAX temp[1], none.0000, temp[1]; 6: MUL temp[1], temp[1], const[12].yyyy; 7: MUL output[1], temp[1], const[6]; 8: MOV output[2], input[2]; 9: DP4 output[3].x, const[7], input[0]; 10: DP4 output[3].y, const[8], input[0]; 11: DP4 output[3].z, const[9], input[0]; 12: DP4 output[3].w, const[10], input[0]; 13: MAD output[4], input[0].xzzz, const[11].xxxx, const[11].yyyy; 14: MOV output[0], temp[0]; 15: MOV output[5], temp[0]; CONST[12] = { 0.0000 0.5000 0.0000 0.0000 } Vertex Program: after 'dead constants' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: DP4 temp[1], input[1].xyz0, const[4].-x-y-z0; 5: MAX temp[1], none.0000, temp[1]; 6: MUL temp[1], temp[1], const[12].yyyy; 7: MUL output[1], temp[1], const[6]; 8: MOV output[2], input[2]; 9: DP4 output[3].x, const[7], input[0]; 10: DP4 output[3].y, const[8], input[0]; 11: DP4 output[3].z, const[9], input[0]; 12: DP4 output[3].w, const[10], input[0]; 13: MAD output[4], input[0].xzzz, const[11].xxxx, const[11].yyyy; 14: MOV output[0], temp[0]; 15: MOV output[5], temp[0]; Final vertex program code: 0: op: 0x00100001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 1: op: 0x00200001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 2: op: 0x00400001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 3: op: 0x00800001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 4: op: 0x00f02001 dst: 1t op: VE_DOT_PRODUCT src0: 0x01110021 reg: 1i swiz: X/ Y/ Z/ 0 src1: 0x0f110082 reg: 4c swiz: -X/-Y/-Z/ 0 src2: 0x01248082 reg: 4c swiz: 0/ 0/ 0/ 0 5: op: 0x00f02007 dst: 1t op: VE_MAXIMUM src0: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src1: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 6: op: 0x00f02002 dst: 1t op: VE_MULTIPLY src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x00492182 reg: 12c swiz: Y/ Y/ Y/ Y src2: 0x01248182 reg: 12c swiz: 0/ 0/ 0/ 0 7: op: 0x00f02202 dst: 1o op: VE_MULTIPLY src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x00d100c2 reg: 6c swiz: X/ Y/ Z/ W src2: 0x012480c2 reg: 6c swiz: 0/ 0/ 0/ 0 8: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10041 reg: 2i swiz: X/ Y/ Z/ W src1: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 src2: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 9: op: 0x00106201 dst: 3o op: VE_DOT_PRODUCT src0: 0x00d100e2 reg: 7c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 10: op: 0x00206201 dst: 3o op: VE_DOT_PRODUCT src0: 0x00d10102 reg: 8c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 11: op: 0x00406201 dst: 3o op: VE_DOT_PRODUCT src0: 0x00d10122 reg: 9c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 12: op: 0x00806201 dst: 3o op: VE_DOT_PRODUCT src0: 0x00d10142 reg: 10c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 13: op: 0x00f08204 dst: 4o op: VE_MULTIPLY_ADD src0: 0x00920001 reg: 0i swiz: X/ Z/ Z/ Z src1: 0x00000162 reg: 11c swiz: X/ X/ X/ X src2: 0x00492162 reg: 11c swiz: Y/ Y/ Y/ Y 14: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 15: op: 0x00f0a203 dst: 5o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 16 Instructions ~ 0 Flow Control Instructions ~ 2 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], COLOR, LINEAR DCL IN[1], GENERIC[0], PERSPECTIVE DCL IN[2], GENERIC[1], PERSPECTIVE DCL IN[3], GENERIC[2], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL SAMP[1] DCL SAMP[2] DCL CONST[1..2] DCL TEMP[0..3] IMM FLT32 { 2.0000, 0.0000, 0.0000, 0.0000} 0: TEX TEMP[0], IN[1], SAMP[0], 2D 1: MOV OUT[0].w, TEMP[0] 2: MOV TEMP[1].xyz, TEMP[0] 3: TEX TEMP[2], IN[2], SAMP[1], SHADOW2D 4: MUL TEMP[3].xyz, IN[0], IMM[0].xxxx 5: MAD_SAT TEMP[2].xyz, TEMP[3], TEMP[2], CONST[1] 6: MUL TEMP[1].xyz, TEMP[1], TEMP[2] 7: TEX TEMP[0].w, IN[3], SAMP[2], 2D 8: MUL TEMP[1].xyz, TEMP[1], TEMP[0].wwww 9: MUL OUT[0].xyz, TEMP[1], CONST[2] 10: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MOV output[0].w, temp[0]; 2: MOV temp[1].xyz, temp[0]; 3: TEX temp[2], input[2], 2DSHADOW[1]; 4: MUL temp[3].xyz, input[0], const[3].xxxx; 5: MAD_SAT temp[2].xyz, temp[3], temp[2], const[1]; 6: MUL temp[1].xyz, temp[1], temp[2]; 7: TEX temp[0].w, input[3], 2D[2]; 8: MUL temp[1].xyz, temp[1], temp[0].wwww; 9: MUL output[0].xyz, temp[1], const[2]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MOV output[0].w, temp[0]; 2: MOV temp[1].xyz, temp[0]; 3: TEX temp[2], input[2], 2DSHADOW[1]; 4: MUL temp[3].xyz, input[0], const[3].xxxx; 5: MAD_SAT temp[2].xyz, temp[3], temp[2], const[1]; 6: MUL temp[1].xyz, temp[1], temp[2]; 7: TEX temp[0].w, input[3], 2D[2]; 8: MUL temp[1].xyz, temp[1], temp[0].wwww; 9: MUL output[0].xyz, temp[1], const[2]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MOV output[0].w, temp[0]; 2: MOV temp[1].xyz, temp[0]; 3: TEX temp[2], input[2], 2DSHADOW[1]; 4: MUL temp[3].xyz, input[0], const[3].xxxx; 5: MAD_SAT temp[2].xyz, temp[3], temp[2], const[1]; 6: MUL temp[1].xyz, temp[1], temp[2]; 7: TEX temp[0].w, input[3], 2D[2]; 8: MUL temp[1].xyz, temp[1], temp[0].wwww; 9: MUL output[0].xyz, temp[1], const[2]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MOV output[0].w, temp[0]; 2: MOV temp[1].xyz, temp[0]; 3: TEX temp[2], input[2], 2DSHADOW[1]; 4: MUL temp[3].xyz, input[0], const[3].xxxx; 5: MAD_SAT temp[2].xyz, temp[3], temp[2], const[1]; 6: MUL temp[1].xyz, temp[1], temp[2]; 7: TEX temp[0].w, input[3], 2D[2]; 8: MUL temp[1].xyz, temp[1], temp[0].wwww; 9: MUL output[0].xyz, temp[1], const[2]; Fragment Program: after 'saturate output writes' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MOV_SAT output[0].w, temp[0]; 2: MOV temp[1].xyz, temp[0]; 3: TEX temp[2], input[2], 2DSHADOW[1]; 4: MUL temp[3].xyz, input[0], const[3].xxxx; 5: MAD_SAT temp[2].xyz, temp[3], temp[2], const[1]; 6: MUL temp[1].xyz, temp[1], temp[2]; 7: TEX temp[0].w, input[3], 2D[2]; 8: MUL temp[1].xyz, temp[1], temp[0].wwww; 9: MUL_SAT output[0].xyz, temp[1], const[2]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MOV_SAT output[0].w, temp[0]; 2: MOV temp[1].xyz, temp[0]; 3: TEX temp[4], input[2], 2DSHADOW[1]; 4: MOV_SAT temp[5].w, input[2].zzzz; 5: ADD temp[5].w, -temp[5].wwww, temp[4].xxxx; 6: CMP temp[2], temp[5].www1, none.0001, none.1111; 7: MUL temp[3].xyz, input[0], const[3].xxxx; 8: MAD_SAT temp[2].xyz, temp[3], temp[2], const[1]; 9: MUL temp[1].xyz, temp[1], temp[2]; 10: TEX temp[0].w, input[3], 2D[2]; 11: MUL temp[1].xyz, temp[1], temp[0].wwww; 12: MUL_SAT output[0].xyz, temp[1], const[2]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0], input[1], 2D[0]; 1: MOV_SAT output[0].w, temp[0]; 2: MOV temp[1].xyz, temp[0]; 3: TEX temp[4], input[2], 2DSHADOW[1]; 4: MOV_SAT temp[5].w, input[2].zzzz; 5: ADD temp[5].w, -temp[5].wwww, temp[4].xxxx; 6: CMP temp[2], temp[5].www1, none.0001, none.1111; 7: MUL temp[3].xyz, input[0], const[3].xxxx; 8: MAD_SAT temp[2].xyz, temp[3], temp[2], const[1]; 9: MUL temp[1].xyz, temp[1], temp[2]; 10: TEX temp[0].w, input[3], 2D[2]; 11: MUL temp[1].xyz, temp[1], temp[0].wwww; 12: MUL_SAT output[0].xyz, temp[1], const[2]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TEX temp[0], input[1].xy__, 2D[0]; 1: MOV_SAT output[0].w, temp[0].___w; 2: MOV temp[1].xyz, temp[0].xyz_; 3: TEX temp[4].x, input[2].xy__, 2DSHADOW[1]; 4: MOV_SAT temp[5].w, input[2].___z; 5: ADD temp[5].w, -temp[5].___w, temp[4].___x; 6: CMP temp[2].xyz, temp[5].www_, none.000_, none.111_; 7: MUL temp[3].xyz, input[0].xyz_, const[3].xxx_; 8: MAD_SAT temp[2].xyz, temp[3].xyz_, temp[2].xyz_, const[1].xyz_; 9: MUL temp[1].xyz, temp[1].xyz_, temp[2].xyz_; 10: TEX temp[0].w, input[3].xy__, 2D[2]; 11: MUL temp[1].xyz, temp[1].xyz_, temp[0].www_; 12: MUL_SAT output[0].xyz, temp[1].xyz_, const[2].xyz_; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TEX temp[0], input[1].xy__, 2D[0]; 1: MOV_SAT output[0].w, temp[0].___w; 2: TEX temp[4].x, input[2].xy__, 2DSHADOW[1]; 3: MOV_SAT temp[5].w, input[2].___z; 4: ADD temp[5].w, -temp[5].___w, temp[4].___x; 5: CMP temp[2].xyz, temp[5].www_, none.000_, none.111_; 6: MUL temp[3].xyz, input[0].xyz_, const[3].xxx_; 7: MAD_SAT temp[2].xyz, temp[3].xyz_, temp[2].xyz_, const[1].xyz_; 8: MUL temp[1].xyz, temp[0].xyz_, temp[2].xyz_; 9: TEX temp[0].w, input[3].xy__, 2D[2]; 10: MUL temp[1].xyz, temp[1].xyz_, temp[0].www_; 11: MUL_SAT output[0].xyz, temp[1].xyz_, const[2].xyz_; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[0], input[1].xy__, 2D[0]; 1: MOV_SAT output[0].w, temp[0].___w; 2: TEX temp[4].x, input[2].xy__, 2DSHADOW[1]; 3: MOV_SAT temp[5].w, input[2].___z; 4: ADD temp[5].w, -temp[5].___w, temp[4].___x; 5: CMP temp[2].xyz, temp[5].www_, none.000_, none.111_; 6: MUL temp[3].xyz, input[0].xyz_, const[3].xxx_; 7: MAD_SAT temp[2].xyz, temp[3].xyz_, temp[2].xyz_, const[1].xyz_; 8: MUL temp[1].xyz, temp[0].xyz_, temp[2].xyz_; 9: TEX temp[0].w, input[3].xy__, 2D[2]; 10: MUL temp[1].xyz, temp[1].xyz_, temp[0].www_; 11: MUL_SAT output[0].xyz, temp[1].xyz_, const[2].xyz_; CONST[3] = { 2.0000 0.0000 0.0000 0.0000 } Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[0], input[1].xy__, 2D[0]; 1: MOV_SAT output[0].w, temp[0].___w; 2: TEX temp[4].x, input[2].xy__, 2DSHADOW[1]; 3: MOV_SAT temp[5].w, input[2].___z; 4: ADD temp[5].w, -temp[5].___w, temp[4].___x; 5: CMP temp[2].xyz, temp[5].www_, none.000_, none.111_; 6: MUL temp[3].xyz, input[0].xyz_, const[3].xxx_; 7: MAD_SAT temp[2].xyz, temp[3].xyz_, temp[2].xyz_, const[1].xyz_; 8: MUL temp[1].xyz, temp[0].xyz_, temp[2].xyz_; 9: TEX temp[0].w, input[3].xy__, 2D[2]; 10: MUL temp[1].xyz, temp[1].xyz_, temp[0].www_; 11: MUL_SAT output[0].xyz, temp[1].xyz_, const[2].xyz_; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[0], input[1].xy__, 2D[0]; 1: src0.w = temp[0] MAD_SAT color[0].w, src0.w, src0.1, src0.0 2: TEX temp[4].x, input[2].xy__, 2DSHADOW[1]; 3: src0.xyz = input[2] MAD_SAT temp[5].w, src0.z, src0.1, src0.0 4: src0.xyz = temp[4], src0.w = temp[5] MAD temp[5].w, -src0.w, src0.1, src0.x 5: src0.w = temp[5] CMP temp[2].xyz, src0.111, src0.000, src0.www 6: src0.xyz = input[0], src1.xyz = const[3] MAD temp[3].xyz, src0.xyz, src1.xxx, src0.000 7: src0.xyz = temp[3], src1.xyz = temp[2], src2.xyz = const[1] MAD_SAT temp[2].xyz, src0.xyz, src1.xyz, src2.xyz 8: src0.xyz = temp[0], src1.xyz = temp[2] MAD temp[1].xyz, src0.xyz, src1.xyz, src0.000 9: TEX temp[0].w, input[3].xy__, 2D[2]; 10: src0.xyz = temp[1], src0.w = temp[0] MAD temp[1].xyz, src0.xyz, src0.www, src0.000 11: src0.xyz = temp[1], src1.xyz = const[2] MAD_SAT color[0].xyz, src0.xyz, src1.xyz, src0.000 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[0], input[1].xy__, 2D[0]; 2: TEX temp[4].x, input[2].xy__, 2DSHADOW[1]; 3: src0.xyz = input[0], src0.w = temp[0], src1.xyz = const[3] MAD temp[3].xyz, src0.xyz, src1.xxx, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 4: src0.xyz = input[2] MAD_SAT temp[5].w, src0.z, src0.1, src0.0 5: src0.xyz = temp[4], src0.w = temp[5] MAD temp[5].w, -src0.w, src0.1, src0.x 6: src0.w = temp[5] CMP temp[2].xyz, src0.111, src0.000, src0.www 7: src0.xyz = temp[3], src1.xyz = temp[2], src2.xyz = const[1] MAD_SAT temp[2].xyz, src0.xyz, src1.xyz, src2.xyz 8: src0.xyz = temp[0], src1.xyz = temp[2] MAD temp[1].xyz, src0.xyz, src1.xyz, src0.000 9: BEGIN_TEX; 10: TEX temp[0].w, input[3].xy__, 2D[2]; 11: src0.xyz = temp[1], src0.w = temp[0] MAD temp[1].xyz, src0.xyz, src0.www, src0.000 12: src0.xyz = temp[1], src1.xyz = const[2] MAD_SAT color[0].xyz, src0.xyz, src1.xyz, src0.000 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[1], temp[1].xy__, 2D[0]; 2: TEX temp[5].x, temp[2].xy__, 2DSHADOW[1]; 3: src0.xyz = temp[0], src0.w = temp[1], src1.xyz = const[3] MAD temp[4].xyz, src0.xyz, src1.xxx, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 4: src0.xyz = temp[2] MAD_SAT temp[0].w, src0.z, src0.1, src0.0 5: src0.xyz = temp[5], src0.w = temp[0] MAD temp[0].w, -src0.w, src0.1, src0.x 6: src0.w = temp[0] CMP temp[0].xyz, src0.111, src0.000, src0.www 7: src0.xyz = temp[4], src1.xyz = temp[0], src2.xyz = const[1] MAD_SAT temp[0].xyz, src0.xyz, src1.xyz, src2.xyz 8: src0.xyz = temp[1], src1.xyz = temp[0] MAD temp[0].xyz, src0.xyz, src1.xyz, src0.000 9: BEGIN_TEX; 10: TEX temp[1].w, temp[3].xy__, 2D[2]; 11: src0.xyz = temp[0], src0.w = temp[1] MAD temp[0].xyz, src0.xyz, src0.www, src0.000 12: src0.xyz = temp[0], src1.xyz = const[2] MAD_SAT color[0].xyz, src0.xyz, src1.xyz, src0.000 R500 Fragment Program: -------- 0 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02400000: id: 0 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe401f401: src: 1 R/G/A/A dst: 1 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00000807:TEX TEX_WAIT wmask: R omask: NONE 1:TEX_INST: 0x02410000: id: 1 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe405f402: src: 2 R/G/A/A dst: 5 R/G/B/A 3:TEX_DXDY: 0x00000000 2 0:CMN_INST 0x00143805:OUT TEX_WAIT wmask: RGB omask: A 1:RGB_ADDR 0x08040c00:Addr0: 0t, Addr1: 3c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00002220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/R/R 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490040:MAD dest:4 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 3 0:CMN_INST 0x00104004:ALU TEX_WAIT wmask: A omask: NONE 1:RGB_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x00c08000:MAD dest:0 alp_A_src:0 B 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 0 0 4 0:CMN_INST 0x00004004:ALU TEX_WAIT wmask: A omask: NONE 1:RGB_ADDR 0x08020005:Addr0: 5t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x00c2c000:MAD dest:0 alp_A_src:0 A 1 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x00000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 5 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x009206d8:rgb_A_src:0 1/1/1 0 rgb_B_src:0 0/0/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x0036c008:CMP dest:0 rgb_C_src:0 A/A/A 0 alp_C_src:0 R 0 6 0:CMN_INST 0x00083804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x10100004:Addr0: 4t, Addr1: 0t, Addr2: 1c, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00222000:MAD dest:0 rgb_C_src:2 R/G/B 0 alp_C_src:0 R 0 7 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x08000001:Addr0: 1t, Addr1: 0t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 8 0:CMN_INST 0x00004007:TEX TEX_WAIT wmask: A omask: NONE 1:TEX_INST: 0x02420000: id: 2 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe401f403: src: 3 R/G/A/A dst: 1 R/G/B/A 3:TEX_DXDY: 0x00000000 9 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 10 0:CMN_INST 0x000b8005:OUT TEX_WAIT wmask: NONE omask: RGB 1:RGB_ADDR 0x08040800:Addr0: 0t, Addr1: 2c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 ~~~~~~~~ FRAGMENT PROGRAM ~~~~~~~ ~ 11 Instructions ~ 6 Vector Instructions (RGB) ~ 3 Scalar Instructions (Alpha) ~ 0 Flow Control Instructions ~ 3 Texture Instructions ~ 0 Presub Operations ~ 6 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], GENERIC[0] 0: MOV OUT[0], IN[0] 1: MOV OUT[1], IN[1] 2: END Vertex Program: before compilation # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'transform loops' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MOV output[0], input[0]; 2: MOV output[2], input[0]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MOV output[0], input[0]; 2: MOV output[2], input[0]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MOV output[0], input[0]; 2: MOV output[2], input[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MOV output[0], input[0]; 2: MOV output[2], input[0]; Final vertex program code: 0: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 1: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src1: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 2: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src1: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 r300: Initial fragment program FRAG DCL IN[0], GENERIC[0], LINEAR DCL OUT[0], COLOR DCL SAMP[0] IMM FLT32 { 0.0000, 1.0000, 0.0000, 0.0000} 0: MOV OUT[0], IMM[0].xxxy 1: TEX OUT[0].xyz, IN[0], SAMP[0], 2D 2: END Fragment Program: before compilation # Radeon Compiler Program 0: MOV output[0], temp[0].0001; 1: TEX output[0].xyz, input[0], 2D[0]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: MOV output[0], temp[0].0001; 1: TEX output[0].xyz, input[0], 2D[0]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: MOV output[0], temp[0].0001; 1: TEX output[0].xyz, input[0], 2D[0]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: MOV output[0], temp[0].0001; 1: TEX output[0].xyz, input[0], 2D[0]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: MOV output[0], temp[0].0001; 1: TEX temp[1], input[0], 2D[0]; 2: MOV output[0].xyz, temp[1]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: MOV output[0], temp[0].0001; 1: TEX temp[1], input[0], 2D[0]; 2: MOV output[0].xyz, temp[1]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: MOV output[0].w, temp[0].___1; 1: TEX temp[1].xyz, input[0].xy__, 2D[0]; 2: MOV output[0].xyz, temp[1].xyz_; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV output[0].w, none.___1; 1: TEX temp[1].xyz, input[0].xy__, 2D[0]; 2: MOV output[0].xyz, temp[1].xyz_; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: MOV output[0].w, none.___1; 1: TEX temp[1].xyz, input[0].xy__, 2D[0]; 2: MOV output[0].xyz, temp[1].xyz_; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: MOV output[0].w, none.___1; 1: TEX temp[1].xyz, input[0].xy__, 2D[0]; 2: MOV output[0].xyz, temp[1].xyz_; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: MAD color[0].w, src0.1, src0.1, src0.0 1: TEX temp[1].xyz, input[0].xy__, 2D[0]; 2: src0.xyz = temp[1] MAD color[0].xyz, src0.xyz, src0.111, src0.000 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[1].xyz, input[0].xy__, 2D[0]; 2: src0.xyz = temp[1] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.1, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[0].xyz, temp[0].xy__, 2D[0]; 2: src0.xyz = temp[0] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.1, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00003807:TEX TEX_WAIT wmask: RGB omask: NONE 1:TEX_INST: 0x02400000: id: 0 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe400f400: src: 0 R/G/A/A dst: 0 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00078005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c18000:MAD dest:0 alp_A_src:0 1 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], GENERIC[0] DCL CONST[0..3] DCL TEMP[0] 0: MUL TEMP[0], IN[0].xxxx, CONST[0] 1: MAD TEMP[0], IN[0].yyyy, CONST[1], TEMP[0] 2: MAD TEMP[0], IN[0].zzzz, CONST[2], TEMP[0] 3: MAD OUT[0], IN[0].wwww, CONST[3], TEMP[0] 4: MOV OUT[1], IN[1] 5: END Vertex Program: before compilation # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'transform loops' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Final vertex program code: 0: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 1: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 5: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 6: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 7 Instructions ~ 0 Flow Control Instructions ~ 1 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL IN[2] DCL OUT[0], POSITION DCL OUT[1], GENERIC[0] DCL OUT[2], GENERIC[1] DCL OUT[3], GENERIC[2] DCL OUT[4], GENERIC[3] DCL OUT[5], GENERIC[10] DCL OUT[6], GENERIC[11] DCL OUT[7], GENERIC[12] DCL CONST[0..12] DCL TEMP[0..2] 0: MOV OUT[7].xyz, IN[0].xyzx 1: MOV OUT[5], IN[2].xxxx 2: ADD OUT[1], IN[1], CONST[0] 3: MUL TEMP[0], CONST[5], IN[0].xxxx 4: MAD TEMP[1], CONST[6], IN[0].yyyy, TEMP[0] 5: MAD TEMP[0], CONST[7], IN[0].zzzz, TEMP[1] 6: MAD TEMP[2], CONST[8], IN[0].wwww, TEMP[0] 7: MUL TEMP[0], CONST[5], IN[0].xxxx 8: MAD TEMP[1], CONST[6], IN[0].yyyy, TEMP[0] 9: MAD TEMP[0], CONST[7], IN[0].zzzz, TEMP[1] 10: MAD OUT[3], CONST[8], IN[0].wwww, TEMP[0] 11: MUL TEMP[0], CONST[1], IN[0].xxxx 12: MAD TEMP[1], CONST[2], IN[0].yyyy, TEMP[0] 13: MAD TEMP[0], CONST[3], IN[0].zzzz, TEMP[1] 14: MAD OUT[4], CONST[4], IN[0].wwww, TEMP[0] 15: MOV OUT[6], TEMP[2].wwww 16: MUL TEMP[0], CONST[9], IN[0].xxxx 17: MAD TEMP[1], CONST[10], IN[0].yyyy, TEMP[0] 18: MAD TEMP[0], CONST[11], IN[0].zzzz, TEMP[1] 19: MAD OUT[0], CONST[12], IN[0].wwww, TEMP[0] 20: MOV OUT[2], TEMP[2] 21: END Vertex Program: before compilation # Radeon Compiler Program 0: MOV output[7].xyz, input[0].xyzx; 1: MOV output[5], input[2].xxxx; 2: ADD output[1], input[1], const[0]; 3: MUL temp[0], const[5], input[0].xxxx; 4: MAD temp[1], const[6], input[0].yyyy, temp[0]; 5: MAD temp[0], const[7], input[0].zzzz, temp[1]; 6: MAD temp[2], const[8], input[0].wwww, temp[0]; 7: MUL temp[0], const[5], input[0].xxxx; 8: MAD temp[1], const[6], input[0].yyyy, temp[0]; 9: MAD temp[0], const[7], input[0].zzzz, temp[1]; 10: MAD output[3], const[8], input[0].wwww, temp[0]; 11: MUL temp[0], const[1], input[0].xxxx; 12: MAD temp[1], const[2], input[0].yyyy, temp[0]; 13: MAD temp[0], const[3], input[0].zzzz, temp[1]; 14: MAD output[4], const[4], input[0].wwww, temp[0]; 15: MOV output[6], temp[2].wwww; 16: MUL temp[0], const[9], input[0].xxxx; 17: MAD temp[1], const[10], input[0].yyyy, temp[0]; 18: MAD temp[0], const[11], input[0].zzzz, temp[1]; 19: MAD temp[3], const[12], input[0].wwww, temp[0]; 20: MOV output[2], temp[2]; 21: MOV output[0], temp[3]; 22: MOV output[8], temp[3]; Vertex Program: after 'transform loops' # Radeon Compiler Program 0: MOV output[7].xyz, input[0].xyzx; 1: MOV output[5], input[2].xxxx; 2: ADD output[1], input[1], const[0]; 3: MUL temp[0], const[5], input[0].xxxx; 4: MAD temp[1], const[6], input[0].yyyy, temp[0]; 5: MAD temp[0], const[7], input[0].zzzz, temp[1]; 6: MAD temp[2], const[8], input[0].wwww, temp[0]; 7: MUL temp[0], const[5], input[0].xxxx; 8: MAD temp[1], const[6], input[0].yyyy, temp[0]; 9: MAD temp[0], const[7], input[0].zzzz, temp[1]; 10: MAD output[3], const[8], input[0].wwww, temp[0]; 11: MUL temp[0], const[1], input[0].xxxx; 12: MAD temp[1], const[2], input[0].yyyy, temp[0]; 13: MAD temp[0], const[3], input[0].zzzz, temp[1]; 14: MAD output[4], const[4], input[0].wwww, temp[0]; 15: MOV output[6], temp[2].wwww; 16: MUL temp[0], const[9], input[0].xxxx; 17: MAD temp[1], const[10], input[0].yyyy, temp[0]; 18: MAD temp[0], const[11], input[0].zzzz, temp[1]; 19: MAD temp[3], const[12], input[0].wwww, temp[0]; 20: MOV output[2], temp[2]; 21: MOV output[0], temp[3]; 22: MOV output[8], temp[3]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MOV output[7].xyz, input[0].xyzx; 1: MOV output[5], input[2].xxxx; 2: ADD output[1], input[1], const[0]; 3: MUL temp[0], const[5], input[0].xxxx; 4: MAD temp[1], const[6], input[0].yyyy, temp[0]; 5: MAD temp[0], const[7], input[0].zzzz, temp[1]; 6: MAD temp[2], const[8], input[0].wwww, temp[0]; 7: MUL temp[0], const[5], input[0].xxxx; 8: MAD temp[1], const[6], input[0].yyyy, temp[0]; 9: MAD temp[0], const[7], input[0].zzzz, temp[1]; 10: MAD output[3], const[8], input[0].wwww, temp[0]; 11: MUL temp[0], const[1], input[0].xxxx; 12: MAD temp[1], const[2], input[0].yyyy, temp[0]; 13: MAD temp[0], const[3], input[0].zzzz, temp[1]; 14: MAD output[4], const[4], input[0].wwww, temp[0]; 15: MOV output[6], temp[2].wwww; 16: MUL temp[0], const[9], input[0].xxxx; 17: MAD temp[1], const[10], input[0].yyyy, temp[0]; 18: MAD temp[0], const[11], input[0].zzzz, temp[1]; 19: MAD temp[3], const[12], input[0].wwww, temp[0]; 20: MOV output[2], temp[2]; 21: MOV output[0], temp[3]; 22: MOV output[8], temp[3]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MOV output[7].xyz, input[0].xyzx; 1: MOV output[5], input[2].xxxx; 2: ADD output[1], input[1], const[0]; 3: MUL temp[0], const[5], input[0].xxxx; 4: MAD temp[1], const[6], input[0].yyyy, temp[0]; 5: MAD temp[0], const[7], input[0].zzzz, temp[1]; 6: MAD temp[2], const[8], input[0].wwww, temp[0]; 7: MUL temp[0], const[5], input[0].xxxx; 8: MAD temp[1], const[6], input[0].yyyy, temp[0]; 9: MAD temp[0], const[7], input[0].zzzz, temp[1]; 10: MAD output[3], const[8], input[0].wwww, temp[0]; 11: MUL temp[0], const[1], input[0].xxxx; 12: MAD temp[1], const[2], input[0].yyyy, temp[0]; 13: MAD temp[0], const[3], input[0].zzzz, temp[1]; 14: MAD output[4], const[4], input[0].wwww, temp[0]; 15: MOV output[6], temp[2].wwww; 16: MUL temp[0], const[9], input[0].xxxx; 17: MAD temp[1], const[10], input[0].yyyy, temp[0]; 18: MAD temp[0], const[11], input[0].zzzz, temp[1]; 19: MAD temp[3], const[12], input[0].wwww, temp[0]; 20: MOV output[2], temp[2]; 21: MOV output[0], temp[3]; 22: MOV output[8], temp[3]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MOV output[7].xyz, input[0].xyz_; 1: MOV output[5], input[2].xxxx; 2: ADD output[1], input[1], const[0]; 3: MUL temp[0], const[5], input[0].xxxx; 4: MAD temp[1], const[6], input[0].yyyy, temp[0]; 5: MAD temp[0], const[7], input[0].zzzz, temp[1]; 6: MAD temp[2], const[8], input[0].wwww, temp[0]; 7: MUL temp[0], const[5], input[0].xxxx; 8: MAD temp[1], const[6], input[0].yyyy, temp[0]; 9: MAD temp[0], const[7], input[0].zzzz, temp[1]; 10: MAD output[3], const[8], input[0].wwww, temp[0]; 11: MUL temp[0], const[1], input[0].xxxx; 12: MAD temp[1], const[2], input[0].yyyy, temp[0]; 13: MAD temp[0], const[3], input[0].zzzz, temp[1]; 14: MAD output[4], const[4], input[0].wwww, temp[0]; 15: MOV output[6], temp[2].wwww; 16: MUL temp[0], const[9], input[0].xxxx; 17: MAD temp[1], const[10], input[0].yyyy, temp[0]; 18: MAD temp[0], const[11], input[0].zzzz, temp[1]; 19: MAD temp[3], const[12], input[0].wwww, temp[0]; 20: MOV output[2], temp[2]; 21: MOV output[0], temp[3]; 22: MOV output[8], temp[3]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV output[7].xyz, input[0].xyz_; 1: MOV output[5], input[2].xxxx; 2: ADD output[1], input[1], const[0]; 3: MUL temp[0], const[5], input[0].xxxx; 4: MAD temp[1], const[6], input[0].yyyy, temp[0]; 5: MAD temp[0], const[7], input[0].zzzz, temp[1]; 6: MAD temp[2], const[8], input[0].wwww, temp[0]; 7: MUL temp[0], const[5], input[0].xxxx; 8: MAD temp[1], const[6], input[0].yyyy, temp[0]; 9: MAD temp[0], const[7], input[0].zzzz, temp[1]; 10: MAD output[3], const[8], input[0].wwww, temp[0]; 11: MUL temp[0], const[1], input[0].xxxx; 12: MAD temp[1], const[2], input[0].yyyy, temp[0]; 13: MAD temp[0], const[3], input[0].zzzz, temp[1]; 14: MAD output[4], const[4], input[0].wwww, temp[0]; 15: MOV output[6], temp[2].wwww; 16: MUL temp[0], const[9], input[0].xxxx; 17: MAD temp[1], const[10], input[0].yyyy, temp[0]; 18: MAD temp[0], const[11], input[0].zzzz, temp[1]; 19: MAD temp[3], const[12], input[0].wwww, temp[0]; 20: MOV output[2], temp[2]; 21: MOV output[0], temp[3]; 22: MOV output[8], temp[3]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MOV output[7].xyz, input[0].xyz_; 1: MOV output[5], input[2].xxxx; 2: ADD output[1], input[1], const[0]; 3: MUL temp[0], const[5], input[0].xxxx; 4: MAD temp[1], const[6], input[0].yyyy, temp[0]; 5: MAD temp[0], const[7], input[0].zzzz, temp[1]; 6: MAD temp[2], const[8], input[0].wwww, temp[0]; 7: MUL temp[0], const[5], input[0].xxxx; 8: MAD temp[1], const[6], input[0].yyyy, temp[0]; 9: MAD temp[0], const[7], input[0].zzzz, temp[1]; 10: MAD output[3], const[8], input[0].wwww, temp[0]; 11: MUL temp[0], const[1], input[0].xxxx; 12: MAD temp[1], const[2], input[0].yyyy, temp[0]; 13: MAD temp[0], const[3], input[0].zzzz, temp[1]; 14: MAD output[4], const[4], input[0].wwww, temp[0]; 15: MOV output[6], temp[2].wwww; 16: MUL temp[0], const[9], input[0].xxxx; 17: MAD temp[1], const[10], input[0].yyyy, temp[0]; 18: MAD temp[0], const[11], input[0].zzzz, temp[1]; 19: MAD temp[3], const[12], input[0].wwww, temp[0]; 20: MOV output[2], temp[2]; 21: MOV output[0], temp[3]; 22: MOV output[8], temp[3]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MOV output[7].xyz, input[0].xyz_; 1: MOV output[5], input[2].xxxx; 2: ADD output[1], input[1], const[0]; 3: MUL temp[0], const[5], input[0].xxxx; 4: MAD temp[1], const[6], input[0].yyyy, temp[0]; 5: MAD temp[0], const[7], input[0].zzzz, temp[1]; 6: MAD temp[2], const[8], input[0].wwww, temp[0]; 7: MUL temp[0], const[5], input[0].xxxx; 8: MAD temp[1], const[6], input[0].yyyy, temp[0]; 9: MAD temp[0], const[7], input[0].zzzz, temp[1]; 10: MAD output[3], const[8], input[0].wwww, temp[0]; 11: MUL temp[0], const[1], input[0].xxxx; 12: MAD temp[1], const[2], input[0].yyyy, temp[0]; 13: MAD temp[0], const[3], input[0].zzzz, temp[1]; 14: MAD output[4], const[4], input[0].wwww, temp[0]; 15: MOV output[6], temp[2].wwww; 16: MUL temp[0], const[9], input[0].xxxx; 17: MAD temp[1], const[10], input[0].yyyy, temp[0]; 18: MAD temp[0], const[11], input[0].zzzz, temp[1]; 19: MAD temp[0], const[12], input[0].wwww, temp[0]; 20: MOV output[2], temp[2]; 21: MOV output[0], temp[0]; 22: MOV output[8], temp[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MOV output[7].xyz, input[0].xyz_; 1: MOV output[5], input[2].xxxx; 2: ADD output[1], input[1], const[0]; 3: MUL temp[0], const[5], input[0].xxxx; 4: MAD temp[1], const[6], input[0].yyyy, temp[0]; 5: MAD temp[0], const[7], input[0].zzzz, temp[1]; 6: MAD temp[2], const[8], input[0].wwww, temp[0]; 7: MUL temp[0], const[5], input[0].xxxx; 8: MAD temp[1], const[6], input[0].yyyy, temp[0]; 9: MAD temp[0], const[7], input[0].zzzz, temp[1]; 10: MAD output[3], const[8], input[0].wwww, temp[0]; 11: MUL temp[0], const[1], input[0].xxxx; 12: MAD temp[1], const[2], input[0].yyyy, temp[0]; 13: MAD temp[0], const[3], input[0].zzzz, temp[1]; 14: MAD output[4], const[4], input[0].wwww, temp[0]; 15: MOV output[6], temp[2].wwww; 16: MUL temp[0], const[9], input[0].xxxx; 17: MAD temp[1], const[10], input[0].yyyy, temp[0]; 18: MAD temp[0], const[11], input[0].zzzz, temp[1]; 19: MAD temp[0], const[12], input[0].wwww, temp[0]; 20: MOV output[2], temp[2]; 21: MOV output[0], temp[0]; 22: MOV output[8], temp[0]; Final vertex program code: 0: op: 0x0070e203 dst: 7o op: VE_ADD src0: 0x01d10001 reg: 0i swiz: X/ Y/ Z/ U src1: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 1: op: 0x00f0a203 dst: 5o op: VE_ADD src0: 0x00000041 reg: 2i swiz: X/ X/ X/ X src1: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 src2: 0x01248041 reg: 2i swiz: 0/ 0/ 0/ 0 2: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W src1: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 3: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00d100a2 reg: 5c swiz: X/ Y/ Z/ W src1: 0x00000001 reg: 0i swiz: X/ X/ X/ X src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 4: op: 0x00f02004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x00d100c2 reg: 6c swiz: X/ Y/ Z/ W src1: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 5: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00d100e2 reg: 7c swiz: X/ Y/ Z/ W src1: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src2: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W 6: op: 0x00f04004 dst: 2t op: VE_MULTIPLY_ADD src0: 0x00d10102 reg: 8c swiz: X/ Y/ Z/ W src1: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 7: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00d100a2 reg: 5c swiz: X/ Y/ Z/ W src1: 0x00000001 reg: 0i swiz: X/ X/ X/ X src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 8: op: 0x00f02004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x00d100c2 reg: 6c swiz: X/ Y/ Z/ W src1: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 9: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00d100e2 reg: 7c swiz: X/ Y/ Z/ W src1: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src2: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W 10: op: 0x00f06204 dst: 3o op: VE_MULTIPLY_ADD src0: 0x00d10102 reg: 8c swiz: X/ Y/ Z/ W src1: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 11: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src1: 0x00000001 reg: 0i swiz: X/ X/ X/ X src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 12: op: 0x00f02004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src1: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 13: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src1: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src2: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W 14: op: 0x00f08204 dst: 4o op: VE_MULTIPLY_ADD src0: 0x00d10082 reg: 4c swiz: X/ Y/ Z/ W src1: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 15: op: 0x00f0c203 dst: 6o op: VE_ADD src0: 0x00db6040 reg: 2t swiz: W/ W/ W/ W src1: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 src2: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 16: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00d10122 reg: 9c swiz: X/ Y/ Z/ W src1: 0x00000001 reg: 0i swiz: X/ X/ X/ X src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 17: op: 0x00f02004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x00d10142 reg: 10c swiz: X/ Y/ Z/ W src1: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 18: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00d10162 reg: 11c swiz: X/ Y/ Z/ W src1: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src2: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W 19: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00d10182 reg: 12c swiz: X/ Y/ Z/ W src1: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 20: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10040 reg: 2t swiz: X/ Y/ Z/ W src1: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 src2: 0x01248040 reg: 2t swiz: 0/ 0/ 0/ 0 21: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 22: op: 0x00f10203 dst: 8o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 23 Instructions ~ 0 Flow Control Instructions ~ 3 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], GENERIC[0], PERSPECTIVE DCL IN[1], GENERIC[1], PERSPECTIVE DCL IN[2], GENERIC[2], PERSPECTIVE DCL IN[3], GENERIC[3], PERSPECTIVE DCL IN[4], GENERIC[10], PERSPECTIVE DCL IN[5], GENERIC[11], PERSPECTIVE DCL IN[6], GENERIC[12], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL SAMP[1] DCL SAMP[2] DCL SAMP[3] DCL CONST[0..7] DCL CONST[12..14] DCL TEMP[0..10] IMM FLT32 { -0.5000, 1.0000, 0.1500, 18.0000} IMM FLT32 { 0.8000, 0.0000, 0.5000, 0.3000} IMM FLT32 { 0.1500, 18.0000, 0.7000, 1.2000} 0: TEX TEMP[0].xyz, IN[0].xyyy, SAMP[3], 2D 1: ADD TEMP[1].xyz, TEMP[0].xzyy, IMM[0].xxxy 2: DP3 TEMP[0].x, TEMP[1].xyzz, TEMP[1].xyzz 3: RSQ TEMP[2].x, TEMP[0].xxxx 4: MUL TEMP[0].xyz, TEMP[1].xyzz, TEMP[2].xxxx 5: MOV TEMP[1].xyz, -CONST[14].xyzx 6: ADD TEMP[2].xyz, CONST[12].xyzz, -IN[6].xyzz 7: DP3 TEMP[3].x, TEMP[2].xyzz, TEMP[2].xyzz 8: RSQ TEMP[4].x, TEMP[3].xxxx 9: MUL TEMP[3].xyz, TEMP[2].xyzz, TEMP[4].xxxx 10: ADD TEMP[2].xyz, TEMP[1].xyzz, TEMP[3].xyzz 11: RCP TEMP[4].x, CONST[2].xxxx 12: MUL TEMP[5].x, IN[4].xxxx, TEMP[4].xxxx 13: MIN TEMP[4].x, TEMP[5].xxxx, IMM[0].yyyy 14: MUL TEMP[5].x, CONST[3].xxxx, TEMP[4].xxxx 15: DP3 TEMP[4].x, TEMP[0].xyzz, TEMP[3].xyzz 16: ADD TEMP[6].x, IMM[0].yyyy, -TEMP[4].xxxx 17: POW TEMP[4].x, TEMP[6].xxxx, IMM[1].xxxx 18: DP3 TEMP[6].x, TEMP[2].xyzz, TEMP[2].xyzz 19: RSQ TEMP[7].x, TEMP[6].xxxx 20: MUL TEMP[6].xyz, TEMP[2].xyzz, TEMP[7].xxxx 21: DP3 TEMP[2].x, TEMP[0].xyzz, TEMP[6].xyzz 22: MAX TEMP[6].x, IMM[1].yyyy, TEMP[2].xxxx 23: POW TEMP[2].x, TEMP[6].xxxx, CONST[7].xxxx 24: MUL TEMP[6].xyz, TEMP[2].xxxx, CONST[13].xyzz 25: MUL TEMP[2].xyz, TEMP[6].xyzz, CONST[6].xxxx 26: MUL TEMP[6].xyz, CONST[13].xyzz, CONST[1].xyzz 27: MUL TEMP[7].xy, CONST[5].xxxx, TEMP[0].xzzz 28: RCP TEMP[8].x, IN[5].xxxx 29: RCP TEMP[9].x, IN[1].wwww 30: MUL TEMP[10].xy, IN[1].xyyy, TEMP[9].xxxx 31: MAD TEMP[9].xy, IMM[1].zzzz, TEMP[10].xyyy, IMM[1].zzzz 32: MAD TEMP[10].xy, TEMP[7].xyyy, TEMP[8].xxxx, TEMP[9].xyyy 33: TEX TEMP[7].xyz, TEMP[10].xyyy, SAMP[2], 2D 34: ADD TEMP[8].x, IMM[0].yyyy, -CONST[0].xxxx 35: MUL TEMP[9].xyz, TEMP[7].xyzz, TEMP[8].xxxx 36: MAD TEMP[7].xyz, TEMP[6].xyzz, CONST[0].xxxx, TEMP[9].xyzz 37: ADD TEMP[6].xyz, TEMP[7].xyzz, TEMP[2].xyzz 38: DP3 TEMP[7].x, TEMP[0].xyzz, TEMP[1].xyzz 39: MAD TEMP[1].x, IMM[1].zzzz, TEMP[7].xxxx, IMM[1].zzzz 40: MUL TEMP[7].xyz, CONST[13].xyzz, CONST[4].xyzz 41: RCP TEMP[8].x, IN[2].wwww 42: MUL TEMP[9].xy, IN[2].xyyy, TEMP[8].xxxx 43: MAD TEMP[8].xy, IMM[1].zzzz, TEMP[9].xyyy, IMM[1].zzzz 44: MUL TEMP[9].x, IMM[1].xxxx, CONST[5].xxxx 45: MUL TEMP[10].xy, TEMP[9].xxxx, TEMP[0].xzzz 46: RCP TEMP[0].x, IN[5].xxxx 47: MUL TEMP[9].xy, TEMP[10].xyyy, TEMP[0].xxxx 48: ADD TEMP[0].xy, TEMP[8].xyyy, -TEMP[9].xyyy 49: TEX TEMP[8].xyz, TEMP[0].xyyy, SAMP[1], 2D 50: ADD TEMP[0].x, IMM[0].yyyy, -TEMP[5].xxxx 51: MUL TEMP[9].xyz, TEMP[8].xyzz, TEMP[0].xxxx 52: MAD TEMP[0].xyz, TEMP[7].xyzz, TEMP[5].xxxx, TEMP[9].xyzz 53: MUL TEMP[5].xyz, TEMP[1].xxxx, TEMP[0].xyzz 54: MAD TEMP[0].xyz, IMM[1].wwww, TEMP[2].xyzz, TEMP[5].xyzz 55: ADD TEMP[1].x, IMM[0].yyyy, -TEMP[4].xxxx 56: MUL TEMP[2].xyz, TEMP[0].xyzz, TEMP[1].xxxx 57: MAD TEMP[0].xyz, TEMP[6].xyzz, TEMP[4].xxxx, TEMP[2].xyzz 58: TEX TEMP[1].w, IN[3].xyyy, SAMP[0], 2D 59: MUL OUT[0].xyz, TEMP[0].xyzx, TEMP[1].wwwx 60: MUL TEMP[0].x, IMM[2].xxxx, IN[4].xxxx 61: ADD TEMP[1].x, IMM[2].zzzz, -TEMP[3].yyyy 62: MAX TEMP[2].x, IMM[1].yyyy, TEMP[1].xxxx 63: MAD TEMP[1].x, IMM[2].yyyy, TEMP[2].xxxx, IMM[2].wwww 64: ADD TEMP[2].x, TEMP[1].xxxx, TEMP[4].xxxx 65: MUL OUT[0].w, TEMP[0].xxxx, TEMP[2].xxxx 66: END Fragment Program: before compilation # Radeon Compiler Program 0: TEX temp[0].xyz, input[0].xyyy, 2D[3]; 1: ADD temp[1].xyz, temp[0].xzyy, const[15].xxxy; 2: DP3 temp[0].x, temp[1].xyzz, temp[1].xyzz; 3: RSQ temp[2].x, temp[0].xxxx; 4: MUL temp[0].xyz, temp[1].xyzz, temp[2].xxxx; 5: MOV temp[1].xyz, -const[14].xyzx; 6: ADD temp[2].xyz, const[12].xyzz, -input[6].xyzz; 7: DP3 temp[3].x, temp[2].xyzz, temp[2].xyzz; 8: RSQ temp[4].x, temp[3].xxxx; 9: MUL temp[3].xyz, temp[2].xyzz, temp[4].xxxx; 10: ADD temp[2].xyz, temp[1].xyzz, temp[3].xyzz; 11: RCP temp[4].x, const[2].xxxx; 12: MUL temp[5].x, input[4].xxxx, temp[4].xxxx; 13: MIN temp[4].x, temp[5].xxxx, const[15].yyyy; 14: MUL temp[5].x, const[3].xxxx, temp[4].xxxx; 15: DP3 temp[4].x, temp[0].xyzz, temp[3].xyzz; 16: ADD temp[6].x, const[15].yyyy, -temp[4].xxxx; 17: POW temp[4].x, temp[6].xxxx, const[16].xxxx; 18: DP3 temp[6].x, temp[2].xyzz, temp[2].xyzz; 19: RSQ temp[7].x, temp[6].xxxx; 20: MUL temp[6].xyz, temp[2].xyzz, temp[7].xxxx; 21: DP3 temp[2].x, temp[0].xyzz, temp[6].xyzz; 22: MAX temp[6].x, const[16].yyyy, temp[2].xxxx; 23: POW temp[2].x, temp[6].xxxx, const[7].xxxx; 24: MUL temp[6].xyz, temp[2].xxxx, const[13].xyzz; 25: MUL temp[2].xyz, temp[6].xyzz, const[6].xxxx; 26: MUL temp[6].xyz, const[13].xyzz, const[1].xyzz; 27: MUL temp[7].xy, const[5].xxxx, temp[0].xzzz; 28: RCP temp[8].x, input[5].xxxx; 29: RCP temp[9].x, input[1].wwww; 30: MUL temp[10].xy, input[1].xyyy, temp[9].xxxx; 31: MAD temp[9].xy, const[16].zzzz, temp[10].xyyy, const[16].zzzz; 32: MAD temp[10].xy, temp[7].xyyy, temp[8].xxxx, temp[9].xyyy; 33: TEX temp[7].xyz, temp[10].xyyy, 2D[2]; 34: ADD temp[8].x, const[15].yyyy, -const[0].xxxx; 35: MUL temp[9].xyz, temp[7].xyzz, temp[8].xxxx; 36: MAD temp[7].xyz, temp[6].xyzz, const[0].xxxx, temp[9].xyzz; 37: ADD temp[6].xyz, temp[7].xyzz, temp[2].xyzz; 38: DP3 temp[7].x, temp[0].xyzz, temp[1].xyzz; 39: MAD temp[1].x, const[16].zzzz, temp[7].xxxx, const[16].zzzz; 40: MUL temp[7].xyz, const[13].xyzz, const[4].xyzz; 41: RCP temp[8].x, input[2].wwww; 42: MUL temp[9].xy, input[2].xyyy, temp[8].xxxx; 43: MAD temp[8].xy, const[16].zzzz, temp[9].xyyy, const[16].zzzz; 44: MUL temp[9].x, const[16].xxxx, const[5].xxxx; 45: MUL temp[10].xy, temp[9].xxxx, temp[0].xzzz; 46: RCP temp[0].x, input[5].xxxx; 47: MUL temp[9].xy, temp[10].xyyy, temp[0].xxxx; 48: ADD temp[0].xy, temp[8].xyyy, -temp[9].xyyy; 49: TEX temp[8].xyz, temp[0].xyyy, 2D[1]; 50: ADD temp[0].x, const[15].yyyy, -temp[5].xxxx; 51: MUL temp[9].xyz, temp[8].xyzz, temp[0].xxxx; 52: MAD temp[0].xyz, temp[7].xyzz, temp[5].xxxx, temp[9].xyzz; 53: MUL temp[5].xyz, temp[1].xxxx, temp[0].xyzz; 54: MAD temp[0].xyz, const[16].wwww, temp[2].xyzz, temp[5].xyzz; 55: ADD temp[1].x, const[15].yyyy, -temp[4].xxxx; 56: MUL temp[2].xyz, temp[0].xyzz, temp[1].xxxx; 57: MAD temp[0].xyz, temp[6].xyzz, temp[4].xxxx, temp[2].xyzz; 58: TEX temp[1].w, input[3].xyyy, 2D[0]; 59: MUL output[0].xyz, temp[0].xyzx, temp[1].wwwx; 60: MUL temp[0].x, const[17].xxxx, input[4].xxxx; 61: ADD temp[1].x, const[17].zzzz, -temp[3].yyyy; 62: MAX temp[2].x, const[16].yyyy, temp[1].xxxx; 63: MAD temp[1].x, const[17].yyyy, temp[2].xxxx, const[17].wwww; 64: ADD temp[2].x, temp[1].xxxx, temp[4].xxxx; 65: MUL output[0].w, temp[0].xxxx, temp[2].xxxx; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TEX temp[0].xyz, input[0].xyyy, 2D[3]; 1: ADD temp[1].xyz, temp[0].xzyy, const[15].xxxy; 2: DP3 temp[0].x, temp[1].xyzz, temp[1].xyzz; 3: RSQ temp[2].x, temp[0].xxxx; 4: MUL temp[0].xyz, temp[1].xyzz, temp[2].xxxx; 5: MOV temp[1].xyz, -const[14].xyzx; 6: ADD temp[2].xyz, const[12].xyzz, -input[6].xyzz; 7: DP3 temp[3].x, temp[2].xyzz, temp[2].xyzz; 8: RSQ temp[4].x, temp[3].xxxx; 9: MUL temp[3].xyz, temp[2].xyzz, temp[4].xxxx; 10: ADD temp[2].xyz, temp[1].xyzz, temp[3].xyzz; 11: RCP temp[4].x, const[2].xxxx; 12: MUL temp[5].x, input[4].xxxx, temp[4].xxxx; 13: MIN temp[4].x, temp[5].xxxx, const[15].yyyy; 14: MUL temp[5].x, const[3].xxxx, temp[4].xxxx; 15: DP3 temp[4].x, temp[0].xyzz, temp[3].xyzz; 16: ADD temp[6].x, const[15].yyyy, -temp[4].xxxx; 17: POW temp[4].x, temp[6].xxxx, const[16].xxxx; 18: DP3 temp[6].x, temp[2].xyzz, temp[2].xyzz; 19: RSQ temp[7].x, temp[6].xxxx; 20: MUL temp[6].xyz, temp[2].xyzz, temp[7].xxxx; 21: DP3 temp[2].x, temp[0].xyzz, temp[6].xyzz; 22: MAX temp[6].x, const[16].yyyy, temp[2].xxxx; 23: POW temp[2].x, temp[6].xxxx, const[7].xxxx; 24: MUL temp[6].xyz, temp[2].xxxx, const[13].xyzz; 25: MUL temp[2].xyz, temp[6].xyzz, const[6].xxxx; 26: MUL temp[6].xyz, const[13].xyzz, const[1].xyzz; 27: MUL temp[7].xy, const[5].xxxx, temp[0].xzzz; 28: RCP temp[8].x, input[5].xxxx; 29: RCP temp[9].x, input[1].wwww; 30: MUL temp[10].xy, input[1].xyyy, temp[9].xxxx; 31: MAD temp[9].xy, const[16].zzzz, temp[10].xyyy, const[16].zzzz; 32: MAD temp[10].xy, temp[7].xyyy, temp[8].xxxx, temp[9].xyyy; 33: TEX temp[7].xyz, temp[10].xyyy, 2D[2]; 34: ADD temp[8].x, const[15].yyyy, -const[0].xxxx; 35: MUL temp[9].xyz, temp[7].xyzz, temp[8].xxxx; 36: MAD temp[7].xyz, temp[6].xyzz, const[0].xxxx, temp[9].xyzz; 37: ADD temp[6].xyz, temp[7].xyzz, temp[2].xyzz; 38: DP3 temp[7].x, temp[0].xyzz, temp[1].xyzz; 39: MAD temp[1].x, const[16].zzzz, temp[7].xxxx, const[16].zzzz; 40: MUL temp[7].xyz, const[13].xyzz, const[4].xyzz; 41: RCP temp[8].x, input[2].wwww; 42: MUL temp[9].xy, input[2].xyyy, temp[8].xxxx; 43: MAD temp[8].xy, const[16].zzzz, temp[9].xyyy, const[16].zzzz; 44: MUL temp[9].x, const[16].xxxx, const[5].xxxx; 45: MUL temp[10].xy, temp[9].xxxx, temp[0].xzzz; 46: RCP temp[0].x, input[5].xxxx; 47: MUL temp[9].xy, temp[10].xyyy, temp[0].xxxx; 48: ADD temp[0].xy, temp[8].xyyy, -temp[9].xyyy; 49: TEX temp[8].xyz, temp[0].xyyy, 2D[1]; 50: ADD temp[0].x, const[15].yyyy, -temp[5].xxxx; 51: MUL temp[9].xyz, temp[8].xyzz, temp[0].xxxx; 52: MAD temp[0].xyz, temp[7].xyzz, temp[5].xxxx, temp[9].xyzz; 53: MUL temp[5].xyz, temp[1].xxxx, temp[0].xyzz; 54: MAD temp[0].xyz, const[16].wwww, temp[2].xyzz, temp[5].xyzz; 55: ADD temp[1].x, const[15].yyyy, -temp[4].xxxx; 56: MUL temp[2].xyz, temp[0].xyzz, temp[1].xxxx; 57: MAD temp[0].xyz, temp[6].xyzz, temp[4].xxxx, temp[2].xyzz; 58: TEX temp[1].w, input[3].xyyy, 2D[0]; 59: MUL output[0].xyz, temp[0].xyzx, temp[1].wwwx; 60: MUL temp[0].x, const[17].xxxx, input[4].xxxx; 61: ADD temp[1].x, const[17].zzzz, -temp[3].yyyy; 62: MAX temp[2].x, const[16].yyyy, temp[1].xxxx; 63: MAD temp[1].x, const[17].yyyy, temp[2].xxxx, const[17].wwww; 64: ADD temp[2].x, temp[1].xxxx, temp[4].xxxx; 65: MUL output[0].w, temp[0].xxxx, temp[2].xxxx; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TEX temp[0].xyz, input[0].xyyy, 2D[3]; 1: ADD temp[1].xyz, temp[0].xzyy, const[15].xxxy; 2: DP3 temp[0].x, temp[1].xyzz, temp[1].xyzz; 3: RSQ temp[2].x, temp[0].xxxx; 4: MUL temp[0].xyz, temp[1].xyzz, temp[2].xxxx; 5: MOV temp[1].xyz, -const[14].xyzx; 6: ADD temp[2].xyz, const[12].xyzz, -input[6].xyzz; 7: DP3 temp[3].x, temp[2].xyzz, temp[2].xyzz; 8: RSQ temp[4].x, temp[3].xxxx; 9: MUL temp[3].xyz, temp[2].xyzz, temp[4].xxxx; 10: ADD temp[2].xyz, temp[1].xyzz, temp[3].xyzz; 11: RCP temp[4].x, const[2].xxxx; 12: MUL temp[5].x, input[4].xxxx, temp[4].xxxx; 13: MIN temp[4].x, temp[5].xxxx, const[15].yyyy; 14: MUL temp[5].x, const[3].xxxx, temp[4].xxxx; 15: DP3 temp[4].x, temp[0].xyzz, temp[3].xyzz; 16: ADD temp[6].x, const[15].yyyy, -temp[4].xxxx; 17: POW temp[4].x, temp[6].xxxx, const[16].xxxx; 18: DP3 temp[6].x, temp[2].xyzz, temp[2].xyzz; 19: RSQ temp[7].x, temp[6].xxxx; 20: MUL temp[6].xyz, temp[2].xyzz, temp[7].xxxx; 21: DP3 temp[2].x, temp[0].xyzz, temp[6].xyzz; 22: MAX temp[6].x, const[16].yyyy, temp[2].xxxx; 23: POW temp[2].x, temp[6].xxxx, const[7].xxxx; 24: MUL temp[6].xyz, temp[2].xxxx, const[13].xyzz; 25: MUL temp[2].xyz, temp[6].xyzz, const[6].xxxx; 26: MUL temp[6].xyz, const[13].xyzz, const[1].xyzz; 27: MUL temp[7].xy, const[5].xxxx, temp[0].xzzz; 28: RCP temp[8].x, input[5].xxxx; 29: RCP temp[9].x, input[1].wwww; 30: MUL temp[10].xy, input[1].xyyy, temp[9].xxxx; 31: MAD temp[9].xy, const[16].zzzz, temp[10].xyyy, const[16].zzzz; 32: MAD temp[10].xy, temp[7].xyyy, temp[8].xxxx, temp[9].xyyy; 33: TEX temp[7].xyz, temp[10].xyyy, 2D[2]; 34: ADD temp[8].x, const[15].yyyy, -const[0].xxxx; 35: MUL temp[9].xyz, temp[7].xyzz, temp[8].xxxx; 36: MAD temp[7].xyz, temp[6].xyzz, const[0].xxxx, temp[9].xyzz; 37: ADD temp[6].xyz, temp[7].xyzz, temp[2].xyzz; 38: DP3 temp[7].x, temp[0].xyzz, temp[1].xyzz; 39: MAD temp[1].x, const[16].zzzz, temp[7].xxxx, const[16].zzzz; 40: MUL temp[7].xyz, const[13].xyzz, const[4].xyzz; 41: RCP temp[8].x, input[2].wwww; 42: MUL temp[9].xy, input[2].xyyy, temp[8].xxxx; 43: MAD temp[8].xy, const[16].zzzz, temp[9].xyyy, const[16].zzzz; 44: MUL temp[9].x, const[16].xxxx, const[5].xxxx; 45: MUL temp[10].xy, temp[9].xxxx, temp[0].xzzz; 46: RCP temp[0].x, input[5].xxxx; 47: MUL temp[9].xy, temp[10].xyyy, temp[0].xxxx; 48: ADD temp[0].xy, temp[8].xyyy, -temp[9].xyyy; 49: TEX temp[8].xyz, temp[0].xyyy, 2D[1]; 50: ADD temp[0].x, const[15].yyyy, -temp[5].xxxx; 51: MUL temp[9].xyz, temp[8].xyzz, temp[0].xxxx; 52: MAD temp[0].xyz, temp[7].xyzz, temp[5].xxxx, temp[9].xyzz; 53: MUL temp[5].xyz, temp[1].xxxx, temp[0].xyzz; 54: MAD temp[0].xyz, const[16].wwww, temp[2].xyzz, temp[5].xyzz; 55: ADD temp[1].x, const[15].yyyy, -temp[4].xxxx; 56: MUL temp[2].xyz, temp[0].xyzz, temp[1].xxxx; 57: MAD temp[0].xyz, temp[6].xyzz, temp[4].xxxx, temp[2].xyzz; 58: TEX temp[1].w, input[3].xyyy, 2D[0]; 59: MUL output[0].xyz, temp[0].xyzx, temp[1].wwwx; 60: MUL temp[0].x, const[17].xxxx, input[4].xxxx; 61: ADD temp[1].x, const[17].zzzz, -temp[3].yyyy; 62: MAX temp[2].x, const[16].yyyy, temp[1].xxxx; 63: MAD temp[1].x, const[17].yyyy, temp[2].xxxx, const[17].wwww; 64: ADD temp[2].x, temp[1].xxxx, temp[4].xxxx; 65: MUL output[0].w, temp[0].xxxx, temp[2].xxxx; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TEX temp[0].xyz, input[0].xyyy, 2D[3]; 1: ADD temp[1].xyz, temp[0].xzyy, const[15].xxxy; 2: DP3 temp[0].x, temp[1].xyzz, temp[1].xyzz; 3: RSQ temp[2].x, temp[0].xxxx; 4: MUL temp[0].xyz, temp[1].xyzz, temp[2].xxxx; 5: MOV temp[1].xyz, -const[14].xyzx; 6: ADD temp[2].xyz, const[12].xyzz, -input[6].xyzz; 7: DP3 temp[3].x, temp[2].xyzz, temp[2].xyzz; 8: RSQ temp[4].x, temp[3].xxxx; 9: MUL temp[3].xyz, temp[2].xyzz, temp[4].xxxx; 10: ADD temp[2].xyz, temp[1].xyzz, temp[3].xyzz; 11: RCP temp[4].x, const[2].xxxx; 12: MUL temp[5].x, input[4].xxxx, temp[4].xxxx; 13: MIN temp[4].x, temp[5].xxxx, const[15].yyyy; 14: MUL temp[5].x, const[3].xxxx, temp[4].xxxx; 15: DP3 temp[4].x, temp[0].xyzz, temp[3].xyzz; 16: ADD temp[6].x, const[15].yyyy, -temp[4].xxxx; 17: POW temp[4].x, temp[6].xxxx, const[16].xxxx; 18: DP3 temp[6].x, temp[2].xyzz, temp[2].xyzz; 19: RSQ temp[7].x, temp[6].xxxx; 20: MUL temp[6].xyz, temp[2].xyzz, temp[7].xxxx; 21: DP3 temp[2].x, temp[0].xyzz, temp[6].xyzz; 22: MAX temp[6].x, const[16].yyyy, temp[2].xxxx; 23: POW temp[2].x, temp[6].xxxx, const[7].xxxx; 24: MUL temp[6].xyz, temp[2].xxxx, const[13].xyzz; 25: MUL temp[2].xyz, temp[6].xyzz, const[6].xxxx; 26: MUL temp[6].xyz, const[13].xyzz, const[1].xyzz; 27: MUL temp[7].xy, const[5].xxxx, temp[0].xzzz; 28: RCP temp[8].x, input[5].xxxx; 29: RCP temp[9].x, input[1].wwww; 30: MUL temp[10].xy, input[1].xyyy, temp[9].xxxx; 31: MAD temp[9].xy, const[16].zzzz, temp[10].xyyy, const[16].zzzz; 32: MAD temp[10].xy, temp[7].xyyy, temp[8].xxxx, temp[9].xyyy; 33: TEX temp[7].xyz, temp[10].xyyy, 2D[2]; 34: ADD temp[8].x, const[15].yyyy, -const[0].xxxx; 35: MUL temp[9].xyz, temp[7].xyzz, temp[8].xxxx; 36: MAD temp[7].xyz, temp[6].xyzz, const[0].xxxx, temp[9].xyzz; 37: ADD temp[6].xyz, temp[7].xyzz, temp[2].xyzz; 38: DP3 temp[7].x, temp[0].xyzz, temp[1].xyzz; 39: MAD temp[1].x, const[16].zzzz, temp[7].xxxx, const[16].zzzz; 40: MUL temp[7].xyz, const[13].xyzz, const[4].xyzz; 41: RCP temp[8].x, input[2].wwww; 42: MUL temp[9].xy, input[2].xyyy, temp[8].xxxx; 43: MAD temp[8].xy, const[16].zzzz, temp[9].xyyy, const[16].zzzz; 44: MUL temp[9].x, const[16].xxxx, const[5].xxxx; 45: MUL temp[10].xy, temp[9].xxxx, temp[0].xzzz; 46: RCP temp[0].x, input[5].xxxx; 47: MUL temp[9].xy, temp[10].xyyy, temp[0].xxxx; 48: ADD temp[0].xy, temp[8].xyyy, -temp[9].xyyy; 49: TEX temp[8].xyz, temp[0].xyyy, 2D[1]; 50: ADD temp[0].x, const[15].yyyy, -temp[5].xxxx; 51: MUL temp[9].xyz, temp[8].xyzz, temp[0].xxxx; 52: MAD temp[0].xyz, temp[7].xyzz, temp[5].xxxx, temp[9].xyzz; 53: MUL temp[5].xyz, temp[1].xxxx, temp[0].xyzz; 54: MAD temp[0].xyz, const[16].wwww, temp[2].xyzz, temp[5].xyzz; 55: ADD temp[1].x, const[15].yyyy, -temp[4].xxxx; 56: MUL temp[2].xyz, temp[0].xyzz, temp[1].xxxx; 57: MAD temp[0].xyz, temp[6].xyzz, temp[4].xxxx, temp[2].xyzz; 58: TEX temp[1].w, input[3].xyyy, 2D[0]; 59: MUL output[0].xyz, temp[0].xyzx, temp[1].wwwx; 60: MUL temp[0].x, const[17].xxxx, input[4].xxxx; 61: ADD temp[1].x, const[17].zzzz, -temp[3].yyyy; 62: MAX temp[2].x, const[16].yyyy, temp[1].xxxx; 63: MAD temp[1].x, const[17].yyyy, temp[2].xxxx, const[17].wwww; 64: ADD temp[2].x, temp[1].xxxx, temp[4].xxxx; 65: MUL output[0].w, temp[0].xxxx, temp[2].xxxx; Fragment Program: after 'saturate output writes' # Radeon Compiler Program 0: TEX temp[0].xyz, input[0].xyyy, 2D[3]; 1: ADD temp[1].xyz, temp[0].xzyy, const[15].xxxy; 2: DP3 temp[0].x, temp[1].xyzz, temp[1].xyzz; 3: RSQ temp[2].x, temp[0].xxxx; 4: MUL temp[0].xyz, temp[1].xyzz, temp[2].xxxx; 5: MOV temp[1].xyz, -const[14].xyzx; 6: ADD temp[2].xyz, const[12].xyzz, -input[6].xyzz; 7: DP3 temp[3].x, temp[2].xyzz, temp[2].xyzz; 8: RSQ temp[4].x, temp[3].xxxx; 9: MUL temp[3].xyz, temp[2].xyzz, temp[4].xxxx; 10: ADD temp[2].xyz, temp[1].xyzz, temp[3].xyzz; 11: RCP temp[4].x, const[2].xxxx; 12: MUL temp[5].x, input[4].xxxx, temp[4].xxxx; 13: MIN temp[4].x, temp[5].xxxx, const[15].yyyy; 14: MUL temp[5].x, const[3].xxxx, temp[4].xxxx; 15: DP3 temp[4].x, temp[0].xyzz, temp[3].xyzz; 16: ADD temp[6].x, const[15].yyyy, -temp[4].xxxx; 17: POW temp[4].x, temp[6].xxxx, const[16].xxxx; 18: DP3 temp[6].x, temp[2].xyzz, temp[2].xyzz; 19: RSQ temp[7].x, temp[6].xxxx; 20: MUL temp[6].xyz, temp[2].xyzz, temp[7].xxxx; 21: DP3 temp[2].x, temp[0].xyzz, temp[6].xyzz; 22: MAX temp[6].x, const[16].yyyy, temp[2].xxxx; 23: POW temp[2].x, temp[6].xxxx, const[7].xxxx; 24: MUL temp[6].xyz, temp[2].xxxx, const[13].xyzz; 25: MUL temp[2].xyz, temp[6].xyzz, const[6].xxxx; 26: MUL temp[6].xyz, const[13].xyzz, const[1].xyzz; 27: MUL temp[7].xy, const[5].xxxx, temp[0].xzzz; 28: RCP temp[8].x, input[5].xxxx; 29: RCP temp[9].x, input[1].wwww; 30: MUL temp[10].xy, input[1].xyyy, temp[9].xxxx; 31: MAD temp[9].xy, const[16].zzzz, temp[10].xyyy, const[16].zzzz; 32: MAD temp[10].xy, temp[7].xyyy, temp[8].xxxx, temp[9].xyyy; 33: TEX temp[7].xyz, temp[10].xyyy, 2D[2]; 34: ADD temp[8].x, const[15].yyyy, -const[0].xxxx; 35: MUL temp[9].xyz, temp[7].xyzz, temp[8].xxxx; 36: MAD temp[7].xyz, temp[6].xyzz, const[0].xxxx, temp[9].xyzz; 37: ADD temp[6].xyz, temp[7].xyzz, temp[2].xyzz; 38: DP3 temp[7].x, temp[0].xyzz, temp[1].xyzz; 39: MAD temp[1].x, const[16].zzzz, temp[7].xxxx, const[16].zzzz; 40: MUL temp[7].xyz, const[13].xyzz, const[4].xyzz; 41: RCP temp[8].x, input[2].wwww; 42: MUL temp[9].xy, input[2].xyyy, temp[8].xxxx; 43: MAD temp[8].xy, const[16].zzzz, temp[9].xyyy, const[16].zzzz; 44: MUL temp[9].x, const[16].xxxx, const[5].xxxx; 45: MUL temp[10].xy, temp[9].xxxx, temp[0].xzzz; 46: RCP temp[0].x, input[5].xxxx; 47: MUL temp[9].xy, temp[10].xyyy, temp[0].xxxx; 48: ADD temp[0].xy, temp[8].xyyy, -temp[9].xyyy; 49: TEX temp[8].xyz, temp[0].xyyy, 2D[1]; 50: ADD temp[0].x, const[15].yyyy, -temp[5].xxxx; 51: MUL temp[9].xyz, temp[8].xyzz, temp[0].xxxx; 52: MAD temp[0].xyz, temp[7].xyzz, temp[5].xxxx, temp[9].xyzz; 53: MUL temp[5].xyz, temp[1].xxxx, temp[0].xyzz; 54: MAD temp[0].xyz, const[16].wwww, temp[2].xyzz, temp[5].xyzz; 55: ADD temp[1].x, const[15].yyyy, -temp[4].xxxx; 56: MUL temp[2].xyz, temp[0].xyzz, temp[1].xxxx; 57: MAD temp[0].xyz, temp[6].xyzz, temp[4].xxxx, temp[2].xyzz; 58: TEX temp[1].w, input[3].xyyy, 2D[0]; 59: MUL_SAT output[0].xyz, temp[0].xyzx, temp[1].wwwx; 60: MUL temp[0].x, const[17].xxxx, input[4].xxxx; 61: ADD temp[1].x, const[17].zzzz, -temp[3].yyyy; 62: MAX temp[2].x, const[16].yyyy, temp[1].xxxx; 63: MAD temp[1].x, const[17].yyyy, temp[2].xxxx, const[17].wwww; 64: ADD temp[2].x, temp[1].xxxx, temp[4].xxxx; 65: MUL_SAT output[0].w, temp[0].xxxx, temp[2].xxxx; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TEX temp[0].xyz, input[0].xyyy, 2D[3]; 1: ADD temp[1].xyz, temp[0].xzyy, const[15].xxxy; 2: DP3 temp[0].x, temp[1].xyzz, temp[1].xyzz; 3: RSQ temp[2].x, temp[0].xxxx; 4: MUL temp[0].xyz, temp[1].xyzz, temp[2].xxxx; 5: MOV temp[1].xyz, -const[14].xyzx; 6: ADD temp[2].xyz, const[12].xyzz, -input[6].xyzz; 7: DP3 temp[3].x, temp[2].xyzz, temp[2].xyzz; 8: RSQ temp[4].x, temp[3].xxxx; 9: MUL temp[3].xyz, temp[2].xyzz, temp[4].xxxx; 10: ADD temp[2].xyz, temp[1].xyzz, temp[3].xyzz; 11: RCP temp[4].x, const[2].xxxx; 12: MUL temp[5].x, input[4].xxxx, temp[4].xxxx; 13: MIN temp[4].x, temp[5].xxxx, const[15].yyyy; 14: MUL temp[5].x, const[3].xxxx, temp[4].xxxx; 15: DP3 temp[4].x, temp[0].xyzz, temp[3].xyzz; 16: ADD temp[6].x, const[15].yyyy, -temp[4].xxxx; 17: POW temp[4].x, temp[6].xxxx, const[16].xxxx; 18: DP3 temp[6].x, temp[2].xyzz, temp[2].xyzz; 19: RSQ temp[7].x, temp[6].xxxx; 20: MUL temp[6].xyz, temp[2].xyzz, temp[7].xxxx; 21: DP3 temp[2].x, temp[0].xyzz, temp[6].xyzz; 22: MAX temp[6].x, const[16].yyyy, temp[2].xxxx; 23: POW temp[2].x, temp[6].xxxx, const[7].xxxx; 24: MUL temp[6].xyz, temp[2].xxxx, const[13].xyzz; 25: MUL temp[2].xyz, temp[6].xyzz, const[6].xxxx; 26: MUL temp[6].xyz, const[13].xyzz, const[1].xyzz; 27: MUL temp[7].xy, const[5].xxxx, temp[0].xzzz; 28: RCP temp[8].x, input[5].xxxx; 29: RCP temp[9].x, input[1].wwww; 30: MUL temp[10].xy, input[1].xyyy, temp[9].xxxx; 31: MAD temp[9].xy, const[16].zzzz, temp[10].xyyy, const[16].zzzz; 32: MAD temp[10].xy, temp[7].xyyy, temp[8].xxxx, temp[9].xyyy; 33: TEX temp[7].xyz, temp[10].xyyy, 2D[2]; 34: ADD temp[8].x, const[15].yyyy, -const[0].xxxx; 35: MUL temp[9].xyz, temp[7].xyzz, temp[8].xxxx; 36: MAD temp[7].xyz, temp[6].xyzz, const[0].xxxx, temp[9].xyzz; 37: ADD temp[6].xyz, temp[7].xyzz, temp[2].xyzz; 38: DP3 temp[7].x, temp[0].xyzz, temp[1].xyzz; 39: MAD temp[1].x, const[16].zzzz, temp[7].xxxx, const[16].zzzz; 40: MUL temp[7].xyz, const[13].xyzz, const[4].xyzz; 41: RCP temp[8].x, input[2].wwww; 42: MUL temp[9].xy, input[2].xyyy, temp[8].xxxx; 43: MAD temp[8].xy, const[16].zzzz, temp[9].xyyy, const[16].zzzz; 44: MUL temp[9].x, const[16].xxxx, const[5].xxxx; 45: MUL temp[10].xy, temp[9].xxxx, temp[0].xzzz; 46: RCP temp[0].x, input[5].xxxx; 47: MUL temp[9].xy, temp[10].xyyy, temp[0].xxxx; 48: ADD temp[0].xy, temp[8].xyyy, -temp[9].xyyy; 49: TEX temp[8].xyz, temp[0].xyyy, 2D[1]; 50: ADD temp[0].x, const[15].yyyy, -temp[5].xxxx; 51: MUL temp[9].xyz, temp[8].xyzz, temp[0].xxxx; 52: MAD temp[0].xyz, temp[7].xyzz, temp[5].xxxx, temp[9].xyzz; 53: MUL temp[5].xyz, temp[1].xxxx, temp[0].xyzz; 54: MAD temp[0].xyz, const[16].wwww, temp[2].xyzz, temp[5].xyzz; 55: ADD temp[1].x, const[15].yyyy, -temp[4].xxxx; 56: MUL temp[2].xyz, temp[0].xyzz, temp[1].xxxx; 57: MAD temp[0].xyz, temp[6].xyzz, temp[4].xxxx, temp[2].xyzz; 58: TEX temp[1].w, input[3].xyyy, 2D[0]; 59: MUL_SAT output[0].xyz, temp[0].xyzx, temp[1].wwwx; 60: MUL temp[0].x, const[17].xxxx, input[4].xxxx; 61: ADD temp[1].x, const[17].zzzz, -temp[3].yyyy; 62: MAX temp[2].x, const[16].yyyy, temp[1].xxxx; 63: MAD temp[1].x, const[17].yyyy, temp[2].xxxx, const[17].wwww; 64: ADD temp[2].x, temp[1].xxxx, temp[4].xxxx; 65: MUL_SAT output[0].w, temp[0].xxxx, temp[2].xxxx; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TEX temp[0].xyz, input[0].xyyy, 2D[3]; 1: ADD temp[1].xyz, temp[0].xzyy, const[15].xxxy; 2: DP3 temp[0].x, temp[1].xyzz, temp[1].xyzz; 3: RSQ temp[2].x, |temp[0].xxxx|; 4: MUL temp[0].xyz, temp[1].xyzz, temp[2].xxxx; 5: MOV temp[1].xyz, -const[14].xyzx; 6: ADD temp[2].xyz, const[12].xyzz, -input[6].xyzz; 7: DP3 temp[3].x, temp[2].xyzz, temp[2].xyzz; 8: RSQ temp[4].x, |temp[3].xxxx|; 9: MUL temp[3].xyz, temp[2].xyzz, temp[4].xxxx; 10: ADD temp[2].xyz, temp[1].xyzz, temp[3].xyzz; 11: RCP temp[4].x, const[2].xxxx; 12: MUL temp[5].x, input[4].xxxx, temp[4].xxxx; 13: MIN temp[4].x, temp[5].xxxx, const[15].yyyy; 14: MUL temp[5].x, const[3].xxxx, temp[4].xxxx; 15: DP3 temp[4].x, temp[0].xyzz, temp[3].xyzz; 16: ADD temp[6].x, const[15].yyyy, -temp[4].xxxx; 17: LG2 temp[4].w, temp[6].xxxx; 18: MUL temp[4].w, temp[4].wwww, const[16].xxxx; 19: EX2 temp[4].x, temp[4].wwww; 20: DP3 temp[6].x, temp[2].xyzz, temp[2].xyzz; 21: RSQ temp[7].x, |temp[6].xxxx|; 22: MUL temp[6].xyz, temp[2].xyzz, temp[7].xxxx; 23: DP3 temp[2].x, temp[0].xyzz, temp[6].xyzz; 24: MAX temp[6].x, const[16].yyyy, temp[2].xxxx; 25: LG2 temp[2].w, temp[6].xxxx; 26: MUL temp[2].w, temp[2].wwww, const[7].xxxx; 27: EX2 temp[2].x, temp[2].wwww; 28: MUL temp[6].xyz, temp[2].xxxx, const[13].xyzz; 29: MUL temp[2].xyz, temp[6].xyzz, const[6].xxxx; 30: MUL temp[6].xyz, const[13].xyzz, const[1].xyzz; 31: MUL temp[7].xy, const[5].xxxx, temp[0].xzzz; 32: RCP temp[8].x, input[5].xxxx; 33: RCP temp[9].x, input[1].wwww; 34: MUL temp[10].xy, input[1].xyyy, temp[9].xxxx; 35: MAD temp[9].xy, const[16].zzzz, temp[10].xyyy, const[16].zzzz; 36: MAD temp[10].xy, temp[7].xyyy, temp[8].xxxx, temp[9].xyyy; 37: TEX temp[7].xyz, temp[10].xyyy, 2D[2]; 38: ADD temp[8].x, const[15].yyyy, -const[0].xxxx; 39: MUL temp[9].xyz, temp[7].xyzz, temp[8].xxxx; 40: MAD temp[7].xyz, temp[6].xyzz, const[0].xxxx, temp[9].xyzz; 41: ADD temp[6].xyz, temp[7].xyzz, temp[2].xyzz; 42: DP3 temp[7].x, temp[0].xyzz, temp[1].xyzz; 43: MAD temp[1].x, const[16].zzzz, temp[7].xxxx, const[16].zzzz; 44: MUL temp[7].xyz, const[13].xyzz, const[4].xyzz; 45: RCP temp[8].x, input[2].wwww; 46: MUL temp[9].xy, input[2].xyyy, temp[8].xxxx; 47: MAD temp[8].xy, const[16].zzzz, temp[9].xyyy, const[16].zzzz; 48: MUL temp[9].x, const[16].xxxx, const[5].xxxx; 49: MUL temp[10].xy, temp[9].xxxx, temp[0].xzzz; 50: RCP temp[0].x, input[5].xxxx; 51: MUL temp[9].xy, temp[10].xyyy, temp[0].xxxx; 52: ADD temp[0].xy, temp[8].xyyy, -temp[9].xyyy; 53: TEX temp[8].xyz, temp[0].xyyy, 2D[1]; 54: ADD temp[0].x, const[15].yyyy, -temp[5].xxxx; 55: MUL temp[9].xyz, temp[8].xyzz, temp[0].xxxx; 56: MAD temp[0].xyz, temp[7].xyzz, temp[5].xxxx, temp[9].xyzz; 57: MUL temp[5].xyz, temp[1].xxxx, temp[0].xyzz; 58: MAD temp[0].xyz, const[16].wwww, temp[2].xyzz, temp[5].xyzz; 59: ADD temp[1].x, const[15].yyyy, -temp[4].xxxx; 60: MUL temp[2].xyz, temp[0].xyzz, temp[1].xxxx; 61: MAD temp[0].xyz, temp[6].xyzz, temp[4].xxxx, temp[2].xyzz; 62: TEX temp[1].w, input[3].xyyy, 2D[0]; 63: MUL_SAT output[0].xyz, temp[0].xyzx, temp[1].wwwx; 64: MUL temp[0].x, const[17].xxxx, input[4].xxxx; 65: ADD temp[1].x, const[17].zzzz, -temp[3].yyyy; 66: MAX temp[2].x, const[16].yyyy, temp[1].xxxx; 67: MAD temp[1].x, const[17].yyyy, temp[2].xxxx, const[17].wwww; 68: ADD temp[2].x, temp[1].xxxx, temp[4].xxxx; 69: MUL_SAT output[0].w, temp[0].xxxx, temp[2].xxxx; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TEX temp[0].xyz, input[0].xy__, 2D[3]; 1: ADD temp[1].xyz, temp[0].xzy_, const[15].xxx_; 2: DP3 temp[0].x, temp[1].xyz_, temp[1].xyz_; 3: RSQ temp[2].x, |temp[0].x___|; 4: MUL temp[0].xyz, temp[1].xyz_, temp[2].xxx_; 5: MOV temp[1].xyz, -const[14].xyz_; 6: ADD temp[2].xyz, const[12].xyz_, -input[6].xyz_; 7: DP3 temp[3].x, temp[2].xyz_, temp[2].xyz_; 8: RSQ temp[4].x, |temp[3].x___|; 9: MUL temp[3].xyz, temp[2].xyz_, temp[4].xxx_; 10: ADD temp[2].xyz, temp[1].xyz_, temp[3].xyz_; 11: RCP temp[4].x, const[2].x___; 12: MUL temp[5].x, input[4].x___, temp[4].x___; 13: MIN temp[4].x, temp[5].x___, const[15].y___; 14: MUL temp[5].x, const[3].x___, temp[4].x___; 15: DP3 temp[4].x, temp[0].xyz_, temp[3].xyz_; 16: ADD temp[6].x, const[15].y___, -temp[4].x___; 17: LG2 temp[4].w, temp[6].x___; 18: MUL temp[4].w, temp[4].___w, const[16].___x; 19: EX2 temp[4].x, temp[4].w___; 20: DP3 temp[6].x, temp[2].xyz_, temp[2].xyz_; 21: RSQ temp[7].x, |temp[6].x___|; 22: MUL temp[6].xyz, temp[2].xyz_, temp[7].xxx_; 23: DP3 temp[2].x, temp[0].xyz_, temp[6].xyz_; 24: MAX temp[6].x, const[16].y___, temp[2].x___; 25: LG2 temp[2].w, temp[6].x___; 26: MUL temp[2].w, temp[2].___w, const[7].___x; 27: EX2 temp[2].x, temp[2].w___; 28: MUL temp[6].xyz, temp[2].xxx_, const[13].xyz_; 29: MUL temp[2].xyz, temp[6].xyz_, const[6].xxx_; 30: MUL temp[6].xyz, const[13].xyz_, const[1].xyz_; 31: MUL temp[7].xy, const[5].xx__, temp[0].xz__; 32: RCP temp[8].x, input[5].x___; 33: RCP temp[9].x, input[1].w___; 34: MUL temp[10].xy, input[1].xy__, temp[9].xx__; 35: MAD temp[9].xy, const[16].zz__, temp[10].xy__, const[16].zz__; 36: MAD temp[10].xy, temp[7].xy__, temp[8].xx__, temp[9].xy__; 37: TEX temp[7].xyz, temp[10].xy__, 2D[2]; 38: ADD temp[8].x, const[15].y___, -const[0].x___; 39: MUL temp[9].xyz, temp[7].xyz_, temp[8].xxx_; 40: MAD temp[7].xyz, temp[6].xyz_, const[0].xxx_, temp[9].xyz_; 41: ADD temp[6].xyz, temp[7].xyz_, temp[2].xyz_; 42: DP3 temp[7].x, temp[0].xyz_, temp[1].xyz_; 43: MAD temp[1].x, const[16].z___, temp[7].x___, const[16].z___; 44: MUL temp[7].xyz, const[13].xyz_, const[4].xyz_; 45: RCP temp[8].x, input[2].w___; 46: MUL temp[9].xy, input[2].xy__, temp[8].xx__; 47: MAD temp[8].xy, const[16].zz__, temp[9].xy__, const[16].zz__; 48: MUL temp[9].x, const[16].x___, const[5].x___; 49: MUL temp[10].xy, temp[9].xx__, temp[0].xz__; 50: RCP temp[0].x, input[5].x___; 51: MUL temp[9].xy, temp[10].xy__, temp[0].xx__; 52: ADD temp[0].xy, temp[8].xy__, -temp[9].xy__; 53: TEX temp[8].xyz, temp[0].xy__, 2D[1]; 54: ADD temp[0].x, const[15].y___, -temp[5].x___; 55: MUL temp[9].xyz, temp[8].xyz_, temp[0].xxx_; 56: MAD temp[0].xyz, temp[7].xyz_, temp[5].xxx_, temp[9].xyz_; 57: MUL temp[5].xyz, temp[1].xxx_, temp[0].xyz_; 58: MAD temp[0].xyz, const[16].www_, temp[2].xyz_, temp[5].xyz_; 59: ADD temp[1].x, const[15].y___, -temp[4].x___; 60: MUL temp[2].xyz, temp[0].xyz_, temp[1].xxx_; 61: MAD temp[0].xyz, temp[6].xyz_, temp[4].xxx_, temp[2].xyz_; 62: TEX temp[1].w, input[3].xy__, 2D[0]; 63: MUL_SAT output[0].xyz, temp[0].xyz_, temp[1].www_; 64: MUL temp[0].x, const[17].x___, input[4].x___; 65: ADD temp[1].x, const[17].z___, -temp[3].y___; 66: MAX temp[2].x, const[16].y___, temp[1].x___; 67: MAD temp[1].x, const[17].y___, temp[2].x___, const[17].w___; 68: ADD temp[2].x, temp[1].x___, temp[4].x___; 69: MUL_SAT output[0].w, temp[0].___x, temp[2].___x; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TEX temp[0].xyz, input[0].xy__, 2D[3]; 1: ADD temp[1].xyz, temp[0].xzy_, none.-H-H-H_; 2: DP3 temp[0].x, temp[1].xyz_, temp[1].xyz_; 3: RSQ temp[2].x, |temp[0].x___|; 4: MUL temp[0].xyz, temp[1].xyz_, temp[2].xxx_; 5: DP3 temp[3].x, (const[12] - input[6]).xyz_, (const[12] - input[6]).xyz_; 6: RSQ temp[4].x, |temp[3].x___|; 7: MUL temp[3].xyz, (const[12] - input[6]).xyz_, temp[4].xxx_; 8: RCP temp[4].x, const[2].x___; 9: MUL temp[5].x, input[4].x___, temp[4].x___; 10: MIN temp[4].x, temp[5].x___, none.1___; 11: MUL temp[5].x, const[3].x___, temp[4].x___; 12: DP3 temp[4].x, temp[0].xyz_, temp[3].xyz_; 13: LG2 temp[4].w, (1 - temp[4]).x___; 14: MUL temp[4].w, temp[4].___w, const[16].___x; 15: EX2 temp[4].x, temp[4].w___; 16: DP3 temp[6].x, (temp[3] - const[14]).xyz_, (temp[3] - const[14]).xyz_; 17: RSQ temp[7].x, |temp[6].x___|; 18: MUL temp[6].xyz, (temp[3] - const[14]).xyz_, temp[7].xxx_; 19: DP3 temp[2].x, temp[0].xyz_, temp[6].xyz_; 20: MAX temp[6].x, none.0___, temp[2].x___; 21: LG2 temp[2].w, temp[6].x___; 22: MUL temp[2].w, temp[2].___w, const[7].___x; 23: EX2 temp[2].x, temp[2].w___; 24: MUL temp[6].xyz, temp[2].xxx_, const[13].xyz_; 25: MUL temp[2].xyz, temp[6].xyz_, const[6].xxx_; 26: MUL temp[6].xyz, const[13].xyz_, const[1].xyz_; 27: MUL temp[7].xy, const[5].xx__, temp[0].xz__; 28: RCP temp[8].x, input[5].x___; 29: RCP temp[9].x, input[1].w___; 30: MUL temp[10].xy, input[1].xy__, temp[9].xx__; 31: MAD temp[9].xy, none.HH__, temp[10].xy__, none.HH__; 32: MAD temp[10].xy, temp[7].xy__, temp[8].xx__, temp[9].xy__; 33: TEX temp[7].xyz, temp[10].xy__, 2D[2]; 34: MUL temp[9].xyz, temp[7].xyz_, (1 - const[0]).xxx_; 35: MAD temp[7].xyz, temp[6].xyz_, const[0].xxx_, temp[9].xyz_; 36: ADD temp[6].xyz, temp[7].xyz_, temp[2].xyz_; 37: DP3 temp[7].x, temp[0].xyz_, const[14].-x-y-z_; 38: MAD temp[1].x, none.H___, temp[7].x___, none.H___; 39: MUL temp[7].xyz, const[13].xyz_, const[4].xyz_; 40: RCP temp[8].x, input[2].w___; 41: MUL temp[9].xy, input[2].xy__, temp[8].xx__; 42: MAD temp[8].xy, none.HH__, temp[9].xy__, none.HH__; 43: MUL temp[9].x, const[16].x___, const[5].x___; 44: MUL temp[10].xy, temp[9].xx__, temp[0].xz__; 45: RCP temp[0].x, input[5].x___; 46: MUL temp[9].xy, temp[10].xy__, temp[0].xx__; 47: ADD temp[0].xy, temp[8].xy__, -temp[9].xy__; 48: TEX temp[8].xyz, temp[0].xy__, 2D[1]; 49: MUL temp[9].xyz, temp[8].xyz_, (1 - temp[5]).xxx_; 50: MAD temp[0].xyz, temp[7].xyz_, temp[5].xxx_, temp[9].xyz_; 51: MUL temp[5].xyz, temp[1].xxx_, temp[0].xyz_; 52: MAD temp[0].xyz, const[16].www_, temp[2].xyz_, temp[5].xyz_; 53: MUL temp[2].xyz, temp[0].xyz_, (1 - temp[4]).xxx_; 54: MAD temp[0].xyz, temp[6].xyz_, temp[4].xxx_, temp[2].xyz_; 55: TEX temp[1].w, input[3].xy__, 2D[0]; 56: MUL_SAT output[0].xyz, temp[0].xyz_, temp[1].www_; 57: MUL temp[0].x, const[17].x___, input[4].x___; 58: ADD temp[1].x, const[17].z___, -temp[3].y___; 59: MAX temp[2].x, none.0___, temp[1].x___; 60: MAD temp[1].x, const[17].y___, temp[2].x___, const[17].w___; 61: MUL_SAT output[0].w, temp[0].___x, (temp[4] + temp[1]).___x; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TEX temp[0].xyz, input[0].xy__, 2D[3]; 1: ADD temp[1].xyz, temp[0].xzy_, none.-H-H-H_; 2: DP3 temp[0].x, temp[1].xyz_, temp[1].xyz_; 3: RSQ temp[2].x, |temp[0].x___|; 4: MUL temp[0].xyz, temp[1].xyz_, temp[2].xxx_; 5: DP3 temp[3].x, (const[12] - input[6]).xyz_, (const[12] - input[6]).xyz_; 6: RSQ temp[4].x, |temp[3].x___|; 7: MUL temp[3].xyz, (const[12] - input[6]).xyz_, temp[4].xxx_; 8: RCP temp[4].x, const[2].x___; 9: MUL temp[5].x, input[4].x___, temp[4].x___; 10: MIN temp[4].x, temp[5].x___, none.1___; 11: MUL temp[5].x, const[3].x___, temp[4].x___; 12: DP3 temp[4].x, temp[0].xyz_, temp[3].xyz_; 13: LG2 temp[4].w, (1 - temp[4]).x___; 14: MUL temp[4].w, temp[4].___w, const[16].___x; 15: EX2 temp[4].x, temp[4].w___; 16: DP3 temp[6].x, (temp[3] - const[14]).xyz_, (temp[3] - const[14]).xyz_; 17: RSQ temp[7].x, |temp[6].x___|; 18: MUL temp[6].xyz, (temp[3] - const[14]).xyz_, temp[7].xxx_; 19: DP3 temp[2].x, temp[0].xyz_, temp[6].xyz_; 20: MAX temp[6].x, none.0___, temp[2].x___; 21: LG2 temp[2].w, temp[6].x___; 22: MUL temp[2].w, temp[2].___w, const[7].___x; 23: EX2 temp[2].x, temp[2].w___; 24: MUL temp[6].xyz, temp[2].xxx_, const[13].xyz_; 25: MUL temp[2].xyz, temp[6].xyz_, const[6].xxx_; 26: MUL temp[6].xyz, const[13].xyz_, const[1].xyz_; 27: MUL temp[7].xy, const[5].xx__, temp[0].xz__; 28: RCP temp[8].x, input[5].x___; 29: RCP temp[9].x, input[1].w___; 30: MUL temp[10].xy, input[1].xy__, temp[9].xx__; 31: MAD temp[9].xy, none.HH__, temp[10].xy__, none.HH__; 32: MAD temp[10].xy, temp[7].xy__, temp[8].xx__, temp[9].xy__; 33: TEX temp[7].xyz, temp[10].xy__, 2D[2]; 34: MUL temp[9].xyz, temp[7].xyz_, (1 - const[0]).xxx_; 35: MAD temp[7].xyz, temp[6].xyz_, const[0].xxx_, temp[9].xyz_; 36: ADD temp[6].xyz, temp[7].xyz_, temp[2].xyz_; 37: DP3 temp[7].x, temp[0].xyz_, const[14].-x-y-z_; 38: MAD temp[1].x, none.H___, temp[7].x___, none.H___; 39: MUL temp[7].xyz, const[13].xyz_, const[4].xyz_; 40: RCP temp[8].x, input[2].w___; 41: MUL temp[9].xy, input[2].xy__, temp[8].xx__; 42: MAD temp[8].xy, none.HH__, temp[9].xy__, none.HH__; 43: MUL temp[9].x, const[16].x___, const[5].x___; 44: MUL temp[10].xy, temp[9].xx__, temp[0].xz__; 45: RCP temp[0].x, input[5].x___; 46: MUL temp[9].xy, temp[10].xy__, temp[0].xx__; 47: ADD temp[0].xy, temp[8].xy__, -temp[9].xy__; 48: TEX temp[8].xyz, temp[0].xy__, 2D[1]; 49: MUL temp[9].xyz, temp[8].xyz_, (1 - temp[5]).xxx_; 50: MAD temp[0].xyz, temp[7].xyz_, temp[5].xxx_, temp[9].xyz_; 51: MUL temp[5].xyz, temp[1].xxx_, temp[0].xyz_; 52: MAD temp[0].xyz, const[16].www_, temp[2].xyz_, temp[5].xyz_; 53: MUL temp[2].xyz, temp[0].xyz_, (1 - temp[4]).xxx_; 54: MAD temp[0].xyz, temp[6].xyz_, temp[4].xxx_, temp[2].xyz_; 55: TEX temp[1].w, input[3].xy__, 2D[0]; 56: MUL_SAT output[0].xyz, temp[0].xyz_, temp[1].www_; 57: MUL temp[0].x, const[17].x___, input[4].x___; 58: ADD temp[1].x, const[17].z___, -temp[3].y___; 59: MAX temp[2].x, none.0___, temp[1].x___; 60: MAD temp[1].x, const[17].y___, temp[2].x___, const[17].w___; 61: MUL_SAT output[0].w, temp[0].___x, (temp[4] + temp[1]).___x; CONST[15] = { 0.8000 0.0000 0.5000 0.3000 } CONST[16] = { 0.1500 18.0000 0.7000 1.2000 } Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TEX temp[0].xyz, input[0].xy__, 2D[3]; 1: ADD temp[1].xyz, temp[0].xzy_, none.-H-H-H_; 2: DP3 temp[0].x, temp[1].xyz_, temp[1].xyz_; 3: RSQ temp[2].x, |temp[0].x___|; 4: MUL temp[0].xyz, temp[1].xyz_, temp[2].xxx_; 5: DP3 temp[3].x, (const[12] - input[6]).xyz_, (const[12] - input[6]).xyz_; 6: RSQ temp[4].x, |temp[3].x___|; 7: MUL temp[3].xyz, (const[12] - input[6]).xyz_, temp[4].xxx_; 8: RCP temp[4].x, const[2].x___; 9: MUL temp[5].x, input[4].x___, temp[4].x___; 10: MIN temp[4].x, temp[5].x___, none.1___; 11: MUL temp[5].x, const[3].x___, temp[4].x___; 12: DP3 temp[4].x, temp[0].xyz_, temp[3].xyz_; 13: LG2 temp[4].w, (1 - temp[4]).x___; 14: MUL temp[4].w, temp[4].___w, const[15].___x; 15: EX2 temp[4].x, temp[4].w___; 16: DP3 temp[6].x, (temp[3] - const[14]).xyz_, (temp[3] - const[14]).xyz_; 17: RSQ temp[7].x, |temp[6].x___|; 18: MUL temp[6].xyz, (temp[3] - const[14]).xyz_, temp[7].xxx_; 19: DP3 temp[2].x, temp[0].xyz_, temp[6].xyz_; 20: MAX temp[6].x, none.0___, temp[2].x___; 21: LG2 temp[2].w, temp[6].x___; 22: MUL temp[2].w, temp[2].___w, const[7].___x; 23: EX2 temp[2].x, temp[2].w___; 24: MUL temp[6].xyz, temp[2].xxx_, const[13].xyz_; 25: MUL temp[2].xyz, temp[6].xyz_, const[6].xxx_; 26: MUL temp[6].xyz, const[13].xyz_, const[1].xyz_; 27: MUL temp[7].xy, const[5].xx__, temp[0].xz__; 28: RCP temp[8].x, input[5].x___; 29: RCP temp[9].x, input[1].w___; 30: MUL temp[10].xy, input[1].xy__, temp[9].xx__; 31: MAD temp[9].xy, none.HH__, temp[10].xy__, none.HH__; 32: MAD temp[10].xy, temp[7].xy__, temp[8].xx__, temp[9].xy__; 33: TEX temp[7].xyz, temp[10].xy__, 2D[2]; 34: MUL temp[9].xyz, temp[7].xyz_, (1 - const[0]).xxx_; 35: MAD temp[7].xyz, temp[6].xyz_, const[0].xxx_, temp[9].xyz_; 36: ADD temp[6].xyz, temp[7].xyz_, temp[2].xyz_; 37: DP3 temp[7].x, temp[0].xyz_, const[14].-x-y-z_; 38: MAD temp[1].x, none.H___, temp[7].x___, none.H___; 39: MUL temp[7].xyz, const[13].xyz_, const[4].xyz_; 40: RCP temp[8].x, input[2].w___; 41: MUL temp[9].xy, input[2].xy__, temp[8].xx__; 42: MAD temp[8].xy, none.HH__, temp[9].xy__, none.HH__; 43: MUL temp[9].x, const[15].x___, const[5].x___; 44: MUL temp[10].xy, temp[9].xx__, temp[0].xz__; 45: RCP temp[0].x, input[5].x___; 46: MUL temp[9].xy, temp[10].xy__, temp[0].xx__; 47: ADD temp[0].xy, temp[8].xy__, -temp[9].xy__; 48: TEX temp[8].xyz, temp[0].xy__, 2D[1]; 49: MUL temp[9].xyz, temp[8].xyz_, (1 - temp[5]).xxx_; 50: MAD temp[0].xyz, temp[7].xyz_, temp[5].xxx_, temp[9].xyz_; 51: MUL temp[5].xyz, temp[1].xxx_, temp[0].xyz_; 52: MAD temp[0].xyz, const[15].www_, temp[2].xyz_, temp[5].xyz_; 53: MUL temp[2].xyz, temp[0].xyz_, (1 - temp[4]).xxx_; 54: MAD temp[0].xyz, temp[6].xyz_, temp[4].xxx_, temp[2].xyz_; 55: TEX temp[1].w, input[3].xy__, 2D[0]; 56: MUL_SAT output[0].xyz, temp[0].xyz_, temp[1].www_; 57: MUL temp[0].x, const[16].x___, input[4].x___; 58: ADD temp[1].x, const[16].z___, -temp[3].y___; 59: MAX temp[2].x, none.0___, temp[1].x___; 60: MAD temp[1].x, const[16].y___, temp[2].x___, const[16].w___; 61: MUL_SAT output[0].w, temp[0].___x, (temp[4] + temp[1]).___x; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TEX temp[0].xyz, input[0].xy__, 2D[3]; 1: src0.xyz = temp[0] MAD temp[1].xyz, src0.xzy, src0.111, -src0.HHH 2: src0.xyz = temp[1] DP3 temp[0].x, src0.xyz, src0.xyz 3: src0.xyz = temp[0] REPL_ALPHA temp[2].x RSQ, |src0.x| 4: src0.xyz = temp[1], src1.xyz = temp[2] MAD temp[0].xyz, src0.xyz, src1.xxx, src0.000 5: src0.xyz = input[6], src1.xyz = const[12], srcp.xyz = (src1 - src0) DP3 temp[3].x, srcp.xyz, srcp.xyz 6: src0.xyz = temp[3] REPL_ALPHA temp[4].x RSQ, |src0.x| 7: src0.xyz = input[6], src1.xyz = const[12], src2.xyz = temp[4], srcp.xyz = (src1 - src0) MAD temp[3].xyz, srcp.xyz, src2.xxx, src0.000 8: src0.xyz = const[2] REPL_ALPHA temp[4].x RCP, src0.x 9: src0.xyz = input[4], src1.xyz = temp[4] MAD temp[5].x, src0.x__, src1.x__, src0.000 10: src0.xyz = temp[5] MIN temp[4].x, src0.x__, src0.1__ 11: src0.xyz = const[3], src1.xyz = temp[4] MAD temp[5].x, src0.x__, src1.x__, src0.000 12: src0.xyz = temp[0], src1.xyz = temp[3] DP3 temp[4].x, src0.xyz, src1.xyz 13: src0.xyz = temp[4], srcp.xyz = (1 - src0) LG2 temp[4].w, srcp.x 14: src0.xyz = const[15], src0.w = temp[4] MAD temp[4].w, src0.w, src0.x, src0.0 15: src0.w = temp[4] REPL_ALPHA temp[4].x EX2, src0.w 16: src0.xyz = const[14], src1.xyz = temp[3], srcp.xyz = (src1 - src0) DP3 temp[6].x, srcp.xyz, srcp.xyz 17: src0.xyz = temp[6] REPL_ALPHA temp[7].x RSQ, |src0.x| 18: src0.xyz = const[14], src1.xyz = temp[3], src2.xyz = temp[7], srcp.xyz = (src1 - src0) MAD temp[6].xyz, srcp.xyz, src2.xxx, src0.000 19: src0.xyz = temp[0], src1.xyz = temp[6] DP3 temp[2].x, src0.xyz, src1.xyz 20: src0.xyz = temp[2] MAX temp[6].x, src0.0__, src0.x__ 21: src0.xyz = temp[6] LG2 temp[2].w, src0.x 22: src0.xyz = const[7], src0.w = temp[2] MAD temp[2].w, src0.w, src0.x, src0.0 23: src0.w = temp[2] REPL_ALPHA temp[2].x EX2, src0.w 24: src0.xyz = temp[2], src1.xyz = const[13] MAD temp[6].xyz, src0.xxx, src1.xyz, src0.000 25: src0.xyz = temp[6], src1.xyz = const[6] MAD temp[2].xyz, src0.xyz, src1.xxx, src0.000 26: src0.xyz = const[13], src1.xyz = const[1] MAD temp[6].xyz, src0.xyz, src1.xyz, src0.000 27: src0.xyz = const[5], src1.xyz = temp[0] MAD temp[7].xy, src0.xx_, src1.xz_, src0.000 28: src0.xyz = input[5] REPL_ALPHA temp[8].x RCP, src0.x 29: src0.w = input[1] REPL_ALPHA temp[9].x RCP, src0.w 30: src0.xyz = input[1], src1.xyz = temp[9] MAD temp[10].xy, src0.xy_, src1.xx_, src0.000 31: src0.xyz = temp[10] MAD temp[9].xy, src0.HH_, src0.xy_, src0.HH_ 32: src0.xyz = temp[7], src1.xyz = temp[8], src2.xyz = temp[9] MAD temp[10].xy, src0.xy_, src1.xx_, src2.xy_ 33: TEX temp[7].xyz, temp[10].xy__, 2D[2]; 34: src0.xyz = const[0], src1.xyz = temp[7], srcp.xyz = (1 - src0) MAD temp[9].xyz, src1.xyz, srcp.xxx, src0.000 35: src0.xyz = temp[6], src1.xyz = const[0], src2.xyz = temp[9] MAD temp[7].xyz, src0.xyz, src1.xxx, src2.xyz 36: src0.xyz = temp[7], src1.xyz = temp[2] MAD temp[6].xyz, src0.xyz, src0.111, src1.xyz 37: src0.xyz = temp[0], src1.xyz = const[14] DP3 temp[7].x, src0.xyz, -src1.xyz 38: src0.xyz = temp[7] MAD temp[1].x, src0.H__, src0.x__, src0.H__ 39: src0.xyz = const[13], src1.xyz = const[4] MAD temp[7].xyz, src0.xyz, src1.xyz, src0.000 40: src0.w = input[2] REPL_ALPHA temp[8].x RCP, src0.w 41: src0.xyz = input[2], src1.xyz = temp[8] MAD temp[9].xy, src0.xy_, src1.xx_, src0.000 42: src0.xyz = temp[9] MAD temp[8].xy, src0.HH_, src0.xy_, src0.HH_ 43: src0.xyz = const[15], src1.xyz = const[5] MAD temp[9].x, src0.x__, src1.x__, src0.000 44: src0.xyz = temp[9], src1.xyz = temp[0] MAD temp[10].xy, src0.xx_, src1.xz_, src0.000 45: src0.xyz = input[5] REPL_ALPHA temp[0].x RCP, src0.x 46: src0.xyz = temp[10], src1.xyz = temp[0] MAD temp[9].xy, src0.xy_, src1.xx_, src0.000 47: src0.xyz = temp[8], src1.xyz = temp[9] MAD temp[0].xy, src0.xy_, src0.111, -src1.xy_ 48: TEX temp[8].xyz, temp[0].xy__, 2D[1]; 49: src0.xyz = temp[5], src1.xyz = temp[8], srcp.xyz = (1 - src0) MAD temp[9].xyz, src1.xyz, srcp.xxx, src0.000 50: src0.xyz = temp[7], src1.xyz = temp[5], src2.xyz = temp[9] MAD temp[0].xyz, src0.xyz, src1.xxx, src2.xyz 51: src0.xyz = temp[1], src1.xyz = temp[0] MAD temp[5].xyz, src0.xxx, src1.xyz, src0.000 52: src0.xyz = temp[2], src0.w = const[15], src1.xyz = temp[5] MAD temp[0].xyz, src0.www, src0.xyz, src1.xyz 53: src0.xyz = temp[4], src1.xyz = temp[0], srcp.xyz = (1 - src0) MAD temp[2].xyz, src1.xyz, srcp.xxx, src0.000 54: src0.xyz = temp[6], src1.xyz = temp[4], src2.xyz = temp[2] MAD temp[0].xyz, src0.xyz, src1.xxx, src2.xyz 55: TEX temp[1].w, input[3].xy__, 2D[0]; 56: src0.xyz = temp[0], src0.w = temp[1] MAD_SAT color[0].xyz, src0.xyz, src0.www, src0.000 57: src0.xyz = const[16], src1.xyz = input[4] MAD temp[0].x, src0.x__, src1.x__, src0.000 58: src0.xyz = const[16], src1.xyz = temp[3] MAD temp[1].x, src0.z__, src0.111, -src1.y__ 59: src0.xyz = temp[1] MAX temp[2].x, src0.0__, src0.x__ 60: src0.xyz = const[16], src0.w = const[16], src1.xyz = temp[2] MAD temp[1].x, src0.y__, src1.x__, src0.w__ 61: src0.xyz = temp[1], src1.xyz = temp[4], src2.xyz = temp[0], srcp.xyz = (src1 + src0) MAD_SAT color[0].w, src2.x, srcp.x, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[0].xyz, input[0].xy__, 2D[3]; 2: TEX temp[1].w, input[3].xy__, 2D[0]; 3: src0.w = input[1] REPL_ALPHA temp[9].x RCP, src0.w 4: src0.xyz = input[5] REPL_ALPHA temp[8].x RCP, src0.x 5: src0.xyz = input[1], src1.xyz = temp[9] MAD temp[10].xy, src0.xy_, src1.xx_, src0.000 6: src0.xyz = temp[10] MAD temp[9].xy, src0.HH_, src0.xy_, src0.HH_ 7: src0.xyz = temp[0] MAD temp[1].xyz, src0.xzy, src0.111, -src0.HHH 8: src0.xyz = temp[1] DP3 temp[0].x, src0.xyz, src0.xyz 9: src0.xyz = temp[0] REPL_ALPHA temp[2].x RSQ, |src0.x| 10: src0.xyz = temp[1], src1.xyz = temp[2] MAD temp[0].xyz, src0.xyz, src1.xxx, src0.000 11: src0.xyz = input[6], src1.xyz = const[12], srcp.xyz = (src1 - src0) DP3 temp[3].x, srcp.xyz, srcp.xyz 12: src0.xyz = temp[3] REPL_ALPHA temp[4].x RSQ, |src0.x| 13: src0.xyz = input[6], src1.xyz = const[12], src2.xyz = temp[4], srcp.xyz = (src1 - src0) MAD temp[3].xyz, srcp.xyz, src2.xxx, src0.000 14: src0.xyz = const[2] REPL_ALPHA temp[4].x RCP, src0.x 15: src0.xyz = const[14], src1.xyz = temp[3], srcp.xyz = (src1 - src0) DP3 temp[6].x, srcp.xyz, srcp.xyz 16: src0.xyz = temp[6] REPL_ALPHA temp[7].x RSQ, |src0.x| 17: src0.xyz = const[14], src1.xyz = temp[3], src2.xyz = temp[7], srcp.xyz = (src1 - src0) MAD temp[6].xyz, srcp.xyz, src2.xxx, src0.000 18: src0.xyz = temp[0], src1.xyz = temp[6] DP3 temp[2].x, src0.xyz, src1.xyz 19: src0.xyz = temp[2], src1.xyz = input[4], src2.xyz = temp[4] MAX temp[6].x, src0.0__, src0.x__ MAD temp[5].w, src1.x, src2.x, src0.0 20: src0.xyz = temp[5], src0.w = temp[5], src1.xyz = temp[6] MIN temp[4].x, src0.w__, src0.1__ LG2 temp[2].w, src1.x 21: src0.xyz = const[3], src0.w = temp[2], src1.xyz = temp[4], src2.xyz = const[7] MAD temp[5].x, src0.x__, src1.x__, src0.000 MAD temp[2].w, src0.w, src2.x, src0.0 22: src0.w = temp[2] REPL_ALPHA temp[2].x EX2, src0.w 23: src0.xyz = temp[2], src1.xyz = const[13] MAD temp[6].xyz, src0.xxx, src1.xyz, src0.000 24: src0.xyz = temp[6], src1.xyz = const[6] MAD temp[2].xyz, src0.xyz, src1.xxx, src0.000 25: src0.xyz = const[13], src1.xyz = const[1] MAD temp[6].xyz, src0.xyz, src1.xyz, src0.000 26: src0.xyz = temp[0], src1.xyz = temp[3] DP3 temp[4].x, src0.xyz, src1.xyz 27: src0.xyz = temp[4], src1.xyz = temp[0], src2.xyz = const[5], srcp.xyz = (1 - src0) MAD temp[7].xy, src2.xx_, src1.xz_, src0.000 LG2 temp[4].w, srcp.x 28: src0.xyz = temp[7], src1.xyz = temp[8], src2.xyz = temp[9] MAD temp[10].xy, src0.xy_, src1.xx_, src2.xy_ 29: src0.w = input[2] REPL_ALPHA temp[8].x RCP, src0.w 30: src0.xyz = const[15], src0.w = temp[4] MAD temp[4].w, src0.w, src0.x, src0.0 31: src0.w = temp[4] REPL_ALPHA temp[4].x EX2, src0.w 32: BEGIN_TEX; 33: TEX temp[7].xyz, temp[10].xy__, 2D[2]; 34: src0.xyz = const[0], src1.xyz = temp[7], srcp.xyz = (1 - src0) MAD temp[9].xyz, src1.xyz, srcp.xxx, src0.000 35: src0.xyz = temp[6], src1.xyz = const[0], src2.xyz = temp[9] MAD temp[7].xyz, src0.xyz, src1.xxx, src2.xyz 36: src0.xyz = temp[7], src1.xyz = temp[2] MAD temp[6].xyz, src0.xyz, src0.111, src1.xyz 37: src0.xyz = temp[0], src1.xyz = const[14] DP3 temp[7].x, src0.xyz, -src1.xyz 38: src0.xyz = temp[7] MAD temp[1].x, src0.H__, src0.x__, src0.H__ 39: src0.xyz = const[13], src1.xyz = const[4] MAD temp[7].xyz, src0.xyz, src1.xyz, src0.000 40: src0.xyz = input[2], src1.xyz = temp[8] MAD temp[9].xy, src0.xy_, src1.xx_, src0.000 41: src0.xyz = temp[9] MAD temp[8].xy, src0.HH_, src0.xy_, src0.HH_ 42: src0.xyz = const[15], src1.xyz = const[5] MAD temp[9].x, src0.x__, src1.x__, src0.000 43: src0.xyz = temp[9], src1.xyz = temp[0] MAD temp[10].xy, src0.xx_, src1.xz_, src0.000 44: src0.xyz = input[5] REPL_ALPHA temp[0].x RCP, src0.x 45: src0.xyz = temp[10], src1.xyz = temp[0] MAD temp[9].xy, src0.xy_, src1.xx_, src0.000 46: src0.xyz = temp[8], src1.xyz = temp[9] MAD temp[0].xy, src0.xy_, src0.111, -src1.xy_ 47: BEGIN_TEX; 48: TEX temp[8].xyz, temp[0].xy__, 2D[1]; 49: src0.xyz = temp[5], src1.xyz = temp[8], srcp.xyz = (1 - src0) MAD temp[9].xyz, src1.xyz, srcp.xxx, src0.000 50: src0.xyz = temp[7], src1.xyz = temp[5], src2.xyz = temp[9] MAD temp[0].xyz, src0.xyz, src1.xxx, src2.xyz 51: src0.xyz = temp[1], src1.xyz = temp[0] MAD temp[5].xyz, src0.xxx, src1.xyz, src0.000 52: src0.xyz = temp[2], src0.w = const[15], src1.xyz = temp[5] MAD temp[0].xyz, src0.www, src0.xyz, src1.xyz 53: src0.xyz = temp[4], src1.xyz = temp[0], srcp.xyz = (1 - src0) MAD temp[2].xyz, src1.xyz, srcp.xxx, src0.000 54: src0.xyz = temp[6], src1.xyz = temp[4], src2.xyz = temp[2] MAD temp[0].xyz, src0.xyz, src1.xxx, src2.xyz 55: src0.xyz = temp[0], src0.w = temp[1], src1.xyz = const[16], src2.xyz = temp[3] MAD_SAT color[0].xyz, src0.xyz, src0.www, src0.000 MAD temp[3].w, src1.z, src0.1, -src2.y 56: src0.xyz = const[16], src0.w = temp[3], src1.xyz = input[4] MAD temp[0].x, src0.x__, src1.x__, src0.000 MAX temp[6].w, src0.0, src0.w 57: src0.xyz = const[16], src0.w = const[16], src1.xyz = temp[2], src1.w = temp[6] MAD temp[1].x, src0.y__, src1.w__, src0.w__ 58: src0.xyz = temp[1], src1.xyz = temp[4], src2.xyz = temp[0], srcp.xyz = (src1 + src0) MAD_SAT color[0].w, src2.x, srcp.x, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TEX temp[0].xyz, temp[0].xy__, 2D[3]; 2: TEX temp[3].w, temp[3].xy__, 2D[0]; 3: src0.w = temp[1] REPL_ALPHA temp[12].x RCP, src0.w 4: src0.xyz = temp[5] REPL_ALPHA temp[11].x RCP, src0.x 5: src0.xyz = temp[1], src1.xyz = temp[12] MAD temp[13].xy, src0.xy_, src1.xx_, src0.000 6: src0.xyz = temp[13] MAD temp[12].xy, src0.HH_, src0.xy_, src0.HH_ 7: src0.xyz = temp[0] MAD temp[3].xyz, src0.xzy, src0.111, -src0.HHH 8: src0.xyz = temp[3] DP3 temp[0].x, src0.xyz, src0.xyz 9: src0.xyz = temp[0] REPL_ALPHA temp[1].x RSQ, |src0.x| 10: src0.xyz = temp[3], src1.xyz = temp[1] MAD temp[0].xyz, src0.xyz, src1.xxx, src0.000 11: src0.xyz = temp[6], src1.xyz = const[12], srcp.xyz = (src1 - src0) DP3 temp[7].x, srcp.xyz, srcp.xyz 12: src0.xyz = temp[7] REPL_ALPHA temp[8].x RSQ, |src0.x| 13: src0.xyz = temp[6], src1.xyz = const[12], src2.xyz = temp[8], srcp.xyz = (src1 - src0) MAD temp[7].xyz, srcp.xyz, src2.xxx, src0.000 14: src0.xyz = const[2] REPL_ALPHA temp[8].x RCP, src0.x 15: src0.xyz = const[14], src1.xyz = temp[7], srcp.xyz = (src1 - src0) DP3 temp[9].x, srcp.xyz, srcp.xyz 16: src0.xyz = temp[9] REPL_ALPHA temp[10].x RSQ, |src0.x| 17: src0.xyz = const[14], src1.xyz = temp[7], src2.xyz = temp[10], srcp.xyz = (src1 - src0) MAD temp[9].xyz, srcp.xyz, src2.xxx, src0.000 18: src0.xyz = temp[0], src1.xyz = temp[9] DP3 temp[1].x, src0.xyz, src1.xyz 19: src0.xyz = temp[1], src1.xyz = temp[4], src2.xyz = temp[8] MAX temp[9].x, src0.0__, src0.x__ MAD temp[6].w, src1.x, src2.x, src0.0 20: src0.xyz = temp[6], src0.w = temp[6], src1.xyz = temp[9] MIN temp[8].x, src0.w__, src0.1__ LG2 temp[1].w, src1.x 21: src0.xyz = const[3], src0.w = temp[1], src1.xyz = temp[8], src2.xyz = const[7] MAD temp[6].x, src0.x__, src1.x__, src0.000 MAD temp[1].w, src0.w, src2.x, src0.0 22: src0.w = temp[1] REPL_ALPHA temp[1].x EX2, src0.w 23: src0.xyz = temp[1], src1.xyz = const[13] MAD temp[9].xyz, src0.xxx, src1.xyz, src0.000 24: src0.xyz = temp[9], src1.xyz = const[6] MAD temp[1].xyz, src0.xyz, src1.xxx, src0.000 25: src0.xyz = const[13], src1.xyz = const[1] MAD temp[9].xyz, src0.xyz, src1.xyz, src0.000 26: src0.xyz = temp[0], src1.xyz = temp[7] DP3 temp[8].x, src0.xyz, src1.xyz 27: src0.xyz = temp[8], src1.xyz = temp[0], src2.xyz = const[5], srcp.xyz = (1 - src0) MAD temp[10].xy, src2.xx_, src1.xz_, src0.000 LG2 temp[8].w, srcp.x 28: src0.xyz = temp[10], src1.xyz = temp[11], src2.xyz = temp[12] MAD temp[13].xy, src0.xy_, src1.xx_, src2.xy_ 29: src0.w = temp[2] REPL_ALPHA temp[11].x RCP, src0.w 30: src0.xyz = const[15], src0.w = temp[8] MAD temp[8].w, src0.w, src0.x, src0.0 31: src0.w = temp[8] REPL_ALPHA temp[8].x EX2, src0.w 32: BEGIN_TEX; 33: TEX temp[10].xyz, temp[13].xy__, 2D[2]; 34: src0.xyz = const[0], src1.xyz = temp[10], srcp.xyz = (1 - src0) MAD temp[12].xyz, src1.xyz, srcp.xxx, src0.000 35: src0.xyz = temp[9], src1.xyz = const[0], src2.xyz = temp[12] MAD temp[10].xyz, src0.xyz, src1.xxx, src2.xyz 36: src0.xyz = temp[10], src1.xyz = temp[1] MAD temp[9].xyz, src0.xyz, src0.111, src1.xyz 37: src0.xyz = temp[0], src1.xyz = const[14] DP3 temp[10].x, src0.xyz, -src1.xyz 38: src0.xyz = temp[10] MAD temp[3].x, src0.H__, src0.x__, src0.H__ 39: src0.xyz = const[13], src1.xyz = const[4] MAD temp[10].xyz, src0.xyz, src1.xyz, src0.000 40: src0.xyz = temp[2], src1.xyz = temp[11] MAD temp[12].xy, src0.xy_, src1.xx_, src0.000 41: src0.xyz = temp[12] MAD temp[11].xy, src0.HH_, src0.xy_, src0.HH_ 42: src0.xyz = const[15], src1.xyz = const[5] MAD temp[12].x, src0.x__, src1.x__, src0.000 43: src0.xyz = temp[12], src1.xyz = temp[0] MAD temp[13].xy, src0.xx_, src1.xz_, src0.000 44: src0.xyz = temp[5] REPL_ALPHA temp[0].x RCP, src0.x 45: src0.xyz = temp[13], src1.xyz = temp[0] MAD temp[12].xy, src0.xy_, src1.xx_, src0.000 46: src0.xyz = temp[11], src1.xyz = temp[12] MAD temp[0].xy, src0.xy_, src0.111, -src1.xy_ 47: BEGIN_TEX; 48: TEX temp[11].xyz, temp[0].xy__, 2D[1]; 49: src0.xyz = temp[6], src1.xyz = temp[11], srcp.xyz = (1 - src0) MAD temp[12].xyz, src1.xyz, srcp.xxx, src0.000 50: src0.xyz = temp[10], src1.xyz = temp[6], src2.xyz = temp[12] MAD temp[0].xyz, src0.xyz, src1.xxx, src2.xyz 51: src0.xyz = temp[3], src1.xyz = temp[0] MAD temp[6].xyz, src0.xxx, src1.xyz, src0.000 52: src0.xyz = temp[1], src0.w = const[15], src1.xyz = temp[6] MAD temp[0].xyz, src0.www, src0.xyz, src1.xyz 53: src0.xyz = temp[8], src1.xyz = temp[0], srcp.xyz = (1 - src0) MAD temp[1].xyz, src1.xyz, srcp.xxx, src0.000 54: src0.xyz = temp[9], src1.xyz = temp[8], src2.xyz = temp[1] MAD temp[0].xyz, src0.xyz, src1.xxx, src2.xyz 55: src0.xyz = temp[0], src0.w = temp[3], src1.xyz = const[16], src2.xyz = temp[7] MAD_SAT color[0].xyz, src0.xyz, src0.www, src0.000 MAD temp[7].w, src1.z, src0.1, -src2.y 56: src0.xyz = const[16], src0.w = temp[7], src1.xyz = temp[4] MAD temp[0].x, src0.x__, src1.x__, src0.000 MAX temp[9].w, src0.0, src0.w 57: src0.xyz = const[16], src0.w = const[16], src1.xyz = temp[1], src1.w = temp[9] MAD temp[3].x, src0.y__, src1.w__, src0.w__ 58: src0.xyz = temp[3], src1.xyz = temp[8], src2.xyz = temp[0], srcp.xyz = (src1 + src0) MAD_SAT color[0].w, src2.x, srcp.x, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00003807:TEX TEX_WAIT wmask: RGB omask: NONE 1:TEX_INST: 0x02430000: id: 3 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe400f400: src: 0 R/G/A/A dst: 0 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x00004007:TEX TEX_WAIT wmask: A omask: NONE 1:TEX_INST: 0x02400000: id: 0 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe403f403: src: 3 R/G/A/A dst: 3 R/G/B/A 3:TEX_DXDY: 0x00000000 2 0:CMN_INST 0x00000804:ALU TEX_WAIT wmask: R omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0000c00a:RCP dest:0 alp_A_src:0 A 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x000000ca:SOP dest:12 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 3 0:CMN_INST 0x00000804:ALU TEX_WAIT wmask: R omask: NONE 1:RGB_ADDR 0x08020005:Addr0: 5t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0000000a:RCP dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x000000ba:SOP dest:11 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 4 0:CMN_INST 0x00001804:ALU TEX_WAIT wmask: RG omask: NONE 1:RGB_ADDR 0x08003001:Addr0: 1t, Addr1: 12t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00802420:rgb_A_src:0 R/G/0 0 rgb_B_src:1 R/R/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x004900d0:MAD dest:13 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 5 0:CMN_INST 0x00001804:ALU TEX_WAIT wmask: RG omask: NONE 1:RGB_ADDR 0x0802000d:Addr0: 13t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x008404b4:rgb_A_src:0 H/H/0 0 rgb_B_src:0 R/G/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x004b40c0:MAD dest:12 rgb_C_src:0 H/H/0 0 alp_C_src:0 R 0 6 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0140:rgb_A_src:0 R/B/G 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00db4030:MAD dest:3 rgb_C_src:0 H/H/H 1 alp_C_src:0 R 0 7 0:CMN_INST 0x00000804:ALU TEX_WAIT wmask: R omask: NONE 1:RGB_ADDR 0x08020003:Addr0: 3t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00440220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000001:DP3 dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 8 0:CMN_INST 0x00000804:ALU TEX_WAIT wmask: R omask: NONE 1:RGB_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0004000b:RSQ dest:0 alp_A_src:0 R 2 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x0000001a:SOP dest:1 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 9 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x08000403:Addr0: 3t, Addr1: 1t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00002220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/R/R 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 10 0:CMN_INST 0x00000804:ALU TEX_WAIT wmask: R omask: NONE 1:RGB_ADDR 0x48043006:Addr0: 6t, Addr1: 12c, Addr2: 128t, srcp:1 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00446223:rgb_A_src:3 R/G/B 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000071:DP3 dest:7 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 11 0:CMN_INST 0x00000804:ALU TEX_WAIT wmask: R omask: NONE 1:RGB_ADDR 0x08020007:Addr0: 7t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0004000b:RSQ dest:0 alp_A_src:0 R 2 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x0000008a:SOP dest:8 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 12 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x40843006:Addr0: 6t, Addr1: 12c, Addr2: 8t, srcp:1 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00004223:rgb_A_src:3 R/G/B 0 rgb_B_src:2 R/R/R 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490070:MAD dest:7 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 13 0:CMN_INST 0x00000804:ALU TEX_WAIT wmask: R omask: NONE 1:RGB_ADDR 0x08020102:Addr0: 2c, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0000000a:RCP dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x0000008a:SOP dest:8 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 14 0:CMN_INST 0x00000804:ALU TEX_WAIT wmask: R omask: NONE 1:RGB_ADDR 0x48001d0e:Addr0: 14c, Addr1: 7t, Addr2: 128t, srcp:1 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00446223:rgb_A_src:3 R/G/B 0 rgb_B_src:3 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000091:DP3 dest:9 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 15 0:CMN_INST 0x00000804:ALU TEX_WAIT wmask: R omask: NONE 1:RGB_ADDR 0x08020009:Addr0: 9t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0004000b:RSQ dest:0 alp_A_src:0 R 2 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x000000aa:SOP dest:10 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 16 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x40a01d0e:Addr0: 14c, Addr1: 7t, Addr2: 10t, srcp:1 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00004223:rgb_A_src:3 R/G/B 0 rgb_B_src:2 R/R/R 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490090:MAD dest:9 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 17 0:CMN_INST 0x00000804:ALU TEX_WAIT wmask: R omask: NONE 1:RGB_ADDR 0x08002400:Addr0: 0t, Addr1: 9t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000011:DP3 dest:1 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 18 0:CMN_INST 0x00004804:ALU TEX_WAIT wmask: AR omask: NONE 1:RGB_ADDR 0x00801001:Addr0: 1t, Addr1: 4t, Addr2: 8t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00900490:rgb_A_src:0 0/0/0 0 rgb_B_src:0 R/0/0 0 targ: 0 4 ALPHA_INST:0x00101060:MAD dest:6 alp_A_src:1 R 0 alp_B_src:2 R 0 targ 0 w:0 5 RGBA_INST: 0x20000095:MAX dest:9 rgb_C_src:0 R/R/R 0 alp_C_src:0 0 0 19 0:CMN_INST 0x00004804:ALU TEX_WAIT wmask: AR omask: NONE 1:RGB_ADDR 0x08002406:Addr0: 6t, Addr1: 9t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020006:Addr0: 6t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0093048c:rgb_A_src:0 A/0/0 0 rgb_B_src:0 1/0/0 0 targ: 0 4 ALPHA_INST:0x00001019:LN2 dest:1 alp_A_src:1 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000084:MIN dest:8 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 20 0:CMN_INST 0x00004804:ALU TEX_WAIT wmask: AR omask: NONE 1:RGB_ADDR 0x10702103:Addr0: 3c, Addr1: 8t, Addr2: 7c, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00902480:rgb_A_src:0 R/0/0 0 rgb_B_src:1 R/0/0 0 targ: 0 4 ALPHA_INST:0x0010c010:MAD dest:1 alp_A_src:0 A 0 alp_B_src:2 R 0 targ 0 w:0 5 RGBA_INST: 0x20490060:MAD dest:6 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 21 0:CMN_INST 0x00000804:ALU TEX_WAIT wmask: R omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020001:Addr0: 1t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0000c008:EX2 dest:0 alp_A_src:0 A 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x0000001a:SOP dest:1 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 22 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x08043401:Addr0: 1t, Addr1: 13c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442000:rgb_A_src:0 R/R/R 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490090:MAD dest:9 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 23 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x08041809:Addr0: 9t, Addr1: 6c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00002220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/R/R 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490010:MAD dest:1 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 24 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x0804050d:Addr0: 13c, Addr1: 1c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490090:MAD dest:9 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 25 0:CMN_INST 0x00000a04:ALU TEX_WAIT NOP wmask: R omask: NONE 1:RGB_ADDR 0x08001c00:Addr0: 0t, Addr1: 7t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00000081:DP3 dest:8 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 26 0:CMN_INST 0x00005804:ALU TEX_WAIT wmask: ARG omask: NONE 1:RGB_ADDR 0xd0500008:Addr0: 8t, Addr1: 0t, Addr2: 5c, srcp:3 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00882402:rgb_A_src:2 R/R/0 0 rgb_B_src:1 R/B/0 0 targ: 0 4 ALPHA_INST:0x00003089:LN2 dest:8 alp_A_src:3 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x004900a0:MAD dest:10 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 27 0:CMN_INST 0x00001804:ALU TEX_WAIT wmask: RG omask: NONE 1:RGB_ADDR 0x00c02c0a:Addr0: 10t, Addr1: 11t, Addr2: 12t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00802420:rgb_A_src:0 R/G/0 0 rgb_B_src:1 R/R/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x004220d0:MAD dest:13 rgb_C_src:2 R/G/0 0 alp_C_src:0 R 0 28 0:CMN_INST 0x00000804:ALU TEX_WAIT wmask: R omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020002:Addr0: 2t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0000c00a:RCP dest:0 alp_A_src:0 A 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x000000ba:SOP dest:11 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 29 0:CMN_INST 0x00004004:ALU TEX_WAIT wmask: A omask: NONE 1:RGB_ADDR 0x0802010f:Addr0: 15c, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020008:Addr0: 8t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0000c080:MAD dest:8 alp_A_src:0 A 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x20000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 0 0 30 0:CMN_INST 0x00000804:ALU TEX_WAIT wmask: R omask: NONE 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020008:Addr0: 8t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0000c008:EX2 dest:0 alp_A_src:0 A 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x0000008a:SOP dest:8 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 31 0:CMN_INST 0x00003807:TEX TEX_WAIT wmask: RGB omask: NONE 1:TEX_INST: 0x02420000: id: 2 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe40af40d: src: 13 R/G/A/A dst: 10 R/G/B/A 3:TEX_DXDY: 0x00000000 32 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0xc8002900:Addr0: 0c, Addr1: 10t, Addr2: 128t, srcp:3 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00006221:rgb_A_src:1 R/G/B 0 rgb_B_src:3 R/R/R 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x004900c0:MAD dest:12 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 33 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x00c40009:Addr0: 9t, Addr1: 0c, Addr2: 12t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00002220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/R/R 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x002220a0:MAD dest:10 rgb_C_src:2 R/G/B 0 alp_C_src:0 R 0 34 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x0800040a:Addr0: 10t, Addr1: 1t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00221090:MAD dest:9 rgb_C_src:1 R/G/B 0 alp_C_src:0 R 0 35 0:CMN_INST 0x00000804:ALU TEX_WAIT wmask: R omask: NONE 1:RGB_ADDR 0x08043800:Addr0: 0t, Addr1: 14c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x01442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 1 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x000000a1:DP3 dest:10 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 36 0:CMN_INST 0x00000804:ALU TEX_WAIT wmask: R omask: NONE 1:RGB_ADDR 0x0802000a:Addr0: 10t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00900494:rgb_A_src:0 H/0/0 0 rgb_B_src:0 R/0/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00494030:MAD dest:3 rgb_C_src:0 H/0/0 0 alp_C_src:0 R 0 37 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x0804110d:Addr0: 13c, Addr1: 4c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x004900a0:MAD dest:10 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 38 0:CMN_INST 0x00001804:ALU TEX_WAIT wmask: RG omask: NONE 1:RGB_ADDR 0x08002c02:Addr0: 2t, Addr1: 11t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00802420:rgb_A_src:0 R/G/0 0 rgb_B_src:1 R/R/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x004900c0:MAD dest:12 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 39 0:CMN_INST 0x00001804:ALU TEX_WAIT wmask: RG omask: NONE 1:RGB_ADDR 0x0802000c:Addr0: 12t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x008404b4:rgb_A_src:0 H/H/0 0 rgb_B_src:0 R/G/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x004b40b0:MAD dest:11 rgb_C_src:0 H/H/0 0 alp_C_src:0 R 0 40 0:CMN_INST 0x00000804:ALU TEX_WAIT wmask: R omask: NONE 1:RGB_ADDR 0x0804150f:Addr0: 15c, Addr1: 5c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00902480:rgb_A_src:0 R/0/0 0 rgb_B_src:1 R/0/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x004900c0:MAD dest:12 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 41 0:CMN_INST 0x00001804:ALU TEX_WAIT wmask: RG omask: NONE 1:RGB_ADDR 0x0800000c:Addr0: 12t, Addr1: 0t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00882400:rgb_A_src:0 R/R/0 0 rgb_B_src:1 R/B/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x004900d0:MAD dest:13 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 42 0:CMN_INST 0x00000804:ALU TEX_WAIT wmask: R omask: NONE 1:RGB_ADDR 0x08020005:Addr0: 5t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x0000000a:RCP dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x0000000a:SOP dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 R 0 43 0:CMN_INST 0x00001804:ALU TEX_WAIT wmask: RG omask: NONE 1:RGB_ADDR 0x0800000d:Addr0: 13t, Addr1: 0t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00802420:rgb_A_src:0 R/G/0 0 rgb_B_src:1 R/R/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x004900c0:MAD dest:12 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 44 0:CMN_INST 0x00001804:ALU TEX_WAIT wmask: RG omask: NONE 1:RGB_ADDR 0x0800300b:Addr0: 11t, Addr1: 12t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0420:rgb_A_src:0 R/G/0 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00c21000:MAD dest:0 rgb_C_src:1 R/G/0 1 alp_C_src:0 R 0 45 0:CMN_INST 0x00003807:TEX TEX_WAIT wmask: RGB omask: NONE 1:TEX_INST: 0x02410000: id: 1 op:LD, ACQ, SCALED 2:TEX_ADDR: 0xe40bf400: src: 0 R/G/A/A dst: 11 R/G/B/A 3:TEX_DXDY: 0x00000000 46 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0xc8002c06:Addr0: 6t, Addr1: 11t, Addr2: 128t, srcp:3 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00006221:rgb_A_src:1 R/G/B 0 rgb_B_src:3 R/R/R 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x004900c0:MAD dest:12 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 47 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x00c0180a:Addr0: 10t, Addr1: 6t, Addr2: 12t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00002220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/R/R 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00222000:MAD dest:0 rgb_C_src:2 R/G/B 0 alp_C_src:0 R 0 48 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x08000003:Addr0: 3t, Addr1: 0t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442000:rgb_A_src:0 R/R/R 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490060:MAD dest:6 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 49 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x08001801:Addr0: 1t, Addr1: 6t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x0802010f:Addr0: 15c, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0044036c:rgb_A_src:0 A/A/A 0 rgb_B_src:0 R/G/B 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00221000:MAD dest:0 rgb_C_src:1 R/G/B 0 alp_C_src:0 R 0 50 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0xc8000008:Addr0: 8t, Addr1: 0t, Addr2: 128t, srcp:3 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00006221:rgb_A_src:1 R/G/B 0 rgb_B_src:3 R/R/R 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490010:MAD dest:1 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 51 0:CMN_INST 0x00003804:ALU TEX_WAIT wmask: RGB omask: NONE 1:RGB_ADDR 0x00102009:Addr0: 9t, Addr1: 8t, Addr2: 1t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00002220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/R/R 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00222000:MAD dest:0 rgb_C_src:2 R/G/B 0 alp_C_src:0 R 0 52 0:CMN_INST 0x000bc005:OUT TEX_WAIT wmask: A omask: RGB 1:RGB_ADDR 0x00744000:Addr0: 0t, Addr1: 16c, Addr2: 7t, srcp:0 2:ALPHA_ADDR 0x08020003:Addr0: 3t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x006d8220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 A/A/A 0 targ: 0 4 ALPHA_INST:0x00c09070:MAD dest:7 alp_A_src:1 B 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x4c490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:2 G 1 53 0:CMN_INST 0x00004804:ALU TEX_WAIT wmask: AR omask: NONE 1:RGB_ADDR 0x08001110:Addr0: 16c, Addr1: 4t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020007:Addr0: 7t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00902480:rgb_A_src:0 R/0/0 0 rgb_B_src:1 R/0/0 0 targ: 0 4 ALPHA_INST:0x00610093:MAX dest:9 alp_A_src:0 0 0 alp_B_src:0 A 0 targ 0 w:0 5 RGBA_INST: 0x00490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 54 0:CMN_INST 0x00000a04:ALU TEX_WAIT NOP wmask: R omask: NONE 1:RGB_ADDR 0x08000510:Addr0: 16c, Addr1: 1t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08002510:Addr0: 16c, Addr1: 9t, Addr2: 128t, srcp:0 3 RGB_INST: 0x0091a484:rgb_A_src:0 G/0/0 0 rgb_B_src:1 A/0/0 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x0048c030:MAD dest:3 rgb_C_src:0 A/0/0 0 alp_C_src:0 R 0 55 0:CMN_INST 0x00140005:OUT TEX_WAIT wmask: NONE omask: A 1:RGB_ADDR 0x80002003:Addr0: 3t, Addr1: 8t, Addr2: 0t, srcp:2 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x00182000:MAD dest:0 alp_A_src:2 R 0 alp_B_src:3 R 0 targ 0 w:0 5 RGBA_INST: 0x20000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 0 0 ~~~~~~~~ FRAGMENT PROGRAM ~~~~~~~ ~ 56 Instructions ~ 50 Vector Instructions (RGB) ~ 18 Scalar Instructions (Alpha) ~ 0 Flow Control Instructions ~ 4 Texture Instructions ~ 9 Presub Operations ~ 14 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial vertex program VERT DCL IN[0] DCL OUT[0], POSITION DCL CONST[0..3] DCL TEMP[0] 0: MUL TEMP[0], IN[0].xxxx, CONST[0] 1: MAD TEMP[0], IN[0].yyyy, CONST[1], TEMP[0] 2: MAD TEMP[0], IN[0].zzzz, CONST[2], TEMP[0] 3: MAD OUT[0], IN[0].wwww, CONST[3], TEMP[0] 4: END Vertex Program: before compilation # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'transform loops' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[0]; 5: MOV output[1], temp[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[0]; 5: MOV output[1], temp[0]; Final vertex program code: 0: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 1: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 5: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 6 Instructions ~ 0 Flow Control Instructions ~ 1 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL OUT[0], COLOR DCL SAMP[0] DCL CONST[0] DCL TEMP[0] 0: TXP TEMP[0], CONST[0], SAMP[0], 2D 1: MOV OUT[0], TEMP[0] 2: END Fragment Program: before compilation # Radeon Compiler Program 0: TXP temp[0], const[0], 2D[0]; 1: MOV output[0], temp[0]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TXP temp[0], const[0], 2D[0]; 1: MOV output[0], temp[0]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TXP temp[0], const[0], 2D[0]; 1: MOV output[0], temp[0]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TXP temp[0], const[0], 2D[0]; 1: MOV output[0], temp[0]; Fragment Program: after 'saturate output writes' # Radeon Compiler Program 0: TXP temp[0], const[0], 2D[0]; 1: MOV_SAT output[0], temp[0]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: MOV temp[1], const[0]; 1: TXP temp[0], temp[1], 2D[0]; 2: MOV_SAT output[0], temp[0]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: MOV temp[1], const[0]; 1: TXP temp[0], temp[1], 2D[0]; 2: MOV_SAT output[0], temp[0]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: MOV temp[1].xyw, const[0].xy_w; 1: TXP temp[0], temp[1].xy_w, 2D[0]; 2: MOV_SAT output[0], temp[0]; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV temp[1].xyw, const[0].xy_w; 1: TXP temp[0], temp[1].xy_w, 2D[0]; 2: MOV_SAT output[0], temp[0]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: MOV temp[1].xyw, const[0].xy_w; 1: TXP temp[0], temp[1].xy_w, 2D[0]; 2: MOV_SAT output[0], temp[0]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: MOV temp[1].xyw, const[0].xy_w; 1: TXP temp[0], temp[1].xy_w, 2D[0]; 2: MOV_SAT output[0], temp[0]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: src0.xyz = const[0], src0.w = const[0] MAD temp[1].xy, src0.xy_, src0.111, src0.000 MAD temp[1].w, src0.w, src0.1, src0.0 1: TXP temp[0], temp[1].xy_w, 2D[0]; 2: src0.xyz = temp[0], src0.w = temp[0] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: src0.xyz = const[0], src0.w = const[0] MAD temp[1].xy, src0.xy_, src0.111, src0.000 MAD temp[1].w, src0.w, src0.1, src0.0 1: BEGIN_TEX; 2: TXP temp[0], temp[1].xy_w, 2D[0]; 3: src0.xyz = temp[0], src0.w = temp[0] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: src0.xyz = const[0], src0.w = const[0] MAD temp[0].xy, src0.xy_, src0.111, src0.000 MAD temp[0].w, src0.w, src0.1, src0.0 1: BEGIN_TEX; 2: TXP temp[0], temp[0].xy_w, 2D[0]; 3: src0.xyz = temp[0], src0.w = temp[0] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00005804:ALU TEX_WAIT wmask: ARG omask: NONE 1:RGB_ADDR 0x08020100:Addr0: 0c, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020100:Addr0: 0c, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0420:rgb_A_src:0 R/G/0 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 1 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02c00000: id: 0 op:PROJ, ACQ, SCALED 2:TEX_ADDR: 0xe400f400: src: 0 R/G/A/A dst: 0 R/G/B/A 3:TEX_DXDY: 0x00000000 2 0:CMN_INST 0x001f8005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], COLOR DCL CONST[0..2] DCL CONST[4..7] DCL TEMP[0] IMM FLT32 { 1.0000, 0.0000, 0.0000, 0.0000} 0: DP4 TEMP[0].x, CONST[0], IN[0] 1: DP4 TEMP[0].y, CONST[1], IN[0] 2: DP4 TEMP[0].z, CONST[2], IN[0] 3: MOV TEMP[0].w, IMM[0].xxxx 4: DP4 OUT[0].x, CONST[4], TEMP[0] 5: DP4 OUT[0].y, CONST[5], TEMP[0] 6: DP4 OUT[0].z, CONST[6], TEMP[0] 7: DP4 OUT[0].w, CONST[7], TEMP[0] 8: MOV OUT[1], IN[1] 9: END Vertex Program: before compilation # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: MOV temp[0].w, temp[0].1111; 4: DP4 temp[1].x, const[4], temp[0]; 5: DP4 temp[1].y, const[5], temp[0]; 6: DP4 temp[1].z, const[6], temp[0]; 7: DP4 temp[1].w, const[7], temp[0]; 8: MOV output[1], input[1]; 9: MOV output[0], temp[1]; 10: MOV output[2], temp[1]; Vertex Program: after 'transform loops' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: MOV temp[0].w, temp[0].1111; 4: DP4 temp[1].x, const[4], temp[0]; 5: DP4 temp[1].y, const[5], temp[0]; 6: DP4 temp[1].z, const[6], temp[0]; 7: DP4 temp[1].w, const[7], temp[0]; 8: MOV output[1], input[1]; 9: MOV output[0], temp[1]; 10: MOV output[2], temp[1]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: MOV temp[0].w, temp[0].1111; 4: DP4 temp[1].x, const[4], temp[0]; 5: DP4 temp[1].y, const[5], temp[0]; 6: DP4 temp[1].z, const[6], temp[0]; 7: DP4 temp[1].w, const[7], temp[0]; 8: MOV output[1], input[1]; 9: MOV output[0], temp[1]; 10: MOV output[2], temp[1]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: MOV temp[0].w, temp[0].1111; 4: DP4 temp[1].x, const[4], temp[0]; 5: DP4 temp[1].y, const[5], temp[0]; 6: DP4 temp[1].z, const[6], temp[0]; 7: DP4 temp[1].w, const[7], temp[0]; 8: MOV output[1], input[1]; 9: MOV output[0], temp[1]; 10: MOV output[2], temp[1]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: MOV temp[0].w, temp[0].___1; 4: DP4 temp[1].x, const[4], temp[0]; 5: DP4 temp[1].y, const[5], temp[0]; 6: DP4 temp[1].z, const[6], temp[0]; 7: DP4 temp[1].w, const[7], temp[0]; 8: MOV output[1], input[1]; 9: MOV output[0], temp[1]; 10: MOV output[2], temp[1]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: MOV temp[0].w, none.___1; 4: DP4 temp[1].x, const[4], temp[0]; 5: DP4 temp[1].y, const[5], temp[0]; 6: DP4 temp[1].z, const[6], temp[0]; 7: DP4 temp[1].w, const[7], temp[0]; 8: MOV output[1], input[1]; 9: MOV output[0], temp[1]; 10: MOV output[2], temp[1]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: MOV temp[0].w, none.___1; 4: DP4 temp[1].x, const[4], temp[0]; 5: DP4 temp[1].y, const[5], temp[0]; 6: DP4 temp[1].z, const[6], temp[0]; 7: DP4 temp[1].w, const[7], temp[0]; 8: MOV output[1], input[1]; 9: MOV output[0], temp[1]; 10: MOV output[2], temp[1]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: MOV temp[0].w, none.___1; 4: DP4 temp[1].x, const[4], temp[0]; 5: DP4 temp[1].y, const[5], temp[0]; 6: DP4 temp[1].z, const[6], temp[0]; 7: DP4 temp[1].w, const[7], temp[0]; 8: MOV output[1], input[1]; 9: MOV output[0], temp[1]; 10: MOV output[2], temp[1]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: MOV temp[0].w, none.___1; 4: DP4 temp[1].x, const[4], temp[0]; 5: DP4 temp[1].y, const[5], temp[0]; 6: DP4 temp[1].z, const[6], temp[0]; 7: DP4 temp[1].w, const[7], temp[0]; 8: MOV output[1], input[1]; 9: MOV output[0], temp[1]; 10: MOV output[2], temp[1]; Final vertex program code: 0: op: 0x00100001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 1: op: 0x00200001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 2: op: 0x00400001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 3: op: 0x00800003 dst: 0t op: VE_ADD src0: 0x017fe000 reg: 0t swiz: U/ U/ U/ 1 src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 4: op: 0x00102001 dst: 1t op: VE_DOT_PRODUCT src0: 0x00d10082 reg: 4c swiz: X/ Y/ Z/ W src1: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 5: op: 0x00202001 dst: 1t op: VE_DOT_PRODUCT src0: 0x00d100a2 reg: 5c swiz: X/ Y/ Z/ W src1: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 6: op: 0x00402001 dst: 1t op: VE_DOT_PRODUCT src0: 0x00d100c2 reg: 6c swiz: X/ Y/ Z/ W src1: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 7: op: 0x00802001 dst: 1t op: VE_DOT_PRODUCT src0: 0x00d100e2 reg: 7c swiz: X/ Y/ Z/ W src1: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 8: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 9: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 10: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 11 Instructions ~ 0 Flow Control Instructions ~ 2 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL OUT[0], COLOR DCL CONST[0] 0: MOV OUT[0], CONST[0] 1: END Fragment Program: before compilation # Radeon Compiler Program 0: MOV output[0], const[0]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: MOV output[0], const[0]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: MOV output[0], const[0]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: MOV output[0], const[0]; Fragment Program: after 'saturate output writes' # Radeon Compiler Program 0: MOV_SAT output[0], const[0]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: MOV_SAT output[0], const[0]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: MOV_SAT output[0], const[0]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: MOV_SAT output[0], const[0]; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV_SAT output[0], const[0]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: MOV_SAT output[0], const[0]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: MOV_SAT output[0], const[0]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: src0.xyz = const[0], src0.w = const[0] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: src0.xyz = const[0], src0.w = const[0] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: src0.xyz = const[0], src0.w = const[0] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x001f8005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08020100:Addr0: 0c, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020100:Addr0: 0c, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], COLOR DCL CONST[0..3] 0: DP4 OUT[0].x, CONST[0], IN[0] 1: DP4 OUT[0].y, CONST[1], IN[0] 2: DP4 OUT[0].z, CONST[2], IN[0] 3: DP4 OUT[0].w, CONST[3], IN[0] 4: MOV OUT[1], IN[1] 5: END Vertex Program: before compilation # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Vertex Program: after 'transform loops' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: DP4 temp[0].x, const[0], input[0]; 1: DP4 temp[0].y, const[1], input[0]; 2: DP4 temp[0].z, const[2], input[0]; 3: DP4 temp[0].w, const[3], input[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Final vertex program code: 0: op: 0x00100001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 1: op: 0x00200001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 2: op: 0x00400001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 3: op: 0x00800001 dst: 0t op: VE_DOT_PRODUCT src0: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src1: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 4: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 5: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 6: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 7 Instructions ~ 0 Flow Control Instructions ~ 1 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL OUT[0], COLOR DCL CONST[0] 0: MOV OUT[0], CONST[0] 1: END Fragment Program: before compilation # Radeon Compiler Program 0: MOV output[0], const[0]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: MOV output[0], const[0]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: MOV output[0], const[0]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: MOV output[0], const[0]; Fragment Program: after 'saturate output writes' # Radeon Compiler Program 0: MOV_SAT output[0], const[0]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: MOV_SAT output[0], const[0]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: MOV_SAT output[0], const[0]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: MOV_SAT output[0], const[0]; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV_SAT output[0], const[0]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: MOV_SAT output[0], const[0]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: MOV_SAT output[0], const[0]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: src0.xyz = const[0], src0.w = const[0] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: src0.xyz = const[0], src0.w = const[0] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: src0.xyz = const[0], src0.w = const[0] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x001f8005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08020100:Addr0: 0c, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020100:Addr0: 0c, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], GENERIC[0], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL CONST[0] DCL TEMP[0] 0: TXP TEMP[0], IN[0], SAMP[0], 2D 1: MUL OUT[0], TEMP[0], CONST[0] 2: END Fragment Program: before compilation # Radeon Compiler Program 0: TXP temp[0], input[0], 2D[0]; 1: MUL output[0], temp[0], const[0]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TXP temp[0], input[0], 2D[0]; 1: MUL output[0], temp[0], const[0]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TXP temp[0], input[0], 2D[0]; 1: MUL output[0], temp[0], const[0]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TXP temp[0], input[0], 2D[0]; 1: MUL output[0], temp[0], const[0]; Fragment Program: after 'saturate output writes' # Radeon Compiler Program 0: TXP temp[0], input[0], 2D[0]; 1: MUL_SAT output[0], temp[0], const[0]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TXP temp[0], input[0], 2D[0]; 1: MUL_SAT output[0], temp[0], const[0]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TXP temp[0], input[0], 2D[0]; 1: MUL_SAT output[0], temp[0], const[0]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TXP temp[0], input[0].xy_w, 2D[0]; 1: MUL_SAT output[0], temp[0], const[0]; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TXP temp[0], input[0].xy_w, 2D[0]; 1: MUL_SAT output[0], temp[0], const[0]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TXP temp[0], input[0].xy_w, 2D[0]; 1: MUL_SAT output[0], temp[0], const[0]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TXP temp[0], input[0].xy_w, 2D[0]; 1: MUL_SAT output[0], temp[0], const[0]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TXP temp[0], input[0].xy_w, 2D[0]; 1: src0.xyz = temp[0], src0.w = temp[0], src1.xyz = const[0], src1.w = const[0] MAD_SAT color[0].xyz, src0.xyz, src1.xyz, src0.000 MAD_SAT color[0].w, src0.w, src1.w, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TXP temp[0], input[0].xy_w, 2D[0]; 2: src0.xyz = temp[0], src0.w = temp[0], src1.xyz = const[0], src1.w = const[0] MAD_SAT color[0].xyz, src0.xyz, src1.xyz, src0.000 MAD_SAT color[0].w, src0.w, src1.w, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TXP temp[0], temp[0].xy_w, 2D[0]; 2: src0.xyz = temp[0], src0.w = temp[0], src1.xyz = const[0], src1.w = const[0] MAD_SAT color[0].xyz, src0.xyz, src1.xyz, src0.000 MAD_SAT color[0].w, src0.w, src1.w, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02c00000: id: 0 op:PROJ, ACQ, SCALED 2:TEX_ADDR: 0xe400f400: src: 0 R/G/A/A dst: 0 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x001f8005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08040000:Addr0: 0t, Addr1: 0c, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08040000:Addr0: 0t, Addr1: 0c, Addr2: 128t, srcp:0 3 RGB_INST: 0x00442220:rgb_A_src:0 R/G/B 0 rgb_B_src:1 R/G/B 0 targ: 0 4 ALPHA_INST:0x0068c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], COLOR 0: MOV OUT[0], IN[0] 1: MOV OUT[1], IN[1] 2: END Vertex Program: before compilation # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'transform loops' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MOV temp[0], input[0]; 1: MOV output[1], input[1]; 2: MOV output[0], temp[0]; 3: MOV output[2], temp[0]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MOV output[0], input[0]; 2: MOV output[2], input[0]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MOV output[0], input[0]; 2: MOV output[2], input[0]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MOV output[0], input[0]; 2: MOV output[2], input[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MOV output[1], input[1]; 1: MOV output[0], input[0]; 2: MOV output[2], input[0]; Final vertex program code: 0: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 1: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src1: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 2: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10001 reg: 0i swiz: X/ Y/ Z/ W src1: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 src2: 0x01248001 reg: 0i swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 r300: Initial fragment program FRAG DCL IN[0], COLOR, PERSPECTIVE DCL OUT[0], COLOR 0: MOV OUT[0], IN[0] 1: END Fragment Program: before compilation # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: MOV output[0], input[0]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: src0.xyz = input[0], src0.w = input[0] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: src0.xyz = temp[0], src0.w = temp[0] MAD color[0].xyz, src0.xyz, src0.111, src0.000 MAD color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00078005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], GENERIC[0], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL CONST[0] DCL TEMP[0] 0: TXP TEMP[0], IN[0], SAMP[0], 2D 1: MOV OUT[0].xyz, CONST[0] 2: MUL OUT[0].w, TEMP[0], CONST[0] 3: END Fragment Program: before compilation # Radeon Compiler Program 0: TXP temp[0], input[0], 2D[0]; 1: MOV output[0].xyz, const[0]; 2: MUL output[0].w, temp[0], const[0]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TXP temp[0], input[0], 2D[0]; 1: MOV output[0].xyz, const[0]; 2: MUL output[0].w, temp[0], const[0]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TXP temp[0], input[0], 2D[0]; 1: MOV output[0].xyz, const[0]; 2: MUL output[0].w, temp[0], const[0]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TXP temp[0], input[0], 2D[0]; 1: MOV output[0].xyz, const[0]; 2: MUL output[0].w, temp[0], const[0]; Fragment Program: after 'saturate output writes' # Radeon Compiler Program 0: TXP temp[0], input[0], 2D[0]; 1: MOV_SAT output[0].xyz, const[0]; 2: MUL_SAT output[0].w, temp[0], const[0]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TXP temp[0], input[0], 2D[0]; 1: MOV_SAT output[0].xyz, const[0]; 2: MUL_SAT output[0].w, temp[0], const[0]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TXP temp[0], input[0], 2D[0]; 1: MOV_SAT output[0].xyz, const[0]; 2: MUL_SAT output[0].w, temp[0], const[0]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TXP temp[0].w, input[0].xy_w, 2D[0]; 1: MOV_SAT output[0].xyz, const[0].xyz_; 2: MUL_SAT output[0].w, temp[0].___w, const[0].___w; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TXP temp[0].w, input[0].xy_w, 2D[0]; 1: MOV_SAT output[0].xyz, const[0].xyz_; 2: MUL_SAT output[0].w, temp[0].___w, const[0].___w; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TXP temp[0].w, input[0].xy_w, 2D[0]; 1: MOV_SAT output[0].xyz, const[0].xyz_; 2: MUL_SAT output[0].w, temp[0].___w, const[0].___w; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TXP temp[0].w, input[0].xy_w, 2D[0]; 1: MOV_SAT output[0].xyz, const[0].xyz_; 2: MUL_SAT output[0].w, temp[0].___w, const[0].___w; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TXP temp[0].w, input[0].xy_w, 2D[0]; 1: src0.xyz = const[0] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 2: src0.w = temp[0], src1.w = const[0] MAD_SAT color[0].w, src0.w, src1.w, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TXP temp[0].w, input[0].xy_w, 2D[0]; 2: src0.xyz = const[0], src0.w = temp[0], src1.w = const[0] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src1.w, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TXP temp[0].w, temp[0].xy_w, 2D[0]; 2: src0.xyz = const[0], src0.w = temp[0], src1.w = const[0] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src1.w, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00004007:TEX TEX_WAIT wmask: A omask: NONE 1:TEX_INST: 0x02c00000: id: 0 op:PROJ, ACQ, SCALED 2:TEX_ADDR: 0xe400f400: src: 0 R/G/A/A dst: 0 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x001f8005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08020100:Addr0: 0c, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08040000:Addr0: 0t, Addr1: 0c, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x0068c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:1 A 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], GENERIC[0], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL TEMP[0] 0: TXP TEMP[0], IN[0], SAMP[0], 2D 1: MOV OUT[0], TEMP[0] 2: END Fragment Program: before compilation # Radeon Compiler Program 0: TXP temp[0], input[0], 2D[0]; 1: MOV output[0], temp[0]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TXP temp[0], input[0], 2D[0]; 1: MOV output[0], temp[0]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TXP temp[0], input[0], 2D[0]; 1: MOV output[0], temp[0]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TXP temp[0], input[0], 2D[0]; 1: MOV output[0], temp[0]; Fragment Program: after 'saturate output writes' # Radeon Compiler Program 0: TXP temp[0], input[0], 2D[0]; 1: MOV_SAT output[0], temp[0]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TXP temp[0], input[0], 2D[0]; 1: MOV_SAT output[0], temp[0]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TXP temp[0], input[0], 2D[0]; 1: MOV_SAT output[0], temp[0]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TXP temp[0], input[0].xy_w, 2D[0]; 1: MOV_SAT output[0], temp[0]; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TXP temp[0], input[0].xy_w, 2D[0]; 1: MOV_SAT output[0], temp[0]; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TXP temp[0], input[0].xy_w, 2D[0]; 1: MOV_SAT output[0], temp[0]; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TXP temp[0], input[0].xy_w, 2D[0]; 1: MOV_SAT output[0], temp[0]; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TXP temp[0], input[0].xy_w, 2D[0]; 1: src0.xyz = temp[0], src0.w = temp[0] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TXP temp[0], input[0].xy_w, 2D[0]; 2: src0.xyz = temp[0], src0.w = temp[0] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TXP temp[0], temp[0].xy_w, 2D[0]; 2: src0.xyz = temp[0], src0.w = temp[0] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, src0.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00007807:TEX TEX_WAIT wmask: ARGB omask: NONE 1:TEX_INST: 0x02c00000: id: 0 op:PROJ, ACQ, SCALED 2:TEX_ADDR: 0xe400f400: src: 0 R/G/A/A dst: 0 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x001f8005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], GENERIC[0] DCL CONST[0..7] DCL TEMP[0] 0: MUL TEMP[0], IN[0].xxxx, CONST[0] 1: MAD TEMP[0], IN[0].yyyy, CONST[1], TEMP[0] 2: MAD TEMP[0], IN[0].zzzz, CONST[2], TEMP[0] 3: MAD OUT[0], IN[0].wwww, CONST[3], TEMP[0] 4: MUL TEMP[0], IN[1].xxxx, CONST[4] 5: MAD TEMP[0], IN[1].yyyy, CONST[5], TEMP[0] 6: MAD TEMP[0], IN[1].zzzz, CONST[6], TEMP[0] 7: MAD OUT[1], IN[1].wwww, CONST[7], TEMP[0] 8: END Vertex Program: before compilation # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MUL temp[0], input[1].xxxx, const[4]; 5: MAD temp[0], input[1].yyyy, const[5], temp[0]; 6: MAD temp[0], input[1].zzzz, const[6], temp[0]; 7: MAD output[1], input[1].wwww, const[7], temp[0]; 8: MOV output[0], temp[1]; 9: MOV output[2], temp[1]; Vertex Program: after 'transform loops' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MUL temp[0], input[1].xxxx, const[4]; 5: MAD temp[0], input[1].yyyy, const[5], temp[0]; 6: MAD temp[0], input[1].zzzz, const[6], temp[0]; 7: MAD output[1], input[1].wwww, const[7], temp[0]; 8: MOV output[0], temp[1]; 9: MOV output[2], temp[1]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MUL temp[0], input[1].xxxx, const[4]; 5: MAD temp[0], input[1].yyyy, const[5], temp[0]; 6: MAD temp[0], input[1].zzzz, const[6], temp[0]; 7: MAD output[1], input[1].wwww, const[7], temp[0]; 8: MOV output[0], temp[1]; 9: MOV output[2], temp[1]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MUL temp[0], input[1].xxxx, const[4]; 5: MAD temp[0], input[1].yyyy, const[5], temp[0]; 6: MAD temp[0], input[1].zzzz, const[6], temp[0]; 7: MAD output[1], input[1].wwww, const[7], temp[0]; 8: MOV output[0], temp[1]; 9: MOV output[2], temp[1]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MUL temp[0], input[1].xxxx, const[4]; 5: MAD temp[0], input[1].yyyy, const[5], temp[0]; 6: MAD temp[0], input[1].zzzz, const[6], temp[0]; 7: MAD output[1], input[1].wwww, const[7], temp[0]; 8: MOV output[0], temp[1]; 9: MOV output[2], temp[1]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MUL temp[0], input[1].xxxx, const[4]; 5: MAD temp[0], input[1].yyyy, const[5], temp[0]; 6: MAD temp[0], input[1].zzzz, const[6], temp[0]; 7: MAD output[1], input[1].wwww, const[7], temp[0]; 8: MOV output[0], temp[1]; 9: MOV output[2], temp[1]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MUL temp[0], input[1].xxxx, const[4]; 5: MAD temp[0], input[1].yyyy, const[5], temp[0]; 6: MAD temp[0], input[1].zzzz, const[6], temp[0]; 7: MAD output[1], input[1].wwww, const[7], temp[0]; 8: MOV output[0], temp[1]; 9: MOV output[2], temp[1]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MUL temp[0], input[1].xxxx, const[4]; 5: MAD temp[0], input[1].yyyy, const[5], temp[0]; 6: MAD temp[0], input[1].zzzz, const[6], temp[0]; 7: MAD output[1], input[1].wwww, const[7], temp[0]; 8: MOV output[0], temp[1]; 9: MOV output[2], temp[1]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MUL temp[0], input[1].xxxx, const[4]; 5: MAD temp[0], input[1].yyyy, const[5], temp[0]; 6: MAD temp[0], input[1].zzzz, const[6], temp[0]; 7: MAD output[1], input[1].wwww, const[7], temp[0]; 8: MOV output[0], temp[1]; 9: MOV output[2], temp[1]; Final vertex program code: 0: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 1: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f02004 dst: 1t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000021 reg: 1i swiz: X/ X/ X/ X src1: 0x00d10082 reg: 4c swiz: X/ Y/ Z/ W src2: 0x01248082 reg: 4c swiz: 0/ 0/ 0/ 0 5: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492021 reg: 1i swiz: Y/ Y/ Y/ Y src1: 0x00d100a2 reg: 5c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 6: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924021 reg: 1i swiz: Z/ Z/ Z/ Z src1: 0x00d100c2 reg: 6c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 7: op: 0x00f02204 dst: 1o op: VE_MULTIPLY_ADD src0: 0x00db6021 reg: 1i swiz: W/ W/ W/ W src1: 0x00d100e2 reg: 7c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 8: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 9: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10020 reg: 1t swiz: X/ Y/ Z/ W src1: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 src2: 0x01248020 reg: 1t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 10 Instructions ~ 0 Flow Control Instructions ~ 2 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], GENERIC[0], PERSPECTIVE DCL OUT[0], COLOR DCL SAMP[0] DCL CONST[0] DCL TEMP[0..1] IMM FLT32 { 1.0000, 0.0000, 0.0000, 0.0000} 0: TXP TEMP[0], IN[0], SAMP[0], 2D 1: MOV OUT[0].xyz, CONST[0] 2: SUB TEMP[1].w, IMM[0].xxxx, TEMP[0].wwww 3: MOV OUT[0].w, TEMP[1] 4: END Fragment Program: before compilation # Radeon Compiler Program 0: TXP temp[0], input[0], 2D[0]; 1: MOV output[0].xyz, const[0]; 2: SUB temp[1].w, temp[0].1111, temp[0].wwww; 3: MOV output[0].w, temp[1]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TXP temp[0], input[0], 2D[0]; 1: MOV output[0].xyz, const[0]; 2: SUB temp[1].w, temp[0].1111, temp[0].wwww; 3: MOV output[0].w, temp[1]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TXP temp[0], input[0], 2D[0]; 1: MOV output[0].xyz, const[0]; 2: SUB temp[1].w, temp[0].1111, temp[0].wwww; 3: MOV output[0].w, temp[1]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TXP temp[0], input[0], 2D[0]; 1: MOV output[0].xyz, const[0]; 2: SUB temp[1].w, temp[0].1111, temp[0].wwww; 3: MOV output[0].w, temp[1]; Fragment Program: after 'saturate output writes' # Radeon Compiler Program 0: TXP temp[0], input[0], 2D[0]; 1: MOV_SAT output[0].xyz, const[0]; 2: SUB temp[1].w, temp[0].1111, temp[0].wwww; 3: MOV_SAT output[0].w, temp[1]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: TXP temp[0], input[0], 2D[0]; 1: MOV_SAT output[0].xyz, const[0]; 2: SUB temp[1].w, temp[0].1111, temp[0].wwww; 3: MOV_SAT output[0].w, temp[1]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: TXP temp[0], input[0], 2D[0]; 1: MOV_SAT output[0].xyz, const[0]; 2: ADD temp[1].w, temp[0].1111, -temp[0].wwww; 3: MOV_SAT output[0].w, temp[1]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: TXP temp[0].w, input[0].xy_w, 2D[0]; 1: MOV_SAT output[0].xyz, const[0].xyz_; 2: ADD temp[1].w, temp[0].___1, -temp[0].___w; 3: MOV_SAT output[0].w, temp[1].___w; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: TXP temp[0].w, input[0].xy_w, 2D[0]; 1: MOV_SAT output[0].xyz, const[0].xyz_; 2: MOV_SAT output[0].w, (1 - temp[0]).___w; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: TXP temp[0].w, input[0].xy_w, 2D[0]; 1: MOV_SAT output[0].xyz, const[0].xyz_; 2: MOV_SAT output[0].w, (1 - temp[0]).___w; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: TXP temp[0].w, input[0].xy_w, 2D[0]; 1: MOV_SAT output[0].xyz, const[0].xyz_; 2: MOV_SAT output[0].w, (1 - temp[0]).___w; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: TXP temp[0].w, input[0].xy_w, 2D[0]; 1: src0.xyz = const[0] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 2: src0.w = temp[0], srcp.w = (1 - src0) MAD_SAT color[0].w, srcp.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: BEGIN_TEX; 1: TXP temp[0].w, input[0].xy_w, 2D[0]; 2: src0.xyz = const[0], src0.w = temp[0], srcp.w = (1 - src0) MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, srcp.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: BEGIN_TEX; 1: TXP temp[0].w, temp[0].xy_w, 2D[0]; 2: src0.xyz = const[0], src0.w = temp[0], srcp.w = (1 - src0) MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 MAD_SAT color[0].w, srcp.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00004007:TEX TEX_WAIT wmask: A omask: NONE 1:TEX_INST: 0x02c00000: id: 0 op:PROJ, ACQ, SCALED 2:TEX_ADDR: 0xe400f400: src: 0 R/G/A/A dst: 0 R/G/B/A 3:TEX_DXDY: 0x00000000 1 0:CMN_INST 0x001f8005:OUT TEX_WAIT wmask: NONE omask: ARGB 1:RGB_ADDR 0x08020100:Addr0: 0c, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0xc8020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:3 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0f000:MAD dest:0 alp_A_src:3 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 r300: Initial vertex program VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], COLOR DCL CONST[0..3] DCL TEMP[0] 0: MUL TEMP[0], IN[0].xxxx, CONST[0] 1: MAD TEMP[0], IN[0].yyyy, CONST[1], TEMP[0] 2: MAD TEMP[0], IN[0].zzzz, CONST[2], TEMP[0] 3: MAD OUT[0], IN[0].wwww, CONST[3], TEMP[0] 4: MOV OUT[1], IN[1] 5: END Vertex Program: before compilation # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'transform loops' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[1]; 6: MOV output[2], temp[1]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[1], input[1]; 5: MOV output[0], temp[0]; 6: MOV output[2], temp[0]; Final vertex program code: 0: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 1: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d10021 reg: 1i swiz: X/ Y/ Z/ W src1: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 src2: 0x01248021 reg: 1i swiz: 0/ 0/ 0/ 0 5: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 6: op: 0x00f04203 dst: 2o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 7 Instructions ~ 0 Flow Control Instructions ~ 1 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL IN[0], COLOR, LINEAR DCL OUT[0], COLOR DCL SAMP[0] DCL CONST[0] DCL TEMP[0..1] IMM FLT32 { 1.0000, 0.0000, 0.0000, 0.0000} 0: TXP TEMP[0], CONST[0], SAMP[0], 2D 1: MOV OUT[0].xyz, IN[0] 2: SUB TEMP[1].w, IMM[0].xxxx, TEMP[0].wwww 3: MOV OUT[0].w, TEMP[1] 4: END Fragment Program: before compilation # Radeon Compiler Program 0: TXP temp[0], const[0], 2D[0]; 1: MOV output[0].xyz, input[0]; 2: SUB temp[1].w, temp[0].1111, temp[0].wwww; 3: MOV output[0].w, temp[1]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TXP temp[0], const[0], 2D[0]; 1: MOV output[0].xyz, input[0]; 2: SUB temp[1].w, temp[0].1111, temp[0].wwww; 3: MOV output[0].w, temp[1]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TXP temp[0], const[0], 2D[0]; 1: MOV output[0].xyz, input[0]; 2: SUB temp[1].w, temp[0].1111, temp[0].wwww; 3: MOV output[0].w, temp[1]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TXP temp[0], const[0], 2D[0]; 1: MOV output[0].xyz, input[0]; 2: SUB temp[1].w, temp[0].1111, temp[0].wwww; 3: MOV output[0].w, temp[1]; Fragment Program: after 'saturate output writes' # Radeon Compiler Program 0: TXP temp[0], const[0], 2D[0]; 1: MOV_SAT output[0].xyz, input[0]; 2: SUB temp[1].w, temp[0].1111, temp[0].wwww; 3: MOV_SAT output[0].w, temp[1]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: MOV temp[2], const[0]; 1: TXP temp[0], temp[2], 2D[0]; 2: MOV_SAT output[0].xyz, input[0]; 3: SUB temp[1].w, temp[0].1111, temp[0].wwww; 4: MOV_SAT output[0].w, temp[1]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: MOV temp[2], const[0]; 1: TXP temp[0], temp[2], 2D[0]; 2: MOV_SAT output[0].xyz, input[0]; 3: ADD temp[1].w, temp[0].1111, -temp[0].wwww; 4: MOV_SAT output[0].w, temp[1]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: MOV temp[2].xyw, const[0].xy_w; 1: TXP temp[0].w, temp[2].xy_w, 2D[0]; 2: MOV_SAT output[0].xyz, input[0].xyz_; 3: ADD temp[1].w, temp[0].___1, -temp[0].___w; 4: MOV_SAT output[0].w, temp[1].___w; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV temp[2].xyw, const[0].xy_w; 1: TXP temp[0].w, temp[2].xy_w, 2D[0]; 2: MOV_SAT output[0].xyz, input[0].xyz_; 3: MOV_SAT output[0].w, (1 - temp[0]).___w; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: MOV temp[2].xyw, const[0].xy_w; 1: TXP temp[0].w, temp[2].xy_w, 2D[0]; 2: MOV_SAT output[0].xyz, input[0].xyz_; 3: MOV_SAT output[0].w, (1 - temp[0]).___w; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: MOV temp[2].xyw, const[0].xy_w; 1: TXP temp[0].w, temp[2].xy_w, 2D[0]; 2: MOV_SAT output[0].xyz, input[0].xyz_; 3: MOV_SAT output[0].w, (1 - temp[0]).___w; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: src0.xyz = const[0], src0.w = const[0] MAD temp[2].xy, src0.xy_, src0.111, src0.000 MAD temp[2].w, src0.w, src0.1, src0.0 1: TXP temp[0].w, temp[2].xy_w, 2D[0]; 2: src0.xyz = input[0] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 3: src0.w = temp[0], srcp.w = (1 - src0) MAD_SAT color[0].w, srcp.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: src0.xyz = const[0], src0.w = const[0] MAD temp[2].xy, src0.xy_, src0.111, src0.000 MAD temp[2].w, src0.w, src0.1, src0.0 1: src0.xyz = input[0] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 2: BEGIN_TEX; 3: TXP temp[0].w, temp[2].xy_w, 2D[0]; 4: src0.w = temp[0], srcp.w = (1 - src0) MAD_SAT color[0].w, srcp.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: src0.xyz = const[0], src0.w = const[0] MAD temp[1].xy, src0.xy_, src0.111, src0.000 MAD temp[1].w, src0.w, src0.1, src0.0 1: src0.xyz = temp[0] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 2: BEGIN_TEX; 3: TXP temp[0].w, temp[1].xy_w, 2D[0]; 4: src0.w = temp[0], srcp.w = (1 - src0) MAD_SAT color[0].w, srcp.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00005804:ALU TEX_WAIT wmask: ARG omask: NONE 1:RGB_ADDR 0x08020100:Addr0: 0c, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020100:Addr0: 0c, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0420:rgb_A_src:0 R/G/0 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c010:MAD dest:1 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490010:MAD dest:1 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 1 0:CMN_INST 0x000b8005:OUT TEX_WAIT wmask: NONE omask: RGB 1:RGB_ADDR 0x08020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 2 0:CMN_INST 0x00004007:TEX TEX_WAIT wmask: A omask: NONE 1:TEX_INST: 0x02c00000: id: 0 op:PROJ, ACQ, SCALED 2:TEX_ADDR: 0xe400f401: src: 1 R/G/A/A dst: 0 R/G/B/A 3:TEX_DXDY: 0x00000000 3 0:CMN_INST 0x00140005:OUT TEX_WAIT wmask: NONE omask: A 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0xc8020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:3 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x00c0f000:MAD dest:0 alp_A_src:3 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 0 0 r300: Initial vertex program VERT DCL IN[0] DCL OUT[0], POSITION DCL CONST[0..3] DCL TEMP[0] 0: MUL TEMP[0], IN[0].xxxx, CONST[0] 1: MAD TEMP[0], IN[0].yyyy, CONST[1], TEMP[0] 2: MAD TEMP[0], IN[0].zzzz, CONST[2], TEMP[0] 3: MAD OUT[0], IN[0].wwww, CONST[3], TEMP[0] 4: END Vertex Program: before compilation # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'transform loops' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[0]; 5: MOV output[1], temp[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[0]; 5: MOV output[1], temp[0]; Final vertex program code: 0: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 1: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 5: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 6 Instructions ~ 0 Flow Control Instructions ~ 1 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial fragment program FRAG PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1 DCL OUT[0], COLOR DCL SAMP[0] DCL CONST[0..1] DCL TEMP[0..1] IMM FLT32 { 1.0000, 0.0000, 0.0000, 0.0000} 0: TXP TEMP[0], CONST[0], SAMP[0], 2D 1: MOV OUT[0].xyz, CONST[1] 2: SUB TEMP[1].w, IMM[0].xxxx, TEMP[0].wwww 3: MOV OUT[0].w, TEMP[1] 4: END Fragment Program: before compilation # Radeon Compiler Program 0: TXP temp[0], const[0], 2D[0]; 1: MOV output[0].xyz, const[1]; 2: SUB temp[1].w, temp[0].1111, temp[0].wwww; 3: MOV output[0].w, temp[1]; Fragment Program: after 'rewrite depth out' # Radeon Compiler Program 0: TXP temp[0], const[0], 2D[0]; 1: MOV output[0].xyz, const[1]; 2: SUB temp[1].w, temp[0].1111, temp[0].wwww; 3: MOV output[0].w, temp[1]; Fragment Program: after 'transform KILP' # Radeon Compiler Program 0: TXP temp[0], const[0], 2D[0]; 1: MOV output[0].xyz, const[1]; 2: SUB temp[1].w, temp[0].1111, temp[0].wwww; 3: MOV output[0].w, temp[1]; Fragment Program: after 'unroll loops' # Radeon Compiler Program 0: TXP temp[0], const[0], 2D[0]; 1: MOV output[0].xyz, const[1]; 2: SUB temp[1].w, temp[0].1111, temp[0].wwww; 3: MOV output[0].w, temp[1]; Fragment Program: after 'saturate output writes' # Radeon Compiler Program 0: TXP temp[0], const[0], 2D[0]; 1: MOV_SAT output[0].xyz, const[1]; 2: SUB temp[1].w, temp[0].1111, temp[0].wwww; 3: MOV_SAT output[0].w, temp[1]; Fragment Program: after 'transform TEX' # Radeon Compiler Program 0: MOV temp[2], const[0]; 1: TXP temp[0], temp[2], 2D[0]; 2: MOV_SAT output[0].xyz, const[1]; 3: SUB temp[1].w, temp[0].1111, temp[0].wwww; 4: MOV_SAT output[0].w, temp[1]; Fragment Program: after 'native rewrite' # Radeon Compiler Program 0: MOV temp[2], const[0]; 1: TXP temp[0], temp[2], 2D[0]; 2: MOV_SAT output[0].xyz, const[1]; 3: ADD temp[1].w, temp[0].1111, -temp[0].wwww; 4: MOV_SAT output[0].w, temp[1]; Fragment Program: after 'deadcode' # Radeon Compiler Program 0: MOV temp[2].xyw, const[0].xy_w; 1: TXP temp[0].w, temp[2].xy_w, 2D[0]; 2: MOV_SAT output[0].xyz, const[1].xyz_; 3: ADD temp[1].w, temp[0].___1, -temp[0].___w; 4: MOV_SAT output[0].w, temp[1].___w; Fragment Program: after 'dataflow optimize' # Radeon Compiler Program 0: MOV temp[2].xyw, const[0].xy_w; 1: TXP temp[0].w, temp[2].xy_w, 2D[0]; 2: MOV_SAT output[0].xyz, const[1].xyz_; 3: MOV_SAT output[0].w, (1 - temp[0]).___w; Fragment Program: after 'dataflow swizzles' # Radeon Compiler Program 0: MOV temp[2].xyw, const[0].xy_w; 1: TXP temp[0].w, temp[2].xy_w, 2D[0]; 2: MOV_SAT output[0].xyz, const[1].xyz_; 3: MOV_SAT output[0].w, (1 - temp[0]).___w; Fragment Program: after 'dead constants' # Radeon Compiler Program 0: MOV temp[2].xyw, const[0].xy_w; 1: TXP temp[0].w, temp[2].xy_w, 2D[0]; 2: MOV_SAT output[0].xyz, const[1].xyz_; 3: MOV_SAT output[0].w, (1 - temp[0]).___w; Fragment Program: after 'pair translate' # Radeon Compiler Program 0: src0.xyz = const[0], src0.w = const[0] MAD temp[2].xy, src0.xy_, src0.111, src0.000 MAD temp[2].w, src0.w, src0.1, src0.0 1: TXP temp[0].w, temp[2].xy_w, 2D[0]; 2: src0.xyz = const[1] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 3: src0.w = temp[0], srcp.w = (1 - src0) MAD_SAT color[0].w, srcp.w, src0.1, src0.0 Fragment Program: after 'pair scheduling' # Radeon Compiler Program 0: src0.xyz = const[0], src0.w = const[0] MAD temp[2].xy, src0.xy_, src0.111, src0.000 MAD temp[2].w, src0.w, src0.1, src0.0 1: src0.xyz = const[1] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 2: BEGIN_TEX; 3: TXP temp[0].w, temp[2].xy_w, 2D[0]; 4: src0.w = temp[0], srcp.w = (1 - src0) MAD_SAT color[0].w, srcp.w, src0.1, src0.0 Fragment Program: after 'register allocation' # Radeon Compiler Program 0: src0.xyz = const[0], src0.w = const[0] MAD temp[0].xy, src0.xy_, src0.111, src0.000 MAD temp[0].w, src0.w, src0.1, src0.0 1: src0.xyz = const[1] MAD_SAT color[0].xyz, src0.xyz, src0.111, src0.000 2: BEGIN_TEX; 3: TXP temp[0].w, temp[0].xy_w, 2D[0]; 4: src0.w = temp[0], srcp.w = (1 - src0) MAD_SAT color[0].w, srcp.w, src0.1, src0.0 R500 Fragment Program: -------- 0 0:CMN_INST 0x00005804:ALU TEX_WAIT wmask: ARG omask: NONE 1:RGB_ADDR 0x08020100:Addr0: 0c, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020100:Addr0: 0c, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0420:rgb_A_src:0 R/G/0 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00c0c000:MAD dest:0 alp_A_src:0 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 0 0 1 0:CMN_INST 0x000b8005:OUT TEX_WAIT wmask: NONE omask: RGB 1:RGB_ADDR 0x08020101:Addr0: 1c, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 3 RGB_INST: 0x00db0220:rgb_A_src:0 R/G/B 0 rgb_B_src:0 1/1/1 0 targ: 0 4 ALPHA_INST:0x00000000:MAD dest:0 alp_A_src:0 R 0 alp_B_src:0 R 0 targ 0 w:0 5 RGBA_INST: 0x00490000:MAD dest:0 rgb_C_src:0 0/0/0 0 alp_C_src:0 R 0 2 0:CMN_INST 0x00004007:TEX TEX_WAIT wmask: A omask: NONE 1:TEX_INST: 0x02c00000: id: 0 op:PROJ, ACQ, SCALED 2:TEX_ADDR: 0xe400f400: src: 0 R/G/A/A dst: 0 R/G/B/A 3:TEX_DXDY: 0x00000000 3 0:CMN_INST 0x00140005:OUT TEX_WAIT wmask: NONE omask: A 1:RGB_ADDR 0x08020080:Addr0: 128t, Addr1: 128t, Addr2: 128t, srcp:0 2:ALPHA_ADDR 0xc8020000:Addr0: 0t, Addr1: 128t, Addr2: 128t, srcp:3 3 RGB_INST: 0x00000000:rgb_A_src:0 R/R/R 0 rgb_B_src:0 R/R/R 0 targ: 0 4 ALPHA_INST:0x00c0f000:MAD dest:0 alp_A_src:3 A 0 alp_B_src:0 1 0 targ 0 w:0 5 RGBA_INST: 0x20000000:MAD dest:0 rgb_C_src:0 R/R/R 0 alp_C_src:0 0 0 r300: Initial vertex program VERT DCL IN[0] DCL OUT[0], POSITION DCL CONST[0..3] DCL TEMP[0] 0: MUL TEMP[0], IN[0].xxxx, CONST[0] 1: MAD TEMP[0], IN[0].yyyy, CONST[1], TEMP[0] 2: MAD TEMP[0], IN[0].zzzz, CONST[2], TEMP[0] 3: MAD OUT[0], IN[0].wwww, CONST[3], TEMP[0] 4: END Vertex Program: before compilation # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'transform loops' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[0]; 5: MOV output[1], temp[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[0]; 5: MOV output[1], temp[0]; Final vertex program code: 0: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 1: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 5: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 6 Instructions ~ 0 Flow Control Instructions ~ 1 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ r300: Initial vertex program VERT DCL IN[0] DCL OUT[0], POSITION DCL CONST[0..3] DCL TEMP[0] 0: MUL TEMP[0], IN[0].xxxx, CONST[0] 1: MAD TEMP[0], IN[0].yyyy, CONST[1], TEMP[0] 2: MAD TEMP[0], IN[0].zzzz, CONST[2], TEMP[0] 3: MAD OUT[0], IN[0].wwww, CONST[3], TEMP[0] 4: END Vertex Program: before compilation # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'transform loops' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'emulate negative addressing' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'native rewrite' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'deadcode' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'dataflow optimize' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'source conflict resolve' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[1], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[1]; 5: MOV output[1], temp[1]; Vertex Program: after 'register allocation' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[0]; 5: MOV output[1], temp[0]; Vertex Program: after 'dead constants' # Radeon Compiler Program 0: MUL temp[0], input[0].xxxx, const[0]; 1: MAD temp[0], input[0].yyyy, const[1], temp[0]; 2: MAD temp[0], input[0].zzzz, const[2], temp[0]; 3: MAD temp[0], input[0].wwww, const[3], temp[0]; 4: MOV output[0], temp[0]; 5: MOV output[1], temp[0]; Final vertex program code: 0: op: 0x00f00002 dst: 0t op: VE_MULTIPLY src0: 0x00000001 reg: 0i swiz: X/ X/ X/ X src1: 0x00d10002 reg: 0c swiz: X/ Y/ Z/ W src2: 0x01248002 reg: 0c swiz: 0/ 0/ 0/ 0 1: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00492001 reg: 0i swiz: Y/ Y/ Y/ Y src1: 0x00d10022 reg: 1c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 2: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00924001 reg: 0i swiz: Z/ Z/ Z/ Z src1: 0x00d10042 reg: 2c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 3: op: 0x00f00004 dst: 0t op: VE_MULTIPLY_ADD src0: 0x00db6001 reg: 0i swiz: W/ W/ W/ W src1: 0x00d10062 reg: 3c swiz: X/ Y/ Z/ W src2: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W 4: op: 0x00f00203 dst: 0o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 5: op: 0x00f02203 dst: 1o op: VE_ADD src0: 0x00d10000 reg: 0t swiz: X/ Y/ Z/ W src1: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 src2: 0x01248000 reg: 0t swiz: 0/ 0/ 0/ 0 Flow Control Ops: 0x00000000 ~~~~~~~~~ VERTEX PROGRAM ~~~~~~~~ ~ 6 Instructions ~ 0 Flow Control Instructions ~ 1 Temporary Registers ~~~~~~~~~~~~~~ END ~~~~~~~~~~~~~~ TIMER| shutdown actor stuff: 9.918 us TIMER| shutdown TexMan: 3.282 us TIMER| shutdown Renderer: 6.69907 ms TIMER| shutdown SDL: 48.9338 ms TIMER| shutdown UserReporter: 1.886 us TIMER| shutdown ScriptingHost: 1.94786 ms TIMER| shutdown ConfigDB: 1.536 us TIMER| resource modules: 10.632 ms TIMER TOTALS (8 clients) ----------------------------------------------------- xml_validation: 43.9824 Mc (52x) tc_ShaderValidation: 12.2715 Mc (23x) tc_linkProgram: 33.0099 Mc (7x) tc_compileShader: 148.123 Mc (11x) tc_transform: 14.9116 Mc (251x) tc_plain_transform: 8447.28 kc (251x) tc_dds_transform: 6078.32 kc (751x) tc_png_decode: 32.1578 Mc (21x) ----------------------------------------------------- TIMER| shutdown misc: 365.967 us