Nvidia® cuda™ 5 sample evaluationresult_2

634 views
545 views

Published on

This evaluation to be continued, For future reference.

Published in: Technology, Sports
2 Comments
0 Likes
Statistics
Notes
  • Be the first to like this

No Downloads
Views
Total views
634
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
13
Comments
2
Likes
0
Embeds 0
No embeds

No notes for slide

Nvidia® cuda™ 5 sample evaluationresult_2

  1. 1. NVIDIA® CUDA™ 5.0Sample evaluation resultPART ⅡGPU: GTX 560 TiCPU: i5-3450S (TDP65W)RAM: 16GBOS: Windows 7 x64 UltimateYukio Saitoh | FXFROG.com24/Apr/2013
  2. 2. INDEXSample binary :19. concurrentKernels20. conjugateGradient21. concurrentKernels22. conjugateGradient23. conjugateGradientPrecond24. convolutionFFT2D25. convolutionSeparable26. convolutionTexture27. cppIntegration28. cudaDecodeD3D9 (runaway)29. cudaDecodeGL30. cudaEncode (runaway)31. dct8x832. deviceQuery33. deviceQueryDrv34. dwtHaar1D35. dxtc
  3. 3. Sample target path and files• C:¥ProgramData¥NVIDIA Corporation¥CUDASamples¥v5.0¥bin¥win64¥Release
  4. 4. concurrentKernels.exe[C:¥ProgramData¥NVIDIA Corporation¥CUDA Samples¥v5.0¥bin¥win64¥Release¥concurrentKernels.exe] - Starting...GPU Device 0: "GeForce GTX 560 Ti" with compute capability 2.1> Detected Compute SM 2.1 hardware with 8 multi-processorsExpected time for serial execution of 8 kernels = 0.080sExpected time for concurrent execution of 8 kernels = 0.010sMeasured time for sample = 0.010sTest passed
  5. 5. conjugateGradient.exeGPU Device 0: "GeForce GTX 560 Ti" with compute capability 2.1> GPU device has 8 Multi-Processors, SM 2.1 compute capabilitiesiteration = 1, residual = 4.451374e+001iteration = 2, residual = 3.248658e+000iteration = 3, residual = 2.695777e-001iteration = 4, residual = 2.314586e-002iteration = 5, residual = 1.997625e-003iteration = 6, residual = 1.852079e-004iteration = 7, residual = 1.705767e-005iteration = 8, residual = 1.618583e-006Test Summary: Error amount = 0.000000
  6. 6. conjugateGradientPrecond.execonjugateGradientPrecond starting...GPU Device 0: "GeForce GTX 560 Ti" with compute capability 2.1GPU selected Device ID = 0> GPU device has 8 Multi-Processors, SM 2.1 compute capabilitieslaplace dimension = 128Convergence of conjugate gradient without preconditioning:iteration = 542, residual = 8.660636e-013Convergence Test: OKConvergence of conjugate gradient using incomplete LU preconditioning:iteration = 188, residual = 9.056491e-013Convergence Test: OKTest Summary:Counted total of 0 errorsqaerr1 = 0.000004 qaerr2 = 0.000003
  7. 7. convolutionFFT2D.exe 1/2[C:¥ProgramData¥NVIDIA Corporation¥CUDA Samples¥v5.0¥bin¥win64¥Release¥convolutionFFT2D.exe] - Starting...GPU Device 0: "GeForce GTX 560 Ti" with compute capability 2.1Testing built-in R2C / C2R FFT-based convolution...allocating memory...generating random input data...creating R2C & C2R FFT plans for 2048 x 2048...uploading to GPU and padding convolution kernel and input data...transforming convolution kernel...running GPU FFT convolution: 1267.922657 MPix/s (3.154767 ms)...reading back GPU convolution results...running reference CPU convolution...comparing the results: rel L2 = 7.179421E-008 (max delta = 4.808732E-007)L2norm Error OK...shutting downTesting custom R2C / C2R FFT-based convolution...allocating memory...generating random input data...creating C2C FFT plan for 2048 x 1024...uploading to GPU and padding convolution kernel and input data...transforming convolution kernel...running GPU FFT convolution: 1261.058719 MPix/s (3.171938 ms)...reading back GPU FFT results...running reference CPU convolution...comparing the results: rel L2 = 7.505000E-008 (max delta = 4.873593E-007)L2norm Error OK...shutting down
  8. 8. convolutionFFT2D.exe 2/2Testing updated custom R2C / C2R FFT-based convolution...allocating memory...generating random input data...creating C2C FFT plan for 2048 x 1024...uploading to GPU and padding convolution kernel and input data...transforming convolution kernel...running GPU FFT convolution: 1588.813202 MPix/s (2.517602 ms)...reading back GPU FFT results...running reference CPU convolution...comparing the results: rel L2 = 7.470519E-008 (max delta = 5.276085E-007)L2norm Error OK...shutting downTest Summary: 0 errorsTest passed
  9. 9. convolutionSeparable.exe[C:¥ProgramData¥NVIDIA Corporation¥CUDA Samples¥v5.0¥bin¥win64¥Release¥convolutionSeparable.exe] -Starting...GPU Device 0: "GeForce GTX 560 Ti" with compute capability 2.1Image Width x Height = 3072 x 3072Allocating and initializing host arrays...Allocating and initializing CUDA arrays...Running GPU convolution (16 identical iterations)...convolutionSeparable, Throughput = 3179.0263 MPixels/sec, Time = 0.00297 s, Size = 9437184 Pixels,NumDevsUsed = 1, Workgroup = 0Reading back GPU results...Checking the results......running convolutionRowCPU()...running convolutionColumnCPU()...comparing the results...Relative L2 norm: 0.000000E+000Shutting down...Test passed
  10. 10. convolutionTexture.exe[C:¥ProgramData¥NVIDIA Corporation¥CUDA Samples¥v5.0¥bin¥win64¥Release¥convolutionTexture.exe] - Starting...GPU Device 0: "GeForce GTX 560 Ti" with compute capability 2.1Initializing data...Running GPU rows convolution (10 identical iterations)...Average convolutionRowsGPU() time: 1.427774 msecs; //3304.859282 Mpix/sCopying convolutionRowGPU() output back to the texture...cudaMemcpyToArray() time: 0.481161 msecs; //9806.674660 Mpix/sRunning GPU columns convolution (10 iterations)Average convolutionColumnsGPU() time: 1.429637 msecs; //3300.552071 Mpix/sReading back GPU results...Checking the results......running convolutionRowsCPU()...running convolutionColumnsCPU()Relative L2 norm: 0.000000E+000Shutting down...Test passed
  11. 11. cppIntegration.exeGPU Device 0: "GeForce GTX 560 Ti" with compute capability 2.1Hello World.Hello World.
  12. 12. cudaDecodeD3D9.exe (runaway)Command Line Arguments:argv[0] = C:¥ProgramData¥NVIDIA Corporation¥CUDA Samples¥v5.0¥bin¥win64¥Release¥cudaDecodeD3D9.exe
  13. 13. cudaDecodeGL.exe 1/2[CUDA/OpenGL Video Decode]Command Line Arguments:argv[0] = C:¥ProgramData¥NVIDIA Corporation¥CUDA Samples¥v5.0¥bin¥win64¥Release¥cudaDecodeGL.exe[cudaDecodeGL]: input file: <../../../3_Imaging/cudaDecodeGL/data/plush1_720p_10s.m2v>VideoCodec : MPEG-2Frame rate : 30000/1001fps ~ 29.97fpsSequence format : ProgressiveCoded frame size: [1280, 720]Display area : [0, 0, 1280, 720]Chroma format : 4:2:0Bitrate : 14116kBit/sAspect ratio : 16:9argv[0] = C:¥ProgramData¥NVIDIA Corporation¥CUDA Samples¥v5.0¥bin¥win64¥Release¥cudaDecodeGL.exe> Device 0: <GeForce GTX 560 Ti >, Compute SM 2.1 detected-> GPU 0: < GeForce GTX 560 Ti > driver mode is: WDDM>> initGL() creating window [1280 x 720]> Using CUDA/GL Device [0]: GeForce GTX 560 Ti> Using GPU Device: GeForce GTX 560 Ti has SM 2.1 compute capabilityTotal amount of global memory: 1024.0000 MB>> modInitCTX<NV12ToARGB_drvapi_x64.ptx > initialized OK>> modGetCudaFunction< CUDA file: NV12ToARGB_drvapi_x64.ptx >CUDA Kernel Function (0x0a4c6660) = < NV12ToARGB_drvapi >>> modGetCudaFunction< CUDA file: NV12ToARGB_drvapi_x64.ptx >CUDA Kernel Function (0x0a4c6210) = < Passthru_drvapi >> VideoDecoder::cudaVideoCreateFlags = <1>Use CUDA decoder
  14. 14. cudaDecodeGL.exe 2/2setTextureFilterMode(GL_NEAREST,GL_NEAREST)ImageGL::CUcontext = 02047fd0ImageGL::CUdevice = 00000000reshape() glViewport(0, 0, 1280, 720)[cudaDecodeGL] - [Frame: 0016, 00.0 fps, frame time: 98854.47 (ms) ][cudaDecodeGL] - [Frame: 0032, 736.9 fps, frame time: 1.36 (ms) ][cudaDecodeGL] - [Frame: 0048, 687.3 fps, frame time: 1.45 (ms) ][cudaDecodeGL] - [Frame: 0064, 788.9 fps, frame time: 1.27 (ms) ][cudaDecodeGL] - [Frame: 0080, 748.5 fps, frame time: 1.34 (ms) ][cudaDecodeGL] - [Frame: 0096, 724.5 fps, frame time: 1.38 (ms) ][cudaDecodeGL] - [Frame: 0112, 747.5 fps, frame time: 1.34 (ms) ][cudaDecodeGL] - [Frame: 0128, 738.9 fps, frame time: 1.35 (ms) ][cudaDecodeGL] - [Frame: 0144, 749.4 fps, frame time: 1.33 (ms) ][cudaDecodeGL] - [Frame: 0160, 764.7 fps, frame time: 1.31 (ms) ][cudaDecodeGL] - [Frame: 0176, 802.6 fps, frame time: 1.25 (ms) ][cudaDecodeGL] - [Frame: 0192, 766.6 fps, frame time: 1.30 (ms) ][cudaDecodeGL] - [Frame: 0208, 827.8 fps, frame time: 1.21 (ms) ][cudaDecodeGL] - [Frame: 0224, 774.1 fps, frame time: 1.29 (ms) ][cudaDecodeGL] - [Frame: 0240, 793.3 fps, frame time: 1.26 (ms) ][cudaDecodeGL] - [Frame: 0256, 742.5 fps, frame time: 1.35 (ms) ][cudaDecodeGL] - [Frame: 0272, 789.0 fps, frame time: 1.27 (ms) ][cudaDecodeGL] - [Frame: 0288, 803.1 fps, frame time: 1.25 (ms) ][cudaDecodeGL] - [Frame: 0304, 723.6 fps, frame time: 1.38 (ms) ][cudaDecodeGL] - [Frame: 0320, 728.5 fps, frame time: 1.37 (ms) ][cudaDecodeGL] statisticsVideo Length (hh:mm:ss.msec) = 00:00:00.440Frames Presented (inc repeats) = 326Average Present Rate (fps) = 739.44Frames Decoded (hardware) = 327Average Rate of Decoding (fps) = 741.71
  15. 15. cudaDecodeD3D9.exe 1/2Command Line Arguments:argv[0] = C:¥ProgramData¥NVIDIA Corporation¥CUDA Samples¥v5.0¥bin¥win64¥Release¥cudaDecodeD3D9.exe[cudaDecodeD3D9]: input file: <../../../3_Imaging/cudaDecodeD3D9/data/plush1_720p_10s.m2v>VideoCodec : MPEG-2Frame rate : 30000/1001fps ~ 29.97fpsSequence format : ProgressiveCoded frame size: [1280, 720]Display area : [0, 0, 1280, 720]Chroma format : 4:2:0Bitrate : 14116kBit/sAspect ratio : 16:9> Using GPU Device 0: GeForce GTX 560 Ti has SM 2.1 compute capabilityTotal amount of global memory: 1024.0000 MB>> modInitCTX<NV12ToARGB_drvapi_x64.ptx> initialized SUCCESS!>> modGetCudaFunction<NV12ToARGB_drvapi_x64.ptx>CUDA Kernel Function = <NV12ToARGB_drvapi, 0x04439d20>>> modGetCudaFunction<NV12ToARGB_drvapi_x64.ptx>CUDA Kernel Function = <Passthru_drvapi, 0x044398d0>> VideoDecoder::cudaVideoCreateFlags = <1>Use CUDA decoder
  16. 16. cudaDecodeD3D9.exe 2/2[cudaDecodeD3D9] - [Frame: 0016, 833.6 fps, time: 1.20 (ms) ][cudaDecodeD3D9] - [Frame: 0032, 1031.0 fps, time: 0.97 (ms) ][cudaDecodeD3D9] - [Frame: 0048, 843.8 fps, time: 1.19 (ms) ][cudaDecodeD3D9] - [Frame: 0064, 864.4 fps, time: 1.16 (ms) ][cudaDecodeD3D9] - [Frame: 0080, 850.9 fps, time: 1.18 (ms) ][cudaDecodeD3D9] - [Frame: 0096, 819.0 fps, time: 1.22 (ms) ][cudaDecodeD3D9] - [Frame: 0112, 844.0 fps, time: 1.18 (ms) ][cudaDecodeD3D9] - [Frame: 0128, 815.6 fps, time: 1.23 (ms) ][cudaDecodeD3D9] - [Frame: 0144, 821.0 fps, time: 1.22 (ms) ][cudaDecodeD3D9] - [Frame: 0160, 874.7 fps, time: 1.14 (ms) ][cudaDecodeD3D9] - [Frame: 0176, 960.4 fps, time: 1.04 (ms) ][cudaDecodeD3D9] - [Frame: 0192, 947.7 fps, time: 1.06 (ms) ][cudaDecodeD3D9] - [Frame: 0208, 896.7 fps, time: 1.12 (ms) ][cudaDecodeD3D9] - [Frame: 0224, 872.5 fps, time: 1.15 (ms) ][cudaDecodeD3D9] - [Frame: 0240, 922.7 fps, time: 1.08 (ms) ][cudaDecodeD3D9] - [Frame: 0256, 943.2 fps, time: 1.06 (ms) ][cudaDecodeD3D9] - [Frame: 0272, 936.6 fps, time: 1.07 (ms) ][cudaDecodeD3D9] - [Frame: 0288, 899.8 fps, time: 1.11 (ms) ][cudaDecodeD3D9] - [Frame: 0304, 901.0 fps, time: 1.11 (ms) ][cudaDecodeD3D9] - [Frame: 0320, 813.1 fps, time: 1.23 (ms) ][cudaDecodeD3D9] statisticsVideo Length (hh:mm:ss.msec) = 00:00:00.375Frames Presented (inc repeats) = 326Average Present FPS = 868.73Frames Decoded (hardware) = 327Average Decoder FPS = 871.40
  17. 17. cudaEncode.exe (runaway)Starting cudaEncode...[ CUDA H.264 Encoder ]argv[0] = C:¥ProgramData¥NVIDIA Corporation¥CUDA Samples¥v5.0¥bin¥win64¥Release¥cudaEncode.exe
  18. 18. dct8x8.exedct8x8.exe Starting...GPU Device 0: "GeForce GTX 560 Ti" with compute capability 2.1CUDA sample DCT/IDCT implementation===================================Loading test image: barbara.bmp... [512 x 512]... SuccessRunning Gold 1 (CPU) version... SuccessRunning Gold 2 (CPU) version... SuccessRunning CUDA 1 (GPU) version... SuccessRunning CUDA 2 (GPU) version... 10459.499992 MPix/s //0.025063 msSuccessRunning CUDA short (GPU) version... SuccessDumping result to barbara_gold1.bmp... SuccessDumping result to barbara_gold2.bmp... SuccessDumping result to barbara_cuda1.bmp... SuccessDumping result to barbara_cuda2.bmp... SuccessDumping result to barbara_cuda_short.bmp... SuccessProcessing time (CUDA 1) : 0.209782 msProcessing time (CUDA 2) : 0.025063 msProcessing time (CUDA short): 0.170617 msPSNR Original <---> CPU(Gold 1) : 32.777073PSNR Original <---> CPU(Gold 2) : 32.777046PSNR Original <---> GPU(CUDA 1) : 32.777092PSNR Original <---> GPU(CUDA 2) : 32.777077PSNR Original <---> GPU(CUDA short): 32.749447PSNR CPU(Gold 1) <---> GPU(CUDA 1) : 64.019310PSNR CPU(Gold 2) <---> GPU(CUDA 2) : 71.777740PSNR CPU(Gold 2) <---> GPU(CUDA short): 42.258053Test Summary...Test passed
  19. 19. dct8x8.exe / resultbarbara_cuda_short.bmp
  20. 20. dct8x8.exe / resultbarbara_cuda1.bmp
  21. 21. dct8x8.exe / resultbarbara_cuda2.bmp
  22. 22. dct8x8.exe / resultbarbara_gold1.bmp
  23. 23. dct8x8.exe / resultbarbara_gold2.bmp
  24. 24. deviceQuery.exe 1/2C:¥ProgramData¥NVIDIA Corporation¥CUDA Samples¥v5.0¥bin¥win64¥Release¥deviceQuery.exe Starting...CUDA Device Query (Runtime API) version (CUDART static linking)Detected 1 CUDA Capable device(s)Device 0: "GeForce GTX 560 Ti"CUDA Driver Version / Runtime Version 5.0 / 5.0CUDA Capability Major/Minor version number: 2.1Total amount of global memory: 1024 MBytes (1073741824 bytes)( 8) Multiprocessors x ( 48) CUDA Cores/MP: 384 CUDA CoresGPU Clock rate: 1800 MHz (1.80 GHz)Memory Clock rate: 2050 MhzMemory Bus Width: 256-bitL2 Cache Size: 524288 bytesMax Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536,65535), 3D=(2048,2048,2048)Max Layered Texture Size (dim) x layers 1D=(16384) x 2048, 2D=(16384,16384) x 2048Total amount of constant memory: 65536 bytesTotal amount of shared memory per block: 49152 bytesTotal number of registers available per block: 32768Warp size: 32
  25. 25. deviceQuery.exe 2/2Maximum number of threads per multiprocessor: 1536Maximum number of threads per block: 1024Maximum sizes of each dimension of a block: 1024 x 1024 x 64Maximum sizes of each dimension of a grid: 65535 x 65535 x 65535Maximum memory pitch: 2147483647 bytesTexture alignment: 512 bytesConcurrent copy and kernel execution: Yes with 1 copy engine(s)Run time limit on kernels: YesIntegrated GPU sharing Host Memory: NoSupport host page-locked memory mapping: YesAlignment requirement for Surfaces: YesDevice has ECC support: DisabledCUDA Device Driver Mode (TCC or WDDM): WDDM (Windows Display Driver Model)Device supports Unified Addressing (UVA): YesDevice PCI Bus ID / PCI location ID: 1 / 0Compute Mode:< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 5.0, CUDA Runtime Version = 5.0, NumDevs = 1,Device0 = GeForceGTX 560 Ti
  26. 26. deviceQueryDrv.exe 1/2C:¥ProgramData¥NVIDIA Corporation¥CUDASamples¥v5.0¥bin¥win64¥Release¥deviceQueryDrv.exe Starting...CUDA Device Query (Driver API) statically linked versionDetected 1 CUDA Capable device(s)Device 0: "GeForce GTX 560 Ti"CUDA Driver Version: 5.0CUDA Capability Major/Minor version number: 2.1Total amount of global memory: 1024 MBytes (1073741824 bytes)( 8) Multiprocessors x ( 48) CUDA Cores/MP: 384 CUDA CoresGPU Clock rate: 1800 MHz (1.80 GHz)Memory Clock rate: 2050 MhzMemory Bus Width: 256-bitL2 Cache Size: 524288 bytesMax Texture Dimension Sizes 1D=(65536) 2D=(65536,65535)3D=(2048,2048,2048)Max Layered Texture Size (dim) x layers 1D=(16384) x 2048, 2D=(16384,16384) x 2048Total amount of constant memory: 65536 bytesTotal amount of shared memory per block: 49152 bytesTotal number of registers available per block: 32768Warp size: 32
  27. 27. deviceQueryDrv.exe 2/2Maximum number of threads per multiprocessor: 1536Maximum number of threads per block: 1024Maximum sizes of each dimension of a block: 1024 x 1024 x 64Maximum sizes of each dimension of a grid: 65535 x 65535 x 65535Texture alignment: 512 bytesMaximum memory pitch: 2147483647 bytesConcurrent copy and kernel execution: Yes with 1 copy engine(s)Run time limit on kernels: YesIntegrated GPU sharing Host Memory: NoSupport host page-locked memory mapping: YesConcurrent kernel execution: YesAlignment requirement for Surfaces: YesDevice has ECC support: DisabledCUDA Device Driver Mode (TCC or WDDM): WDDM (Windows Display Driver Model)Device supports Unified Addressing (UVA): YesDevice PCI Bus ID / PCI location ID: 1 / 0Compute Mode:< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
  28. 28. dwtHaar1D.exeC:¥ProgramData¥NVIDIA Corporation¥CUDA Samples¥v5.0¥bin¥win64¥Release¥dwtHaar1D.exe Starting...GPU Device 0: "GeForce GTX 560 Ti" with compute capability 2.1source file = "../../../3_Imaging/dwtHaar1D/data/signal.dat"reference file = "result.dat"gold file = "../../../3_Imaging/dwtHaar1D/data/regression.gold.dat"Reading signal from "../../../3_Imaging/dwtHaar1D/data/signal.dat"Writing result to "result.dat"Reading reference result from "../../../3_Imaging/dwtHaar1D/data/regression.gold.dat"Test success!Signal.dat9.5012929e-0012.3113851e-0016.0684258e-0014.8598247e-0018.9129897e-001・・・Regression.gold.datResult.dat
  29. 29. dxtc.exeC:¥ProgramData¥NVIDIA Corporation¥CUDA Samples¥v5.0¥bin¥win64¥Release¥dxtc.exe Starting...GPU Device 0: "GeForce GTX 560 Ti" with compute capability 2.1Image Loaded ../../../3_Imaging/dxtc/data/lena_std.ppm, 512 x 512 pixelsRunning DXT Compression on 512 x 512 image...16384 Blocks, 64 Threads per Block, 1048576 Threads in Grid...dxtc, Throughput = 17.7004 MPixels/s, Time = 0.01481 s, Size = 262144 Pixels, NumDevsUsed = 1, Workgroup =64
  30. 30. dxtc.exe 1/4Checking accuracy...Deviation at ( 9, 1): 0.791667 rmsDeviation at ( 99, 1): 1.041667 rmsDeviation at ( 12, 2): 0.937500 rmsDeviation at ( 90, 3): 0.166667 rmsDeviation at ( 38, 4): 1.916667 rmsDeviation at ( 34, 7): 1.687500 rmsDeviation at ( 57, 7): 0.458333 rmsDeviation at ( 100, 8): 2.416667 rmsDeviation at ( 30, 9): 2.375000 rmsDeviation at ( 31, 9): 0.770833 rmsDeviation at ( 58, 9): 0.791667 rmsDeviation at ( 29, 10): 0.020833 rmsDeviation at ( 79, 10): 1.833333 rmsDeviation at ( 13, 11): 1.041667 rmsDeviation at ( 4, 13): 8.562500 rmsDeviation at ( 28, 13): 0.562500 rmsDeviation at ( 90, 13): 0.708333 rmsDeviation at ( 25, 14): 0.520833 rmsDeviation at ( 69, 14): 0.770833 rmsDeviation at ( 87, 16): 0.708333 rmsDeviation at ( 90, 17): 1.041667 rmsDeviation at ( 24, 19): 0.916667 rmsDeviation at ( 25, 19): 0.625000 rmsDeviation at ( 26, 19): 1.041667 rmsDeviation at ( 55, 20): 4.791667 rmsDeviation at ( 20, 23): 1.541667 rmsDeviation at ( 99, 23): 3.312500 rmsDeviation at ( 45, 24): 18.104166 rmsDeviation at ( 8, 28): 0.895833 rms
  31. 31. dxtc.exe 2/4Deviation at ( 21, 30): 1.562500 rmsDeviation at ( 115, 32): 24.104166 rmsDeviation at ( 2, 33): 0.854167 rmsDeviation at ( 102, 33): 2.250000 rmsDeviation at ( 50, 35): 26.958334 rmsDeviation at ( 68, 35): 11.937500 rmsDeviation at ( 115, 36): 0.458333 rmsDeviation at ( 12, 38): 2.166667 rmsDeviation at ( 40, 40): 0.270833 rmsDeviation at ( 86, 43): 0.604167 rmsDeviation at ( 116, 43): 0.125000 rmsDeviation at ( 43, 44): 2.250000 rmsDeviation at ( 54, 44): 4.791667 rmsDeviation at ( 46, 46): 2.875000 rmsDeviation at ( 116, 46): 0.604167 rmsDeviation at ( 4, 47): 0.708333 rmsDeviation at ( 117, 48): 0.937500 rmsDeviation at ( 23, 51): 3.520833 rmsDeviation at ( 11, 52): 0.041667 rmsDeviation at ( 67, 54): 5.687500 rmsDeviation at ( 26, 55): 0.854167 rmsDeviation at ( 21, 56): 5.000000 rmsDeviation at ( 24, 56): 0.562500 rmsDeviation at ( 30, 57): 0.937500 rmsDeviation at ( 21, 59): 2.541667 rmsDeviation at ( 120, 59): 0.104167 rmsDeviation at ( 112, 60): 1.125000 rmsDeviation at ( 77, 61): 1.083333 rms
  32. 32. dxtc.exe 3/4Deviation at ( 114, 62): 4.958333 rmsDeviation at ( 78, 66): 0.541667 rmsDeviation at ( 106, 68): 0.375000 rmsDeviation at ( 16, 70): 3.104167 rmsDeviation at ( 10, 71): 0.937500 rmsDeviation at ( 108, 71): 0.354167 rmsDeviation at ( 0, 72): 0.854167 rmsDeviation at ( 118, 72): 5.562500 rmsDeviation at ( 11, 73): 0.541667 rmsDeviation at ( 68, 74): 1.937500 rmsDeviation at ( 70, 76): 1.791667 rmsDeviation at ( 124, 76): 3.354167 rmsDeviation at ( 103, 78): 0.375000 rmsDeviation at ( 127, 78): 0.541667 rmsDeviation at ( 108, 79): 0.083333 rmsDeviation at ( 120, 81): 0.541667 rmsDeviation at ( 43, 82): 24.979166 rmsDeviation at ( 67, 82): 3.125000 rmsDeviation at ( 78, 82): 2.437500 rmsDeviation at ( 123, 84): 0.541667 rmsDeviation at ( 127, 85): 0.187500 rmsDeviation at ( 122, 87): 0.083333 rmsDeviation at ( 124, 87): 0.541667 rmsDeviation at ( 127, 88): 0.229167 rmsDeviation at ( 93, 91): 0.666667 rmsDeviation at ( 115, 93): 0.083333 rmsDeviation at ( 69, 95): 1.875000 rmsDeviation at ( 106, 95): 1.125000 rms
  33. 33. dxtc.exe 4/4Deviation at ( 107, 95): 3.708333 rmsDeviation at ( 13, 96): 1.354167 rmsDeviation at ( 115, 98): 0.187500 rmsDeviation at ( 118, 98): 0.187500 rmsDeviation at ( 116, 101): 0.187500 rmsDeviation at ( 78, 105): 0.541667 rmsDeviation at ( 67, 107): 0.708333 rmsDeviation at ( 74, 107): 0.375000 rmsDeviation at ( 65, 109): 0.770833 rmsDeviation at ( 89, 109): 0.708333 rmsDeviation at ( 118, 109): 3.854167 rmsDeviation at ( 67, 110): 1.083333 rmsDeviation at ( 88, 111): 0.208333 rmsDeviation at ( 64, 113): 0.708333 rmsDeviation at ( 84, 113): 0.333333 rmsDeviation at ( 88, 113): 0.187500 rmsDeviation at ( 84, 114): 1.666667 rmsDeviation at ( 66, 115): 0.770833 rmsDeviation at ( 19, 118): 5.270833 rmsDeviation at ( 76, 121): 0.104167 rmsDeviation at ( 70, 122): 0.708333 rmsDeviation at ( 91, 122): 0.208333 rmsDeviation at ( 71, 123): 0.854167 rmsDeviation at ( 75, 123): 0.854167 rmsDeviation at ( 61, 124): 0.937500 rmsDeviation at ( 91, 124): 0.270833 rmsRMS(reference, result) = 0.015488Test passed
  34. 34. SummaryGTX560, Some samples does not work fine.→ MUST support CUDA compute capability 3.0.→ Requires GPU devices with compute SM 3.5 orhigher.This evaluation to be continued, For futurereference.

×