Page 1 of 1

Trinity: Help me run a GPU test!

Posted: Sun May 26, 2013 11:34 pm
by codedivine
Hi folks. I need your help!

I am trying to find out the approximate double-precision performance of the GPU in AMD Trinity for ADD, MUL and FMA instructions. I have written a small OpenCL test to do so. If possible, can you please download and run my test? Link: http://www.rgbench.com/MaxFlopsCL.exe

It needs to be run on the command-line under 64-bit windows. For machines with a single GPU, the test should choose the default GPU. If you are running a Trinity based system with an additional AMD GPU installed, and the test reports numbers for that GPU instead, then try doing: "MaxFlopsCL.exe 0 1" instead?

If you are worried about downloading arbitrary binaries from the internet, well the test is open-source with source here so you can compile yourself if you wish :) : https://bitbucket.org/codedivine/maxflopscl

(UPDATE: The test only takes about 1-2 seconds to run!)

Re: Trinity: Help me run a GPU test!

Posted: Mon May 27, 2013 1:27 am
by biffzinker
Not working for me shows "device id not in range."

Edit: Tried running it on a Geforce 560 GTX 2GB

Re: Trinity: Help me run a GPU test!

Posted: Mon May 27, 2013 4:07 am
by Chrispy_
GPU runs at base 300MHz clockspeed throughout, GPU-Z reports no shift to 920MHz, and CCC reports zero activity.
Regardless, here are the numbers

GPU selected Tahiti
Op = Add
Time 0.764876ms GOps/s 87.7382
Op = Mul
Time 0.835313ms GOps/s 80.3398
Op = Fma
Time 0.872548ms GOps/s 76.9113

Re: Trinity: Help me run a GPU test!

Posted: Mon May 27, 2013 4:17 am
by Chrispy_
Same again on the Pitcairn, doesn't seem to wake up the GPU from low-power state for some reason....

GPU selected Pitcairn
Op = Add
Time 1.92994ms GOps/s 34.7725
Op = Mul
Time 3.76447ms GOps/s 17.8269
Op = Fma
Time 3.71207ms GOps/s 18.0786

Re: Trinity: Help me run a GPU test!

Posted: Mon May 27, 2013 4:22 am
by Chrispy_
....and It won't run on an optimus laptop, regardless of whether I force Intel or nVidia graphics.

"Device ID is not in range"

Re: Trinity: Help me run a GPU test!

Posted: Mon May 27, 2013 8:44 am
by codedivine
Chrispy_ wrote:
Same again on the Pitcairn, doesn't seem to wake up the GPU from low-power state for some reason....

GPU selected Pitcairn
Op = Add
Time 1.92994ms GOps/s 34.7725
Op = Mul
Time 3.76447ms GOps/s 17.8269
Op = Fma
Time 3.71207ms GOps/s 18.0786


Yeah the problem size is quite small, and doesn't seem to force discrete graphics cards to go to high power states.

Re: Trinity: Help me run a GPU test!

Posted: Mon May 27, 2013 9:07 am
by codedivine
Chrispy_ wrote:
....and It won't run on an optimus laptop, regardless of whether I force Intel or nVidia graphics.

"Device ID is not in range"


Well I think you likely have two platforms installed. The tool has two command-line integer parameters. First parameter is the platform ID (starts with 0) and second parameter is device ID (starts with 0). So you can try a few different options. Eg: "MaxFlopsCL.exe 1 0"

I will fix the test so that it first prints out all the available GPUs and their platform and device IDs.

Re: Trinity: Help me run a GPU test!

Posted: Mon May 27, 2013 9:07 am
by codedivine
biffzinker wrote:
Not working for me shows "device id not in range."

Edit: Tried running it on a Geforce 560 GTX 2GB


Thanks for the attempt! Look for my reply above. I will try and fix the issue.

Re: Trinity: Help me run a GPU test!

Posted: Tue May 28, 2013 5:11 am
by Chrispy_
Not sure how relevant it is, but here's some feedback on a GT330 - claiming it doesnt' have SM1.3 (it does) or map_f64_to_f32 directive (not sure what that is)
I guess the underlying architecture (GT215) is quite old even if the card is from 2011.

GPU selected GeForce GT 330
Op = Add
Error creating program from source 0 -42 -45
Buildlog 4096 ptxas application ptx input, line 38; : error : Instruction 'cvt' requires SM 1.3 or higher, or map_f64_to_f32 directive

Re: Trinity: Help me run a GPU test!

Posted: Tue May 28, 2013 3:15 pm
by codedivine
Hi folks. Please redownload the binary. I have modified the test so that it will test all the fp64 capable GPUs in your system. Thus, command line parameters are no longer required.

Re: Trinity: Help me run a GPU test!

Posted: Tue May 28, 2013 3:29 pm
by Orwell
Tried running it on a 5850 with Catalyst 13.4:

The program was unable to start correctly (0xc000007b). Click OK to close the application.

Okay. :(


Forget about that, I just needed to install the MSVS x64 2012 runtime.

Got this result now for a 5850 running at 850/1250MHz (about 20% above stock):
Device selected Cypress
Op = Add
Time 1.39822ms GOps/s 47.9958
Op = Mul
Time 2.1868ms GOps/s 30.6882
Op = Fma
Time 2.22738ms GOps/s 30.1291


Side note: scores are still much higher when I keep the GPU from falling asleep by running a D3D9 program:
Device selected Cypress
Op = Add
Time 0.476667ms GOps/s 140.788
Op = Mul
Time 0.611111ms GOps/s 109.814
Op = Fma
Time 0.650222ms GOps/s 103.209

Re: Trinity: Help me run a GPU test!

Posted: Tue May 28, 2013 3:34 pm
by codedivine
Orwell wrote:
Tried running it on a 5850 with Catalyst 13.4:

The program was unable to start correctly (0xc000007b). Click OK to close the application.

Okay. :(


Well not sure what to make of that. Looking for solution.

Re: Trinity: Help me run a GPU test!

Posted: Tue May 28, 2013 4:15 pm
by codedivine
I have increased the problem size and uploaded the updated version. This should perhaps be more accurate for discrete GPUs, though really it is not that great for big GPUs.

Re: Trinity: Help me run a GPU test!

Posted: Tue May 28, 2013 4:29 pm
by Orwell
:P

I blindly increased the problem size at line 60 to 4096*4096 (from 1024*16 I believe) and it looks much better now:

GPU selected Cypress
Op = Add
Time 145.021ms GOps/s 473.859
Op = Mul
Time 284.319ms GOps/s 241.698
Op = Fma
Time 284.232ms GOps/s 241.773


Compiled with GCC 4.7.1 x64 bundled with this: http://sourceforge.net/projects/orwelldevcpp/, using the OpenCL libraries provided by AMD.

Re: Trinity: Help me run a GPU test!

Posted: Tue May 28, 2013 4:58 pm
by codedivine
Fixed the problem size issue by making it dynamic. Now it should ensure that each test is run for at least 100ms on each GPU.

Re: Trinity: Help me run a GPU test!

Posted: Tue May 28, 2013 5:03 pm
by codedivine
Orwell wrote:
:P

I blindly increased the problem size at line 60 to 4096*4096 (from 1024*16 I believe) and it looks much better now:

GPU selected Cypress
Op = Add
Time 145.021ms GOps/s 473.859
Op = Mul
Time 284.319ms GOps/s 241.698
Op = Fma
Time 284.232ms GOps/s 241.773


Compiled with GCC 4.7.1 x64 bundled with this: http://sourceforge.net/projects/orwelldevcpp/, using the OpenCL libraries provided by AMD.


Thanks! I have updated the repo. You may be interested in the changes :)

Re: Trinity: Help me run a GPU test!

Posted: Tue May 28, 2013 5:07 pm
by Orwell
Here's the end result for my 5850:
Image

This proves the add part in FMA is basically free on this card. Yay. Here's the source code:
http://wilcobrouwer.nl/bestanden/MaxFLOPS%20Orwell.7z

So, yes, I was bored.

Re: Trinity: Help me run a GPU test!

Posted: Tue May 28, 2013 5:10 pm
by codedivine
Orwell wrote:
Here's the end result for my 5850:
Image

This proves the add part in FMA is basically free on this card. Yay. Here's the source code:
http://wilcobrouwer.nl/bestanden/MaxFLOPS%20Orwell.7z

So, yes, I was bored.


Awesome! :D

Re: Trinity: Help me run a GPU test!

Posted: Wed May 29, 2013 11:11 pm
by biffzinker
D:\>maxflopscl
Device selected GeForce GTX 560
Device compute units: 7
Device extensions:
cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_d3d9_sharing cl
_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_compiler_option
s cl_nv_device_attribute_query cl_nv_pragma_unroll cl_khr_global_int32_base_ato
mics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr
_local_int32_extended_atomics cl_khr_fp64
FP64 supported with configuration: CL_FP_DENORM CL_FP_INF_NAN CL_FP_ROUND_TO_NEA
REST CL_FP_ROUND_TO_ZERO CL_FP_ROUND_TO_INF CL_FP_FMA
Testing DP performance
Op = Add
Time 157.156ms GOps/s 54.6588
Op = Mul
Time 157.154ms GOps/s 54.6593
Op = Fma
Time 157.637ms GOps/s 54.4918

D:\>


560 running 985/1970

Re: Trinity: Help me run a GPU test!

Posted: Fri May 31, 2013 12:11 am
by codedivine
biffzinker wrote:
D:\>maxflopscl
Device selected GeForce GTX 560
Device compute units: 7
Device extensions:
cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_d3d9_sharing cl
_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_compiler_option
s cl_nv_device_attribute_query cl_nv_pragma_unroll cl_khr_global_int32_base_ato
mics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr
_local_int32_extended_atomics cl_khr_fp64
FP64 supported with configuration: CL_FP_DENORM CL_FP_INF_NAN CL_FP_ROUND_TO_NEA
REST CL_FP_ROUND_TO_ZERO CL_FP_ROUND_TO_INF CL_FP_FMA
Testing DP performance
Op = Add
Time 157.156ms GOps/s 54.6588
Op = Mul
Time 157.154ms GOps/s 54.6593
Op = Fma
Time 157.637ms GOps/s 54.4918

D:\>


560 running 985/1970


Thanks! Interesting to note that for Nvidia cards, ADDs, MULs and FMAs all have equal throughput unlike the AMD card Orwell tested above.

Re: Trinity: Help me run a GPU test!

Posted: Fri May 31, 2013 4:45 am
by JustAnEngineer
Radeon HD7950
Device selected Tahiti
Device compute units: 28
Device extensions:
cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_d3d10_sharing cl_khr_image2d_from_buffer
FP64 supported with configuration: CL_FP_DENORM CL_FP_INF_NAN CL_FP_ROUND_TO_NEAREST CL_FP_ROUND_TO_ZERO CL_FP_ROUND_TO_INF CL_FP_FMA
Testing DP performance
Op = Add
Time 165.452ms GOps/s 830.686
Op = Mul
Time 164.605ms GOps/s 417.48
Op = Fma
Time 164.678ms GOps/s 417.296

Re: Trinity: Help me run a GPU test!

Posted: Sat Jun 01, 2013 12:47 am
by codedivine
JustAnEngineer wrote:
Radeon HD7950
Device selected Tahiti
Device compute units: 28
Device extensions:
cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_d3d10_sharing cl_khr_image2d_from_buffer
FP64 supported with configuration: CL_FP_DENORM CL_FP_INF_NAN CL_FP_ROUND_TO_NEAREST CL_FP_ROUND_TO_ZERO CL_FP_ROUND_TO_INF CL_FP_FMA
Testing DP performance
Op = Add
Time 165.452ms GOps/s 830.686
Op = Mul
Time 164.605ms GOps/s 417.48
Op = Fma
Time 164.678ms GOps/s 417.296


Thanks! We again see that it does ADD at twice the rate as MUL and FMA. Tahiti appears to do fp64 ADDs at 1/2 fp32 rate, while fp64 MUL and FMA are 1/4 fp32 rate.

Re: Trinity: Help me run a GPU test!

Posted: Sat Jun 01, 2013 4:54 am
by JustAnEngineer
Incidentally, there may be something odd going on with your timing calculation. When I tried running it from the command line and redirecting the output to a file, I got different (lower) results.
maxflopscl >max.txt

Re: Trinity: Help me run a GPU test!

Posted: Sat Jun 08, 2013 4:35 am
by JustAnEngineer
Radeon HD6970
Device selected Cayman
Device compute units: 24
Device extensions:
cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_d3d10_sharing cl_amd_image2d_from_buffer_read_only
FP64 supported with configuration: CL_FP_DENORM CL_FP_INF_NAN CL_FP_ROUND_TO_NEAREST CL_FP_ROUND_TO_ZERO CL_FP_ROUND_TO_INF CL_FP_FMA
Testing DP performance
Op = Add
Time 104.77ms GOps/s 655.91
Op = Mul
Time 103.034ms GOps/s 333.479
Op = Fma
Time 103.164ms GOps/s 333.058

Re: Trinity: Help me run a GPU test!

Posted: Sat Jun 08, 2013 1:32 pm
by JustAnEngineer
GeForce GTX460-768
Device selected GeForce GTX 460
Device compute units: 7
Device extensions:
cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_d3d9_sharing cl_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64
FP64 supported with configuration: CL_FP_DENORM CL_FP_INF_NAN CL_FP_ROUND_TO_NEAREST CL_FP_ROUND_TO_ZERO CL_FP_ROUND_TO_INF CL_FP_FMA
Testing DP performance
Op = Add
Time 107.804ms GOps/s 39.8405
Op = Mul
Time 107.73ms GOps/s 39.868
Op = Fma
Time 107.988ms GOps/s 39.7726