Trinity: Help me run a GPU test!

From the pixels, bits, and shaders to the graphic cards that power them. Discuss the latest from AMD and NVIDIA here.

Moderators: morphine, SecretSquirrel

Trinity: Help me run a GPU test!

Postposted on Sun May 26, 2013 11:34 pm

Hi folks. I need your help!

I am trying to find out the approximate double-precision performance of the GPU in AMD Trinity for ADD, MUL and FMA instructions. I have written a small OpenCL test to do so. If possible, can you please download and run my test? Link: http://www.rgbench.com/MaxFlopsCL.exe

It needs to be run on the command-line under 64-bit windows. For machines with a single GPU, the test should choose the default GPU. If you are running a Trinity based system with an additional AMD GPU installed, and the test reports numbers for that GPU instead, then try doing: "MaxFlopsCL.exe 0 1" instead?

If you are worried about downloading arbitrary binaries from the internet, well the test is open-source with source here so you can compile yourself if you wish :) : https://bitbucket.org/codedivine/maxflopscl

(UPDATE: The test only takes about 1-2 seconds to run!)
codedivine
Gerbil Elite
Silver subscriber
 
 
Posts: 702
Joined: Sat Jan 24, 2009 8:13 am

Re: Trinity: Help me run a GPU test!

Postposted on Mon May 27, 2013 1:27 am

Not working for me shows "device id not in range."

Edit: Tried running it on a Geforce 560 GTX 2GB
biffzinker
Gerbil First Class
 
Posts: 111
Joined: Tue Mar 21, 2006 3:53 pm
Location: Anchorage, Alaska

Re: Trinity: Help me run a GPU test!

Postposted on Mon May 27, 2013 4:07 am

GPU runs at base 300MHz clockspeed throughout, GPU-Z reports no shift to 920MHz, and CCC reports zero activity.
Regardless, here are the numbers

GPU selected Tahiti
Op = Add
Time 0.764876ms GOps/s 87.7382
Op = Mul
Time 0.835313ms GOps/s 80.3398
Op = Fma
Time 0.872548ms GOps/s 76.9113
<insert large, flashing, epileptic-fit-inducing signature (based on the latest internet-meme) here>
Chrispy_
Gerbil Jedi
Gold subscriber
 
 
Posts: 1782
Joined: Fri Apr 09, 2004 3:49 pm

Re: Trinity: Help me run a GPU test!

Postposted on Mon May 27, 2013 4:17 am

Same again on the Pitcairn, doesn't seem to wake up the GPU from low-power state for some reason....

GPU selected Pitcairn
Op = Add
Time 1.92994ms GOps/s 34.7725
Op = Mul
Time 3.76447ms GOps/s 17.8269
Op = Fma
Time 3.71207ms GOps/s 18.0786
<insert large, flashing, epileptic-fit-inducing signature (based on the latest internet-meme) here>
Chrispy_
Gerbil Jedi
Gold subscriber
 
 
Posts: 1782
Joined: Fri Apr 09, 2004 3:49 pm

Re: Trinity: Help me run a GPU test!

Postposted on Mon May 27, 2013 4:22 am

....and It won't run on an optimus laptop, regardless of whether I force Intel or nVidia graphics.

"Device ID is not in range"
<insert large, flashing, epileptic-fit-inducing signature (based on the latest internet-meme) here>
Chrispy_
Gerbil Jedi
Gold subscriber
 
 
Posts: 1782
Joined: Fri Apr 09, 2004 3:49 pm

Re: Trinity: Help me run a GPU test!

Postposted on Mon May 27, 2013 8:44 am

Chrispy_ wrote:Same again on the Pitcairn, doesn't seem to wake up the GPU from low-power state for some reason....

GPU selected Pitcairn
Op = Add
Time 1.92994ms GOps/s 34.7725
Op = Mul
Time 3.76447ms GOps/s 17.8269
Op = Fma
Time 3.71207ms GOps/s 18.0786


Yeah the problem size is quite small, and doesn't seem to force discrete graphics cards to go to high power states.
codedivine
Gerbil Elite
Silver subscriber
 
 
Posts: 702
Joined: Sat Jan 24, 2009 8:13 am

Re: Trinity: Help me run a GPU test!

Postposted on Mon May 27, 2013 9:07 am

Chrispy_ wrote:....and It won't run on an optimus laptop, regardless of whether I force Intel or nVidia graphics.

"Device ID is not in range"


Well I think you likely have two platforms installed. The tool has two command-line integer parameters. First parameter is the platform ID (starts with 0) and second parameter is device ID (starts with 0). So you can try a few different options. Eg: "MaxFlopsCL.exe 1 0"

I will fix the test so that it first prints out all the available GPUs and their platform and device IDs.
codedivine
Gerbil Elite
Silver subscriber
 
 
Posts: 702
Joined: Sat Jan 24, 2009 8:13 am

Re: Trinity: Help me run a GPU test!

Postposted on Mon May 27, 2013 9:07 am

biffzinker wrote:Not working for me shows "device id not in range."

Edit: Tried running it on a Geforce 560 GTX 2GB


Thanks for the attempt! Look for my reply above. I will try and fix the issue.
codedivine
Gerbil Elite
Silver subscriber
 
 
Posts: 702
Joined: Sat Jan 24, 2009 8:13 am

Re: Trinity: Help me run a GPU test!

Postposted on Tue May 28, 2013 5:11 am

Not sure how relevant it is, but here's some feedback on a GT330 - claiming it doesnt' have SM1.3 (it does) or map_f64_to_f32 directive (not sure what that is)
I guess the underlying architecture (GT215) is quite old even if the card is from 2011.

GPU selected GeForce GT 330
Op = Add
Error creating program from source 0 -42 -45
Buildlog 4096 ptxas application ptx input, line 38; : error : Instruction 'cvt' requires SM 1.3 or higher, or map_f64_to_f32 directive
<insert large, flashing, epileptic-fit-inducing signature (based on the latest internet-meme) here>
Chrispy_
Gerbil Jedi
Gold subscriber
 
 
Posts: 1782
Joined: Fri Apr 09, 2004 3:49 pm

Re: Trinity: Help me run a GPU test!

Postposted on Tue May 28, 2013 3:15 pm

Hi folks. Please redownload the binary. I have modified the test so that it will test all the fp64 capable GPUs in your system. Thus, command line parameters are no longer required.
codedivine
Gerbil Elite
Silver subscriber
 
 
Posts: 702
Joined: Sat Jan 24, 2009 8:13 am

Re: Trinity: Help me run a GPU test!

Postposted on Tue May 28, 2013 3:29 pm

Tried running it on a 5850 with Catalyst 13.4:

The program was unable to start correctly (0xc000007b). Click OK to close the application.

Okay. :(


Forget about that, I just needed to install the MSVS x64 2012 runtime.

Got this result now for a 5850 running at 850/1250MHz (about 20% above stock):
Code: Select all
Device selected Cypress
Op = Add
Time 1.39822ms GOps/s 47.9958
Op = Mul
Time 2.1868ms GOps/s 30.6882
Op = Fma
Time 2.22738ms GOps/s 30.1291


Side note: scores are still much higher when I keep the GPU from falling asleep by running a D3D9 program:
Code: Select all
Device selected Cypress
Op = Add
Time 0.476667ms GOps/s 140.788
Op = Mul
Time 0.611111ms GOps/s 109.814
Op = Fma
Time 0.650222ms GOps/s 103.209
Last edited by Orwell on Tue May 28, 2013 3:47 pm, edited 5 times in total.
Phenom II X4 @ 3600/2400 | Radeon 5850 @ 850/1250 | Samsung 830 128GB | Samsung SyncMaster T260 | Boston Acoustics A26
Orwell
Gerbil
 
Posts: 32
Joined: Wed Apr 03, 2013 3:28 pm

Re: Trinity: Help me run a GPU test!

Postposted on Tue May 28, 2013 3:34 pm

Orwell wrote:Tried running it on a 5850 with Catalyst 13.4:

The program was unable to start correctly (0xc000007b). Click OK to close the application.

Okay. :(


Well not sure what to make of that. Looking for solution.
codedivine
Gerbil Elite
Silver subscriber
 
 
Posts: 702
Joined: Sat Jan 24, 2009 8:13 am

Re: Trinity: Help me run a GPU test!

Postposted on Tue May 28, 2013 4:15 pm

I have increased the problem size and uploaded the updated version. This should perhaps be more accurate for discrete GPUs, though really it is not that great for big GPUs.
codedivine
Gerbil Elite
Silver subscriber
 
 
Posts: 702
Joined: Sat Jan 24, 2009 8:13 am

Re: Trinity: Help me run a GPU test!

Postposted on Tue May 28, 2013 4:29 pm

:P

I blindly increased the problem size at line 60 to 4096*4096 (from 1024*16 I believe) and it looks much better now:

Code: Select all
GPU selected Cypress
Op = Add
Time 145.021ms GOps/s 473.859
Op = Mul
Time 284.319ms GOps/s 241.698
Op = Fma
Time 284.232ms GOps/s 241.773


Compiled with GCC 4.7.1 x64 bundled with this: http://sourceforge.net/projects/orwelldevcpp/, using the OpenCL libraries provided by AMD.
Phenom II X4 @ 3600/2400 | Radeon 5850 @ 850/1250 | Samsung 830 128GB | Samsung SyncMaster T260 | Boston Acoustics A26
Orwell
Gerbil
 
Posts: 32
Joined: Wed Apr 03, 2013 3:28 pm

Re: Trinity: Help me run a GPU test!

Postposted on Tue May 28, 2013 4:58 pm

Fixed the problem size issue by making it dynamic. Now it should ensure that each test is run for at least 100ms on each GPU.
codedivine
Gerbil Elite
Silver subscriber
 
 
Posts: 702
Joined: Sat Jan 24, 2009 8:13 am

Re: Trinity: Help me run a GPU test!

Postposted on Tue May 28, 2013 5:03 pm

Orwell wrote::P

I blindly increased the problem size at line 60 to 4096*4096 (from 1024*16 I believe) and it looks much better now:

Code: Select all
GPU selected Cypress
Op = Add
Time 145.021ms GOps/s 473.859
Op = Mul
Time 284.319ms GOps/s 241.698
Op = Fma
Time 284.232ms GOps/s 241.773


Compiled with GCC 4.7.1 x64 bundled with this: http://sourceforge.net/projects/orwelldevcpp/, using the OpenCL libraries provided by AMD.


Thanks! I have updated the repo. You may be interested in the changes :)
codedivine
Gerbil Elite
Silver subscriber
 
 
Posts: 702
Joined: Sat Jan 24, 2009 8:13 am

Re: Trinity: Help me run a GPU test!

Postposted on Tue May 28, 2013 5:07 pm

Here's the end result for my 5850:
Image

This proves the add part in FMA is basically free on this card. Yay. Here's the source code:
http://wilcobrouwer.nl/bestanden/MaxFLOPS%20Orwell.7z

So, yes, I was bored.
Last edited by Orwell on Tue May 28, 2013 5:13 pm, edited 1 time in total.
Phenom II X4 @ 3600/2400 | Radeon 5850 @ 850/1250 | Samsung 830 128GB | Samsung SyncMaster T260 | Boston Acoustics A26
Orwell
Gerbil
 
Posts: 32
Joined: Wed Apr 03, 2013 3:28 pm

Re: Trinity: Help me run a GPU test!

Postposted on Tue May 28, 2013 5:10 pm

Orwell wrote:Here's the end result for my 5850:
Image

This proves the add part in FMA is basically free on this card. Yay. Here's the source code:
http://wilcobrouwer.nl/bestanden/MaxFLOPS%20Orwell.7z

So, yes, I was bored.


Awesome! :D
codedivine
Gerbil Elite
Silver subscriber
 
 
Posts: 702
Joined: Sat Jan 24, 2009 8:13 am

Re: Trinity: Help me run a GPU test!

Postposted on Wed May 29, 2013 11:11 pm

D:\>maxflopscl
Device selected GeForce GTX 560
Device compute units: 7
Device extensions:
cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_d3d9_sharing cl
_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_compiler_option
s cl_nv_device_attribute_query cl_nv_pragma_unroll cl_khr_global_int32_base_ato
mics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr
_local_int32_extended_atomics cl_khr_fp64
FP64 supported with configuration: CL_FP_DENORM CL_FP_INF_NAN CL_FP_ROUND_TO_NEA
REST CL_FP_ROUND_TO_ZERO CL_FP_ROUND_TO_INF CL_FP_FMA
Testing DP performance
Op = Add
Time 157.156ms GOps/s 54.6588
Op = Mul
Time 157.154ms GOps/s 54.6593
Op = Fma
Time 157.637ms GOps/s 54.4918

D:\>


560 running 985/1970
biffzinker
Gerbil First Class
 
Posts: 111
Joined: Tue Mar 21, 2006 3:53 pm
Location: Anchorage, Alaska

Re: Trinity: Help me run a GPU test!

Postposted on Fri May 31, 2013 12:11 am

biffzinker wrote:
D:\>maxflopscl
Device selected GeForce GTX 560
Device compute units: 7
Device extensions:
cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_d3d9_sharing cl
_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_compiler_option
s cl_nv_device_attribute_query cl_nv_pragma_unroll cl_khr_global_int32_base_ato
mics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr
_local_int32_extended_atomics cl_khr_fp64
FP64 supported with configuration: CL_FP_DENORM CL_FP_INF_NAN CL_FP_ROUND_TO_NEA
REST CL_FP_ROUND_TO_ZERO CL_FP_ROUND_TO_INF CL_FP_FMA
Testing DP performance
Op = Add
Time 157.156ms GOps/s 54.6588
Op = Mul
Time 157.154ms GOps/s 54.6593
Op = Fma
Time 157.637ms GOps/s 54.4918

D:\>


560 running 985/1970


Thanks! Interesting to note that for Nvidia cards, ADDs, MULs and FMAs all have equal throughput unlike the AMD card Orwell tested above.
codedivine
Gerbil Elite
Silver subscriber
 
 
Posts: 702
Joined: Sat Jan 24, 2009 8:13 am

Re: Trinity: Help me run a GPU test!

Postposted on Fri May 31, 2013 4:45 am

Radeon HD7950
Device selected Tahiti
Device compute units: 28
Device extensions:
cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_d3d10_sharing cl_khr_image2d_from_buffer
FP64 supported with configuration: CL_FP_DENORM CL_FP_INF_NAN CL_FP_ROUND_TO_NEAREST CL_FP_ROUND_TO_ZERO CL_FP_ROUND_TO_INF CL_FP_FMA
Testing DP performance
Op = Add
Time 165.452ms GOps/s 830.686
Op = Mul
Time 164.605ms GOps/s 417.48
Op = Fma
Time 164.678ms GOps/s 417.296
JustAnEngineer
Gerbil God
Gold subscriber
 
 
Posts: 15343
Joined: Sat Jan 26, 2002 7:00 pm
Location: The Heart of Dixie

Re: Trinity: Help me run a GPU test!

Postposted on Sat Jun 01, 2013 12:47 am

JustAnEngineer wrote:Radeon HD7950
Device selected Tahiti
Device compute units: 28
Device extensions:
cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_d3d10_sharing cl_khr_image2d_from_buffer
FP64 supported with configuration: CL_FP_DENORM CL_FP_INF_NAN CL_FP_ROUND_TO_NEAREST CL_FP_ROUND_TO_ZERO CL_FP_ROUND_TO_INF CL_FP_FMA
Testing DP performance
Op = Add
Time 165.452ms GOps/s 830.686
Op = Mul
Time 164.605ms GOps/s 417.48
Op = Fma
Time 164.678ms GOps/s 417.296


Thanks! We again see that it does ADD at twice the rate as MUL and FMA. Tahiti appears to do fp64 ADDs at 1/2 fp32 rate, while fp64 MUL and FMA are 1/4 fp32 rate.
codedivine
Gerbil Elite
Silver subscriber
 
 
Posts: 702
Joined: Sat Jan 24, 2009 8:13 am

Re: Trinity: Help me run a GPU test!

Postposted on Sat Jun 01, 2013 4:54 am

Incidentally, there may be something odd going on with your timing calculation. When I tried running it from the command line and redirecting the output to a file, I got different (lower) results.
Code: Select all
maxflopscl >max.txt
i7-4770K, H70, Gryphon Z87, 16 GiB, R9-290, SSD, 2 HD, Blu-ray, SB ZX, TJ08-E, SS-660XP², 3007WFP+2001FP, RK-9000BR, MX518
JustAnEngineer
Gerbil God
Gold subscriber
 
 
Posts: 15343
Joined: Sat Jan 26, 2002 7:00 pm
Location: The Heart of Dixie

Re: Trinity: Help me run a GPU test!

Postposted on Sat Jun 08, 2013 4:35 am

Radeon HD6970
Device selected Cayman
Device compute units: 24
Device extensions:
cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_d3d10_sharing cl_amd_image2d_from_buffer_read_only
FP64 supported with configuration: CL_FP_DENORM CL_FP_INF_NAN CL_FP_ROUND_TO_NEAREST CL_FP_ROUND_TO_ZERO CL_FP_ROUND_TO_INF CL_FP_FMA
Testing DP performance
Op = Add
Time 104.77ms GOps/s 655.91
Op = Mul
Time 103.034ms GOps/s 333.479
Op = Fma
Time 103.164ms GOps/s 333.058
i7-4770K, H70, Gryphon Z87, 16 GiB, R9-290, SSD, 2 HD, Blu-ray, SB ZX, TJ08-E, SS-660XP², 3007WFP+2001FP, RK-9000BR, MX518
JustAnEngineer
Gerbil God
Gold subscriber
 
 
Posts: 15343
Joined: Sat Jan 26, 2002 7:00 pm
Location: The Heart of Dixie

Re: Trinity: Help me run a GPU test!

Postposted on Sat Jun 08, 2013 1:32 pm

GeForce GTX460-768
Device selected GeForce GTX 460
Device compute units: 7
Device extensions:
cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_d3d9_sharing cl_nv_d3d10_sharing cl_khr_d3d10_sharing cl_nv_d3d11_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64
FP64 supported with configuration: CL_FP_DENORM CL_FP_INF_NAN CL_FP_ROUND_TO_NEAREST CL_FP_ROUND_TO_ZERO CL_FP_ROUND_TO_INF CL_FP_FMA
Testing DP performance
Op = Add
Time 107.804ms GOps/s 39.8405
Op = Mul
Time 107.73ms GOps/s 39.868
Op = Fma
Time 107.988ms GOps/s 39.7726
i7-4770K, H70, Gryphon Z87, 16 GiB, R9-290, SSD, 2 HD, Blu-ray, SB ZX, TJ08-E, SS-660XP², 3007WFP+2001FP, RK-9000BR, MX518
JustAnEngineer
Gerbil God
Gold subscriber
 
 
Posts: 15343
Joined: Sat Jan 26, 2002 7:00 pm
Location: The Heart of Dixie


Return to Graphics

Who is online

Users browsing this forum: No registered users and 2 guests