AparapiFractals
AparapiFractals
- Mandelbrot explorer : mouse controls and zoom
- benchmark : no graphics, tests on different devices
Benchmark - Soft
canvas : 500 x 500
maxIterations : 100,000
region : -2.0000000000000000d,-2.0000000000000000d 2.0000000000000000d,2.0000000000000000d
+------------------------------------------+--------------------------------+--------------+--------------------------------+------------+
| KEY | DEVICE | LOCAL SIZES | NAME | Elapsed(ms)|
+------------------------------------------+--------------------------------+--------------+--------------------------------+------------+
| AUTO | NVIDIA<GPU> | 10 x 25 | GeForce GTX 1650 SUPER | 363 |
| NVIDIA<GPU> (11681328) | NVIDIA<GPU> | 8 x 8 | GeForce GTX 1650 SUPER | 330 |
| Java Alternative Algorithm (-2) | Java Alternative Algorithm | 8 x 8 | Java Alternative Algorithm | 427 |
| Java Thread Pool (-3) | Java Thread Pool | 8 x 8 | Java Thread Pool | 390 |
+------------------------------------------+--------------------------------+--------------+--------------------------------+------------+
============================================================================================================================================
Benchmark - Hard
canvas : 500 x 500
maxIterations : 1,000,000
region : -2.0000000000000000d,-2.0000000000000000d 2.0000000000000000d,2.0000000000000000d
+------------------------------------------+--------------------------------+--------------+--------------------------------+------------+
| KEY | DEVICE | LOCAL SIZES | NAME | Elapsed(ms)|
+------------------------------------------+--------------------------------+--------------+--------------------------------+------------+
| AUTO | NVIDIA<GPU> | 10 x 25 | GeForce GTX 1650 SUPER | 3222 |
| NVIDIA<GPU> (20593056) | NVIDIA<GPU> | 8 x 8 | GeForce GTX 1650 SUPER | 3042 |
| Java Alternative Algorithm (-2) | Java Alternative Algorithm | 8 x 8 | Java Alternative Algorithm | 3986 |
| Java Thread Pool (-3) | Java Thread Pool | 8 x 8 | Java Thread Pool | 4786 |
+------------------------------------------+--------------------------------+--------------+--------------------------------+------------+
Merge request reports
Activity
Just ran this, it mostly works great but with a few issues...
-
the GUI freezes when the background operation is working. In other words if i run a benchmark the GUI iis frozen until the benchmark finishes. If i scroll in and the scrolling is slow enough the GUI freezes as well. GUI code should never run code that isnt responsive without sticking it into a seperate thread.
-
It doesnt seem to handle multiple graphics cards directly. While at first it was doing a fine job of running off my gpu, when i tried to switch to one of the other GPUs in the list it didnt actually switch which GPU it was utilizing and continued as normal on the first GPU
-
This may be related, probably is, but when i try to run the hard benchmark from the command line it errors out for me (error pasted at the bottom).
-
related closesly to #3 if i run the benchmark from the GUI I see similar errors and can not resume operation of the GUI.
38) AparapiFractals - soft benchmark 39) AparapiFractals - hard benchmark Enter your selection, or q/Q to quit: 39 #pragma OPENCL EXTENSION cl_khr_fp64 : enable typedef struct This_s{ int wmax; int hmax; double cx1; double wx; double cy1; double hy; int max_iterations; __global int *result; int result__javaArrayLength0; int result__javaArrayDimension0; int result__javaArrayLength1; int result__javaArrayDimension1; int passid; }This; int get_pass_id(This *this){ return this->passid; } __kernel void run( int wmax, int hmax, double cx1, double wx, double cy1, double hy, int max_iterations, __global int *result, int result__javaArrayLength0, int result__javaArrayDimension0, int result__javaArrayLength1, int result__javaArrayDimension1, int passid ){ This thisStruct; This* this=&thisStruct; this->wmax = wmax; this->hmax = hmax; this->cx1 = cx1; this->wx = wx; this->cy1 = cy1; this->hy = hy; this->max_iterations = max_iterations; this->result = result; this->result__javaArrayLength0 = result__javaArrayLength0; this->result__javaArrayDimension0 = result__javaArrayDimension0; this->result__javaArrayLength1 = result__javaArrayLength1; this->result__javaArrayDimension1 = result__javaArrayDimension1; this->passid = passid; { int w = get_global_id(0); int h = get_global_id(1); if (w<this->wmax && h<this->hmax){ double cx = this->cx1 + ((double)w * this->wx); double cy = this->cy1 + ((double)h * this->hy); double xn = cx; double yn = cy; double y2 = cy * cy; int t = 0; t = 0; for (; t<this->max_iterations && ((xn * xn) + y2)<4.0; t++){ yn = ((2.0 * xn) * yn) + cy; xn = ((xn * xn) - y2) + cx; y2 = yn * yn; } (&this->result[w * this->result__javaArrayDimension0])[h] = t; } return; } } Starting benchmark... Example : 'GeForce GTX 1650 SUPER' soft 348ms, hard 3058ms Example : 'Java Alternative Algorithm', AMD 3700X, soft 390ms, hard 3993ms OperatingSystem: Linux CPU:32 amd64 ============================================================================================================================================ Benchmark - Hard canvas : 500 x 500 maxIterations : 1,000,000 region : -2.0000000000000000d,-2.0000000000000000d 2.0000000000000000d,2.0000000000000000d +------------------------------------------+--------------------------------+--------------+--------------------------------+------------+ | KEY | DEVICE | LOCAL SIZES | NAME | Elapsed(ms)| +------------------------------------------+--------------------------------+--------------+--------------------------------+------------+ after clEnqueueNDRangeKernel, globalSize[0] = 500, localSize[0] = 20 after clEnqueueNDRangeKernel, globalSize[1] = 500, localSize[1] = 25 !!!!!!! clEnqueueNDRangeKernel() failed invalid work group size Nov 18, 2020 5:44:28 PM com.aparapi.internal.kernel.KernelRunner fallBackToNextDevice WARNING: Device failed for AfKernel, devices={AMD<GPU>|AMD<GPU>|AMD<GPU>|AMD<GPU>|Java Alternative Algorithm|Java Thread Pool}: OpenCL execution seems to have failed (runKernelJNI returned -54) com.aparapi.internal.exception.AparapiException: OpenCL execution seems to have failed (runKernelJNI returned -54) at com.aparapi.internal.kernel.KernelRunner.executeOpenCL(KernelRunner.java:1263) at com.aparapi.internal.kernel.KernelRunner.executeInternalInner(KernelRunner.java:1722) at com.aparapi.internal.kernel.KernelRunner.executeInternalOuter(KernelRunner.java:1383) at com.aparapi.internal.kernel.KernelRunner.execute(KernelRunner.java:1374) at com.aparapi.Kernel.execute(Kernel.java:2897) at com.aparapi.Kernel.execute(Kernel.java:2854) at com.aparapi.Kernel.execute(Kernel.java:2794) at com.aparapi.examples.afmandelbrot.AfAparapiUtils.execute(AfAparapiUtils.java:156) at com.aparapi.examples.afmandelbrot.AfAparapiUtils.benchmark(AfAparapiUtils.java:201) at com.aparapi.examples.afmandelbrot.AfMain.benchmarkHard(AfMain.java:331) at com.aparapi.examples.afmandelbrot.AfBenchmark.main(AfBenchmark.java:42) at com.aparapi.examples.All.selected(All.java:234) at com.aparapi.examples.All.main(All.java:99) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:564) at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:282) at java.base/java.lang.Thread.run(Thread.java:832) Nov 18, 2020 5:44:28 PM com.aparapi.internal.kernel.KernelRunner fallBackToNextDevice WARNING: Trying next device: AMD<GPU> | AUTO | AMD<GPU> | 20 x 25 | gfx900 | 1340 | [WARNING] java.lang.AssertionError: user supplied Device incompatible with current EXECUTION_MODE or getTargetDevice(); device = AMD<GPU>; kernel = AfKernel, devices={AMD<GPU>|AMD<GPU>|AMD<GPU>|Java Alternative Algorithm|Java Thread Pool} at com.aparapi.internal.kernel.KernelRunner.executeInternalInner (KernelRunner.java:1425) at com.aparapi.internal.kernel.KernelRunner.executeInternalOuter (KernelRunner.java:1383) at com.aparapi.internal.kernel.KernelRunner.execute (KernelRunner.java:1374) at com.aparapi.Kernel.execute (Kernel.java:2897) at com.aparapi.Kernel.execute (Kernel.java:2854) at com.aparapi.Kernel.execute (Kernel.java:2794) at com.aparapi.examples.afmandelbrot.AfAparapiUtils.execute (AfAparapiUtils.java:156) at com.aparapi.examples.afmandelbrot.AfAparapiUtils.benchmark (AfAparapiUtils.java:201) at com.aparapi.examples.afmandelbrot.AfMain.benchmarkHard (AfMain.java:331) at com.aparapi.examples.afmandelbrot.AfBenchmark.main (AfBenchmark.java:42) at com.aparapi.examples.All.selected (All.java:234) at com.aparapi.examples.All.main (All.java:99) at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0 (Native Method) at jdk.internal.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62) at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke (Method.java:564) at org.codehaus.mojo.exec.ExecJavaMojo$1.run (ExecJavaMojo.java:282) at java.lang.Thread.run (Thread.java:832)
-
-
Improved GUI, It's almost all on separate threads.
Zoom scrolling at deeper zoom level is not fluent and auto-disables realtime refresh. -
I bought a second GPU and tested GPU switching.
-
I have no idea, I never had this error.
clEnqueueNDRangeKernel() failed invalid work group size
may be related to global and local width.
Let's see what happens with this new version. -
GUI and Benchmark use the same kernel.
Benchmark with 2 GPUs :
AparapiFractals - Mandelbrot Benchmark - Soft image size : 500 x 500 maxIterations : 100.000 complex region : -2,0000000000000000d,-2,0000000000000000d 2,0000000000000000d,2,0000000000000000d +-----+--------------------------------+----------------+--------------------------------------------+----------+--------+------------+ |Type | shortDescription | deviceId | Name | LSizes | ExMode | Elapsed(ms)| +-----+--------------------------------+----------------+--------------------------------------------+----------+--------+------------+ | GPU | NVIDIA<GPU> | 2116615257056 | GeForce GTX 1650 SUPER | 10 x 25 | AUTO | 358 | | GPU | NVIDIA<GPU> | 2116615257056 | GeForce GTX 1650 SUPER | 8 x 8 | GPU | 335 | | GPU | AMD<GPU> | 2116618425296 | Oland | 8 x 8 | GPU | 1259 | | ALT | Java Alternative Algorithm | -2 | Java Alternative Algorithm | 8 x 8 | JTP | 427 | | JTP | Java Thread Pool | -3 | Java Thread Pool | 8 x 8 | JTP | 399 | +-----+--------------------------------+----------------+--------------------------------------------+----------+--------+------------+ ======================================================================================================================================= Profiles by Kernel Subclass (mean elapsed times in milliseconds) Device Count CLASS_MODEL_BUILT INIT_JNI OPENCL_GENERATED OPENCL_COMPILED PREPARE_EXECUTE EXECUTED Total ----------------- [[ AfKernel ]] --------------------------------------------------------------------------------------------------- NVIDIA<GPU> 5 28,393 31,012 0,836 0,388 0,044 139,218 199,892 AMD<GPU> 2 0,006 0,015 0,000 37,955 0,029 630,363 668,368 Java Thread Pool 4 0,000 0,000 0,000 0,000 0,009 207,219 207,227
-