PDA

View Full Version : Multithreading and OpenGL



rombust
11-01-2010, 03:24 PM
I tried to create a new example (Commited to 2.2 and 2.3 SVN : Examples/Display/Thread)

See image:

http://esoteric.clanlib.org/~rombust/ReleaseImages/thread.png

The main thread should draw a rotating texture running at full frame rate (currently set at 60fps)

The worker thread should create the texture.

The textures are double buffered, so the texture (A) is read from and texture (B) is written to.

The texture is written to using a CL_FrameBufferObject via an OpenGL shader.

The shader renders a simple mandelbrot set.

That's the theory.

On Windows the example runs very slowly. It seems that the main thread is waiting for the worker thread to complete.

I am using the latest ATI graphics driver.

On Linux, the example runs at 60fps. I did not check the example running flatout (without flip(1), and using flip(0) )

Before the thread starts, I use:
CL_GraphicContext worker_gc = gc.create_worker_gc();
That is passed to the thread

The thread contains: CL_SharedGCData::add_ref(); .... CL_SharedGCData::release_ref();

Without the add_ref() the application crashes in CL_SharedGCData

Is add_ref() required?

On windows, the application crashes on exit with release_ref(). On linux, it works

Except once, where it crashed the entire ubuntu desktop (freezing the keyboard/display)

Can someone have a look please :)

rombust
11-01-2010, 05:32 PM
I spoke to somebody at ##opengl (at irc.freenode.net )

They suggested that there is no guarantee whether the GLSL shader on the graphics card will run in parallel or serial in this situation.

(i.e. interleaving or queuing the request)

Can anyone verify this?

It that is true (and it make sense). The example will need adjusting to create the texture in the worker thread via a pixel buffer object.

Also the question about CL_SharedGCData add_ref() still remains.

bit0
11-01-2010, 09:51 PM
Thanks rombust, very interesting and useful example.
I'm not sure I understand OpenGL well enough yet, but I would try the following strategy:

In the renderer thread, create a PBO; no GC sharing. Then loop as follows: map the PBO and then tell the worker thread to write data to it (by passing the PBO's pointer).
The worker thread draws into the PBO, then notifies the renderer thread when it has finished.
When the renderer thread receives the notification, unmaps the PBO, calls glTexImage2D() and then renders the texture on the 3D scene.

Now evolve this scenario to use two PBOs so that you always have one texture free for rendering at your desired rate in the first thread, while the other is being updated by the worker (and pipeline stalls should be minimized).

Does it make any sense? Apologies in advance if I talked nonsense :-)

bit0

rombust
11-02-2010, 09:17 AM
Okay.

In the renderer thread, I created a PBO, but still with GC sharing, to show that you can.

http://esoteric.clanlib.org/~rombust/ReleaseImages/thread2.png

The main thread runs fast (3332fps in that image).

There still remains some problems:

1) It crashes on exit (because of the CL_SharedGCData I think)
2) clanSWRender target crashes (I have not looked at why)
3) clanGL1 target crashes (probably pbo's not implemented I guess)

rombust
11-03-2010, 10:13 AM
The problems have now been fixed

The ClanLib developers decided to remove create_worker_gc()

The reasons are:
1) It makes ClanLib implementation more complex. We currently share data between different OpenGL contexts using CL_SharedGCData and wglShareLists. It will be tricky to get it right when also using multiple threads. That is assuming Windows and Linux graphic card drivers fully support it.

2) There is no advantage in speed on current graphics cards. At this moment in time, it is unknown if using multiple core GPU's would change this. That is assuming it is supported by OpenGL.

The example has been updated to reflect this. And it should be much more stable