PDA

View Full Version : Thread example



ArtHome
05-12-2016, 08:14 AM
Hello!

Please help with an explanation of techniques described in Example Thread:
1. Created two instances "clan :: Texture2D"
2. Created two instances "clan :: PixelBuffer"
3. Parallel thread calculates fractal and writes it to one of the "clan :: PixelBuffer", and the second is used to read in the main thread. Everything is ok.
4. Next the same change "clan :: Texture2D"

First buffering at "clan :: PixelBuffer", then again at the level of "clan :: Texture2D", and then a third time at the level of graphics system when doing `window.flip (0);`

If I comment out the buffering level "clan :: Texture2D", the example works without visible changes, so it is not necessary from a technical aspect, and perhaps methodologically.

Can someone explain what the author conceived in this example?

rombust
06-01-2016, 03:26 PM
Hello!

Please help with an explanation of techniques described in Example Thread:
1. Created two instances "clan :: Texture2D"
2. Created two instances "clan :: PixelBuffer"
3. Parallel thread calculates fractal and writes it to one of the "clan :: PixelBuffer", and the second is used to read in the main thread. Everything is ok.
4. Next the same change "clan :: Texture2D"

First buffering at "clan :: PixelBuffer", then again at the level of "clan :: Texture2D", and then a third time at the level of graphics system when doing `window.flip (0);`

If I comment out the buffering level "clan :: Texture2D", the example works without visible changes, so it is not necessary from a technical aspect, and perhaps methodologically.

Can someone explain what the author conceived in this example?

I cannot remember how it works (I wrote this example many years ago).
iirc, this is how it "should" work.
1. The thread renders the fractal
2. The main thread upload the pixelbuffer to a texture. The texture is buffered to prevent pipeline stalls. I.E. the GPU driver would stall if trying to display and upload the texture at the same time.

(ClanLib only supports uploading data to opengl on the main thread.)

ArtHome
06-02-2016, 12:30 PM
I cannot remember how it works (I wrote this example many years ago).
iirc, this is how it "should" work.
1. The thread renders the fractal
2. The main thread upload the pixelbuffer to a texture.

Ok, I see the beginning uploading at code:


pixelbuffer_write->unlock();

texture_write->set_subimage(canvas, 0, 0, *pixelbuffer_write, pixelbuffer_write->get_size());

Is really uploading `pixelbuffer` to the `texture_write` does not finished after returns from `set_subimage()` and is it still going on for some time? Or this code does only some initializing and does not start transfer data?


The texture is buffered to prevent pipeline stalls. I.E. the GPU driver would stall if trying to display and upload the texture at the same time.
And is at what point the data transmission begins from the texture to the GPU - at the call `clan::Image image(*texture...); image.draw()`?

If pixelbuf starts transfer to the GPU at call `texture_write->set_subimage()` and finished after returns from `image.draw()` I understand the logic of the example. But it is really hard for noob like me :)


The part of code for your convenience with hope for further clarifications :)


bool App::update()
{
... skip the code related to the fractal

// If the pixel buffer was uploaded on the last frame, double buffer it
if (texture_write_active)
{
texture_write_active = false;
if (texture_buffers_offset == 0)
{
texture_buffers_offset = 1;
texture_write = &texture_buffers[1];
texture_completed = &texture_buffers[0];
}
else
{
texture_buffers_offset = 0;
texture_write = &texture_buffers[0];
texture_completed = &texture_buffers[1];
}
}

// Wait for pixel buffer completion
std::unique_lock<std::mutex> lock(thread_mutex);
if (thread_complete_flag == true)
{
thread_complete_flag = false;
pixelbuffer_write->unlock();

texture_write->set_subimage(canvas, 0, 0, *pixelbuffer_write, pixelbuffer_write->get_size());
texture_write_active = true;
// Note the worker thread will start on the other pixelbuffer straight away, in the next "if" statement
}

// Start a new transfer when required
if ((thread_start_flag == false))
{
worker_thread_framerate_counter.frame_shown();

// Swap the pixelbuffer's
if (pixel_buffers_offset == 0)
{
pixel_buffers_offset = 1;
pixelbuffer_write = &pixel_buffers[1];
pixelbuffer_completed = &pixel_buffers[0];
}
else
{
pixel_buffers_offset = 0;
pixelbuffer_write = &pixel_buffers[0];
pixelbuffer_completed = &pixel_buffers[1];
}

pixelbuffer_write->lock(canvas, clan::access_write_only);
dest_pixels = (unsigned char *) pixelbuffer_write->get_data();
thread_start_flag = true;
thread_complete_flag = false;

// Adjust the mandelbrot scale
float mandelbrot_time_delta_ms = (float) (current_time - last_mandelbrot_time);
last_mandelbrot_time = current_time;
scale -= scale * mandelbrot_time_delta_ms / 1000.0f;
if (scale <= 0.001f)
scale = 4.0f;

thread_worker_event.notify_all();
}
pixelbuffer_write->unlock();

// Draw rotating mandelbrot
canvas.set_transform(clan::Mat4f::translate(canvas .get_width()/2, canvas.get_height()/2, 0.0f) * clan::Mat4f::rotate(clan::Angle(angle, clan::angle_degrees), 0.0f, 0.0f, 1.0f));
clan::Image image(*texture_completed, clan::Size(texture_size, texture_size));
image.draw( canvas, -texture_size/2, -texture_size/2 );

canvas.set_transform(clan::Mat4f::identity());

... skip the part related to the FPS and error handling

window.flip(0);

return !quit;

}

rombust
06-02-2016, 04:55 PM
This performs the upload:
texture_write->set_subimage(canvas, 0, 0, *pixelbuffer_write, pixelbuffer_write->get_size());

When we call set_subimage() we tell the GPU driver to upload the pixelbuffer

If internally the GPU is already using the texture (to draw the current frame), the driver has 2 options: wait, or defer copying (copy memory to a transfer buffer)

We want to avoid waiting for the GPU to finish. This wastes CPU cycles

It is possible that the latest graphics cards and drivers no longer have this performance issue.

We have no control over when the graphic card internals actually perform rendering of our image to the screen.
Internal GPU rendering may actually happen after flip() is called, and the screen drawn later (or not at all - see Runt Frames http://techreport.com/review/24553/inside-the-second-with-nvidia-frame-capture-tools/9 )

However we can make it easier for the GPU (that's why we have the texture buffers, so rendering and uploading can occur simultaneously)

"clan::Image image(*texture...); image.draw()" Uses the clan::Image functions to draw the clan::Texture.
I guess we could have "static clan::Image::draw_texture(texture,...);"

- - - Updated - - -

So, using an unbuffered clan::Texture may be very fast on a new graphics card. (Maybe depending how many texture's you have)

ArtHome
06-03-2016, 08:39 AM
Thanks for the detailed explanation :hat: