PDA

View Full Version : 3D acceleration not working



Pap
07-11-2013, 10:13 PM
I have issues making OpenGL working properly. Even a simple ClanLib program showing a rotating rectangle runs with a very low frame rate. The problem occurs in two computers, one with Nvidia graphics driver and one with an Intel integrated graphics adapter. Note that in both computers OpenGL is properly installed: glxgears works as expected, and both systems are able to run "heavy" 3D games that use OpenGL extensively with a very good FPS (60-120).
Operating system: Debian 8.0, 64-bit (same issues with Debian 7.0, I doubt the problem has to do with the OS anyway); ClanLib version: 2.3.7, compiled from source without problems (all required libraries installed and up-to-date); gcc version: 4.7.3.

The workarounds I found so far:

Using CL_Display::flip(0) instead of CL_Display::flip(1) to draw the screen increases frame rate dramatically. This is not an accepted solution, however, for two reasons. First, CL_Display::flip(0) means the screen is drawn without waiting for the next display refresh, which should be avoided, as clearly stated at http://clanlib.org/wiki/MainDocs:Timing. Second, it works only on the computer with Nvidia graphics card, but does not increase FPS at all on the computer with Intel graphics card.
Using software rendering (either by using CL_SetupSWRender on the code itself, or by running the executable with LIBGL_ALWAYS_SOFTWARE=1, or by disabling DRI at Xorg configuration file (setting Option "DRI" "False" in xorg.conf). Again, this workaround is not an acceptable solution; software rendering is only a temporary fallback, not to mention the glitches on FPS.

Given that in both computers graphics drivers and OpenGL are installed correctly, and ClanLib itself was compiled without any warnings or errors, I assume the problem has to do with how ClanLib cooperates with OpenGL, or I am missing something else here.
The following code implements a simple test example:

//test.cpp
// Choose the target renderer
// #define USE_SOFTWARE_RENDERER
// #define USE_OPENGL1
#define USE_OPENGL

#include <ClanLib/application.h>
#include "Game.hpp"

#ifdef USE_SOFTWARE_RENDERER
#include <ClanLib/swrender.h>
#endif
#ifdef USE_OPENGL
#include <ClanLib/gl.h>
#endif
#ifdef USE_OPENGL1
#include <ClanLib/gl1.h>
#endif

class Program {
public:
static int main(const std::vector<CL_String> &args) {
CL_SetupCore setup_core;
CL_SetupDisplay setup_display;
#ifdef USE_SOFTWARE_RENDERER
CL_SetupSWRender setup_renderer;
#endif
#ifdef USE_OPENGL1
CL_SetupGL1 setup_renderer;
#endif
#ifdef USE_OPENGL
CL_SetupGL setup_renderer;
#endif
try {
Game game;
game.run();
}
catch(CL_Exception &exception) {
// Create a console window for text-output if not available
CL_ConsoleWindow console("Console", 80, 160);
CL_Console::write_line("Exception caught: "+
exception.get_message_and_stack_trace());
console.display_close_message();

return -1;
}
return 0;
}
};
CL_ClanApplication app(&Program::main);


//Game.hpp
#ifndef GAME_HPP
#define GAME_HPP
#include <ClanLib/core.h>
#include <ClanLib/display.h>
class Game {
private:
bool quit;
int delay;
void on_window_close();
void on_key_pressed(const CL_InputEvent &,const CL_InputState &);
public:
void run();
Game();
~Game();
};
#endif


Game.cpp
#include "Game.hpp"

void Game::on_window_close() {quit=true;}

void Game::on_key_pressed(const CL_InputEvent &event,
const CL_InputState &state) {
if (event.id==CL_KEY_ESCAPE || event.id==CL_KEY_Q) quit=true;
else if (event.id==CL_KEY_UP) delay>0?delay--:delay;
else if (event.id==CL_KEY_DOWN) delay++;
}

Game::Game() {
quit=false;delay=0;
}

Game::~Game() {
}

void Game::run() {
CL_DisplayWindow window("ClanLib test",800,600);
CL_GraphicContext gc=window.get_gc();
CL_InputDevice keyboard=window.get_ic().get_keyboard();
CL_Slot slot_quit=
window.sig_window_close().connect(this,&Game::on_window_close);
CL_Slot slot_key_pressed=
keyboard.sig_key_down().connect(this,&Game::on_key_pressed);
CL_Font font(gc,"ClanFont",30);
CL_Size font_size=font.get_text_size(gc,"Test!");
int font_x=(gc.get_width()-font_size.width)/2;
CL_ResourceManager rectangle_resources("./data/rectangle.xml");
CL_Sprite rectangle(gc,"rectangle",&rectangle_resources);
while(!quit) {
gc.clear(CL_Colorf::black);
font.draw_text(gc,font_x,100,"Φούφουτος!",CL_Colorf::orangered);
rectangle.rotate(CL_Angle::from_degrees(1));
rectangle.draw(gc,350,250);
window.flip(1);
CL_KeepAlive::process();
CL_System::sleep(delay);
}
}where rectangle.xml just defines a simple rectangle sprite.

rombust
07-12-2013, 11:33 AM
Sounds like something strange is going on!

You mention NVidia, if that is via the nouveau driver, note that OpenGL support is limited. "Accelerated OpenGL, although progressing, is not yet supported."

If works with OpenGL 1.3 code quite well, that's "#define USE_OPENGL1" in your example.

Intel OpenGL 3.x hardware support was also poor on Linux (don't know if it still is)

The NVidia official driver runs nicely.

Try the "DisplayTarget" example, it's one of the better examples to compare performance with the various targets.

It might be helpful to try ClanLib 3.0, with the DisplayTarget example. But ClanLib 2.3 should work.

rombust
07-12-2013, 12:22 PM
I have just tried the DisplayTarget example, with ClanLib 3.0 beta on Suse Linux using the nouveau driver

GL1 - 360fps
GL3 - 360fps
SWRender - 220fps

You mentioned changing flip(1) to flip(0) speeded up fps ... If that's the case, your app would be faster that 60fps anyway?

Pap
07-12-2013, 02:20 PM
Thank you for your replies, I tried all what you suggested.


You mention NVidia, if that is via the nouveau driver, note that OpenGL support is limited.I am not using the nouveau driver, but the official NVidia driver, compiled from source, and up-to-date.



If works with OpenGL 1.3 code quite well, that's "#define USE_OPENGL1" in your example.Both GL and GL1 doesn't work as expected. Very low frame rate, unless I use flip(0) - and even that works only for the computer with Nvidia card; the one with Intel card is still slow even with flip(0) and the only way to get decent fps is to turn off 3d acceleration, although that Intel card is able to run other openGL applications more than decently..



Intel OpenGL 3.x hardware support was also poor on Linux (don't know if it still is)Not sure about that, but I do know computer with that Intel card can run 3D games pretty well. Graphics driver and openGL is installed properly, as the only applications I get low fps is those implemented in ClanLib.



You mentioned changing flip(1) to flip(0) speeded up fps ... If that's the case, your app would be faster that 60fps anyway?with flip(1) even the small example I posted earlier runs very very slowly, I guess 1-2 fps. The only way to get decent fps is either using flip(0) - and as I said that works only for Nvidia, or turning off hardware acceleration (using SWRender or by other means), which works in both computers but with fps glitches.

Here is a description of the two graphics cards and openGL isntalled, together with the results I get by the DisplayTarget example:

Graphics card: NVIDIA Corporation GT216 [GeForce GT 220] (rev a2)
OpenGL:
server glx vendor string: NVIDIA Corporation
server glx version string: 1.4
client glx vendor string: NVIDIA Corporation
client glx version string: 1.4
GLX version: 1.4
OpenGL vendor string: NVIDIA Corporation
OpenGL version string: 3.3.0 NVIDIA 304.88
OpenGL shading language version string: 3.30 NVIDIA via Cg compiler
DisplayTarget results:
GL: ~680 fps
GL1: ~680 fps
SWRender: ~190 fps
Graphics card: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller (rev 09)
OpenGL:
server glx vendor string: SGI
server glx version string: 1.4
client glx vendor string: Mesa Project and SGI
client glx version string: 1.4
GLX version: 1.4
OpenGL vendor string: Intel Open Source Technology Center
OpenGL version string: 3.0 Mesa 9.1.4
OpenGL shading language version string: 1.30
GL: 59 fps
GL1: 59 fps
SWRender: 160 fps

I really don't know what to do next, any suggestions and help is more than welcome.

Judas
07-12-2013, 04:35 PM
I see that the Intel card in the DisplayTarget example is running at 59 fps (60 hz with some rounding error in the calculation), which indicates that it is vsync limited. Some display drivers can be configured to ignore the vertical sync setting specified by the application. My guess is that the Intel driver is doing that.

In your own test code the problem is probably related to the System::sleep(delay) line. Try remove it, change flip(1) to flip(0) and see what FPS you are getting.

Pap
07-12-2013, 09:35 PM
I see that the Intel card in the DisplayTarget example is running at 59 fps (60 hz with some rounding error in the calculation), which indicates that it is vsync limited. Some display drivers can be configured to ignore the vertical sync setting specified by the application. My guess is that the Intel driver is doing that.I have to check that - Intel driver is not much documented so not sure how can I turn off vsync (if it is on by default). Still, that doesn't explain the very low fps I get on Nvidia as well.



In your own test code the problem is probably related to the System::sleep(delay) line. Try remove it, change flip(1) to flip(0) and see what FPS you are getting.
sleep(delay) was there to control the rotation speed, I removed it but nothing really changed; specifically:
Computer with Nvidia card: very low fps with flip(1), much better with flip(0), but still not what I get using a similar example based on SFML.
Computer with Intel driver: very low fps with either flip(1) and flip(0) - no difference at all.
I still don't get it; both computers can run heavy openGL 3d games with very good fps; Even the DisplayTarget example gives nice fps, but I get way lower fps with my simple program. Moreover, the "Text" example in /usr/share/ClanLib-2.3.7/Examples/Display_Text/Text says 1000 fps on Nvidia, 62 fps on Intel, but both rotate slowly, pretty much as slow as my rectangle in the test code I posted above; again, if I switch to flip(0) in Text.cpp, I get much better rotation speed on Nvidia but no change on Intel; in general, fps is far from being acceptable, unless I switch to SWRender.

rombust
07-12-2013, 10:37 PM
I can't think of what can be causing it.

If you say that other SDK's work, even when they lock the frame to the refresh rate, then it points to a problem with ClanLib.

When you say the FPS is low ... Is it visually low (i.e. it looks low), or measured low?

The reason I ask, is that you mention that the "Display_Text/Text says 1000 fps on Nvidia" but still rotates slowly.

I have observed that before, on a certain display manager (I can't remember which one).

For example Basic2D looked really smooth on one display manager, and jerky on another. But running at the same FPS

All flip(1) does, is call: glXSwapIntervalSGI or glXSwapIntervalMESA with the swap_interval. (In CL_OpenGLWindowProvider_GLX::flip(int interval) ). The code really should attempt MESA first, then SGI (not the other way around) imho.

If other SDK's can flip correctly, it'll be worth looking at how they do it.
Maybe there is glxSwapIntervalSuperCool() function.

More likely though, is that SGI is being used instead of MESA. Just a guess

Judas
07-13-2013, 12:33 AM
Graphics card: NVIDIA Corporation GT216 [GeForce GT 220] (rev a2)
DisplayTarget results:
GL: ~680 fps
GL1: ~680 fps
SWRender: ~190 fps
Graphics card: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller (rev 09)
OpenGL:
GL: 59 fps
GL1: 59 fps
SWRender: 160 fps



Those numbers were from your computer, right? I wouldn't say that 680 fps is unacceptably low. :)

How do you know what FPS your own test program is running at? From what I can tell on the code you pasted there is nothing there measuring what the FPS actually is. Could the error be in how you calculate the FPS?

Pap
07-13-2013, 10:34 AM
I see that the Intel card in the DisplayTarget example is running at 59 fps (60 hz with some rounding error in the calculation), which indicates that it is vsync limited. Some display drivers can be configured to ignore the vertical sync setting specified by the application. My guess is that the Intel driver is doing that.Indeed, this is the case. It turned out that Intel driver turns on vsync by default and ignores flip(0) so that it always acts as flip(1). I post a workaround here, in case someone else has the same issue with an Intel driver: in xorg.conf (usually located at /etc/X11) add the line

Option "SwapBufferWait" "False"
(depending on your configuration, xorg.conf may not even exist, in which case you should either create one by Xorg -configure or add the line above in the file corresponding to your graphics driver, located at /usr/share/X11/xorg.conf.d, and usually named "20-intel.conf".)
This is not a perfect solution, however, as it still ignores application's vsync setting so that it now always acts as flip(0), even if you use flip(1). I am still looking for a better solution.



Those numbers were from your computer, right? I wouldn't say that 680 fps is unacceptably low. :)Those numbers is what DisplayTarget reports, but having a look at /usr/share/ClanLib-2.3.7/Examples/Display/DisplayTarget/Sources/target.cpp I see that it uses flip(0). If I change it to flip(1) I get 60 fps.



When you say the FPS is low ... Is it visually low (i.e. it looks low), or measured low?
The reason I ask, is that you mention that the "Display_Text/Text says 1000 fps on Nvidia" but still rotates slowly.

How do you know what FPS your own test program is running at? From what I can tell on the code you pasted there is nothing there measuring what the FPS actually is. Could the error be in how you calculate the FPS?
Good question; actually I tried to keep the test code minimal; I see a slow rotation, at least visually slow. However, I don't think a way to measure fps is needed, at least for a rough estimation (but still accurate enough to get an idea): running my small test program (with flip(1),) a full rotation of the rectangle takes about 6 seconds (measured using a stopwatch); now, each frame increases angle by one degree, with rectangle.rotate(CL_Angle::from_degrees(1)) so I should expect 60 degrees per second, or 6 seconds for a full rotation. This is exactly what I get using a stopwatch, so yes, the result surprises me, but in both computers I get ~60 fps with flip(1) - although it looks very slow. Much better than what I thought, so I stand corrected about my statement, "very low fps". it was a very quick conclusion, not based on numbers but based on visually slow rotation and comparison with a similar example implemented in another SDK (which turned out it uses the equivalent of flip(0) by default, hence it rotates much faster.) So consider this thread as "solved" please, and thank you for all the help provided. :)
Now, if I understand well, using flip(1), I should expect a frame rate as fast as my monitor's display refresh rate, yes? I am asking because a similar rough estimation gives 60 fps using the example Display_Text/Text (it says 1000 fps but that's clearly wrong, as it takes about 12 seconds for a full rotation, and given that angle is increased by 0.5 in that example, fps is ~60); furthermore, as I mentioned earlier, I also get 60 fps using the DisplayTarget example with flip(1). In general, I seem to always get 60 fps if I use flip(1). On both computers tested, monitor's refresh rate is indeed 60Hz.

I also wonder if that 60 fps will stay the same in complex big applications with much drawing going on per frame. Any evidence about that?

Judas
07-13-2013, 04:45 PM
Now, if I understand well, using flip(1), I should expect a frame rate as fast as my monitor's display refresh rate, yes? I am asking because a similar rough estimation gives 60 fps using the example Display_Text/Text (it says 1000 fps but that's clearly wrong, as it takes about 12 seconds for a full rotation, and given that angle is increased by 0.5 in that example, fps is ~60); furthermore, as I mentioned earlier, I also get 60 fps using the DisplayTarget example with flip(1). In general, I seem to always get 60 fps if I use flip(1). On both computers tested, monitor's refresh rate is indeed 60Hz.

I also wonder if that 60 fps will stay the same in complex big applications with much drawing going on per frame. Any evidence about that?

flip(1) means that it will wait for the next vertical monitor synchronization before showing the frame buffer. Typically today monitors run at 60 Hz, although you can find monitors running at 75 Hz or even higher values.

In an ideal setup, an application running with vsync gives the best visual experience for the user because the user will always see a full frame, while if vsync is off then sometimes you might notice screen tearing caused by seeing half of one frame and half of another.

In the worst case setup, an application running with vsync will bounce between 30 and 60 fps because some frames take too long to render to meet the 60 Hz deadline. If a frame hasn't been rendered within 16 ms (60 Hz), then flip(1) causes it to wait until the next vsync. The speed becomes 30 fps because every other frame is the same as last time, and in the worst case some frames make the deadline while some don't, causing the game to periodically run 60 fps, then 30 fps, then 60 fps, etc.

The rule of thumb is therefore that if an application always meets the 60 Hz deadline, then it is much better to have vsync enabled. The game gets a consistent frame rate of exactly 60. If the application cannot meet the deadline, then it is better to have vsync off because the application might be running at say 48-52 fps, and this is much better than 30 fps. Whether an application can render a frame within the 16 ms deadline depends greatly on the graphics card and what is being rendered. In the DisplayTarget Nvidia test, it seems to be rendering a frame in 1.5 ms, but then that app is also very simple. The Intel card also clearly meets the deadline or the vsync FPS wouldn't have been 60. :)

Many gamers always set vsync off in games because 1) They like bigger FPS numbers, 2) They've noticed the 30-60 Hz bouncing effect, which usually happens in combat situations in games, 3) The game ticking logic is poorly written and depends on the actual frame rate, 4) Aiming in shooters is significantly affected by low FPS. Good aiming players need 50+ FPS as an absolute minimum.

I hope that gives you a more clear picture of pros and cons with vsync. :)

As for the visual jarring effect you are seeing, that seems to be some kind of issue with the window manager and/or X11 server you are using. You'd probably have to ask the developers of those subsystems what could be causing that. Just make sure that if you do, you say the fps is way over 60, which it is in your case. It could be a problem in how ClanLib presents the frame buffer (a standard GLX swap buffers call), but without outside help we have no idea what to do about it.

Pap
07-13-2013, 09:24 PM
Thank you for that; it made it clear flip(1) does what I was thinking, and it also explained details I was not aware of (I am not new in programming but I used to write mathematical applications; when you solve systems of differential equations you don't really care about "fps" and "vsync", but when you are interested on making a game or something similar you really need to know about those things.) :)

rombust
07-13-2013, 09:27 PM
The is "datenwolf" comments on the subject on stackoverflow: (ref: http://stackoverflow.com/questions/7359366/tips-to-reduce-opengl-3-4-frame-rate-stuttering-under-linux )

•Do you run a compositing window manager?

Compositing creates a whole bunch of synchronization and timing issues. Also (some of) the OpenGL code you can find in the compositing WMs at some places drives tears into the eyes of a seasoned OpenGL coder, especially if one has experience writing realtime 3D (game) engines.

KDE4 and GNOME3 by default use compositing, if available. The same holds for the Ubuntu Unity desktop shell. Also for some non-compositing WMs the default skripts start xcompmgr for transparency and shadow effects.

Unfortnately the whole Compositing thing completely messes up VSync. Technically it was rather simple to fix, but whenever I tried to explain the X.org developers I got back blank stares (in email form); and I can't fix it myself, because it requires patching the drivers. There are 2 things to do: If a window is redirected (into an off-screen buffer) doublebuffering is switched off (compositing is doublebuffering), directed Windows (v)sync against a deadline, set by the compositor, depending on the estimated time for drawing the composition

Pap
07-13-2013, 10:23 PM
•Do you run a compositing window manager?I was never a fun of fancy bloated window managers; They are all resource devourers, so kde, gnome, compiz and the like are not even installed in my systems. I usually run "LXDE" which has no "advanced" visual effects, such as transparency, fading, scaling, etc. "Fluxbox" or "Window Manager" are also good choices ("Window Manager" has a tiny memory footprint, smaller than all the other WMs mentioned.) I am pretty sure none of them uses compositing at all.

The only issue remaining is the fact Intel driver forces either vsync or no vsync (depending on the setting in xorg.conf,) ignoring application's synchronization setting in both cases; so if driver is set to vsync it will ignore flip(0) and will still use vsync no matter what; inversely, if it is set to no vsync, it will ignore flip(1) and will force no vsync. There must be a way to make it work as Nvidia driver does, but I am not aware of how to do it yet. I will post a solution as soon I know how to do it, in case someone else has the same issue.

I think the best choice is to stick with flip(1) as a default but also offer the option for no vsync; in a well-designed application, options for GL, GL1, and software rendering should provided as well, pretty much as DisplayTarget example does. I have to say here that ClanLib offers a neat way to do all that, not found in other SDKs. :)