todor943 Posted June 7, 2018 Report Share Posted June 7, 2018 So, I've noticed that when you move the camera there is a considerable amount of fps drop. I have a tendency to not trust my AMD ATI driver on linux, so I switched to the Intel GPU. Both were yielding about the same FPS in game, which I found weird. Until I looked at the screenshots I made for CPU use. I am attaching that below. On the left side of figure 1 you can see that how the Intel Core i7-4810MQ tries to juggle around single thread performance when using the dedicated AMD FirePro W4170M. On the right side, with the same FPS, the game is running on the Intel graphics. The CPU usage pattern, however, is different. To me as a dev this looks like a scheduling as well as a threading problem. I have the steps to replicate, so how can I profile the runtime to see what is causing this particular issue : Momentary stutter/lag when moving around, with all effects on low and no units. Is there an issue already tracked for this? Figure 1: Quote Link to comment Share on other sites More sharing options...
aeonios Posted June 7, 2018 Report Share Posted June 7, 2018 AFAIK the game is not threaded in any way. Sim and rendering and the netclient all run on one thread. There are plans to move the net client to a separate thread but that hasn't been done yet. Rendering and sim are tightly interdependent so it probably wouldn't do much good to run them on separate threads. I do have plans to try to reduce the CPU-side costs of rendering, but it'll be a while before I make any real progress on it. Keep in mind that intel graphics steals memory from main memory so it wouldn't be strange to have different/weird CPU usage patterns from it, not to mention it'd use a totally different driver. Quote Link to comment Share on other sites More sharing options...
Imarok Posted June 7, 2018 Report Share Posted June 7, 2018 6 hours ago, todor943 said: So, I've noticed that when you move the camera there is a considerable amount of fps drop. I have a tendency to not trust my AMD ATI driver on linux, so I switched to the Intel GPU. Both were yielding about the same FPS in game, which I found weird. Until I looked at the screenshots I made for CPU use. I am attaching that below. On the left side of figure 1 you can see that how the Intel Core i7-4810MQ tries to juggle around single thread performance when using the dedicated AMD FirePro W4170M. On the right side, with the same FPS, the game is running on the Intel graphics. The CPU usage pattern, however, is different. To me as a dev this looks like a scheduling as well as a threading problem. I have the steps to replicate, so how can I profile the runtime to see what is causing this particular issue : Momentary stutter/lag when moving around, with all effects on low and no units. Is there an issue already tracked for this? Figure 1: See https://trac.wildfiregames.com/wiki/EngineProfiling Quote Link to comment Share on other sites More sharing options...
Stan` Posted June 7, 2018 Report Share Posted June 7, 2018 @todor943 Welcome to the forums. See also: https://trac.wildfiregames.com/ticket/3054 for weird rendering slowdowns on amd cards. 1 Quote Link to comment Share on other sites More sharing options...
vladislavbelov Posted June 7, 2018 Report Share Posted June 7, 2018 13 hours ago, aeonios said: Rendering and sim are tightly interdependent so it probably wouldn't do much good to run them on separate threads. But it shouldn't be interdependent. Because we need to split it for 2 steps: submitting and drawing. Without threading it would be less powerful, but still useful. Because we don't need to wait until post-processing and buffer swapping will be finished. Quote Link to comment Share on other sites More sharing options...
aeonios Posted June 7, 2018 Report Share Posted June 7, 2018 31 minutes ago, vladislavbelov said: But it shouldn't be interdependent. Because we need to split it for 2 steps: submitting and drawing. Without threading it would be less powerful, but still useful. Because we don't need to wait until post-processing and buffer swapping will be finished. Yes but if render is running much faster than sim then it will sit there and spam enumerateObjects at sim. Synchronizing that to eliminate the wasted requests would be a serious pain. I do however plan on separating submission and visibility testing and centralizing visibility testing in the renderer. I suspect that on large maps the cost of visibility testing becomes very high so using a hierarchal system could improve performance by a lot. Quote Link to comment Share on other sites More sharing options...
vladislavbelov Posted June 7, 2018 Report Share Posted June 7, 2018 59 minutes ago, aeonios said: Yes but if render is running much faster than sim then it will sit there and spam enumerateObjects at sim. Renderer shouldn't do this. He should wait, if he has been done his work already. 1 hour ago, aeonios said: Synchronizing that to eliminate the wasted requests would be a serious pain. No, it wouldn't. We only need a simple semaphore. Because we need to have a 1:1 running scale. 1 hour ago, aeonios said: I suspect that on large maps the cost of visibility testing becomes very high so using a hierarchal system could improve performance by a lot. I'm not sure that only visibility checking improvements will save a lot of performance, but the testing will show. Also the hierarchal system should be pretty dynamic without much asymptotics losts. Quote Link to comment Share on other sites More sharing options...
aeonios Posted June 7, 2018 Report Share Posted June 7, 2018 19 minutes ago, vladislavbelov said: Renderer shouldn't do this. He should wait, if he has been done his work already. No, it wouldn't. We only need a simple semaphore. Because we need to have a 1:1 running scale. If the game on normal speed updates at 30fps and the renderer is trying to draw 60+fps for smooth camera scrolling then you won't be doing anybody any favors by forcing the renderer to be capped at 30fps. What we would need is a way for the renderer to know that it doesn't need to update animations or object positions if sim hasn't advanced, and then you might get less than smooth animation if the renderer draws right before the sim is about to advance. I don't know how that's usually handled. I don't think a semaphore would work that well either. The renderer has to make a synchronized call to enumerateObjects on the one hand, and then sim has to make a lot of synchronized calls to the renderer for submitting each object. Compared to simply reducing the CPU overhead of rendering I don't know if it'd be worth it given the amount of synchronization overhead that would be required. At the very least it'd require a synchronized queue on both ends. 21 minutes ago, vladislavbelov said: I'm not sure that only visibility checking improvements will save a lot of performance, but the testing will show. Also the hierarchal system should be pretty dynamic without much asymptotics losts. I'm no so sure about that either but I do know that we're using expensive bounding box tests rather than sphere tests and it might potentially have to check thousands or even tens of thousands of objects every frame. At the moment I don't have any concrete numbers because there's no way to collect statistics since visibility tests are done in simulation and sim doesn't have direct access to the renderer, but that's another thing that centralization would allow. Quote Link to comment Share on other sites More sharing options...
vladislavbelov Posted June 7, 2018 Report Share Posted June 7, 2018 2 hours ago, aeonios said: If the game on normal speed updates at 30fps and the renderer is trying to draw 60+fps for smooth camera scrolling then you won't be doing anybody any favors by forcing the renderer to be capped at 30fps. What we would need is a way for the renderer to know that it doesn't need to update animations or object positions if sim hasn't advanced, and then you might get less than smooth animation if the renderer draws right before the sim is about to advance. I don't know how that's usually handled. You don't need 60fps if nothing was changed. And I don't suggest to fully separate thread, I only suggest a simple worker. It requires much less work from the current state. 2 hours ago, aeonios said: I don't think a semaphore would work that well either. The renderer has to make a synchronized call to enumerateObjects on the one hand, and then sim has to make a lot of synchronized calls to the renderer for submitting each object. Compared to simply reducing the CPU overhead of rendering I don't know if it'd be worth it given the amount of synchronization overhead that would be required. At the very least it'd require a synchronized queue on both ends. You need only 2 synchronizations in my variant. Because after the submitting step you have a set of buckets independent of the sim state. 2 hours ago, aeonios said: I'm no so sure about that either but I do know that we're using expensive bounding box tests rather than sphere tests and it might potentially have to check thousands or even tens of thousands of objects every frame. Why do you call AABB expensive? It's usually faster than sphere, but usual BB is slower, but not significantly (btw, it'd be good to compare). 2 hours ago, aeonios said: At the moment I don't have any concrete numbers because there's no way to collect statistics since visibility tests are done in simulation and sim doesn't have direct access to the renderer, but that's another thing that centralization would allow. You have profiler.txt for that. Quote Link to comment Share on other sites More sharing options...
aeonios Posted June 7, 2018 Report Share Posted June 7, 2018 3 hours ago, vladislavbelov said: You don't need 60fps if nothing was changed. And I don't suggest to fully separate thread, I only suggest a simple worker. It requires much less work from the current state. If the camera moved then something changed, ie the camera view. 3 hours ago, vladislavbelov said: Why do you call AABB expensive? It's usually faster than sphere, but usual BB is slower, but not significantly (btw, it'd be good to compare). In what universe? Sphere requires only one signed distance calculation per frustum plane. AABB requires a lot more complicated math, again per frustum plane. 3 hours ago, vladislavbelov said: You have profiler.txt for that. Profiler.txt can't tell you anything that it doesn't know. I don't think there's any info recorded for objects that are culled by visibility testing. Renderer only knows about objects which are actually submitted. Quote Link to comment Share on other sites More sharing options...
vladislavbelov Posted June 8, 2018 Report Share Posted June 8, 2018 1 hour ago, aeonios said: If the camera moved then something changed, ie the camera view. "camera moved" != "nothings was changed". 1 hour ago, aeonios said: In what universe? Obviously. And AABB with AABB usually faster than Sphere with Sphere. About frustum planes the math is pretty simple for AABB, you need only 2 dots. Something bad is happening, if you need a complicated math for it. 1 hour ago, aeonios said: Profiler.txt can't tell you anything that it doesn't know. I don't think there's any info recorded for objects that are culled by visibility testing. Renderer only knows about objects which are actually submitted. Feel free to add new values to the profiler: https://trac.wildfiregames.com/wiki/EngineProfiling. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.