View Issue Details

IDProjectCategoryLast Update
0024120AI War 2GUINov 17, 2020 11:01 pm
ReporterDaniexpert Assigned ToChris_McElligottPark  
Severityminor 
Status resolvedResolutionfixed 
Product Version2.631 Multiplayer Swaps And Performance 
Fixed in Version2.633 Roaring Performance 
Summary0024120: New UI changes reduced my performance
DescriptionI'll start by saying that I really like the new UI stuff, makes the game more enjoyable and eye-catching . The main menu is really nice!
But, unfortunately for me, performance, seen as FPS, dropped by 20-30% in game, both when paused and unpaused, while in the menu I see it go as low as 20FPS, not going above 26, due to the new animations. I always kept an FPS cap of 75, reaching it often before the UI changes, especially with the game paused, but now I barely reach 60FPS when paused. I'm currently on a laptop with no so great hardware, but not bad either (i7-8565U, 8GB ram, MX130 as GPU, game installed on the SSD, win10). No other changes with drivers or other software on the computer.
So, I'm asking if there are any ways to decrease the image quality (texture, filters, etc.) in order to improve performance, without changing the output resolution of the game? Ship stacking is on the default values and I already disabled the rendering of ships and some animations, which is unfortunate because I like a lot having both on.
TagsNo tags attached.

Activities

Chris_McElligottPark

Nov 16, 2020 10:38 am

administrator   ~0059610

Hi there!

For the main menu, this is pretty much expected on older hardware, and I have a variety of things in the environment that slow down a ton or turn off based on what your FPS is. Arguably I should make the reflection probe, which is the slowest thing, just stop updating entirely if your FPS is still below 30 while it's updating. Potentially I should also disable the point lights on the ships if FPS is too low.

When it comes to the in-game performance, though, that's something that surprises me a really great deal. There are only a few windows that are visible during most of gameplay, and even when you're opening up a sub-window that's something that just shows one at a time generally.

So, this catches me off guard in a couple of ways. I'm not entirely sure why this is happening for you, and I may need to make some options. First of all, a few observations and a request:

1. Historically, the most CPU-draining part of the GUI rendering has been the text itself, plus whatever calculations are being done to show where to draw things. So if that has shifted on your machine (and thus also some other subset of machines) and now the actual draw is the bottleneck, then this is new. Can you do me a favor before I go barking up this new tree and give a check on the new beta version v2618_last_before_new_ui and see how your performance compares in the non-main-menu parts of the game? If it really is a 20% to 30% drop in the same situation, then that is very surprising and something I need to look into. But before we come to that conclusion, I'd like to be sure we're comparing apples to apples.

2. One thing that is hitting the CPU a fair bit, but should not be THAT much, is the new ability for buttons to glow. That's a full-screen processing effect that I may just need to make disable-able.

3. The shaders for those sprites are also probably a bit more expensive now than before, but not a huge amount and most of that is composited anyhow. Which, on the subject of this, that is one of the things that is making me a little suspicious here -- if the UI is causing lag, and it isn't the overall added bloom effect for icons or just the decisions about UI stuff in general, then this does make me wonder how much extra is happening in the compositing stages of the UI in general. There are a lot of things in the UI that do not change at all from frame to frame if you are in a 70FPS situation, and ideally it would be running those updates less frequently in general and thus also not incurring so many extra draw calls or shader expense or whatever is the particular case here. In a truly ideal world, I'd stagger these various main-screen windows so that they are updated in rotation rather than all of them updating every frame. To be clear: none of us can perceive textual changes that happen on a subsecond interval, and having our text change 70 times per second is not a good thing if that were the case. Nothing about the UI is directly animated where it needs changes that fast. So this may be a general revelation that I need to be doing better time-slicing of UI panel updates in general, for everyone's sake.

4. The last thing is texture resolution, for sure. I can go back over my textures and make sure I didn't include any uncompressed in a fit of perfectionism, but generally speaking all of those large textures in the background are compressed with DXT1 compression (to my knowledge, but I need to check), and that's the oldest and most-compatible compression format and leads to most of those images being only 1-2 MB. That shouldn't be putting any substantial strain on the GPU pipeline even if compositing is happening too frequently. That said, sometimes these are drawn very small despite being originally very big, and I should check to make sure that all of these have proper mipmaps. I just went in and took OFF all the mipmaps on the icons themselves, but that's not an issue for background textures.

5. So actually this is the last thing, and it's related to those textures still. If the GPU pipeline isn't the problem and compression is fine, then it's got to be an overdraw issue. The very basic background is no less efficient than it used to be: the resolution is the same, the details in the texture are greater, but the GPU doesn't care. It's literally identical performance despite visually looking better. BUT where that changes is with the large background images which are placed over top of them. Those are very frequently drawn with a huge amount of transparency, and so overdraw happens on all of those pixels. Depending on what screen resolution you are running at, that could be a substantial thing.

6. Ah man, I keep having more thoughts. Quasi related to the last, but I'm actually often drawing those images HUGE and outside of the drawing bounds in order to have them look good. I'm using clipping masks to then get them smaller, and I had assumed that pixels that are behind a clipping mask don't get drawn... but suddenly I have this sort of sick feeling and wonder. I usually know better than to trust someone else's code like that (in this case the unity clipping code). It's entirely possible that all the pixels are being drawn, but just being discarded if they are outside of the clipping mask bounds. I will have to look into that and potentially create my own new... something. Either new version of the image class, or a new shader that I can use on there that lets me adjust the uv offsets rather than raw image size. To do that, I'm going to need to look at the code for their existing UI elements, but thankfully that's all open-source.

Your graphics card doesn't strike me as one that is particularly weak, and your CPU and such is just fine. So unless there's a problem I don't see at a glance with your GPU bus being really more narrow than expected, I really keep coming back to pixel fill rate. In writing to you what the problems might be, I kept stumbling across increasing numbers of ways I might be hitting the fill rate in negative fashions, so I think I have some useful things to look into regardless of your response on performance loss between that new beta branch and the current live build. Either way, there's some tweaks to be made on my end. But I would definitely appreciate it if you're able to get me apples to apples info (same savegame and scene and resolution, report of fps from the v2618_last_before_new_ui and the latest build to see differences.

Thanks for bringing this up!

Daniexpert

Nov 16, 2020 11:32 am

manager   ~0059612

Hi!

Regarding the menu animation, is it possible to just have an on/off switch so everyone can disable if it is needed?

I'll gladly test it, because I'd prefer this being only an issue on my machine, instead of a bigger problem. I'll try to run some benchmarks to have some real data then.
I'll also try to run the game forcing the integrated GPU, therefore leaving the MX130 asleep, to see how much weight is removed from the CPU with a dedicated CPU. It could be helpful, who knows.

Glad to be able to help :)

Chris_McElligottPark

Nov 17, 2020 3:39 pm

administrator   ~0059620

So, coming in version 2.633, this should be interesting to get feedback on:

* On the main menu scene, improved the culling mask on the scene-view camera to greatly improve efficiency of that scene.
** It looks like the main menu may have been accidentally drawing 1.8 million tris rather than 800k tris because of this being set wrong.

* The reflection probe on the main menu scene has also been updated to have an appropriate culling mask, for the same reason.
** The reflection probe updates, which are quite heavy and frequent, should also thus be correspondingly faster and draw so many fewer triangles as well.

* Poly few has been employed on the main menu scene to combine all of those meshes of the hangar into just a single mesh with 16 submeshes for the various materials.
** This cuts the number of draw calls on the main menu down from about 3000 to about 250. The visual end result is identical. The performance gain is potentially massive, but varies heavily by hardware.

* We have historically had static and dynamic batching disabled for this game, because we use GPU instancing instead (which is far more efficient and direct).
** However, when we made the new main menu, we had implemented things such that this type of batching would be useful there, so we turned it on.
** We have now changed things around again to remove that, and so once again removed those from being on in the application as a whole.
** It's quite possible that these were dragging down performance on some machines in general, as the game may have been spending some CPU cycles fruitlessly looking for things to dynamically batch during the main game itself.
** It's irrelevant to the end result of how things look, but there's no chance of that popping in and impacting performance negatively anymore, which is good. If it wasn't a performance impact, then no worries there, either.

* Using Blender, we've manually removed some off-screen sections of the main menu meshes. This has overall reduced our polygon count in the game on the main menu by another 300k or so triangles.
** This sort of hand-optimization is something that we had been saving until it was clear this is where the bottleneck was, and after it was clear that the new main menu was a winner (and that we had time aside for it).

* With these changes, on Chris's main two computers he sees:
** On the main menu on his main dev machine (GTX 1070 and a few year old i7 laptop) a jump from about 55-60 fps to instead being about 100fps.
** On the main menu on his MacBook Pro from late 2013 which has an i7 but does not meet the minimum system requirements in general, it jumps from 26fps to... 26 fps. So there's a different limiting factor other than polygon count or draw calls on this ancient of hardware.
** Most likely, any machines that are actually meeting the minimum system requirements, or vaguely approaching the recommended, environment, should see a substantial performance bump on the main menu. And for everyone, the disabling of the static and dynamic batching may improve performance beyond the main menu.

Chris_McElligottPark

Nov 17, 2020 4:21 pm

administrator   ~0059621

* In our main menu scene, the way that the reflection probe is update has been changed fairly substantially.
** Previously it was every-frame every-face if you had at least 30fps, and every-frame individual-face if you had at least 15fps, and below that would not update over time.
** The individual-face updates were really jarring, however, and not something that is a good idea for any sort of smooth feeling.
** Now only if you have at least 50 fps will it do every-frame every-face updates, and below that it will just not update over time, instead only having the reflection from the initial onawake event.
** On Chris's main machine this makes no difference since it runs at 100fps now, but on the under-min-specs OSX machine this brings performance up to 31fps from the previous 26fps.

* On the main menu, a number of lights were set to affect more than just the Scenes layer. This probably did not affect performance, but we are correcting that anyhow.

* On the main menu, we had one extra spot light that was drawn in a not-important weighting, and that was very dramatic and looked good in general BEFORE we started having ships with lights on them moving around.
** Since having ships moving around, that spot light would disable itself as the spotlights overtook it, then re-enable itself, and the transitions were jarring. It did not seem to affect performance much on the high-end or ultra-low-end machines, but in the middle-tier it might, also
** This spot light is simply removed, as it was not needed for the new scene composition.
** We experimented with turning off the point lights used on the ships, or even with turning off the reflection probe from being on at all, but the former gave 2fps on the super-old mac (from 31 to 33 fps), and the latter gave no boost at all.
** Whatever is holding back the ancient below-specs mac is really not the sort of thing that is holding back the rest of the potential computing audience. And this is one excellent reason why we have system requirements in the first place. Not that 30fps is a cardinal sin; the original AI War was hard-locked to 20fps most of the time.

Daniexpert

Nov 17, 2020 4:56 pm

manager   ~0059622

Hi again!

So I did run some benchmarks. Good news: it's not 20-30% worse, it was probably a bad day for my computer when I noticed this huge drop. I'm too tired to explain all in detail, so I'll just post the screenshot of the results I calculated with a quick explanation. The percentages you see show that the new version has lower performances than the one before the UI rework, especially in the menu (expected).

Generally speaking, I feel that in the old build, when compared to the new one, the planet view is smoother, but when going back to the galaxy view, the fps drops are heavier, in the order of 20-30 fps.

(Forgot to add, FPS cap is set at 75).
If something is not clear, ask and I'll explain.

Daniexpert

Nov 17, 2020 4:59 pm

manager   ~0059623

I didn't refresh the page before posting. I'll add the benchmark of 2.633 when I have time. I'm curious to see if something changes.

Chris_McElligottPark

Nov 17, 2020 5:05 pm

administrator   ~0059624

That is good news! That does mean that there's a lot less work for me to do. I did wind up looking into some UI compositing updates in general, mainly because I was curious, , but I'm not sure if this is something that will be needed on any machines.

For the record, I am also seeing surprisingly low performance on the galaxy map, which is a big surprise to me. I'm not sure what is happening there at the moment.

Chris_McElligottPark

Nov 17, 2020 5:22 pm

administrator   ~0059625

I don't think these are very important, but nonetheless:

* Added a new Performance tab option: Unrestricte UI Update Speeds
** Normally, most UI windows only update their contents every 50-100 milliseconds. If your framerate is much higher than this, however, you may prefer that the UI update at whatever your actual framerate is.
** This will likely reduce your framerate, potentially substantially, but it leads to the ultimate in responsiveness. Prior to version 2.633, and since sometime in the game's alpha, the UIs were all running on unrestribted update speeds.
** We are not noticing any substantial benefit from this on our powerful machines, but on lower-end and middle-tier machines this may make more of a difference.
** At the moment, things seem to perform equally well either way, but it's nice to put a lesser load on things where we can. Since this does not seem to make a visual/feel difference that we can detect at the moment, this seems fine to have with a differing default from the past.

* In the ArcenUI_Element class, we have a SetActiveIfNeeded() method that long ago had some gating that was based on a cached wasLastActive in the class.
** This was working poorly, back in alpha or beta of the game, because of how unity handles commands to enable objects that are disabled in the heirarchy, and things like that.
** The game has now been updated to do a check against the activeInHierarchy property of the gameobject, which will always give the real result. This should not result in bugs, and should in theory result in some slightly better performance in certain cases where large numbers of ui elements are turning on or off frequently.
** We don't really see much of a difference based on this, but in general this was something we noticed that was an optimization we had wanted a long time ago, and being able to have a tamer version of that back in here now is nice.

Daniexpert

Nov 17, 2020 6:29 pm

manager   ~0059627

Attaching a corrected version of the spreadsheet with the benchmark differences I originally posted in 0007721:0024120:0059622
Hopefully this time everything is correct.

Chris_McElligottPark

Nov 17, 2020 9:04 pm

administrator   ~0059629

Sweet, thanks for that! Also coming next build:

* Over the last few months, as we've added functionality, the performance of the galaxy map has dropped notably.
** In combination with a much-more-recent performance drop related to how we draw sprites-in-text and how that affects the galaxy map only, full galaxy maps were down in the 25fps range and really choppy to move around, today.
** We've now restructured a lot of things to update in a time-sliced fashion, and the performance is now in the range of 90fps when zoomed all the way in, and 60fps when zoomed all the way out on a full map.
** There are still some performance improvements we need to pursue related to sprites-in-text in this specific instance, but those will be in the next build.
*** We did experiment around with trying some things like adjusting the sprite-in-text shader to allow for GPU instancing, but that went absolutely bonkers in a way that we don't care to untangle. There's a better approach that we'll implement soon.

Chris_McElligottPark

Nov 17, 2020 11:01 pm

administrator   ~0059630

Okay, this is definitely done. It's possible there might be some things I could do with the image backgrounds in the UI that might be slightly more efficient, but I think that they are actually being handled fine and that was the only open question. After your comments on further updates, the changes in performance were not enough to warrant that. But there were some notable performance improvements available elsewhere, and I've now hit them all:

* A whole messload of the new background images and other accents that are used in the new UI have been made vastly more efficient.
** This may actually vary by OS just how much more efficient they are, but in essence these are all now able to be stored in DXT1 format, and all of the ones where relevant now have mipmaps for more efficient drawing at smaller resolutions.
** The amount of VRAM that this should save, and the extra load removed from the GPU pipeline, should be substantial.
** Thanks to Daniexpert for getting us to look into this.

* Discarded these changes: Rather than using raw "TextMeshPro" text renderers for the text that is shown all around the galaxy map, we are now using individual Unity GUI Canvases with embedded TextMeshProUGUI objects.
** Visually this looks identical, but now we are able to take advantage of the compositing stages that unity canvases go through, and thus we can have things like strength icons be embedded directly in these canvases without them causing extra draw calls.
** At the moment we have one canvas per planet, with three text sections inside of that. This is less efficient per update of text, but more efficient for drawing text, which is the more common operation.
** None of these respond to mouse raycasts at all, so on the off chance that the occasional (could not click a planet on the galaxy map) was relating to these, that no longer is possible.

* Replacement changes for the above: In the end we went back to raw TextMeshPro text renderers, as their performance was superior to anything we tried with an abundance of canvases.
** We did wind up also making it so that the shaders for TextMesh Pro sprites now use the Geometry queue instead of the Transparent Queue, which improves performance on rendering and also allows for batching.
** And for the various ship icons in both the main view and the galaxy view, those also now use the Geometry queue. Those should generally get picked up by GPU instancing, but in the event they do not they will now get picked up by dynamic batching instead.
** We actually have re-enabled dynamic batching for the game, but still left static batching off, and this seems to give the optimal performance when that's paired with these shader changes for the sprites.
** Sprites used to always do perfect instancing, but now the sort order sometimes messes that up since there are multiple materials and it feels like it needs to handle them in proper order (really, z buffer ought to be sufficient and overdraw is probably preferred, but anyhow). The queue change makes these more likely to instance, and in the event they don't instance it makes them batch, thus leaning on the z buffer as noted.
** The end performance boost on the top machine we have is now getting us back into the 90s fps on the galaxy map, up from the high 20s in the prior build, and still in the 90s on the main planet view. And both feel smooth rather than jittery, now, which is good.

* On the galaxy map, we are now properly buffering text such that we don't put back in the same value into a field that just had that value.
** This was causing some needless thrashing and re-parsing of rich text tags.

Fingers crossed it does as well for you! :)

Issue History

Date Modified Username Field Change
Nov 16, 2020 6:01 am Daniexpert New Issue
Nov 16, 2020 10:38 am Chris_McElligottPark Note Added: 0059610
Nov 16, 2020 10:38 am Chris_McElligottPark Assigned To => Chris_McElligottPark
Nov 16, 2020 10:38 am Chris_McElligottPark Status new => assigned
Nov 16, 2020 11:32 am Daniexpert Note Added: 0059612
Nov 17, 2020 3:39 pm Chris_McElligottPark Note Added: 0059620
Nov 17, 2020 4:21 pm Chris_McElligottPark Note Added: 0059621
Nov 17, 2020 4:56 pm Daniexpert File Added: 2020-11-17_22-52-13-AIWar2_overall.xlsx_-_Excel.png
Nov 17, 2020 4:56 pm Daniexpert Note Added: 0059622
Nov 17, 2020 4:59 pm Daniexpert Note Added: 0059623
Nov 17, 2020 5:05 pm Chris_McElligottPark Note Added: 0059624
Nov 17, 2020 5:22 pm Chris_McElligottPark Note Added: 0059625
Nov 17, 2020 6:12 pm Daniexpert File Deleted: 2020-11-17_22-52-13-AIWar2_overall.xlsx_-_Excel.png
Nov 17, 2020 6:29 pm Daniexpert File Added: 2020-11-18_00-29-30-AIWar2_overall.xlsx_-_Excel.png
Nov 17, 2020 6:29 pm Daniexpert Note Added: 0059627
Nov 17, 2020 9:04 pm Chris_McElligottPark Note Added: 0059629
Nov 17, 2020 11:01 pm Chris_McElligottPark Status assigned => resolved
Nov 17, 2020 11:01 pm Chris_McElligottPark Resolution open => fixed
Nov 17, 2020 11:01 pm Chris_McElligottPark Fixed in Version => 2.633 Roaring Performance
Nov 17, 2020 11:01 pm Chris_McElligottPark Note Added: 0059630