#1923 closed defect (wontfix)
[PATCH] very performance basic non behaviour changing optimisations (in hotspot only)
Reported by: | tuan kuranes | Owned by: | tuan kuranes |
---|---|---|---|
Priority: | Should Have | Milestone: | |
Component: | Core engine | Keywords: | patch |
Cc: | Patch: |
Description
Done only on very hot CPU intensive code, using very sleepy profiler hotspot tracing:
what the patch does
- Debranch function
- const variables methods and parameters wherever possible.
The most and foremost one being getting rid of all new/free/malloc/delete during runtime as that's your main performance killer (all memory should be pre-allocated and stored in memory pools, period).
Attachments (3)
Change History (50)
comment:1 by , 11 years ago
comment:2 by , 11 years ago
- Const is foremost a compile-time safety and code safety, useful for code reading and code analysis (tools like cppcheck). Still, it provides compiler are hint, and nowadays compilers are moving faster, and it might still do good deeds.
- Meanwhile const methods and const methods parameters are optimised by compiler, const references can prevent aliasing ( the load and store hit. Note that using "restrict" make it more even so, but people already have hard time with my const-correcness...) The most useful is const vector3D &myvector which allows compiler to pass the pointer only (size_t copy in register), not doing a copy of the whole vector3D object. (rvo doesn't really work afaik)
I get noticeable perf improvements with my patch here, but previous comment was misleading, I added const only in hotspot (top 10 in profiler) I reviewed, but most of its effect are cache wise, and performance also comes from changes to prevent stack allocation. (but all that only in hotspots.)
I added a new version of the patch, with new hotspot. There's lots of other optimisation opportunities, on the same vein, but now, it's more high-level and can be behaviour changing.
(more stack allocation to prevent, like the shaderpogram uniform stack allocation thing using string as ID that could be cached by shaderprogram by name getTransformUniform, getShadowSizeUniform() that would help ease the boosthaslap cpu impact, etc.).
comment:3 by , 11 years ago
The latest patch is a *very huge perf* gain here, having done all the hotspots now. (pathfinding is the most impacted, but all other cpu intensive operations get noticable gains)
The idea of the patch is really "Optimisations as you go" withtout any code behaviour modification, so it's really easy to apply and test.
- http://www.tantalon.com/pete/cppopt/main.htm (most gain with "Pass Class Parameters by Reference")
- cpu cycle saving with computation cache (computing edge distance once and sort, instead of computing distance at each sort comparison in pathfinding)
- Effective C++ scott meyers usual things.
There's still const whenever possible on code I crossed, because it's good practice for clean code and optimisiation opportunities, and in order to be excused, at least, I'm not alone there : http://clipboard.com/clip/LQP6eFU2JSft1ZBHlDQGutltTI1UqvgqxJTe
Now, that let JS code in hotspots for cpu cycle saving. ("search drop build site" is a frame killer here)
The next step are the "hiccups" still there.
I would guess they come from memory, with all those stack allocation, new and delete even show in hotspots. Not only it's slow, but memory fragmention grows over time execution (notice when you save, close and restarts, game is faster, at start). The way there is "memory pooling", which is preallocation "once and for all" from where we get new object and store unused objects. Not only it stops memory fragmentation and is faster than new/delete, but it gives data "localization" in RAM, giving cpu cache usage a boost. ( http://blog.kejser.org/2012/06/14/the-effect-of-cpu-caches-and-memory-access-patterns/ and all data-oriented talks/slides )
(each patch replace the precedent, it's not a cumulative patch)
comment:4 by , 11 years ago
Keywords: | patch review added; performance removed |
---|
As interesting as these patches may be, you've provided no numbers on what performance gains were seen, nor how you measured them, or what compiler you used and what system this was tested on.
Also it's extremely unlikely that we apply and review a 142 KB patch, so if you would like some of these changes added, I suggest breaking it into multiple patches, where each one addresses a particular issue and a few files. Then provide a description of the patch and what it gains.
follow-up: 7 comment:5 by , 11 years ago
Sorry about huge patch.
A link where patch is discussed a bit : http://www.wildfiregames.com/forum/index.php?showtopic=17274&st=60#entry268582
How do you want patch performance gains posted ? is there a wiki page explaining a route to follow ?
It depends on game context, so I don't know how to get meaningful numbers, but note that the patch just removed unecessary cycles and branch mispredict (attested by codeXL) so it can only be gains.
As said in the forum thread, My method was to used real profiler (codeXL) to validate each changes on cpu cycles, branch prediction, and cache trashing, and only kept those which lowered those numbers (but didn't take note or screenshots of numbers).
(Note that It's only the latest file patch, the latest in the file section (didn't find how to remove file previously submitted.)
I intend to give some time to cut patch in three (probably not until I get time, that is , next week-end)
Here's perhaps how I could make three separate patches:
- "optimisation as you go", that cover a lot of files, but not much code changes in the end mostly "method(paramClass)" => "method(const param &Class) const" and other const opportunities whenever possible. (that's the one that make the patch big, because of the huge number of files.)
- Pathfinding (CFvector2D and all path files does cover a lot of files too ).
- Basic branches optimisation (remove if from inside while and call stack, and not inline funcs), and whenever RVO doesn't kick (and cause temp object alloc).
I intend to give some time to cut patch in three (probably not until I get time, that is , next week-end)
comment:6 by , 11 years ago
As pointed on forum, last file patch seems to look rather like a wrong patch, and doesn't apply on linux out of the box. (and seems a wrong file with commented code and fast_blur_test intrinsics, and seemingly the boost::unordered changes test that should not be in.)
comment:7 by , 11 years ago
Replying to kuranes:
How do you want patch performance gains posted ? is there a wiki page explaining a route to follow ?
We probably need to make this a more formal process in the future, but basically what I hinted at: testing method, test system (OS, compiler, hardware, etc.), results, explanation of changes.
Here's perhaps how I could make three separate patches:
- "optimisation as you go", that cover a lot of files, but not much code changes in the end mostly "method(paramClass)" => "method(const param &Class) const" and other const opportunities whenever possible. (that's the one that make the patch big, because of the huge number of files.)
- Pathfinding (CFvector2D and all path files does cover a lot of files too ).
- Basic branches optimisation (remove if from inside while and call stack, and not inline funcs), and whenever RVO doesn't kick (and cause temp object alloc).
Sounds good to me, and if you can clean up any commented out code that was added, that would help too. Since it's a patch, we'll already be able to see and compare the old and new code without it being commented out :)
comment:8 by , 11 years ago
Here's all the patches:
- pathfind.patch: huge patch, but very huge perf boost, mostly cycles saving reusing results, and using more appropriate structures, minimizing allocation and object copies. (20% to 400% perf, depending on path requests)
- size_empty.patch: using empty() instead of size(), because size does iterate and check content size. (0.1 perf...)
- skinning.patch: specific RVO not kicking code for CPU model skinning, avoiding lots of temp objects,and loop unrolling as 4 bones is a static number. (3-7%perf)
- const_only.2.patch: const param& , const param, const method, const simple variable (making compiler avoiding object copies whenever possible) some rvo no kicks in there as well.(5-10%perf)
- minimap.patch: add VBO support for unit drawing when supported, huge perf boost per frame, as opengl driver doesn't have to sync anymore. (20%perf with a good gpu, invisible with slow gpu)
- branch.patch: remove branches whenever possible in loops: here CLOSTexture::GenerateBitmap getting huge boost, removing lots of code and adding cache prefetching benefits with the single memset at start (5%perf)
- perf.patch: all in one patch updated with all changes
Perf is in the main thread only, relative to 100%. Here it makes 0ad very playable on all non naval maps, including huge maps. The real perf stopper now here is a IA economy script "drop sites searching" that kicks in regularly and gives perfs stuttering. Huge pathfind request, (200 units moves at once), still goes up, but gives "reasonable stuttering".
Tested on linux (old cpu), and windows x64 with 32bits exe (good cpu).
comment:10 by , 11 years ago
Trac fails to parse correctly the perf.patch and both const patches, and I fail to apply them (the diff does work though).
It seems to me like you have done substantial changes to the pathfinder and the rangemanager, which will the most need investigating.
Edit: compiled: not sure about performance improvements, particularly as I don't really have any point of comparison, but it doesn't feel slower. Biggest slowdown for me always was the rendering anyway.
comment:11 by , 11 years ago
Added a .tar.gz with patches inside to be trac-proof.
Path and range is the culprit of the patch, as perf is abissmal as soon as you're having "armies" fighting, or many player/bots (even a simple long "punjab1" scenario play againt "aegis" bots becomes unplayable as times advance (even after a close/reload to remove memory fragmentation). I guess perf is barely visible in simple scenario on a good cpu, but should becomes very apparent on big maps/numerous player/bots.
comment:12 by , 11 years ago
I tried applying the patches inside 0ad_perf.tar.gz, they apply fine except for the const one:
~/sorgenti/0ad/trunk/source$ patch -p1 < const_only.patch patching file graphics/UnitAnimation.cpp patching file main.cpp patching file maths/BoundingBoxAligned.cpp patching file maths/BoundingBoxOriented.cpp patching file renderer/ModelRenderer.cpp patching file renderer/OverlayRenderer.cpp patching file renderer/TerrainOverlay.h patching file renderer/TerrainRenderer.cpp patching file renderer/VertexArray.cpp patch: **** malformed patch at line 266: Index: source/simulation2/system/ComponentManager.cpp
Then I tried a short replay of combat demo with:
time ./0ad -replay=/home/fabio/.config/0ad/logs/sim_log/./3557/commands.txt
With standard 0ad from r13404 I get:
[...CUT...] Turn 597 (200)... TIMER| profile2 get buffer: 7.17836 ms TIMER| profile2 visitor: 67.0483 ms TIMER| profile2 query: 74.7369 ms # Final state: af6cec4fa14b5cfba8b22ddee4373686 TIMER TOTALS (9 clients) ----------------------------------------------------- tc_pool_alloc: 34.5203 Mc (1739x) tc_png_decode: 0 c (0x) tc_dds_transform: 0 c (0x) tc_transform: 0 c (0x) tc_plain_transform: 0 c (0x) tc_ShaderValidation: 0 c (0x) tc_ShaderGLSLLink: 0 c (0x) tc_ShaderGLSLCompile: 0 c (0x) xml_validation: 103.31 Mc (41x) ----------------------------------------------------- real 0m6.664s user 0m6.116s sys 0m0.508s
with the patched one:
[...CUT...] Turn 597 (200)... TIMER| profile2 get buffer: 7.66424 ms TIMER| profile2 visitor: 74.6106 ms TIMER| profile2 query: 82.8202 ms # Final state: 56f8dceac7c11f8548376a9cf2e11899 TIMER TOTALS (9 clients) ----------------------------------------------------- tc_pool_alloc: 36.4033 Mc (1739x) tc_png_decode: 0 c (0x) tc_dds_transform: 0 c (0x) tc_transform: 0 c (0x) tc_plain_transform: 0 c (0x) tc_ShaderValidation: 0 c (0x) tc_ShaderGLSLLink: 0 c (0x) tc_ShaderGLSLCompile: 0 c (0x) xml_validation: 105.967 Mc (42x) ----------------------------------------------------- real 0m5.437s user 0m4.892s sys 0m0.520s
So it is at least > 15% faster on this map with few units and running only a couple of minutes, which is really promising. I should try with a longer play when I'll find some time.
Note however that the final state hash is different with the patch.
comment:13 by , 11 years ago
I also tried with Combat demo (huge). However with the patch it's a lot slower (6m52.802s vs. 3m23.077s). I attach the commands.txt for this demo.
comment:14 by , 11 years ago
I can confirm that with your commands.txt.
Unpatched version:
TIMER| profile2 get buffer: 846.507 us TIMER| profile2 visitor: 53.3953 ms TIMER| profile2 query: 54.4477 ms # Final state: 56e7b0c246f7bd2ea0fe92790b3016b6 TIMER TOTALS (9 clients) ----------------------------------------------------- tc_pool_alloc: 12.2239 Mc (1738x) tc_png_decode: 0 c (0x) tc_transform: 0 c (0x) tc_plain_transform: 0 c (0x) tc_dds_transform: 0 c (0x) tc_ShaderGLSLLink: 0 c (0x) tc_ShaderGLSLCompile: 0 c (0x) tc_ShaderValidation: 0 c (0x) xml_validation: 18.5788 Mc (7x) ----------------------------------------------------- real 0m32.780s user 0m32.635s sys 0m0.076s
Patched version:
TIMER| profile2 get buffer: 924.714 us TIMER| profile2 visitor: 53.2509 ms TIMER| profile2 query: 54.4416 ms # Final state: e669e0cbc9e4afcf060b66203d3a0676 TIMER TOTALS (9 clients) ----------------------------------------------------- tc_pool_alloc: 26.1131 Mc (1738x) tc_png_decode: 0 c (0x) tc_transform: 0 c (0x) tc_plain_transform: 0 c (0x) tc_dds_transform: 0 c (0x) tc_ShaderGLSLLink: 0 c (0x) tc_ShaderGLSLCompile: 0 c (0x) tc_ShaderValidation: 0 c (0x) xml_validation: 17.6386 Mc (6x) ----------------------------------------------------- real 1m14.301s user 1m13.198s sys 0m0.255s
I also did a callgrind run on both versions. While the unpatched version did 13691 calls to CCmpPathFinder::ProcessShortRequest, the patched version did 22644 calls. Is that because of timing differences?
comment:15 by , 11 years ago
Notice the final state hashes differ with and without the patches, so it's altering the simulation behavior.
comment:16 by , 11 years ago
All patches applies here with "svn patch myfile.patch" ? is using patch directly mandatory for the project ?
Reposted a const.patch/diff
Other patches applies and let see a 3% gain on your benchmark method.
Now for pathfinding, the hash thing is certainly a problem. I'll have a look.
On the "time" performance, they're not representative at all on pathfinding patch, as in order to keep "save" compatibility, serialisation does some vector to map/ptr_map translation and deserialisation each time (which happens a lot during replay apparently), which surely gives wrong timing there. (the entity map was used for find and loop, whereas a vector for loop and an entityid indexed vector gives ). Testable with a non-tested hack with pathfind_test, just for perf, which gives 10% on that replay
I'll look for a workaround for both problems (hash and serialisation overhead breaking benchmarks (perhaps related)).
follow-up: 18 comment:17 by , 11 years ago
Don't you think that at least some of the timing difference might be because of this:
I also did a callgrind run on both versions. While the unpatched version did 13691 calls to CCmpPathFinder::ProcessShortRequest?, the patched version did 22644 calls.
comment:18 by , 11 years ago
Replying to scroogie:
Don't you think that at least some of the timing difference might be because of this:
I also did a callgrind run on both versions. While the unpatched version did 13691 calls to CCmpPathFinder::ProcessShortRequest?, the patched version did 22644 calls.
Surely, that too, but I suspect that the drastic perf difference in pathfinding comes from the serialization thing.
comment:19 by , 11 years ago
Removed the problematic parts, fixing perf "timing" and hash. Posted the patch pathfind.2.patch still giving reasonable perfs on pathfinding. So now all patch should apply and gives correct hash and perf on both linux and win32 ( 0ad_perf.tar.gz (without pathfind and const ) + const.patch + pathfind.2.patch )
I'll make a new patch and investigate on the entity_id entity map perf/hash problem in another ticket/patch, as the perf impact of std::map::find and iterate grows with number of agents (log(n) for find, non contigous memory access for iteration) instead of not constant not growing find and prefetching iteration, and therefore is a serious candidate for slowing down huge numbers game.
comment:20 by , 11 years ago
I tried the "3" package and the performance now looks similar to stock 0ad (no regression but no improvement). I tried again the two commands.txt I tested before.
comment:21 by , 11 years ago
@fabio strange, on your commands.txt time bench I get a consistent 3-5% perf. Be sure to do the time at least twice each to avoid "cold cache" effect
Added a patch to apply on vanilla 0ad svn in order to test/benchmark directly the pathfinding using a simple command:
test -test TestCmpPathfinder
Using that benchmark, I get 10-15% on computePath and 15-20% on computeshortpath.
comment:22 by , 11 years ago
Yes, I noticed a small gain with the patch, but the first command.txt test is still slower than when using the first patch (where I got a >15% improvement).
by , 11 years ago
Attachment: | Archive.zip added |
---|
latest all-in-one patch against latest svn (removed size_empty, added Territory_memfrag and pathfind_bench useful for vanilla 0ad comparison)
comment:24 by , 11 years ago
I tested the latest minimap.patch
and indeed measured a performance improvement on my system
Win7 SP 1 (6.1.7601); CPU: x86, Intel Core i7-2600K @ 3.40GHz (1x4x2), 3.39 GHz; Memory: 8192 MiB; 5352 MiB free; Graphics Card: AMD Radeon HD 6800 Series; OpenGL Drivers: 4.2.12217 Compatibility Profile Context 12.104.0.0, atioglxx.dll (6.14.10.12217)
Example profiling data from Gambia River map shortly after loading:
With patch: render minimap 0.873 ms/frame Without patch: render minimap 1.516 ms/frame
But I notice it breaks minimap unit rendering on fixed function render path. This can be tested by setting renderpath=fixed
in your local.cfg
file (see Manual_Settings). It's not a big deal because we've been meaning to remove the fixed render path anyway, but progress on that has stalled and we still technically support it. Can you figure out what might have broken?
The other thing is that it looks very strange when toggling reveal map (Alt+D to open dev overlay), as if it's being rendered in a different order than before. I don't think this is a significant problem either but IMO it looks worse than it did.
What's the motivation for maxEntitiesDrawn = 8144
? We could have maps with more entities than that, but we don't want some entities left unrendered in the minimap or debug warnings on those maps. Of course there is a limit to how many could be usefully rendered in the minimap but we should use some algorithm for that (along with other minimap rendering improvements - #763).
Otherwise, it needs more testing for bugs on different systems, then we can commit it.
follow-ups: 26 27 32 comment:25 by , 11 years ago
- added new Minimap.patch fixing the ff path.
- tested on nvidia linux&win7
- 8114 is arbitrary. Can goes up to 65535 (max indice if we stick to 16bits buffer,best openglES wise), didn't find a max entity number. A simple merge by distance algo is indeed possible, but should only kick if above the size. (a fixed size is mandatory perf wise, avoiding buffer allocation/deallocation is important perfwise)
- "Reveal Map order" change is just showing that mixing deprecated opengl immediate calls with modern (async) opengl calls leads to driver random drawing order. Can change that in making all other deprecated opengl calls (glvertexf() and all) going modern opengl (async vbo). (is there a ticket for getting rid of ff path ? of gl deprecated calls (could be mixed with openglES ticket)? )
comment:26 by , 11 years ago
is there a ticket for getting rid of ff path ? of gl deprecated calls (could be mixed with openglES ticket)?
There is a forum topic: http://www.wildfiregames.com/forum/index.php?showtopic=16734
comment:27 by , 11 years ago
Thanks for fixing the bug. There is no ticket for removing fixed function support yet, the person who was working on that (myconid) has disappeared. fabio's link explains the reasoning though: the renderer was going to be completely overhauled and fixed function support was an obstacle. We probably shouldn't remove it without reason, but only if there is an advantage for everyone else, as myconid claimed there would be.
Replying to kuranes:
- 8114 is arbitrary. Can goes up to 65535 (max indice if we stick to 16bits buffer,best openglES wise), didn't find a max entity number. A simple merge by distance algo is indeed possible, but should only kick if above the size. (a fixed size is mandatory perf wise, avoiding buffer allocation/deallocation is important perfwise)
There can be many more entities than that in theory (see Entity.h), but 65535 is more than enough for visible entities on the minimap, with our current map sizes.
- "Reveal Map order" change is just showing that mixing deprecated opengl immediate calls with modern (async) opengl calls leads to driver random drawing order. Can change that in making all other deprecated opengl calls (glvertexf() and all) going modern opengl (async vbo).
I don't think it's worth doing that now, but it's something to keep in mind for other reasons. The difficulty (I think) would be implementing the fixed function version and maintaining them both as we improve the minimap. It needs a lot of work, like using pixel art for resources, instead of drawing every entity as a point.
follow-up: 29 comment:28 by , 11 years ago
- You want me to do the 65535 change or the merge by distance thing ?
- Note that removing glvertexf calls in favor of VBO call doesn't break ff. (it's removing "glmatrix" and adding shader calls that does)
- About ff, I replied in the forum thread. In my "steps" in the forum, we can even keep ff as a possible "renderer". and make only the VBO changes (shader changes would be done in "renderers") (once you have VBO, you can do async drawing and draw them in any renderer you like).
comment:29 by , 11 years ago
Replying to kuranes:
- You want me to do the 65535 change or the merge by distance thing ?
A large limit is fine for now, after it's committed, we can make a ticket that it might crash with a large number of entities. Until we decide how to handle this. I think far far less than 65535 should ever be rendered in practice. We need some minimap mockups to see how it should look, maybe Pureon will be up for that.
- Note that removing glvertexf calls in favor of VBO call doesn't break ff. (it's removing "glmatrix" and adding shader calls that does)
We already have VBO support, so that doesn't sound too difficult. If you want to attach a new patch with it, we'll test :)
follow-up: 31 comment:30 by , 11 years ago
kuranes, did you want to continue working on the minimap rendering? If not, we can move forward on committing the latest patch and leave the remaining work as a TODO.
comment:31 by , 11 years ago
Replying to historic_bruno:
kuranes, did you want to continue working on the minimap rendering? If not, we can move forward on committing the latest patch and leave the remaining work as a TODO.
Sry, I tought you were hinting at letting it run like that until next gen minimap show up. I can make a patch either with a 65k limit or just limit drawing a 8144 ? Which one it is ?
(got a bit sidetracked with making pathfinding a separate static lib with its own gui tools for test/code in wxwidget)
comment:32 by , 11 years ago
Sorry, I was only referring to this suggestion:
Replying to kuranes:
- "Reveal Map order" change is just showing that mixing deprecated opengl immediate calls with modern (async) opengl calls leads to driver random drawing order. Can change that in making all other deprecated opengl calls (glvertexf() and all) going modern opengl (async vbo). (is there a ticket for getting rid of ff path ? of gl deprecated calls (could be mixed with openglES ticket)? )
I thought you might be working on that, so I wasn't going to commit it. Sounds like you've been working on more important things though, don't worry about it :)
comment:34 by , 11 years ago
@ben: thanks
@historic_bruno: ok, misunderstood, sorry. On that ff removal, I've made some proposition on that FF removal topic: http://www.wildfiregames.com/forum/index.php?showtopic=16734&st=20#entry269321 but didn't get much discussion/approval. I'll add that specific step on my 0ad todo if it's ok.
comment:35 by , 11 years ago
Hey kuranes. Could you perhaps remove any old irrelevant patch/diff files? You can do that by clicking on the attachment, scrolling to the end of diff comparisons and clicking "Delete Attachment".
As for these patches in general, I think it's very hard to review it if you have all these patch files. It would make more sense to separate your minimap improvements and const patch into separate tickets - then we could give a more objective review on your code.
I know that there is a lot to improve in 0 A.D. source and it's hard not to fix everything :) But try to focus on a single task per ticket.
In general about the patches though:
- I think "const size_t value;" and "int getStuff(const MyStuff& arg) const" is all right, but "const size_t* const pvalue;" is a bit pushing the limits of readability :). Furthermore it should give no performance boost, since it's just a compile time check to make the variable read-only. The assembly generated has no difference. So having "int getStuff(const size_t arg)" is totally unnecessary, however "void doStuff(const MyStuff& arg)" still makes sense - since you are guaranteeing the value passed by reference will not change.
- Storing calculations in temporary variables is common sense and good practice in general, so that's the correct way to go.
- Passing larger than 4-byte values by reference is also a good thing to do. In general 0 A.D. has too many copy-on-move and temporary objects, so it's important to reduce that.
If you have created new tickets to categorize the patches and have removed the old patches, I can give you a better review than this. :)
comment:36 by , 11 years ago
Owner: | set to |
---|
follow-up: 38 comment:37 by , 11 years ago
Thanks RedFox for following up.
Here's list of created ticket for easier review: #2022 #2023 #2024 #2025 #2026
Sadly, No "delete attachment" for me, lacking trac permission I guess. All files can be deleted but archive.zip. I'll create fast ticket just now.
Non "optimizing const" are code documenation oriented, they can be removed if they don't suit team style.
- const size_t* const is but a way to point for aliasing and therefore "restrict" optimisations and use. (themself leading to potential openmp "optimisations spot")
- "int getStuff(const size_t arg)" is more developper safeguard and code documentation than real perf. (preventing people using args as local vars and easing code reading, by pointing possible optimisation when code evolves)
Btw, If you want just perf "const &" spots, and avoid my patches, get free version of PVS-studio, and run an analysis, it should list theme perfectly.
IRL hard times here due to heavy rainfall flooding in my location, impossible to allocate times on 0ad just now so pathfinding test env and the skinning patch followup is on hold.
follow-up: 39 comment:38 by , 11 years ago
Replying to kuranes:
IRL hard times here due to heavy rainfall flooding in my location, impossible to allocate times on 0ad just now so pathfinding test env and the skinning patch followup is on hold.
Sorry to heat that, thanks for your work so far, and thanks for creating separate tickets :) Hope you stay safe and dry!
comment:40 by , 11 years ago
Milestone: | Alpha 14 → Alpha 15 |
---|
comment:41 by , 10 years ago
Milestone: | Alpha 15 → Alpha 16 |
---|---|
Priority: | Must Have → Should Have |
We should review if we want any of this at all. Setting as "Should have".
follow-up: 43 comment:42 by , 10 years ago
When is this being commited/reviewed? It seems that we would get an enormous performance boost: just what 0 A.D. needs!
Edit: I'm referencing to the whole Kuranes patch.
comment:43 by , 10 years ago
Replying to niektb:
When is this being commited/reviewed? It seems that we would get an enormous performance boost: just what 0 A.D. needs!
Edit: I'm referencing to the whole Kuranes patch.
As the patches are 10+ months old they probably won't apply anymore without a lot of work.
comment:44 by , 10 years ago
Judging from my work on other Kuranes patches it's unlikely to be a huge difference.
comment:45 by , 10 years ago
Resolution: | → wontfix |
---|---|
Status: | new → closed |
Too much work to fix + not enough improvement.
comment:46 by , 10 years ago
Milestone: | Alpha 16 |
---|
comment:47 by , 8 years ago
Keywords: | review removed |
---|
I believe const should not affect the code generated by the compiler at all, so it'll have no effect on performance. (In most cases the compiler can already tell whether a variable gets modified or not; and in the cases where it can't tell, it can't assume some other code doesn't modify the variable despite it being const). Are there any cases where you can see these const changes resulting in faster code?