Opened 11 years ago

Last modified 5 years ago

#1860 new enhancement

[PATCH] Use a vector for iterating in the ComponentManager

Reported by: sbte Owned by:
Priority: Should Have Milestone: Backlog
Component: Simulation Keywords: patch
Cc: Patch:

Description (last modified by sbte)

Using a vector in the ComponentManager instead of a map speeds up iterating considerably. It gives me a speedup of 4ms in Gambia River (14ms->10ms) for the sim interpolation. This, however, is not the only place where the code is called, so the overall speedup might be bigger.

The downside is that you have to keep both a map and a vector of the Components, but since those are just pointers, I don't think it is a problem.

(I edited the description for the vector version of the patch. This is much faster than the list version.)

Change History (21)

comment:1 by wraitii, 11 years ago

I've been playing with that patch (modified the Vec to List in the variable names) and I haven't encountered anything to indicate it could be failing. Not sure about the speedup, but there is no reason why this wouldn't speed things up slightly.

Edit: Should test MP with that though.

Last edited 11 years ago by wraitii (previous) (diff)

comment:2 by sbte, 11 years ago

I just did some profiling, and found out that a vector is a lot faster than a list, even though I thought otherwise. I'll fix my patch.

Edit: The vector version gives a speedup of 14ms -> 10ms. MUCH faster that is

Last edited 11 years ago by sbte (previous) (diff)

comment:3 by sbte, 11 years ago

Description: modified (diff)
Summary: [PATCH] Use a list for iterating in the ComponentManager[PATCH] Use a vector for iterating in the ComponentManager

comment:4 by wraitii, 11 years ago

I'm not too surprised, this sounds really similar to QUantumstate's optimization on the Range Manager. I believe you fixed what he called the "memory leak" issue by keeping the map, any other idea on how to do it?

comment:5 by sbte, 11 years ago

You mean that I should get rid of the map? I think I concluded last time that this was not possible, because the entity_id_t is still needed in some place. The memory 'leak' is limited to ~10k pointers, so it's not that much. But I'll have another look.

Edit: Oh, I see. In your patch you just used the entity_id_t as an index. If that id is always smaller than 2 time the amount of entities, then this should be fine indeed. Otherwise doing things that way would bee a bigger memory leak. Any idea about that?

Last edited 11 years ago by sbte (previous) (diff)

comment:6 by wraitii, 11 years ago

sbte: frankly, none. The real problem with using a vector without the map is that if you use tons of temporary entities, you get a huge vector for no real reason, which can be seen as a memory "leak". I "solved" that issue on my version of the patch by using pointers and deleting the informations about deleted stuff, so in itself the only memory used is that of the vector (which is, I believe, largely negligible) but you also have to make sure that serialization works (and deserialization too, which is harder as you'd have to initialize null pointers) and that things don't get too ugly when quitting the game/simulation, and restarting it. Basically It's not really nice, and probably not extremely robust, but it's probably the fastest way to do it.

Quantumstate said otherwise that we could keep a map of IDs that would be [Real ID of the entity <-> index in the vector] for fast lookups without having to use this hack, but this means we'll resize the vector more often and I'm not sure it's quite as fast.

Last edited 11 years ago by wraitii (previous) (diff)

comment:7 by sbte, 11 years ago

Ok, I tried two different approaches, one being storing the entries in the vector by entity id as you did in your patch. This resulted in huge vectors that were slow to iterate. Even if there's nothing in the vector, you still need to iterate and check if there is a pointer to a component there. This is really slow compared to the iteration over a close packed vector.

The other method was pushing to the vector and using the GetEntityId method of the IComponent to find the id. This resulted in a slowdown, because of the need to iterate in PostMessage.

I, however, still don't understand what you mean by the memory leak. I don't store any unnecessary information, because I push back to a vector of pointers. This means that per entity, only one extra pointer is stored. This is basically the same as storing a map with the index to the right position in the vector, but then faster.

Also, after testing more approaches, I still think the approach in the patch I posted here (the one using the vector) is the best.

comment:8 by wraitii, 11 years ago

Seems like it's not exactly the same case, in fact, but your approach seems better than mine in the other thread anyway. I recommend you take a look at both at the same time, for further committing.

With both these patches, I think pathfinding and rendering (and AI to some extent) are really the only bottlenecks in speed now. Though it's not perfect, framerate should be noticeably improved.

Last edited 11 years ago by wraitii (previous) (diff)

comment:9 by sbte, 11 years ago

Actually, the two approaches are totally different, because the RangeManager optimization depends on fast accessing by entity id, where here, it depends on fast iteration over all elements. This approach, which is best here, won't work there. And the same the other way round.

comment:10 by Kieran P, 11 years ago

Milestone: Alpha 13Alpha 14

comment:11 by Philip Taylor, 11 years ago

Some thoughts on 0001-Iterate-over-a-vector-instead-of-a-map-in-the-Compon.patch:

It uses std::vector::erase(), which is probably not ideal for performance in some cases - deleting an entity near the front of the list will result in a fairly big memcpy(). (Looks like there's about 12K entities on Gambia River, so that's potentially 100KB to copy for each deleted entity). If order doesn't matter, it's better to swap the target element with the last element and then do .pop_back().

But order does matter - if we broadcast messages to entities in a different order for different players, the game will likely go OOS. We need to either maintain the vector in a consistent order (e.g. sorted by entity_id_t), or extend the serializer to preserve the arbitrary order of the vector (which sounds ugly). I think the code currently always creates entities with IDs that are either increasing, or are local entities (which don't affect synchronisation/serialization), so simply using .push_back() will maintain a sorted order for non-local entities, which should be enough; but I don't know whether it's reasonable to rely on that being true forever. (And you still have to use .erase() to maintain the sorted order when removing.)

Slightly more generally: For interpolation performance, I think the current use of BroadcastMessage to do a virtual method call for every individual component is kind of wasteful, and the multiple CmpPtr lookups inside Interpolate() don't help. It'd be better if all the data needed by Interpolate() was in a single struct per component (including a copy of some of the CCmpPosition state) and if we had a single global std::vector of those structs. (In this case the order doesn't matter, so it's cheap to insert and delete from the vector). Then we could iterate over the whole vector at once, doing the interpolation and frustum culling etc, giving much better cache locality and more opportunity for parallelism.

That'd need some changes to the component system so it can store per-component-type state (in addition to (or maybe, in some cases, instead of) the per-component state) and call per-component-type methods to perform all the interpolation at once, but I suspect it'd give a greater benefit than the more generic change to CComponentManager, with less risk of expensive deletion or order-related sync problems.

comment:12 by Kieran P, 11 years ago

Keywords: review removed

in reply to:  11 comment:13 by sbte, 11 years ago

Replying to Philip:

Some thoughts on 0001-Iterate-over-a-vector-instead-of-a-map-in-the-Compon.patch:

It uses std::vector::erase(), which is probably not ideal for performance in some cases - deleting an entity near the front of the list will result in a fairly big memcpy(). (Looks like there's about 12K entities on Gambia River, so that's potentially 100KB to copy for each deleted entity). If order doesn't matter, it's better to swap the target element with the last element and then do .pop_back().

I did not do this, because I wasn't sure about the issues you named below. But it's indeed a lot faster. Deletion, however, happens a lot less than the Broadcasting. And one deletion is still cheaper than one interpolation I believe.

But order does matter - if we broadcast messages to entities in a different order for different players, the game will likely go OOS. We need to either maintain the vector in a consistent order (e.g. sorted by entity_id_t), or extend the serializer to preserve the arbitrary order of the vector (which sounds ugly). I think the code currently always creates entities with IDs that are either increasing, or are local entities (which don't affect synchronisation/serialization), so simply using .push_back() will maintain a sorted order for non-local entities, which should be enough; but I don't know whether it's reasonable to rely on that being true forever. (And you still have to use .erase() to maintain the sorted order when removing.)

How is this different from what I do now? I currently use push_back to add an element to the vector.

Slightly more generally: For interpolation performance, I think the current use of BroadcastMessage to do a virtual method call for every individual component is kind of wasteful, and the multiple CmpPtr lookups inside Interpolate() don't help. It'd be better if all the data needed by Interpolate() was in a single struct per component (including a copy of some of the CCmpPosition state) and if we had a single global std::vector of those structs. (In this case the order doesn't matter, so it's cheap to insert and delete from the vector). Then we could iterate over the whole vector at once, doing the interpolation and frustum culling etc, giving much better cache locality and more opportunity for parallelism.

I'm not sure what you mean here. Keeping a copy of the position means that you also have to update that position. Since this is the most heavy part of the Interpolation, you do not what to do this. The same holds for the terrain. If the position changes, you have to do a lookup for the terrain beneath it. Other than that, all the data needed is already local I think.

That'd need some changes to the component system so it can store per-component-type state (in addition to (or maybe, in some cases, instead of) the per-component state) and call per-component-type methods to perform all the interpolation at once, but I suspect it'd give a greater benefit than the more generic change to CComponentManager, with less risk of expensive deletion or order-related sync problems.

Not sure what you mean here too.

comment:14 by tuan kuranes, 11 years ago

Could we go toward an "entityDB" template ? Where you store data per entity_id and retrieve it by entity_id or can iterate over all the data stored there ?

It would lead to reusable tool across components, avoiding code duplication and ensure performance code.

Using two vector a filled and a sparse entity_id indexed as explained in my comment here (#1707) You get a fast iterator and a fast finder, and a single code base.

Last edited 11 years ago by historic_bruno (previous) (diff)

comment:15 by Kieran P, 11 years ago

Milestone: Alpha 14Alpha 15

comment:16 by wraitii, 10 years ago

Milestone: Alpha 15Alpha 16

comment:17 by sanderd17, 10 years ago

Milestone: Alpha 16Alpha 17

comment:18 by leper, 10 years ago

Milestone: Alpha 17Backlog

comment:19 by Imarok, 5 years ago

Component: UI & SimulationSimulation

Move tickets to Simulation as UI & Simulation got some sub components.

Note: See TracTickets for help on using tickets.