Opened 10 years ago

Closed 9 years ago

Last modified 9 years ago

#2475 closed task (fixed)

Decide on using new ES6 keyed collections (e.g. Map, Set)

Reported by: Yves Owned by: Yves
Priority: Should Have Milestone: Alpha 18
Component: Core engine Keywords:
Cc: Patch:

Description (last modified by Yves)

I'd like to have a team decision if we start using the new keyed collections which are available with SpiderMonkey 24 and later. These features are marked "experimental".

Overview

Mozilla writes this:

This is an experimental technology, part of the Harmony (ECMAScript 6) proposal. Because this technology's specification has not stabilized, check the compatibility table for >usage in various browsers. Also note that the syntax and behavior of an experimental technology is subject to change in future version of browsers as the spec changes.

Check the descriptions of these objects and also the browser compatibility list at the end of the articles:

Why should we use them?

It's mainly for performance reasons.

Performance example

Because I don't like generic benchmarks, here's an example of a test with our code I made today.

It's a non-visual replay of a 2vs2 AI game with aegis bots on Oasis 04 going over around 15000 turns. I've measured the total runtime in seconds of the get function in entity.js. I've used my v31 testing branch for the measurements but I don't know if it makes a difference compared to v24. The function gets called 51'206'230 times in total and the hash is the same with and without patch. The patch is attached. It's very simple and we could improve performance even further if we used it at some more key locations that use key/value pairs.

Before:
Time in seconds for get function:                      43.91533856391563
Total execution time  (some variantion is normal):    995 seconds

Get taking 4.4% of total execution time

After (2nd version of the patch):
Time in seconds for get function:                     21.613286443973905
Total execution time (some variation is normal):     987 seconds

Get taking 2.19% of total execution time

Attachments (4)

map_replacement.diff (2.5 KB ) - added by Yves 10 years ago.
performance testpatch (can be committed if the decision is positive)
map_replacement_v2.diff (3.2 KB ) - added by Yves 10 years ago.
second patch, now also using Map for the aura stuff
timer.diff (2.6 KB ) - added by sanderd17 10 years ago.
Test for the timer component. Results in a ~20% speed improvement in combat demo (huge) when no actions are executed.
MapSerialization_v1.0.diff (5.4 KB ) - added by Yves 10 years ago.
Serialization of Map objects combined with Sander's patch (except the serialization part in JS because it's obsolete now)

Download all attachments as: .zip

Change History (30)

by Yves, 10 years ago

Attachment: map_replacement.diff added

performance testpatch (can be committed if the decision is positive)

comment:1 by Yves, 10 years ago

My personal opinion is that we should use these. I don't think they are going to change completely before the specification is finalized and as you see in this patch it's quite simple to make the change (or change it back again).

by Yves, 10 years ago

Attachment: map_replacement_v2.diff added

second patch, now also using Map for the aura stuff

comment:2 by Yves, 10 years ago

Description: modified (diff)

comment:3 by Yves, 10 years ago

Description: modified (diff)

comment:4 by Josh, 10 years ago

I don't mean to be a big sceptic, but what's the standard deviation on these numbers? It seems to me the difference is within standard error and using experimental features for questionable improvement does not strike me best path for 0AD at this point.

in reply to:  4 ; comment:5 by Yves, 10 years ago

Replying to Josh:

I don't mean to be a big sceptic, but what's the standard deviation on these numbers? It seems to me the difference is within standard error ...

If you mean the total execution time numbers, then I agree. But the execution time of the get function in entity.js was twice as long before!

in reply to:  5 ; comment:6 by Josh, 10 years ago

Replying to Yves:

Replying to Josh:

I don't mean to be a big sceptic, but what's the standard deviation on these numbers? It seems to me the difference is within standard error ...

If you mean the total execution time numbers, then I agree. But the execution time of the get function in entity.js was twice as long before!

If you think it's twice as fast, I won't argue (you are our spidermonkey expert). Just wanted to make sure we don't miss "experimental".

in reply to:  6 comment:7 by Yves, 10 years ago

Replying to Josh:

If you think it's twice as fast, I won't argue (you are our spidermonkey expert). Just wanted to make sure we don't miss "experimental".

I expected some need for discussions, so I brought this up for making a decision. Twice as fast is what my measurement in the description shows (22 vs 44 seconds) I can measure again to confirm it or you can measure again if you think something's wrong.

comment:8 by sanderd17, 10 years ago

IMO, we should use it if it improves performance. I'd also like to switch EntityCollections to use maps or sets (Maps have a foreach function that's normally faster than regular for loops).

I don't expect the specification to change a lot. Only the specification wrt special values as keys could change I think (s.a. NaN and -/+0). But if we stay away from those special values, the API should be rather stable. The performance of Maps might still change with extra features being added though.

Last edited 10 years ago by sanderd17 (previous) (diff)

comment:9 by agentx, 10 years ago

Here is a pure performance test: http://jsperf.com/map-vs-object-property-access/6 with FF30 property lookup with Map is ~4 times faster compared to Object and Array.

Edit: This one is slightly different and gives 10 times better performance: http://jsperf.com/map-vs-object-as-hashes/5

Last edited 10 years ago by agentx (previous) (diff)

comment:10 by sanderd17, 10 years ago

This test is skewed because objects can only have strings as keys. This means that all other types first have to do a toString() conversion.

If you take all keys to be strings, you barely see a difference. This means we should test it in actual code, and the result will depend on the keys used, and on the operations done with the maps/objects (s.a. deletion).

comment:11 by agentx, 10 years ago

Well, it depends. You can either wait until Mozilla optimizes against 0 A.D's code base or you optimize 0 A.D's code base and exploit existing optimizations.

comment:12 by sanderd17, 10 years ago

I'm not saying that we shouldn't use maps, just that the test is wrong, and we shouldn't blindly transform all code into maps.

comment:13 by agentx, 10 years ago

the test is wrong

It broke your browser?

comment:14 by Yves, 10 years ago

I think what Sander says is the same I meant with "Because I don't like generic benchmarks, here's an example of a test with our code I made today". I don't like generic benchmarks and micro benchmarks because the result in a real application can be completely different. My example shows that there is an improvement in my specific testcase. But if you compare it to your results, it's still something different. It's twice as fast and not 4x as fast or 10x as fast. Yes, the function I measured also contains other code, but saying Map.get(key) is x times faster than obj[key] is a too general statement and doesn't make sense (no matter what x is).

However, I think that these types were made for a purpose and this purpose is an association of keys to values. It should be faster in these cases and I think we should change to these types where performance matters and where we can measure an improvement. Once the specification is finalized we should consider adding a similar guideline to our coding style guide as in the descriptions of these types.

About Map from the reference:

Use maps over objects when keys are unknown until run time, and when all keys are the same type and all values are the same type.

Use objects when there is logic that operates on individual elements.

comment:15 by historic_bruno, 10 years ago

We started using typed arrays when they weren't universally supported, so I think it's not a bad idea in principle. If we can serialize them properly, it's even less of an issue, we could potentially use them in simulation code. As others are saying though, I don't think we should use them unless they are known to be necessary (like typed arrays, we don't use them for every array that might be fixed size, only when fast access is critical).

comment:16 by agentx, 10 years ago

I don't like generic benchmarks and micro benchmarks because the result in a real application can be completely different.

I completely agree. You have very good reasons to not overestimate micro benchmarks, on the other hand macro benchmarks do not pin point the exact location where code gets slower than expected. I'd love to see a real performance comparison of Maps vs. Objects in ECs, not with all the other code running too. Say by creating/deleting/updating 10000 entities and updating 20-30 collections each time.

ECs use Objects as Maps and have been correctly identified as major performance sink. So one might think the problem is somewhere else, given the minimal improvement and the leaner data structure lacks the whole prototyping thing.

But the real question is which functions are compiled to machine code by IonMonkey. If they all bail out discussing Maps vs. Objects is a waste of energy. Wasn't that the goal of the migration? To go for maximum speed and to get rid of byte code where ever possible?

How come SM24 is so much faster in real world apps and benchmarks than SM4, except 0 A.D.?

comment:17 by Yves, 10 years ago

I'll have a look how serialization works with these types. Our team decision made during the meeting was:

ES6 keyed collections can be used if measurements confirm a significant performance improvement. ES6 keyed collections should currently not be used without measuring the performance difference because the specification is still not finalized.

... assuming serialization works.

comment:18 by Yves, 10 years ago

Keywords: review removed

comment:19 by Stan, 10 years ago

Milestone: Alpha 16Alpha 17

Will need further testing to decide wether the performance is worth it or not.

comment:20 by Yves, 10 years ago

Note: Consider using Map objects for the caching in l10n.js

by sanderd17, 10 years ago

Attachment: timer.diff added

Test for the timer component. Results in a ~20% speed improvement in combat demo (huge) when no actions are executed.

comment:21 by Yves, 10 years ago

Milestone: Alpha 17Alpha 18

by Yves, 10 years ago

Attachment: MapSerialization_v1.0.diff added

Serialization of Map objects combined with Sander's patch (except the serialization part in JS because it's obsolete now)

comment:22 by Yves, 10 years ago

In 15770:

Adds Serialization support for ES6 Maps.

Also includes the patch from Sanderd17 to use Maps and Sets for the Timer components. Sets can't be serialized yet, but in this case they don't require serialization.
Refs #2475

comment:23 by Yves, 10 years ago

In 15773:

Reverts the use of Set in r15770.

Sets don't support multiple elements with the same key which is required here to fire timers multiple times per turn.
Refs #2475

comment:24 by Yves, 10 years ago

In 15781:

Use ES6 Maps in the AI API for applying tech and aura modification to entity templates.
This should improve performance a bit (check the ticket for details).

Refs #2475

comment:25 by Yves, 9 years ago

Resolution: fixed
Status: newclosed

I'm closing this because the decision is made. Maps and Sets may be used (not sure if there are any use cases for WeakMaps and WeakSets). At the moment they should mainly be used if they improve performance significantly or make the code a lot more clean.

comment:26 by agentx, 9 years ago

It should be noted, that while Maps improve single property access, they are ~ 10 times slower when looping over all keys by for .. of compared to Object.keys(obj).forEach() So, depending on the use case there is a upper limit and for huge key collections it is not advised to use Maps. And regarding strings, SM generates a index of all short strings at read time, so actually it doesn't compare strings but pointers when accessing object values.

Note: See TracTickets for help on using tickets.