Opened 10 years ago

Closed 10 years ago

#2574 closed enhancement (fixed)

Headless Testing

Reported by: agentx Owned by:
Priority: Should Have Milestone: Alpha 17
Component: Core engine Keywords: Testing
Cc: Patch:

Description

This is partially already there, but I think it could need a few kicks to make it really useful for automated release testing and bot development.

Procedure: Start the game per command line, let read config, perhaps run at warp speed, log against stdout, print report at finish. Change config, start over.

Use cases:

Balance: It is possible to create a map for n factions where the outcome with a simple defined bot is that only one unit survives. From there ones knows how many e.g. Athen cavalry cancels out how many Roman Infantry. Having these numbers let also compare costs. Any change in balance can then be tested on the given map and allows conclusion like the romans are now 1.5 times stronger, but cost 1.6 times more metal. Is that realistic?

The map might look like an arena and the balance test bot hasn't to be very sophisticated, it just has to make sure every idle unit it recognizes attack moves to map center. Doing this with head would allow to capture a video showing that things actually work. Perhaps it isn't that complex to render directly videos in headless mode. (ffmpeg?)

Bot Development: A bot needs in any case heuristic config data. Is rushing at 12 min or 13 more effective, how many female at start, etc. Headless testing would allow Monte Carlo methods to find out these numbers. For realistic sea faring on maps with many islands that would be huge dev boost. This is also a first step towards genetic AI, as config input and game result deliver the data to optimize bots. Actually bots would learn from release to release from experience, a quite exciting option.

Headless testing should support all maps, plus random. Doing it clever, would allow this method do dog feeding, e.g. take a map and let it run until it has found out which units cancel each other out. Everything should be fully automated, so run overnight and present results from 50 games next morning.

I hope this proposal to achieve consistently good quality finds some friends.

--agentx

Change History (7)

comment:1 by Radagast, 10 years ago

It would definitely be a time saver in both categories and perhaps even more. For the genetic algorithms we surely need it at some point, if we can't somehow figure a workaround with shell scripts. (But I guess the time warp and others not yet are provided as command line argument. So that might be the first step.)

Video capturing without actually running a game implies that we redirect the rendered images/frames directly to the harddisk. I think it could be possible but I have no experience with this myself. And I wonder if it would be a significant difference performancewise.

Procedure: Start the game per command line, let read config, perhaps run at warp speed, log against stdout, print report at finish. Change config, start over."

Very useful. Perhaps we have the possibility for such a procedure already and don't know? Otherwise I wonder how difficult it would be .. especially the modify config part. Could it be random changes? Or should there be a way to add a function parameter to as command line argument. e.g. pyrogenesis --function_target=attack_frequency_in_minutes --function=12*x

comment:2 by agentx, 10 years ago

Live capturing into a H.264/mp4 video needs only a few lines of code: http://blog.mmacklin.com/2013/06/11/real-time-video-capture-with-ffmpeg/

comment:3 by sanderd17, 10 years ago

Balance isn't just about attack strength. It's also about production time/cost, unlock time, resource gather speed, applicable technologies, technology strength, technology research time/cost, ...

In most cases, a fixed bot wouldn't be smart enough to suddenly find that one overpowered unit or technology. As it will keep gathering fields with women even if it's faster to create soldiers.

Just sending some units to the map center will give you no info at all.

I hope to be able to extend and use the reports we get from lobby players in the future, to use them for balancing purposes.

It's more feasible for bot testing, as they will indeed improve slowly. But balancing and bot testing can't be combined. As the balancing would make the bot optimisations reversed, so bot learning has to start over again, and in the end, the bot learning will be too slow to find the overpowered units.

in reply to:  2 ; comment:4 by fabio, 10 years ago

Replying to agentx:

Live capturing into a H.264/mp4 video needs only a few lines of code: http://blog.mmacklin.com/2013/06/11/real-time-video-capture-with-ffmpeg/

There is already ffmpeg support but it was disabled 6 years ago (r5539).

in reply to:  4 comment:5 by agentx, 10 years ago

There is already ffmpeg support but it was disabled 6 years ago (r5539).

Looks like including ffmpeg is difficult. Does this apply to making calls via _popen too? Also, supporting at least one system would already be a good start. (Still wondering why Google invests $$$ into an online game video plattform...)

in reply to:  3 comment:6 by agentx, 10 years ago

Balance isn't just about attack strength.

Sander, I completely agree. My point is a start is needed. Something that allows improvement without changing the C code base. I'm asking to open a door. Knowing how units compare is not exactly no info. Some forum users would appreciate an answer. This is basic information and all what counts right before a battle. Currently all one can do is checking XML files + guessing, more or less educated.

A full balance comparison is definitely not trivial, but achievable step by step, not with one major update. However, Hannibal could tell e.g. which civ reaches town phase fastest on max XYZ using all resources optimally.

But balancing and bot testing can't be combined.

Sure, but a simple defined bot is helpful for automated and repeated testing. Haven't thought of starting a deathmatch against a bot just capable of sending units in the middle of a map :) I have already two bots just for testing: Numerus and Checker. Would never call them an AI, though.

comment:7 by agentx, 10 years ago

Resolution: fixed
Status: newclosed

The proposed functionality can be achieved with external tools using Linux, see: http://www.wildfiregames.com/forum/index.php?showtopic=18781

Note: See TracTickets for help on using tickets.