Opened 15 years ago

Closed 9 years ago

Last modified 3 years ago

#245 closed task (fixed)

[PATCH] Support XML validation

Reported by: Philip Taylor Owned by: leper
Priority: Nice to Have Milestone: Alpha 19
Component: Core engine Keywords: patch
Cc: Patch:

Description (last modified by leper)

It'd be nice to use proper validator tools to report errors in XML tools. So:

  • Write RelaxNG Compact schemas for all the XML formats. (See existing DTDs for GUI and terrain files.)
  • Add a script to convert them to RelaxNG XML.
  • Change the CXeromyces API to accept a schema argument.
  • Use that schema as part of the file hash, and use it to validate the document.
  • Possibly refuse to load invalid documents? Or just report the error (and cache the error message in the XMB)? Currently we report an error.
  • Write an external tool that validates all the XML files at once.

Attachments (7)

gui.rnc (5.0 KB ) - added by François Poirotte 15 years ago.
Compact Relax NG schema for GUI
terrains.rnc (659 bytes ) - added by François Poirotte 15 years ago.
Compact Relax NG schema for terrain
scenario.rnc (1008 bytes ) - added by Philip Taylor 15 years ago.
incomplete year-old scenario schema
xml_validation.patch (29.3 KB ) - added by historic_bruno 10 years ago.
xml_validation_r16600.patch (33.2 KB ) - added by leper 9 years ago.
Working patch (including tests) agains r16600.
xml_validation_r16661.patch (34.0 KB ) - added by leper 9 years ago.
xml_validation_r16711.patch (32.8 KB ) - added by leper 9 years ago.

Download all attachments as: .zip

Change History (31)

by François Poirotte, 15 years ago

Attachment: gui.rnc added

Compact Relax NG schema for GUI

by François Poirotte, 15 years ago

Attachment: terrains.rnc added

Compact Relax NG schema for terrain

comment:1 by François Poirotte, 15 years ago

See attachments above. These RNC files were directly derived from their DTD equivalents. They could probably be improved by restricting the types used for certain attributes (see also comments enclosed in the files).

Conversion to Relax NG XML syntax can be done using existing tools (such as trang, available at http://code.google.com/p/jing-trang/). Also, have a look at http://www.relaxng.org/#conversion for other possible converters.

Schema validation using Relax NG can be achieved with libxml2 though it suffers from a few caveats. I'm not sure whether it is possible with Xerces though.

by Philip Taylor, 15 years ago

Attachment: scenario.rnc added

incomplete year-old scenario schema

comment:2 by Philip Taylor, 15 years ago

We're using libxml2 for parsing, after having moved away from Xerces, so any in-game validation will likely have to use that. Are those caveats documented anywhere? (I don't think we'll need to do anything very fancy at all, so hopefully we can avoid any difficulties...)

comment:3 by (none), 14 years ago

Milestone: Open Source Release

Milestone Open Source Release deleted

comment:4 by Andrew, 14 years ago

Milestone: Backlog

comment:5 by historic_bruno, 10 years ago

Description: modified (diff)

r15377 added Relax NG compact and converted Relax NG XML grammars and a Perl script to validate the game's XML files. Engine integration is still a work in progress.

comment:6 by ben, 10 years ago

In 15378:

Fixes invalid actors reported by XML validator script. Some were only technically invalid but others were broken. Refs #245

comment:7 by ben, 10 years ago

In 15379:

Fixes a few other invalid XML files reported by the validator script, refs #245

comment:8 by historic_bruno, 10 years ago

I used trang (written in Java, New BSD-licensed) to convert the Relax NG compact grammars into XML. Need to examine it more closely and see if it's suitable for adding to SVN, with a trivial script to perform the conversions. Other converters are listed here.

Also need to look into Actor Editor to see if it's outputting invalid XML files, and if so, fix that.

comment:9 by historic_bruno, 10 years ago

Keywords: patch review added
Milestone: BacklogAlpha 17
Summary: Support XML validation[PATCH] Support XML validation

by historic_bruno, 10 years ago

Attachment: xml_validation.patch added

comment:10 by historic_bruno, 10 years ago

Attached patch adds XML validation for most of the XML formats in the game. However, it causes a crash on my system when running the tests. I didn't find any crash while testing the game itself, so it might only be a bug in the test setup.

The patch completes all the points of this ticket, except the RNC->RNG conversion script.

comment:11 by leper, 10 years ago

The scenario rng could be moved to CGame, or left where it currently is. GetValidator can return references to stuff that is later on replaced while the reference is still around. There's an easily fixed merge conflict in Minimap.cpp and adding the copyright dates could also be done.

The test segfault is in xmlRelaxNGValidateDatatype on line 8598 (libxml2 2.9.1) where lib->check is no valid pointer (I had values close to 0 (0xf occured once) and values with some of the high bits set 0x7fff0...0f (or close by)). Some of the other fields were also clearly bogus.

comment:12 by Itms, 10 years ago

Milestone: Alpha 17Alpha 18

comment:13 by Itms, 9 years ago

Milestone: Alpha 18Alpha 19

by leper, 9 years ago

Attachment: xml_validation_r16600.patch added

Working patch (including tests) agains r16600.

in reply to:  11 comment:14 by leper, 9 years ago

Replying to leper:

The scenario rng could be moved to CGame, or left where it currently is. GetValidator can return references to stuff that is later on replaced while the reference is still around.

These still need to be done/discussed.

Also CXeromyces::Terminate() should acquire the lock before operating on it (even though having another contender for it would be a logic bug).

The test segfault was caused by not clearing cached schemas.

by leper, 9 years ago

Attachment: xml_validation_r16661.patch added

comment:15 by leper, 9 years ago

Fixes the issue of GetValidator() leaking references by making it private. Also changes some places where we operate on the static data without having the mutex (not anymore). The order in Terminate() is possibly still a bit strange (I think it would make more sense if we'd reverse it).

The mutexes for the schemas seem a bit strange sometimes (and there is the possibility that we could leak references to cached schemas, but proper init/shutdown order should make this a non-issue).

Should we merge CanValidate() and ValidateEncoded(doc)?

I've decided not to do the scenario rng move to CGame, because it doesn't really matter.

Apart from that this would be ready.

by leper, 9 years ago

Attachment: xml_validation_r16711.patch added

comment:16 by leper, 9 years ago

After merging CanValidate() and ValidateEncoded() we either have to return true from RelaxNGValidator::ValidateEncoded(xmlDocPtr doc) in case !m_Schema, or supply a schema for simulation templates, or live with the slew of errors in both test cases and the game.

Should I just revert that merge and leave a TODO in the code, or is the above patch and the TODO included in there sufficient?

in reply to:  16 comment:17 by historic_bruno, 9 years ago

Replying to leper:

After merging CanValidate() and ValidateEncoded() we either have to return true from RelaxNGValidator::ValidateEncoded(xmlDocPtr doc) in case !m_Schema, or supply a schema for simulation templates, or live with the slew of errors in both test cases and the game.

I tested the last patch, but I don't get errors in the tests, and the only errors in the game are related to material XMLs (possibly an outdated schema?)

comment:18 by leper, 9 years ago

Because I included

if (!m_Schema)
    return true;

in RelaxNGValidator::ValidateEncoded(xmlDocPtr doc). Which might or might not be really wanted.

comment:19 by leper, 9 years ago

In 16733:

XML validation. Based on patch by historic_bruno. Refs #245.

comment:20 by leper, 9 years ago

Description: modified (diff)
Keywords: review removed

We still need a script to convert the rncs into rngs.

comment:21 by leper, 9 years ago

Owner: set to leper
Resolution: fixed
Status: newclosed

In 16734:

Add RNC to RNG conversion script that uses trang. Fixes #245.

Also fix the material.rnc.

comment:22 by leper, 9 years ago

In 16773:

Move GUI schemas to the mod mod. Fix some validation related errors in the mod mod.

Check if the used directory exists before trying to add a validator. Refs #245.

comment:23 by elexis, 5 years ago

In 22921:

Reliably report and reject invalid XML files following rP16733, refs #245, fixes #5222.

Don't write XMB files for XML files that failed the validation, otherwise the XML validation error will not be reported on consecutive program starts anymore (as the XMB is loaded, skipping validation).
This had resulted in invalid XML going unnoticed and committed in credentials.xml in rP21847.

Differential Revision: https://code.wildfiregames.com/D1574
Reported by: gameboy
Comments By: bb
Tested on: clang 8.0.1, Jenkins

comment:24 by wraitii, 3 years ago

In 25258:

Delete "Compact RelaxNG" / .rnc files

(Follows rP25210 where I deleted actor.rnc)

This deletes the .rnc validation files, upgrading the .rng files to being the source of truth.

  • The engine uses, via LibXML2, .rng files to validate XML schemas, to the .rnc files are redundant.
  • The RelaxNG "Compact" format is a DSL for the RelaxNG XML format. Its syntax is unique, unlike .rng files which are XML.
  • Some errors are likely - I'm not sure anyone has converted the compact files in years.
  • The tool to generate .rng from .rnc files is trang (https://github.com/relaxng/jing-trang), which runs on the JVM and is quite annoying to install compared to "your usual text editor".
  • The JS components use the full .rng format in XML, so editing this format is already familiar to most people that mod the game.

The .rnc files were added in rP15377 along the .rng files.

Refs #413, #245

Differential Revision: https://code.wildfiregames.com/D3824

Note: See TracTickets for help on using tickets.