#4705 closed enhancement (fixed)
Use TLS for multiplayer lobby
Reported by: | rugk | Owned by: | Itms |
---|---|---|---|
Priority: | Must Have | Milestone: | Alpha 27 |
Component: | Multiplayer lobby | Keywords: | |
Cc: | Patch: | Phab:D4910 |
Description (last modified by )
TLS should be used in the lobby in order to protect passwords.
Threat model: Any network attacker; three-letter agency or attacker in wifi
Edit: SSL is considered a requirement since May 25th 2018 following GDPR article 32.1.a https://gdpr-info.eu/art-32-gdpr/ (and there already have been Cease&Decist letter for trivial contact forms not using SSL).
Report by leper on Sep 7th 2013: https://github.com/JoshuaJB/0ad/issues/3
Change History (57)
comment:1 by , 7 years ago
Milestone: | Backlog |
---|---|
Resolution: | → invalid |
Status: | new → closed |
comment:2 by , 7 years ago
Description: | modified (diff) |
---|---|
Resolution: | invalid |
Status: | closed → reopened |
Summary: | Use HTTPS for multiplayer lobby → Use TLS for multiplayer lobby |
Edited.
comment:3 by , 7 years ago
Component: | Network → Multiplayer lobby |
---|
Please let someone else manage this issue.
comment:4 by , 6 years ago
Keywords: | security multiplayer removed |
---|---|
Milestone: | → Backlog |
Just a note for the future work on this: gloox will have to be compiled with gnutls (or possibly openssl, but gnutls is the default) and anonymous authentication is currently broken with gnutls >= 3.6.0 (see #5033 and the related openSUSE bug). This might be fixed when we work on this.
comment:5 by , 6 years ago
TLS enabled in r21875.
As mentioned in #5033 I only see the gloox unit tests being affected, so this should not affect 0 A.D. code in any way on any platform. I also only see the choice between OpenSSL and GnuTLS in the INTALL file and online FAQ, but no choice to compile without any TLS support. I asked Imarok who built the last gloox binary to confirm.
comment:6 by , 6 years ago
Milestone: | Backlog → Alpha 23 |
---|
comment:7 by , 6 years ago
Problem.
TLS does seem to work with TLS according to Angens test yesterday.
That means the windows gloox binary comes with either with GnuTLS or OpenSSL.
According to Itms uncertain comments in #5033, it does not use GnuTLS.
Philip posted a comment the next day on IRC that noone cared about:
2018-02-13-QuakeNet-#0ad-dev.log
:
13:37 < Philip> Itms: Regarding the OpenSSL thing: You might need to check that the OpenSSL license is compatible with GPL (v3 for Gloox, I think?) - I vaguely remember there being problems in the past, and I don't know if they've been resolved yet
13:38 < Philip> See e.g. https://lwn.net/Articles/428111/
So it seems we need a new gloox binary because we have violated a license all the time.
comment:8 by , 6 years ago
I do not know about gloox, but libcurl on Windows gets TLS support from a library (called winssl if my memory is correct) provided by the platform. Maybe the same thing is used by gloox by default on Windows. In any case we are not using gnutls nor openssl to build gloox on Windows (maybe I wasn't the last one to update it but when I did I didn't use either of these libs, and I assume other people have followed the same instructions as I did).
comment:9 by , 6 years ago
In that case we probably use WinTLS which is the default for windows and if Im not mistaken not supported for WindowsXP.
comment:10 by , 6 years ago
Milestone: | Alpha 23 → Backlog |
---|
Certificate and server are correct, gloox doesn't implement certificate verification incorrectly and rejects the certificate: https://bugs.camaya.net/ticket/?id=280
comment:11 by , 6 years ago
So the error is "unknown certificate issuer".
Are you sure you actually tested it on a system, where either the Let's Encrypt root certificate or the "IdenTrust DST Root CA X3" are included? Because when they are not, that would result in such an error. See https://letsencrypt.org/certificates/ for the details.
follow-up: 33 comment:13 by , 6 years ago
Solved: Licensing
Issues: Windows XP support dropped? gloox broken.
I guess you wanted to hear the story:
- So it seems we use SChannel implementation on Windows builds and GnuTLS on unix builds.
- Therefore no GPL v3 licensing issue with OpenSSL.
- But we might have dropped Windows XP without noticing, as SChannel is either not or not well supported there.
Certificate Verification:
- Since the Let's Encrypt certificate was installed, the certificate verification still failed on Windows (user1) and Ubuntu (mine) but not Debian/unstable (Dunedan).
- In Phab:D1620 the certificate error strings were displayed from gloox and it said the Issuer (Let's Encrypt) was not recognized.
- The Let's encrypt certificate was correctly installed according to the manual.
- ejabberd documentation says it serves certificate chains well.
- wireshark confirms that ejabberd sends a certificate chain containing the Let's Encrypt root CA cert signed by IdenTrust and the lobby certificate signed by Let's Encrypt.
libCURL
in 0 A.D. accepts the Let's Encrypt certificate fromModIO.cpp
(https://api.mod.io/
) andUserReporter.cpp
(when changing the upload url tohttps://play0ad.com/
)- That already indicates that the error should be searched in the gloox code and how it uses GnuTLS and SChannel.
- The rootCA was in my directory and also adding the Lets Encrypt rootCA didn't change the result.
- The
gnutls-cli
app works as intended with https but didn't establish a connection to xmpp (although it claims it can). - Joining
jabber.de
with 0 A.D. works, but also rejects the cert. Half of all Xmpp servers I found use Lets Encrypt. - Installing the most recent gloox version didn't change the result.
- Setting the GNUTLS debugging level also didn't show anything relevant.
- Since it was a trust issue, looking at how gloox reads the root certificate authority certificates that the operating system trusts revealed the issue in gloox. Namely that it doesn't read any.
libCURL
code hardcodes certificate paths like/etc/ssl/certs/
, that's not an option for us (nor advisable forgloox
norlibCURL
since it's the OS decision which certs to trust).- Browsing through the GnuTLS specifications shows
gnutls_certificate_set_x509_system_trust
should be used to load the OS certificates. Doing so fixes the Issuer bug. - Then there's a wrong negation in the gloox GnuTLS client code. If both are fixed, the certificate is accepted. Hence the patch uploaded to https://bugs.camaya.net/ticket/?id=280
- The gloox GnuTLS implementation also doesn't seem to pass on the given caCerts in the constructor, but I don't want to go down that rabbithole too.
- The gloox GnuTLS code seems about untouched since 2012 before the
gnutls_certificate_set_x509_system_trust
command became available. But that's a discussion for the gloox bugtracker. - Since the certificate is rejected on the Windows build as well and since we assume SChannel is used, the gloox SChannel implementation may be assumed broken as well.
- Once upstream gloox/GnuTLS bug is fixed, we may rebuild the Windows gloox library with GnuTLS if that's the case.
comment:14 by , 6 years ago
Description: | modified (diff) |
---|
comment:16 by , 6 years ago
Itms tested 0 A.D. on Windows XP, found that libCURL complains about outdated TLS support and gloox connects as expected (ignoring certs).
follow-up: 48 comment:19 by , 5 years ago
TLS 1.0 should be disabled serverside, because it is prone to BEAST attacks. However the Windows client currently requires that version.
comment:20 by , 5 years ago
Description: | modified (diff) |
---|
comment:23 by , 5 years ago
The lobby player "Bolt" has reported that on Windows 10, the a23b version has complained about the lobby certificate being invalid. That is true since it recently expired, however default.cfg
disables certificate validation by default (since it's bugged in gloox). Bolt did not know about such technicalities and is unlikely to have edited the config. Itms has tested on Windows 10 too and didn't experience this. Not sure how this can happen, perhaps it'S another bug in gloox.
comment:24 by , 5 years ago
Just tried in Windows 10, and didn't get any error with or without TLS. I guess it depends on the Windows 10 version he has. Did he give any other information ?
follow-up: 28 comment:25 by , 5 years ago
He should try again now that the cert is valid, just to exclude that possibility.
comment:26 by , 5 years ago
(01:21:41 PM) randomid: 0ad 23b is not working out of the box on arch linux (01:21:59 PM) randomid: TLS connection error occurs on Lobby connect (01:22:20 PM) randomid: needed to set up local.cfg and deactivate verification
It sounds like XmppClient.cpp
fails to get the default.cfg
value and uses the C++ hardcoded default true
.
follow-up: 29 comment:27 by , 5 years ago
(01:38:53 PM) randomid: default.cfg my installation of 23b used is different from default.cfg from github (01:42:31 PM) randomid: its cfg from 23a. (01:43:12 PM) randomid: default.cfg was not updated, while installation of 23b
Edit:
The bundle at http://releases.wildfiregames.com/ contains the most recent default.cfg
. https://www.archlinux.org/packages/community/any/0ad-data/ appears up to date too, so perhaps he updated 0ad but not 0ad-data. (Otherwise it sounded exactly like what Bolt had on Win10.)
comment:28 by , 5 years ago
Replying to Itms:
He should try again now that the cert is valid, just to exclude that possibility.
I can confirm the certificate was renewed, Bolt did try again and still gets the error somehow. Bolt also has the new public.zip (which contains default.cfg), because he saw the TLS options in the menu.
comment:29 by , 5 years ago
Replying to elexis:
The bundle at http://releases.wildfiregames.com/ contains the most recent
default.cfg
. https://www.archlinux.org/packages/community/any/0ad-data/ appears up to date too, so perhaps he updated 0ad but not 0ad-data. (Otherwise it sounded exactly like what Bolt had on Win10.)
I just installed 0ad on a fresh install of Arch and everything works well. The default.cfg is correct. So I guess it was an installation bug, maybe they should remove everything and reinstall.
comment:30 by , 5 years ago
Looks like he updated the 0ad, but not 0ad-data arch package, and 0ad-data should have a dependency for 0ad.
(02:08:13 PM) randomid: 0ad-data should be dependecie of 0ad
(02:09:01 PM) randomid: i update 0ad from repo. i checked now, 0ad-data was not updated. thats explains my failure
Only Bolts Windows 10 case remains open (maybe a gloox bug or some firewall thing?).
comment:31 by , 5 years ago
It's not related to the ticket but to the same things as the last comments. It seems there was a confusion somewhere and in debian 0ad-data wasn't updated. Leading to weird states of debian users who wanted to use the multiplayer lobby/ https://metadata.ftp-master.debian.org/changelogs//main/0/0ad/0ad_0.0.23.1-2_changelog
comment:32 by , 5 years ago
For everyone's information, gloox 1.0.22 was released two weeks ago, it includes elexis' patch for TLS verification on Unix with gnutls. However I built 1.0.22 on Windows and it doesn't fix the TLS verification with SChannel.
So upgrading is not really allowing us to verify certs yet.
comment:33 by , 5 years ago
Since that diff only modified the GnuTLS implementation, it couldn't fix the SChannel implementation.
One could only build it on windows with GnuTLS if one wants to burn some time, but that only hides the upstream issue and the time could be invested in fixing SChannel on gloox.
That would be in tlsschannel.cpp
.
Notice there are commented out certificate variables in the tlsgnutlsclient.cpp
that I marked as TODO in the attachment and I see more commented out certificate variables in tlsschannel.cpp
.
(09:47:30 PM) Emperior: elexis TLS handshake did not complete successfully. the certificate has been revoked. The certificate hasn't got a known issuer. The certificate has not been issued for the peer we're connected to.
(09:48:01 PM) Emperior: windows 10
Judging from the error message, I guess it's broken from left to right in said file.
follow-up: 35 comment:34 by , 5 years ago
I experienced an issue with tls option turned on. No issue without tls encryption (option turned off).
TLS handshake did not complete successfully
The certificate hasn't got a known issuer.
The certificate has been revoked.
The certificate has not been issued for the peer we are connecting to.
I am on Windows 10 version 1903. The problem started when I first install 0ad on the computer ~1 month ago. The problem only exist for one user (me) on the computer. Other user accounts including new accounts work fine with tls encryption on. I am not sure why this would happened and all users use the same copy of 0ad on C:\Program Files
. How can I fix this problem? I didn't think I have touched anything related to tls except git with ssh but should not be related to tls?? Maybe due to wsl turned on with ubuntu on it? I am very confused now.
comment:35 by , 5 years ago
Replying to DeadByIdentityV:
I experienced an issue with tls option turned on. No issue without tls encryption (option turned off).
TLS handshake did not complete successfully
The certificate hasn't got a known issuer.
The certificate has been revoked.
The certificate has not been issued for the peer we are connecting to.I am on Windows 10 version 1903. The problem started when I first install 0ad on the computer ~1 month ago. The problem only exist for one user (me) on the computer. Other user accounts including new accounts work fine with tls encryption on. I am not sure why this would happened and all users use the same copy of 0ad on
C:\Program Files
. How can I fix this problem? I didn't think I have touched anything related to tls except git with ssh but should not be related to tls?? Maybe due to wsl turned on with ubuntu on it? I am very confused now.
I'm not sure what is causing this but meanwhile you can disable TLS in the game options. I would bet on permission issues, I think 0 A.D. usually installs in APPDATA for that very specific reason.
comment:36 by , 5 years ago
The problem only exist for one user (me) on the computer. Other user accounts including new accounts work fine with tls encryption on. I am not sure why this would happened and all users use the same copy of 0ad on C:\Program Files
Thanks for the report!
In order to connect to the lobby, 0 A.D. uses the gloox library, and that has (a bit broken) TLS support. I remember fixing a bug for the linux TLS gloox part where gloox was not able to read from the system store of certificates and I think the error message was the same. So I think it might be that the Windows system is configured in such a way that the 0 A.D. may not have permission to read the system store of certificates, and thus fails to verify them, whereas other users do have that permission.
https://docs.microsoft.com/en-us/dotnet/framework/wcf/feature-details/working-with-certificates
The "Trusted Root Certification Authorities" and "Chain Trust" part is relevant. There is the root certificate authority (looks like "Internet Security Research Group" currently) that certifies the "Let's Encrypt Authority X3" certificate authority, and that one certified Wildfire Games certificate.
So if this one Windows user cannot access one of these certificates in the 'chain of trust', i.e. one of the root certificate authority certificates, then the chain of trust is broken and the error pops up.
So if you have some time and will to test, it would be good if you could try to find any permission difference for access of certificates in the Windows certificate manager.
Notice that this only relates to the public keys, the private keys of certificates are not published and no certificates should have to be added.
Other than a permission issue, I could also imagine that a certificate in the chain of trust might be outdated for one user but updated for the other users (sounds unlikely though).
In the windows gloox code tlsschannel.cpp
, the function validateCert
has some cases where validation is abandoned, it might be that some of these cases are triggered depending on the user:
// Get server's certificate. if( QueryContextAttributes( &m_context, SECPKG_ATTR_REMOTE_CERT_CONTEXT, (PVOID)&remoteCertContext ) != SEC_E_OK ) { //printf("Error querying remote certificate\n"); // !!! THROW SOME ERROR break; } if( uServerName == 0 ) { //printf("SEC_E_INSUFFICIENT_MEMORY ~ Not enough memory!!!\n"); break; } // convert into unicode csizeServerName = MultiByteToWideChar( CP_ACP, 0, serverName, -1, uServerName, csizeServerName ); if( csizeServerName == 0 ) { //printf("SEC_E_WRONG_PRINCIPAL\n"); break; } if( !CertGetCertificateChain( 0, remoteCertContext, 0, remoteCertContext->hCertStore, &chainParameter, 0, 0, &chainContext ) ) { // DWORD status = GetLastError(); // printf("Error 0x%x returned by CertGetCertificateChain!!!\n", status); break; } if( !CertVerifyCertificateChainPolicy( CERT_CHAIN_POLICY_SSL, chainContext, &policyParameter, &policyStatus ) ) { // DWORD status = GetLastError(); // printf("Error 0x%x returned by CertVerifyCertificateChainPolicy!!!\n", status); break; } if( policyStatus.dwError ) { //printf("Trust Error!!!}n"); break; }
The error messages you posted above come from 0ad code XmppClient::onTLSConnect
where the argument glooxwrapper::CertInfo& info
was passed by gloox tlschannel.cpp
, where status = ConnTlsFailed
and 0ads XmppClient::TLSErrorToString
shows the error message, for the bits:
CertSignerUnknown = 2, /**< The certificate hasn't got a known issuer. */ CertRevoked = 4, /**< The certificate has been revoked. */ CertWrongPeer = 32, /**< The certificate has not been issued for the
i.e. that's the number 38
.
I don't see how the function SChannel::validateCert
in gloox tlschannel.cpp
can combine three error messages when it's code (switch( policyStatus.dwError )
) only sets at most one of these bits at a time. And that's the only code in that file that sets m_certInfo.status
.
This makes me think that your user account might not even use the tlschannel.cpp
means to apply TLS.
The only alternative for windows might be tlsopensslbase.cpp
, i.e. OpenSSL used for your user account but SChannel used for the other users. (One can compile with Gnu TLS on windows too, but you said you did not.)
Looking at OpenSSLBase::handshake()
in that file, voila!
m_certInfo.status |= CertWrongPeer; m_certInfo.status |= CertNotActive; m_certInfo.status |= CertExpired;
So for some reason your useraccount uses OpenSSL and the other user accounts use SChannel, and OpenSSL is bugged (neither gloox nor 0ad guys tested it) or misconfigured for your computer.
Any idea how it could happen that your account uses OpenSSL and the other accounts SChannel?
Otherwise I suspect readers of this ticket can reproduce and fix the OpenSSL thing, although OpenSSL was not intended to be used.
Which of the two certificate libraries gloox choses seems to depend on HAVE_OPENSSL
and HAVE_WINTLS
. That is defined via configure.ac
:
configure.ac: AC_DEFINE(HAVE_GNUTLS, 1, [Define to 1 if you want TLS support (GnuTLS). Undefine HAVE_OPENSSL.]) configure.ac: AC_DEFINE(HAVE_OPENSSL, 1, [Define to 1 if you want TLS support (OpenSSL). Undefine HAVE_GNUTLS.])
So that choice is set a compile time, which means it makes no sense to me because you use the same pyrogenesis.exe
file.
So I wonder, are you actually sure you use the same pyrogenesis.exe
?
Edit: Or my assumption/false memory/wrong information that 0ad on Windows uses SChannel
was wrong and OpenSSL
is used in any case, but OpenSSL has some user-specific behavior.
comment:38 by , 5 years ago
I was wrong because the flags reported differ from the flags that the OpenSSL gloox client sets, for example CertRevoked
is not set in tlsopensslbase.cpp
.
We were actually not 100% sure whether gloox from r19608 was built with SChannel, OpenSSL or GnuTLS.
Vladislav tested and saw that schannel.dll
was loaded.
Looking further at the gloox SChannel code, it seems the certificate status might be uninitialized and thus return the weird error messages.
DeadByIdentityV can you try to see if there is any difference in certificates for Let's Encrypt
or Internet Security Research Group (ISRG)
for the two users?
https://www.tbs-certificates.co.uk/FAQ/en/disable-certificate-windows.html
Or maybe a local firewall that is only enabled for that one user?
comment:39 by , 5 years ago
Milestone: | Backlog → Alpha 24 |
---|
comment:40 by , 5 years ago
Cc: | added |
---|
comment:41 by , 5 years ago
Cc: | removed |
---|
comment:42 by , 5 years ago
Cc: | added |
---|
comment:45 by , 5 years ago
For the ones who are here due to the Windows 10 error showing weird combinations of TLS certificate errors:
I didn't initialize m_certStatus
in the XmppClient
.
So if there is a path in gloox that calls XmppClient::onDisconnect
with gloox::ConnectionError::ConnTlsFailed
but does not call XmppClient::onTLSConnect
, then m_certStatus
was an uninitialized value, and thus would show random certificate error strings if the connection failed this way.
From a superficial look at the gloox code, at least there are multiple cases that call disconnect( ConnTlsFailed );
, so it doesn't sound unlikely that the TLS handshake failed prior to onTLSConnect
being called.
This however does still not explain why the certificate didn't work for these users in specific. It sounded like the local certificate store had some root CA removed or disabled or otherwise inaccessible; or the selected cipher not being supported or disabled on the target platform.
comment:47 by , 5 years ago
In a24 svn lobby today:
(09/25/2019 08:29:38 PM) Emperior: The certificate hasn't got a known issuer lol (09/25/2019 08:30:01 PM) Emperior: The certificate is not yet active. (09/25/2019 08:30:19 PM) Emperior: the certificate has not been issued for the peer connected to. (09/25/2019 08:30:31 PM) Emperior: The certificate signer is not certificate authority. xD (09/25/2019 08:30:45 PM) elexis: you updated svn? (09/25/2019 08:30:53 PM) Emperior: y
comment:48 by , 4 years ago
2x TODO:
Replying to elexis:
TLS 1.0 should be disabled serverside, because it is prone to BEAST attacks. However the Windows client currently requires that version.
- Debian 10 has disabled TLS 1.0 by default (and the wfg server uses that).
- Not only Windows XP uses that version but also the pyrogenesis.exe compiled by WFG uses gloox that only supports TLS 1.0. I didn't check whether it's a gloox defect or an issue how it was used.
So TODO (1) is trying to get the client to support more recent than 10 year old protocols and ciphers and (2) is making the "connect without TLS" option available to the lobby login/register page. There already was a patch for that in Phab:D1679, but the less complex version was committed instead in r21932. The option should display a message box remind the user the effects and dangers of not using encryption when connecting. Experience had shown that many players did not find that option in the menu or didn't look for it when they urged to connect regardless of unavailable transport layer security.
Aside from that there were also bugs mentioned above that weren't reproduced fully yet.
comment:49 by , 3 years ago
Milestone: | Alpha 24 → Alpha 25 |
---|
comment:52 by , 20 months ago
I believe both open issues for Windows, missing support for TLS > 1.0 and failing certificate verification, which are caused by missing functionality in the SChannel support of gloox, are already fixed in gloox trunk. So once a successor for the current version 1.0.24 of gloox gets released, upgrading the bundled gloox binaries for Windows should fix these issues.
comment:54 by , 15 months ago
Cc: | removed |
---|---|
Milestone: | Backlog → Alpha 27 |
Owner: | set to |
Status: | reopened → new |
I'm going to try and update gloox to the trunk version on Windows (#3004) in the hope of fixing this.
comment:55 by , 15 months ago
Patch: | → Phab:D4910 |
---|
Upgrading gloox fixes both remaining issues on Windows: TLS connection works, it also works when forcing TLS 1.2 client-side (assuming the latter actually happens inside gloox).
comment:57 by , 13 months ago
For everyone's information, it appears that fixes for Windows in gloox are not being included in 1.0.x releases upstream. Thus we shall keep using the development version of gloox on Windows, until 1.1.x releases start to happen.
TLS, not HTTPS. Since the rest of the ticket is quite wrong based on that confusion I'm closing this as invalid. If you care about rewriting that into seomething sensible reopen it, also there is a multiplayer lobby component.