QRadar authentication issues using LDAP Auth Module - troubleshooting


Dependencies - need to know

Starting with QRadar Release 7.3.3FP5 or 7.4.1FP1 and future versions of QRadar the "Active Directory" authentication module deprecated from QRadar Console.

After the authentication module setting were changed in the QRadar authentication settings to LDAP or LDAPS you should have to keep in mind the following dependencies, if your QRadar Users possibly running in “login failed” issues:

If it's required for security reasons to use authentication to LDAP with SSL, the following limiting factor can be a culprit.

What does this mean?

For example: you are using a pool of 4 LDAP Servers combined in an alias "pool.domain.com". In this use case a pool as “Server URL” like ldaps://pool.domain.com:636 was used. Normaly a single LDAP server like ldaps://server.domain.com:636 is used in the LDAP authentication settings. Some users suddenly are reporting, that their login to QRadar "randomly" failed - "wrong username or password message". What was causing this issue? Beforehand, it was not the wrong username/password combination or typing error :)

This behaviour started relatively close to the change of the authentication settings. And i was asking me, what's going on and why seems this to be a random error?

I've investigated the qradar.log file and found some specific messages. The following is a sample message that can occur in this cases:

Oct 12 08:13:46 tomcat[14230]: 2020-10-12 08:13:46,701 [QRADAR] [user@ (6416) /console/login] edu.vt.middleware.ldap.auth.SearchDnResolver: [WARN] Error performing LDAP operation, retrying (attempt 0)

Oct 12 08:13:46 tomcat[14230]: javax.naming.CommunicationException: simple bind failed: pool.ldap.com:636 [Root exception is java.net.SocketException: Connection reset]

Troubleshooting progress - Part 1

Continuing with further analysis also my own failed random logins, using tcpdump to investigate, i've got on to the track of what happens in that moment. Everytime i refreshed the Browser, restarted the Browser, or just waited a few minutes after my last login failed, another server out of the pool responded. This was expected and sense of this pool in case of preventing an outage of LDAP. But, i've identified two servers right away if the login was successful and two servers right away if the login failed. 

Prepared with this recent obtained details i've reached out to the LDAP Team, hopefully to get a "quick and short answer" why this happens. But it's not as simple as that. In collaboration we proceeded to investigate the settings around the LDAP Servers and identified after a while the configuration mismatch of the affected LDAP servers causing the mentioned “connection resets” displayed as “failed logins” in the UI.

Troubleshooting progress - Part 2

Those two servers were new and physical machines setup from scratch in Q1 this year. Some "security provider" services were disabled and new important registry settings regarding to TLS were applied on those machines. On each new server (in case of login failed) the setting “TLSv1.2 only” was enabled! The others (in case of login was successful!) had settings to allow any connection except of TLSv1.2. And that was the culprit!

Regarding to the concept of hardening the communication like in this use case, only this "new" setting "TLSv1.2 only” will be allowed. By the way we've detected at this point a deviated historical misconfiguration formally known as “workaround” or “compatibility reasons” :)

Why is this good to know?

Because there is a known permanent restriction, documented in APAR IV96427. QRadar is currently unable to connect to TLSv1.2 LDAP servers over SSL connections!

Details found here: APAR IV96427

Summarised: we hit this "unable to connect to TLSv1.2 LDAP servers over SSL connections" issue. In combination of deviated and opposed TLS settings on the involved LDAP servers this issue occured "randomly".

Final Solution Settings

So at this point, if you hit this issue in a related use case, review the TLS Settings on each LDAP server you might want to use. This helps you to avoid this kind of impact. And please make sure you'll have the certificates of the available LDAP servers applied to QRadar. You can add them in /opt/qradar/conf/trusted_certificates and pull them down with the following command:

openssl s_client -connect ad_host.example.com:636 -showcerts </dev/null 2>/dev/null | openssl x509 -outform pem > ad_ldap_server.pem

And finaly modify the settings in the authentication setup of the authentication module, if TLSv1.2 is required, using the following:

Set SSL to "FALSE"
Set TLS to "TRUE"

Concluding Insight

Once again, at the end of this troubleshooting experience, i've determined with a positive impact, that team-work incredible provides the way to find the root cause of an issue in our more and more complexe IT-Security World. This helps to remedy misconfiguration and the possibility to apply a constructive solution.

And finally the interaction of IT-Security and IT-Servicemanagement regarding to the PDCA (plan-do-check-act) cycle. For me, to cut a long story short, it makes absolutely sense.

I hope this article contains useful informations.


Ralph Belfiore