Troubleshooting LDAP authentication issues in DSE, past and present | DataStax Academy: Free Cassandra Tutorials and Training

There is a big improvement in DSE v5.1.6 that makes it much easier to debug LDAP authentication issues compared to earlier versions of DSE.  Here, we will look at using the DSE v5.1.6 improvement to debug LDAP authentication.  We will also talk about LDAP authentication debugging methods to use with older versions of DSE.

Debugging using DSE v5.1.6+

Thanks to work done for DSP-14418 and DSP-15159, you can now get LDAP authentication debug information in your cassandra debug.log, when you add the following to your cassandra logback.xml:

<logger name="com.datastax.bdp.cassandra.auth" level="TRACE"/>

Let's jump right in and look at a few examples to see how it works...

Let's say you've just enabled LDAP Authentication and Authorization in DSE, you have this in your dse.yaml:

$ cqlsh -u thanh -p datastax
Connection error: ('Unable to connect to any servers', {'127.0.0.1': AuthenticationFailed('Failed to authenticate to 127.0.0.1: Error from server: code=0100 [Bad credentials] message="Failed to login. Please re-try."',)})

So, you've got cassandra internal users and you've got LDAP users (in MS Active Directory) who should be allowed to log into your dse cluster...but, your first test to log into cqlsh as an LDAP user fails.  What do you do?

You, of course, turn on tracing by adding the above-mentioned logger to your logback.xml.  As a result, you now see TRACE level messages in your debug.log whenever a user tries logging in.  When your ldap user failed to log into cqlsh:

$ cqlsh -u thanh -p datastax
Connection error: ('Unable to connect to any servers', {'127.0.0.1': AuthenticationFailed('Failed to authenticate to 127.0.0.1: Error from server: code=0100 [Bad credentials] message="Failed to login. Please re-try."',)})

you saw the following was written to the debug.log:

WARN  [Native-Transport-Requests-3] 2018-02-02 04:53:45,234  DseAuthenticator.java:659 - Plain text authentication without client / server encryption is strongly discouraged
TRACE [Native-Transport-Requests-3] 2018-02-02 04:53:45,235  LdapUtils.java:279 - [ldap-authenticate] username: thanh not found in cache
TRACE [Native-Transport-Requests-3] 2018-02-02 04:53:45,236  LdapUtils.java:454 - [ldap-fetch-user] username: thanh connection: org.apache.directory.ldap.client.api.LdapNetworkConnection@401d5664
TRACE [Native-Transport-Requests-3] 2018-02-02 04:53:45,236  LdapUtils.java:469 - [ldap-fetch-user] bind to connection
TRACE [Native-Transport-Requests-3] 2018-02-02 04:53:45,239  LdapUtils.java:478 - [ldap-fetch-user] user_search_base: CN=Users,DC=qawin2008,DC=datastax,DC=lan, user_search_filter: (sAMAccountName=thanh)
TRACE [Native-Transport-Requests-3] 2018-02-02 04:53:45,247  LdapUtils.java:487 - [ldap-fetch-user] found entry for username: thanh
TRACE [Native-Transport-Requests-3] 2018-02-02 04:53:45,248  LdapUtils.java:406 - [ldap-bind] userDN: CN=thanh,CN=Users,DC=qawin2008,DC=datastax,DC=lanconnection: org.apache.directory.ldap.client.api.LdapNetworkConnection@401d5664
TRACE [Native-Transport-Requests-3] 2018-02-02 04:53:45,251  LdapUtils.java:411 - [ldap-bind] SUCCESS - bind succcessful for userDN: CN=thanh,CN=Users,DC=qawin2008,DC=datastax,DC=lan
TRACE [Native-Transport-Requests-3] 2018-02-02 04:53:45,252  LdapUtils.java:301 - [ldap-authenticate] SUCCESS - username: thanh, userDN: CN=thanh,CN=Users,DC=qawin2008,DC=datastax,DC=lan
TRACE [Native-Transport-Requests-3] 2018-02-02 04:53:45,252  LdapUtils.java:543 - [ldap-fetch-user-groups] username: thanh connection: org.apache.directory.ldap.client.api.LdapNetworkConnection@401d5664
TRACE [Native-Transport-Requests-3] 2018-02-02 04:53:45,252  LdapUtils.java:557 - [ldap-fetch-user-groups] bind to connection
TRACE [Native-Transport-Requests-3] 2018-02-02 04:53:45,254  LdapUtils.java:628 - [ldap-fetch-user-groups] performing member-of lookup for groups - fetching username: thanh with user_search_base: CN=Users,DC=qawin2008,DC=datastax,DC=lan, user_search_filter: (sAMAccountName=thanh)
TRACE [Native-Transport-Requests-3] 2018-02-02 04:53:45,257  LdapUtils.java:657 - [ldap-fetch-user-groups] found group: OpscenterAdmin for username: thanh
TRACE [Native-Transport-Requests-3] 2018-02-02 04:53:45,257  LdapUtils.java:657 - [ldap-fetch-user-groups] found group: Administrators for username: thanh
TRACE [Native-Transport-Requests-3] 2018-02-02 04:53:45,260  DseRoleManager.java:281 - [role-management] found the following roles for grantee: - [] using ldap mode
DEBUG [Native-Transport-Requests-3] 2018-02-02 04:53:45,261  DseAuthenticator.java:592 - Failed to authenticate:
com.datastax.bdp.cassandra.auth.DseAuthenticationException: Failed to login. Please re-try.

But, what does it mean?

The TRACE log entries above tell us:

1 - “username: thanh not found in cache” just means that the ldap user (thanh in this example) credentials were not found in the credentials cache.  If you have "credentials_validity_in_ms" (in dse.yaml under ldap_options) set to something other than 0 (0 means credentials caching is disabled), then the ldap user's credentials will be stored (encrypted) in the credentials cache for credentials_validity_in_ms

2 - DSE is using CN=Users,DC=qawin2008,DC=datastax,DC=lan as the user_search_base and (sAMAccountName=thanh) as the user_search_filter (these should line up with with what you have configured for ldap_options user_search_base and user_search_filter in dse.yaml)

3 - DSE was able to find the ldap user record for login user 'thanh': the DN of the ldap user 'thanh' is CN=thanh,CN=Users,DC=qawin2008,DC=datastax,DC=lan

4 - To find thanh's ldap group memberships, DSE here is using a "member-of" lookup (this should line up with what you have configured for ldap_options group_search_type in dse.yaml)

5 - Using the "memberof" lookup, DSE has found that the ldap user thanh belongs to 2 ldap groups:  (1) OpscenterAdmin and (2) Administrators

6 - DSE, using LDAP mode, has found no matching Roles for thanh in the DSE Cassandra database.  The matching roles is an empty list:   found the following roles for grantee: - [] using ldap mode

7 - DSE is using "ldap mode" for role management (this should line up with what you have configured for role_management_options mode in dse.yaml)

So, let's check to see what DSE Cassandra roles we do have:

Log into cqlsh as the cassandra superuser to do the check:

$ cqlsh -u cassandra -p cassandra
Connected to TT_516 at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.11.1.2070 | DSE 5.1.6 | CQL spec 3.4.4 | Native protocol v4]
Use HELP for help.
cassandra@cqlsh> list roles;

 role           | super | login | options
----------------+-------+-------+---------
      cassandra |  True |  True |        {}
       intuser1 | False |  True |        {}

(2 rows)

Sure enough, we do not have any Roles in DSE Cassandra that match any of the ldap groups that thanh is a member of.  You were supposed to make sure that there was a Role in DSE Cassandra that matches the name of an ldap group that your ldap user is a member of.  

You need to have a Role in Cassandra that corresponds to an LDAP group so that you can set up permissions to Cassandra resources based on LDAP group membership using cql (eg cqlsh > GRANT CREATE on keyspace TEST_KEYSPACE to LDAP_GRP1)

Since you didn't create the necessary Cassandra Role before, let's do it now and then have ldap user 'thanh' try logging in again:

cassandra@cqlsh> create role OpscenterAdmin with login=true;


$ cqlsh -u thanh -p datastax
Connection error: ('Unable to connect to any servers', {'127.0.0.1': AuthenticationFailed('Failed to authenticate to 127.0.0.1: Error from server: code=0100 [Bad credentials] message="Failed to login. Please re-try."',)})

Same problem :(

Let's check the TRACE entries in the debug.log that resulted from thanh's latest cqlsh login attempt:

...
TRACE [Native-Transport-Requests-2] 2018-02-02 05:12:15,185  LdapUtils.java:301 - [ldap-authenticate] SUCCESS - username: thanh, userDN: CN=thanh,CN=Users,DC=qawin2008,DC=datastax,DC=lan
TRACE [Native-Transport-Requests-2] 2018-02-02 05:12:15,185  LdapUtils.java:543 - [ldap-fetch-user-groups] username: thanh connection: org.apache.directory.ldap.client.api.LdapNetworkConnection@401d5664
TRACE [Native-Transport-Requests-2] 2018-02-02 05:12:15,185  LdapUtils.java:557 - [ldap-fetch-user-groups] bind to connection
TRACE [Native-Transport-Requests-2] 2018-02-02 05:12:15,187  LdapUtils.java:628 - [ldap-fetch-user-groups] performing member-of lookup for groups - fetching username: thanh with user_search_base: CN=Users,DC=qawin2008,DC=datastax,DC=lan, user_search_filter: (sAMAccountName=thanh)
TRACE [Native-Transport-Requests-2] 2018-02-02 05:12:15,188  LdapUtils.java:657 - [ldap-fetch-user-groups] found group: OpscenterAdmin for username: thanh
TRACE [Native-Transport-Requests-2] 2018-02-02 05:12:15,189  LdapUtils.java:657 - [ldap-fetch-user-groups] found group: Administrators for username: thanh
TRACE [Native-Transport-Requests-2] 2018-02-02 05:12:15,191  DseRoleManager.java:281 - [role-management] found the following roles for grantee: <role thanh> - [] using ldap mode
DEBUG [Native-Transport-Requests-2] 2018-02-02 05:12:15,191  DseAuthenticator.java:592 - Failed to authenticate:
com.datastax.bdp.cassandra.auth.DseAuthenticationException: Failed to login. Please re-try.

It looks exactly the same!

But, we just created a matching role in DSE Cassandra...didn't we?  Let's check on that:

cassandra@cqlsh> list roles;

 role           | super | login | options
----------------+-------+-------+---------
      cassandra |  True |  True |        {}
       intuser1 | False |  True |        {}
 opscenteradmin | False |  True |        {}

(3 rows)

Hmm, we have "opscenteradmin", not "OpscenterAdmin".  Let's try again:

cassandra@cqlsh> drop role opscenteradmin;
cassandra@cqlsh> create role "OpscenterAdmin" with login=true;
cassandra@cqlsh> list roles;

 role           | super | login | options
----------------+-------+-------+---------
 OpscenterAdmin | False |  True |        {}
      cassandra |  True |  True |        {}
       intuser1 | False |  True |        {}

(3 rows)

That looks better, but again we see the same error:

$ cqlsh -u thanh -p datastax
Connection error: ('Unable to connect to any servers', {'127.0.0.1': AuthenticationFailed('Failed to authenticate to 127.0.0.1: Error from server: code=0100 [Bad credentials] message="Failed to login. Please re-try."',)})

And in the debug.log:

...
TRACE [Native-Transport-Requests-1] 2018-02-02 05:18:24,145  LdapUtils.java:301 - [ldap-authenticate] SUCCESS - username: thanh, userDN: CN=thanh,CN=Users,DC=qawin2008,DC=datastax,DC=lan
TRACE [Native-Transport-Requests-1] 2018-02-02 05:18:24,145  LdapUtils.java:543 - [ldap-fetch-user-groups] username: thanh connection: org.apache.directory.ldap.client.api.LdapNetworkConnection@401d5664
TRACE [Native-Transport-Requests-1] 2018-02-02 05:18:24,145  LdapUtils.java:557 - [ldap-fetch-user-groups] bind to connection
TRACE [Native-Transport-Requests-1] 2018-02-02 05:18:24,147  LdapUtils.java:628 - [ldap-fetch-user-groups] performing member-of lookup for groups - fetching username: thanh with user_search_base: CN=Users,DC=qawin2008,DC=datastax,DC=lan, user_search_filter: (sAMAccountName=thanh)
TRACE [Native-Transport-Requests-1] 2018-02-02 05:18:24,149  LdapUtils.java:657 - [ldap-fetch-user-groups] found group: OpscenterAdmin for username: thanh
TRACE [Native-Transport-Requests-1] 2018-02-02 05:18:24,149  LdapUtils.java:657 - [ldap-fetch-user-groups] found group: Administrators for username: thanh
TRACE [Native-Transport-Requests-1] 2018-02-02 05:18:24,151  DseRoleManager.java:281 - [role-management] found the following roles for grantee: <role thanh> - [<role OpscenterAdmin>] using ldap mode
DEBUG [Native-Transport-Requests-1] 2018-02-02 05:18:24,153  DseAuthenticator.java:592 - Failed to authenticate:
com.datastax.bdp.cassandra.auth.DseAuthenticationException: Failed to login. Please re-try.

So, this time, from the above, we see that DSE found the matching DSE Cassandra role ("OpscenterAdmin") !

But, why is it still failing then?

Well, unfortunately, the log doesn't tell us in this case.

So, what can we do now then?

Answer:  Remove some of the complexity.  Right now, you're testing both Authentication and Authorization at the same time.  Turn off authorization and see if you can at least get the ldap authentication piece working.

Ok.  So, you disable Authorization:

authorization_options:
    enabled: false
    transitional_mode: disabled
    allow_row_level_security: false

restart dse for the change to take effect, and now...

$ cqlsh -u thanh -p datastax
Connected to TT_516 at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.11.1.2070 | DSE 5.1.6 | CQL spec 3.4.4 | Native protocol v4]
Use HELP for help.
thanh@cqlsh>

Success!!

But, why doesn't it work when Authorization is enabled?

Let's go back and review all the steps needed to get Authorization working:  https://support.datastax.com/hc/en-us/articles/115005881643

It looks like you missed this step:

cassandra@cqlsh> grant EXECUTE on LDAP SCHEME to "OpscenterAdmin";

Ok, so you try it now then, but:

cassandra@cqlsh> grant execute on ldap scheme to "OpscenterAdmin";
ServerError: java.lang.UnsupportedOperationException: GRANT operation is not supported by the DseAuthorizer if it is not enabled

The above error is basically saying that you need to have Authorization enabled in order to GRANT permissions.

Alright, so you re-enable Authorization, re-start DSE, try the GRANT again, this time, it works:

cassandra@cqlsh> grant execute on ldap scheme to "OpscenterAdmin";
cassandra@cqlsh> quit;

And, when thanh tries to log in via cqlsh now?

$ cqlsh -u thanh -p datastax
Connected to TT_516 at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.11.1.2070 | DSE 5.1.6 | CQL spec 3.4.4 | Native protocol v4]
Use HELP for help.
thanh@cqlsh>

It works!!

What did we learn from this?

1.  TRACE level logging gives you a lot more to go on to figure out what's missing in the config.  Still...

2.  Even with TRACE level logging enabled, the log might not tell you exactly what you missed in your config.

3.  You still need to walk through all the setup steps given by https://support.datastax.com/hc/en-us/articles/115005881643-Setting-Up-L... and make sure that you followed all the steps as instructed.

4.  Most importantly, it really helps if you can remove or reduce the complexity of the problem, like we did when we disabled Authorization to get Authentication working first.  Then, once we confirmed that Authentication worked, we could focus on getting Authorization working as well.

The "old way" (Debugging using DSE pre-v5.1.6)

Because earlier versions of dse don't really have any LDAP debugging facility, you'll have no choice but to do more work to get to the bottom of the problem.  

I have found the following approach to work quite well to debug ldap auth issues in DSE versions that are pre-v5.1.6:

(1) First, just walk through all the required steps again to see if anything was missed in one of the steps.  Refer to this KB Article for LDAP Authentication and Authorization setup steps needed:  https://support.datastax.com/hc/en-us/articles/115005881643-Setting-Up-L...

You might find that you actually missed something or made a typo somewhere.

You might find that you walked through everything again, but found nothing that looks amiss.

In that case,

(2) Try your ldap login test with a different ldap user and/or different ldap group to see if that works.  The other ldap user is more likely to work if it's more simply configured, doesn't have any special characters ( such as "-", "_", "(", ")" ) anywhere in its DN, belongs to fewer ldap groups, or ldap groups with simpler names (no special characters in the ldap group name), etc.  The key is to remove any unnecessary complexity that might compound your problem(s).

Basically, you want to check/try to see if you can get even just one ldap user to successfully log into DSE.  If you can do that, you can compare the working user with the non-working user to find what difference causes the problem.

(3) If you try both the above but still can't get any ldap users logged in to dse, OR, if you have one ldap user working but only one or only some working, and you can't tell what the difference is that allows them to work and prevent your other ldap users from logging in, then try using a different ldap client/utility/browser to access/query your ldap users.  Doing this, you can confirm you are using the correct CNs, DNs, search bases, search filters, etc.  Using this, you can also see/compare different Attributes of different ldap users and ldap groups to see why some users work while others don't.

DSE just uses the standard Apache API to do LDAP queries, so any utility that uses the same API would be a good utility to use to troubleshoot.  My utility of choice here is the Apache Directory Studio.  It's easy to install, easy to use, and is a lightweight application.  It can be found here http://directory.apache.org/studio/

The only configuration you should have to do with your other ldap utility is the configuration needed to connect to the LDAP server <- for this, use the same as what you used in your dse.yaml, the same being what you used for the following settings in dse.yaml:

ldap_options:
  server_host: test-win2008-AD.datastax.lan
  server_port: 389
  search_dn: CN=Administrator,CN=Users,DC=qawin2008,DC=datastax,DC=lan
  search_password: XXXXXX

Here, I want to share an actual LDAP authentication problem I saw recently, and how I went about troubleshooting it:

The problem was, of course, this:

$ cqlsh -u jane -p XXXXXX
Connection error: ('Unable to connect to any servers', {'127.0.0.1': AuthenticationFailed('Failed to authenticate to 127.0.0.1: Error from server: code=0100 [Bad credentials] message="jane is not permitted to log in"',)})

How I went about troubleshooting it:

1.  I first checked the dse.yaml AND the cassandra.yaml to see how authentication was configured in those files.

I saw that cassandra.yaml was not using the correct Authenticator, it was using com.datastax.bdp.cassandra.auth.LdapAuthenticator

NOTE:  To use LDAP Auth in DSE 5.0+, you need to use the following since LdapAuthenticator is no longer supported in DSE 5.x+:

authenticator: com.datastax.bdp.cassandra.auth.DseAuthenticator

2.  Apart from the Authenticator in cassandra.yaml being incorrect, I didn't see any other yaml configuration problems, so I tried to test logging in to cqlsh with a different ldap user on the same server, user 'thanh' .  User 'thanh' is in a different ldap group from the ldap group 'jane' is in, but that's ok..all we have to do is to create the matching Role in Cassandra for thanh's ldap group  <- that worked!  'thanh' was able to log in.  That proved that the dse configs in dse.yaml and cassandra.yaml were good.

But, what about the 'jane' user was problematic?

'jane' was added to the 'Jane-Test-Admin' ldap group, and a matching Role was created for 'Jane-Test-Admin' in Cassandra.  It looked like it had been configured correctly.  So..

3.  I used the Apache Directory Studio to confirm jane's ldap group membership.  When I checked, I saw that 'jane' was not listed as a member of 'Jane-Test-Admin', 'jane' was only a member of the 'OpscenterAdmin'’ ldap group:

and , looking at the 'Jane-Test-Admin' ldap group, I saw only 2 members, 'tony' and 'thanh', not 'jane':

and that was the problem!  Ldap user 'jane' wasn’t a member of the ldap group 'Jane-Test-Admin'...Only, it wasn't, really.  Jane confirmed by checking directly on the LDAP server (using the "Active Directory Users and Groups" utility) that 'jane' was indeed a member of 'Jane-Test-Admin', it's just that this wasn't seen by either Apache Directory Studio OR DSE.  It was only seen when something other than 'Jane-Test-Admin' was used/set as jane's "Primary Group".  After a little bit of research, we discovered this:  https://support.microsoft.com/en-us/help/275523/setting-primary-group-ex...

which explained what we saw:

"When you view Active Directory with a Lightweight Directory Access Protocol (LDAP) utility such as Ldp.exe, the Members attribute is not populated with the Primary group."

Mystery solved.

Additional Notes (These are important, you should read them):

1.  Before you actually jump in and start troubleshooting...

Make SURE that the cqlsh you are using MATCHES the version of DSE that you are using,  Otherwise, you can very easily spend hours going 'round and 'round in circles trying to resolve something that can only be resolved by using the correct version of cqlsh.  This note comes from the fact that here have been Datastax customers who have run into problems with ldap integration & Unified Authentication because they were using the Community version of cqlsh on a client, instead of the dse version of cqlsh.

2.  If you think your DSE ldap auth problem might have to do with an incorrect DN, Search Base, User Filter, Group Filter, etc that's being used in dse.yaml, but you're "not sure", it will be best to get your LDAP Admin involved to get the correct values for those.

3.  If you're using DSE v.5.1.6 (or later), have turned on TRACE logging to debug authentication issues, and see this in your logs:

TRACE [Native-Transport-Requests-1] 2018-02-02 04:43:15,918  LdapUtils.java:231 - [ldap-authenticate] attempting authentication retry: 3
TRACE [Native-Transport-Requests-1] 2018-02-02 04:43:15,918  LdapUtils.java:279 - [ldap-authenticate] username: cassandra not found in cache
TRACE [Native-Transport-Requests-1] 2018-02-02 04:43:15,918  LdapUtils.java:454 - [ldap-fetch-user] username: cassandra connection: org.apache.directory.ldap.client.api.LdapNetworkConnection@401d5664
TRACE [Native-Transport-Requests-1] 2018-02-02 04:43:15,918  LdapUtils.java:469 - [ldap-fetch-user] bind to connection
TRACE [Native-Transport-Requests-1] 2018-02-02 04:43:15,921  LdapUtils.java:478 - [ldap-fetch-user] user_search_base: CN=Users,DC=qawin2008,DC=datastax,DC=lan, user_search_filter: (sAMAccountName=cassandra)
TRACE [Native-Transport-Requests-1] 2018-02-02 04:43:15,924  LdapUtils.java:521 - [ldap-fetch-user] ERROR - failed to fetch username: cassandra
TRACE [Native-Transport-Requests-1] 2018-02-02 04:43:16,057  LdapUtils.java:543 - [ldap-fetch-user-groups] username: cassandra connection: org.apache.directory.ldap.client.api.LdapNetworkConnection@401d5664
TRACE [Native-Transport-Requests-1] 2018-02-02 04:43:16,057  LdapUtils.java:557 - [ldap-fetch-user-groups] bind to connection
TRACE [Native-Transport-Requests-1] 2018-02-02 04:43:16,060  LdapUtils.java:628 - [ldap-fetch-user-groups] performing member-of lookup for groups - fetching username: cassandra with user_search_base: CN=Users,DC=qawin2008,DC=datastax,DC=lan, user_search_filter: (sAMAccountName=cassandra)
TRACE [Native-Transport-Requests-1] 2018-02-02 04:43:16,061  LdapUtils.java:668 - [ldap-fetch-user-groups] FAILURE - could not find user entry for username: cassandra or user entry did not have any memberOf attribute entries
TRACE [Native-Transport-Requests-1] 2018-02-02 04:43:16,064  DseRoleManager.java:281 - [role-management] found the following roles for grantee: <role cassandra> - [<role cassandra>] using ldap mode

Don't be alarmed.  The above is most likely NOT a problem.  It just shows that the user ('cassandra' in this example) was not found on the ldap server.  That's fine because the user in question here isn't an ldap user; the user here is an internal cassandra user.  When DSE authenticates a user, it will do so trying the default authentication scheme first (the default is whatever you set it to be in dse.yaml).  Then, DSE will try other auth schemes as listed in the other_schemes setting, in the order they are listed.

4.  Please note, the discussion here is about LDAP authentication in DSE, not LDAP authentication in Opscenter.  The Opscenter LDAP authentication code is different.  Opscenter LDAP Authentication deserves its own discussion.