I joined DataStax when our European offices opened.  Over the last few years, I have visited and spoken to a large number of customers and there are a number of common questions that seem to come up as an aside to the main topic of conversation.

On the basis that for every person that asks, four others wanted to - I thought I might share the common answers here!

Why does Support always ask for a diagnostic bundle, when I provided one last week?

The diagnostic bundle contains more than just the configuration of the node.  It also contains the most current version of the system logs and a snapshot of the output of certain key nodetool outputs.  As an example, nodetool cfstats shows which tables have recently read a large number of tombstones.  The output of nodetool netstats will show if nodes are currently streaming data.

The diagnostic bundle is usually an engineer's first point of call when trying to diagnose problems.  It gives us a very comprehensive picture of the current state of the cluster.

Why do you ask for a diagnostic bundle rather than just jumping straight on a webex?

DSE is a distributed database. Much of the troubleshooting process is spent correlating outputs from multiple nodes and thereby tracking which part is the cause, and which is the effect.  There are a number of tools and scripts that we can use to grab this information directly from a diagnostic bundle that are much harder to use in a live environment.  

Additionally, it’s pretty common for engineers to collaborate on tricky issues - whilst one engineer is working directly with a customer, one or more colleagues can check the diagnostic bundle to confirm or deny current working theories.

Why do I talk to so many engineers when dealing with a critical situation?

DataStax uses a process called Follow the Sun in order to provide support on a 24-by-7 basis.  We have 4 hubs, one each on the US east and west coasts, one in Europe and one in Australia.  The European hub covers from midnight to 8 am PST.  The US covers 8 am to 4 pm and finally the Australian team covers 4 pm till midnight.

If a ticket is likely to require further support, then each team will brief the next of the tickets that will need attention during their shift.  This process ensures that the engineers working are fresh and that mistakes are not caused by fatigue.

If a customer logs a non-critical situation outside of their normal working hours - an engineer will usually check to ensure that ticket was logged at the correct priority before re-queuing the issue for the local team to follow up.

Why do you suggest working with your consulting team for performance tuning issues?

It’s mostly a matter of experience.  Support engineers are like car mechanics.  Our solution architects are professional race car drivers.  You can ask the mechanic to take the car round the time trial course, he understands how the car works, and will eventually get you over the line.  The driver, however, will have run the course any number of times before and is familiar with how the car behaves as a whole.  In general, they will get better results and will be quicker to do so.

Do you have a good way to close out this blog?

Not really no.  It kind of just ends.