February 3, 2014

Down-under Terminal Server connection woes - or "Throw another protocol error on the Barbie!"

Recently, I've had to deal with a rather bizarre terminal server issue.  At this one location, no computer could connect to a specific terminal server twice.  The computers could connect fine one time.  But, if the user logged off or disconnected, and then tried to connect to the terminal server again, the following error was displayed each time:

Your Remote Desktop sessions has ended.  The connection to the remote computer was lost, possibly due to network connectivity problems.  Try connecting to the remote computer again.  If the problem continues, contact your network administrator or technical support












Of course, you would think 'orphaned session' or a terminal server setting, but that was not the case.  No limits were set and I could see the users' sessions disconnecting just fine from the server.  The first connection would work fine until the user logged off the terminal server or was disconnected.  Once that happened, the user could not sign in again with the above error.

But here's the trick:  If I rebooted the server or the firewall at the location, the users could connect again - but again, only once, then another reboot would be required.


So after confirming the usual suspects like DNS, AD account status, and VPN tunnels were all active and working normally, I decided the issue had to be something deeper.  I found the following error in the Terminal Server's System Event Log:



"Event 56, TermDD - The Terminal Server security layer detected an error in the protocol stream and has disconnected the client."

This little Event ID led down a real rabbit-hole of blog posts, forum discussions, and random Microsoft KB articles.  Let me give you some of the highlights:
  • Reduce the encryption level of the terminal server to Low & use "RDP Encryption"
  • Set the RDP encryption algorithm to balance network & memory usage
  • Enable 'keep alive' on the terminal server
  • Disable TCP Chimney Offload, Receive-Side Scaling State (RSS), and NetDMA
  • Confirm RDC client version is the latest on all clients
  • Use "ERR.EXE" to analyze the last word byte of the above error (B50000D0 in this case)
No one online seemed to have the final solution and none of the suggestions helped me.  I put everything back the way it was, pulled my head away from the wall, and decided to just get down and dirty with a Wireshark trace.  Hopefully the trace would help figure out exactly what was happening with these failed connections.  Running a quick client trace gave me some errors but nothing definite.  Wireshark did report some checksum errors and this "dissector bug":


"Dissector bug, protocol T.124 proto.c:3478 failed assertion (guint)hfindex < gpa_hfinfo.len) unregistered hf!"

The checksum errors led me down the hardware stack to the network cards, turning off the "checksum offload" at the IP & TCP levels on the virtual host & virtual server.  This cleared up some of the Checksum errors in Wireshark but still the same terminal server error persisted.

I was still not convinced that the Sonicwall at the location wasn't to blame for all this.  After all, we had other network issues with a business application at that same location which had still not been fixed.


Bruised and beaten, I elected to open support tickets with Sonicwall and Microsoft and begin working this issue from each end with them....


(To be continued...)

No comments:

Post a Comment