This may an odd ball and I haven't dived too deep into this thread, but if the script is calling the keep alive function and the connection is dropping every 5 minutes or so, is that when then connection actually drops in that script? IE does calling that function force the VPN to drop due to a change in configuration?
Let me start from the beginning, since I see I've lost a couple people.
Customer has two full installations of the product. We'll call them Working and Broken. Product consists of a VM issuing commands to a cloud provider to spin up a cloud machine. Product attempts repeated SSH logins until it connects successfully, whereupon it knows the cloud machine is alive. At that point, product uploads a half-dozen script files that consist mainly of installation commands (OpenVPN, Java SDK, etc.). Product then runs the script remotely and streams the output back to Product where it is logged.
The Broken installation consists of a VM Ubuntu instance running through a pair of Cisco ASA firewalls configured in routed mode with failover support. During the run of the installation script, the ASA terminates all SSH connections to that cloud machine. This is a problem, as the Product no longer sees the completion of the installation script and does not know when to run the next script in the sequence.
The Working installation consists of a VM Ubuntu instance running through a single Cisco ASA firewall (slightly larger, more ports, same software version) configured in transparent mode upstairs in the same building. Using the same internet provider. Or at least... I thought so before Thursday, where I just learned that they have bypassed the ASA firewall entirely. Which may explain why Working actually works.
Customer has loaned me a spare ASA firewall, which I am using for testing.