A blog by Devendra Tewari
Keep-alives are useful in scenarios where either end of a TCP connection disappears without closing the session.
The following script in Python demonstrates sending a keep-alive message when there is no data activity for 60 seconds. If there is no response, 4 additional keep-alive messages are sent at intervals of 15 seconds. If none get a response, the connection is aborted.
import sys
import socket
import traceback
import time
def do_work():
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# timeout recv every 5 seconds
sock.settimeout(5.0)
# check and turn on TCP Keepalive
x = sock.getsockopt(socket.SOL_SOCKET, socket.SO_KEEPALIVE)
if (x == 0):
print 'Socket Keepalive off, turning on'
x = sock.setsockopt(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1)
print 'setsockopt='+str(x)
# overrides value (in seconds) shown by sysctl net.ipv4.tcp_keepalive_time
sock.setsockopt(socket.SOL_TCP, socket.TCP_KEEPIDLE, 60)
# overrides value shown by sysctl net.ipv4.tcp_keepalive_probes
sock.setsockopt(socket.SOL_TCP, socket.TCP_KEEPCNT, 4)
# overrides value shown by sysctl net.ipv4.tcp_keepalive_intvl
sock.setsockopt(socket.SOL_TCP, socket.TCP_KEEPINTVL, 15)
else:
print 'Socket Keepalive already on'
try:
sock.connect(('192.168.0.120', 8001))
except socket.error:
print 'Socket connect failed!'
traceback.print_exc()
return
print 'Socket connect worked!'
while True:
try:
# read at most 10 bytes (or less)
req = sock.recv(10)
except socket.timeout:
print 'Socket timeout, loop and try recv() again'
continue
except:
traceback.print_exc()
print 'Other Socket err, exit and try creating socket again'
# break from loop
break
if req == '':
# connection closed by peer, exit loop
print 'Connection closed by peer'
break
print 'Received', req
try:
sock.close()
except:
pass
if __name__ == '__main__':
do_work()
# references
# http://tldp.org/HOWTO/html_single/TCP-Keepalive-HOWTO/
# http://www.digi.com/wiki/developer/index.php/Handling_Socket_Error_and_Keepalive
Edit the IP address 192.168.0.120
and port to whatever works on your network.
Create a test TCP listener/server using netcat (nc on OS X) on machine with IP address specified in the script
netcat -l 192.168.0.120 8001
Next, run the script
python keepalive.py
It should establish a TCP connection with the listener. Interrupt the network by enabling a firewall, or powering off a router. You’ll see the following when the connection times out
Traceback (most recent call last):
File "socket_test.py", line 39, in do_work
req = sock.recv(10)
error: [Errno 110] Connection timed out
Other Socket err, exit and try creating socket again
On OS X or Windows, you can enable keep-alive but cannot set TCP_KEEPIDLE and other parameters. You’ll get the following error message if you try to do so
Traceback (most recent call last):
File "socket_test.py", line 65, in <module>
do_work()
File "socket_test.py", line 19, in do_work
sock.setsockopt(socket.SOL_TCP, socket.TCP_KEEPIDLE, 60)
AttributeError: 'module' object has no attribute 'TCP_KEEPIDLE'
Wireshark highlights keep-alive messages if TCP sequence number analysis is enabled.