The CLD signal is sent when a child process dies. This is great! All I need to do is trap that signal and start my process backup. My main concern with this is will I get stuck in an infinite loop, forking new processes because some condition has caused the child process to die every time, consuming all the resources on my system.
Here's my solution so far:
class UploadServer
def initialize(options = {})
@upload_read, @upload_write = IO.pipe
@start_up_threads = (options[:start_up_threads] || 2)
@max_read = (options[:max_read] || 1024)
@logger = (options[:logger] || Logger.new(STDOUT))
end
# starts up the upload server
def start
@pid = fork do
initialize_server
while( 1 )
select
end
end
Signal.trap(0) do
# tell the child process to die
Process.kill("TERM", @pid)
end
Signal.trap("CLD") do
# something extremely unexpected happened and the child process died
@logger.error( "It appears the upload background process has died... Attempting a restart..." )
# make sure we kill of any residue from the child process is cleaned up e.g. avoid defunct process
Process.wait(@pid)
# this is all a little risky since someone could have been in the middle of an upload
# they'll be cut off anyway since the process died...
# close down open pipe
@upload_write.close
# create a new pipe
@upload_read, @upload_write = IO.pipe
# start it back up
start
end
@logger.debug( "Upload Process started up on #{@pid}" )
# close the read end on the main process
@upload_read.close
@pid
end
The Process.wait(@pid) is very important otherwise we're left with a lot of <defunct> processes. It may also help to throttle the issue of infinite forking. At the very least it means we'll never get more then 1 child process per server. The only other thing I can imagine adding is some kind of timer to help throttle in the case that the child process dies very quickly...

0 comments:
Post a Comment