i created perl script below - (showing relevant code snippets)
#!/appl/pm/vendor/perl/lx-x86/5.14.2/bin/perl -w $no_of_retries = 60; $sleep_time = 60; use fcntl qw(:flock seek_end); #------------------------------------------------------------------------------- # allow 1 instance loader program run #------------------------------------------------------------------------------- sub syncwaitforfile($$$) { ($delay_file, $no_of_retries, $sleep_time) = @_; #check delay file $minutes_passed = 0; while ( (-e $delay_file) && (--$no_of_retries) && ($no_of_retries > 0) ) { if ( (($minutes_passed / 10) == 0) && ($minutes_passed != 0)) # check every 10 minutes , send mail { mailnotify ("loader [pid=$$;hostname=$host;user=$env{'logname'}] waiting $minutes_passed minutes delay file removed", "delay file => $delay_file", $maillist); } logger::info ("waiting removal of delay_file $delay_file $sleep_time seconds. time passed waiting => $minutes_passed minutes. number of retries remaining : $no_of_retries"); sleep($sleep_time); ++$minutes_passed; } if ($no_of_retries <= 0) { mailnotify ("loader [pid=$$;hostname=$host;user=$env{'logname'}] has exited due non-removal of delay file", "delay file => $delay_file", $maillist); errorexit("couldn't acquire exclusive lock. loader.pl cannot run in parallel. exiting", 1, 0); } while(1) { $flockfile = "${delay_file}.flockfile"; open (my $handle, ">" , $flockfile) or errorexit("not able open file $flockfile", -1, 0); logger::info "acquiring lock on $flockfile"; flock($handle, lock_ex) or errorexit("failed during lock", -1, 0); #take lock #check again delay file while time process took lock, #by time other process may took lock created delay file , release lock if (-e $delay_file) { logger::info "releasing lock on $flockfile"; flock($handle, lock_un); # unlocks delay file # wait till delay_file removed logger::info "waiting removal of delay_file $delay_file"; while ( (-e $delay_file) && (--$no_of_retries) && ($no_of_retries > 0) ) { if ( (($minutes_passed / 10) == 0) && ($minutes_passed != 0) ) { mailnotify ("loader [pid=$$;hostname=$host;user=$env{'logname'}] waiting $minutes_passed minutes delay file removed", "delay file => $delay_file", $maillist); } logger::info ("waiting removal of delay_file $delay_file $sleep_time seconds. time passed waiting => $minutes_passed. number of retries remaining : $no_of_retries"); sleep($sleep_time); ++$minutes_passed; } if ($no_of_retries <= 0) { mailnotify ("loader [pid=$$;hostname=$host;user=$env{'logname'}] has exited due non-removal of delay file", "delay file => $delay_file", $maillist); errorexit("couldn't acquire exclusive lock. loader.pl cannot run in parallel. exiting", -1, 0); } else { next; } } logger::info "creating delay file $delay_file"; open (delay_file, "> $delay_file") or errorexit("error: unable open file:$delay_file $!", -1, 0); print delay_file "locked loader [pid=$$;hostname=$host;user=$env{'logname'}]\n"; close(delay_file); logger::info "releasing lock on $flockfile"; flock($handle, lock_un); # unlocks flock file last; }}
the above code fails under stress test. ran 25 parallel instances , working fine.
but when system reached more 50 parallel instances, waiting delay_file free, got blocked. of loaders failed @ following line -
flock($handle, lock_ex) or errorexit("failed during lock", -1, 0);
i have 2 questions -
[1] what have been possible reason failure when load increased ? (it wasn't failing when load less)
[2] what best way handle mutex in perl works under excessive load?
i using perl on linux -
perl - 5.14.2
linux - rhel server 5.11