postfix-users October 2010 archive
Main Archive Page > Month Archives  > postfix-users archives
postfix-users: Re: Postfix locking up, not accepting connections

Re: Postfix locking up, not accepting connections / smtp not sending emails out

From: Christian Rohmann <crohmann_at_nospam>
Date: Fri Oct 29 2010 - 18:19:38 GMT
To: postfix-users@postfix.org

Hey again,

On 10/29/2010 07:23 PM, Wietse Venema wrote:
> The main loop in the master is as follows:
>
> forever {
> set an alarm for 1000s
> do an EPOLL_WAIT for up to 500s and handle any child process
> events, or short-term timer requests that are implemented
> around the EPOLL_WAIT timer.
> respond to sighup (the sighup flag is set by a signal handler)
> respond to sigchld (the sigchld flag is set by a signal handler)
> }

Just now one machine had the issue again. I checked and saw that we
where down to just two smtpd processes and even though master was still
bound to port 25 no new connections where accepted. I did telnet to it,
but the connection was not accepted and ran into timeout.

How does the timer issue relate to the master process not accepting
anymore TCP/IP connections on port 25?

> It would be worthwhile to see what strace reports when you leave
> it running. If strace reports nothing in 500s then EPOLL_WAIT is
> not working. If strace reports nothing after 1000s then the alarm
> timer is also not working.

I'll try to gather you some strace data. I guess the strace should be of
the master? Could you give me a hint on what options you might want?

On 10/29/2010 07:04 PM, Wietse Venema wrote:
> VMware has an entire KB article on problems with delivering timer
> interrupts to guest machines, and the hoops that they are jumping
> through to avoid poor performance. See
> http://tech.groups.yahoo.com/group/postfix-users/message/269786

Thanks for the hint, I already printed that article to read over the
weekend.

Thanks for your help,

Christian