Erlang Mailing Lists

Author Message

<  Erlang bugs mailing list  ~  [erlang-questions] epmd leaving ports in TIME_WAIT?

Guest
Posted: Tue Mar 23, 2010 5:21 pm Reply with quote
Guest
Hi,
I did as you suggested and ran epmd -d.
It ends up outputting something like:
epmd: Tue Mar 23 09:26:39 2010: ** sent PORT2_RESP (error) for "rodc10"
epmd: Tue Mar 23 09:26:40 2010: ** got PORT2_REQ

Over and over.
This is because one of my nodes pings (net_adm:ping) a node that doesn't
exist from time to time. (Every couple seconds or so)
Also, when epmd dies, the ports are closed properly. In any case, I find it
surprising that epmd has to open so many sockets to ask around if someone
has seen the missing node.

On Mon, Mar 22, 2010 at 11:45 AM, Michael Santos
<michael.santos@gmail.com>wrote:

> On Mon, Mar 22, 2010 at 11:17:25AM -0400, Nicholas Frechette wrote:
> > Escalating to erlang-bugs.
> > I've restarted both my server and laptop over the weekend.
> > On both machines, I restarted my 2 erlang applications (4 nodes,
> connected
> > in pairs: A <-> B, C <-> D, with pairs on the same computer)
> >
> > This was yesterday. This morning I did another netstat -t, and indeed, I
> > have >100 sockets stuck in TIME_WAIT on both computers.
>
> Sockets in TIME_WAIT state are normal. After the socket is closed,
> the OS puts the socket into TIME_WAIT to ensure any pending packets
> queued somewhere in the network for the socket pair have time to arrive.
> Usually TIME_WAIT is 2 or 4 minutes.
>
> It looks as if there a is a number of TCP connections that are being
> established and closed to your epmd.
>
> > Both with outgoing
> > on localhost and the other pc, in about equal proportion.
> > No node has crashed/restarted. None of the nodes does anything fancy,
> simply
> > net_adm:ping to connect the nodes and then data is exchanged using
> messages.
> >
> > The problem seems somewhat related to the fact that epmd seems to restart
> > from time to time as the OS gets confused and cannot retrieve the PID
> that
> > originally opened the sockets (although port shows it is epmd)
>
> What is restarting epmd?
>
> See anything in your logs? Maybe try running epmd in debug mode. Kill
> epmd if it is running and run: epmd -d
>
> > I briefly looked at the epmd code and did see a few comments in there
> about
> > // should probably always close and a few other potential places where it
> > might leak sockets. Unfortunately I ran out of time.
>
> Doesn't appear to be leaking fd's, but you can check with lsof.
>
> > Can anyone confirm if they see similar behavior? Note that on both
> > computers, both nodes are started manually (not automated yet) and as
> such
> > it isn't a race to see which node can start epmd first. Although, I
> wonder
> > if it might be related to the problem of the epmd 100% cpu use, I believe
> > another poster made the point that it would happen when epmd runs out of
> > file descriptor (which would happen if it leaks sockets in TIME_WAIT).
>
> That's just one error condition; for example, the connection could have
> been aborted or the socket could have been closed. Are you seeing a lot
> of CPU usage?
>
>
>


Post received from mailinglist
Guest
Posted: Tue Mar 23, 2010 10:44 pm Reply with quote
Guest
Yes, I know why my node isn't 'up', my question is why is epmd opening a
socket to query if a node is up? ie: what does it attempt to connect to?
Said node was never up in epmd's lifetime meaning it shouldn't be a cached
value or the likes. Supposing it connects to my other epmd process on my
other computer, why would it not keep it as a permanent tcp connection? It
also attemps to connect to ports on the same host, not just the other
computer so i'm a bit curious.
Is epmd implemented as a soap over an http like protocol where queries are
made over single use socket connections?

On Tue, Mar 23, 2010 at 5:04 PM, Michael Santos <michael.santos@gmail.com>wrote:

> On Tue, Mar 23, 2010 at 01:15:18PM -0400, Nicholas Frechette wrote:
> > Hi,
> > I did as you suggested and ran epmd -d.
> > It ends up outputting something like:
> > epmd: Tue Mar 23 09:26:39 2010: ** sent PORT2_RESP (error) for "rodc10"
> > epmd: Tue Mar 23 09:26:40 2010: ** got PORT2_REQ
> >
> > Over and over.
> > This is because one of my nodes pings (net_adm:ping) a node that doesn't
> > exist from time to time. (Every couple seconds or so)
>
> Right, so every time the node connects and disconnects the TCP session
> will go into TIME_WAIT.
>
> > Also, when epmd dies, the ports are closed properly. In any case, I find
> it
> > surprising that epmd has to open so many sockets to ask around if someone
> > has seen the missing node.
>
> 1> [ begin {ok,S} = gen_tcp:connect({127,0,0,1},4369,[]), ok =
> gen_tcp:close(S) end || _ <- lists:seq(1,10000) ].
>
> That will generate 10,000 sessions in TIME_WAIT Smile I guess the question
> is why your nodes keep disappearing from the network.
>
> > On Mon, Mar 22, 2010 at 11:45 AM, Michael Santos
> > <michael.santos@gmail.com>wrote:
> >
> > > On Mon, Mar 22, 2010 at 11:17:25AM -0400, Nicholas Frechette wrote:
> > > > Escalating to erlang-bugs.
> > > > I've restarted both my server and laptop over the weekend.
> > > > On both machines, I restarted my 2 erlang applications (4 nodes,
> > > connected
> > > > in pairs: A <-> B, C <-> D, with pairs on the same computer)
> > > >
> > > > This was yesterday. This morning I did another netstat -t, and
> indeed, I
> > > > have >100 sockets stuck in TIME_WAIT on both computers.
> > >
> > > Sockets in TIME_WAIT state are normal. After the socket is closed,
> > > the OS puts the socket into TIME_WAIT to ensure any pending packets
> > > queued somewhere in the network for the socket pair have time to
> arrive.
> > > Usually TIME_WAIT is 2 or 4 minutes.
> > >
> > > It looks as if there a is a number of TCP connections that are being
> > > established and closed to your epmd.
> > >
> > > > Both with outgoing
> > > > on localhost and the other pc, in about equal proportion.
> > > > No node has crashed/restarted. None of the nodes does anything fancy,
> > > simply
> > > > net_adm:ping to connect the nodes and then data is exchanged using
> > > messages.
> > > >
> > > > The problem seems somewhat related to the fact that epmd seems to
> restart
> > > > from time to time as the OS gets confused and cannot retrieve the PID
> > > that
> > > > originally opened the sockets (although port shows it is epmd)
> > >
> > > What is restarting epmd?
> > >
> > > See anything in your logs? Maybe try running epmd in debug mode. Kill
> > > epmd if it is running and run: epmd -d
> > >
> > > > I briefly looked at the epmd code and did see a few comments in there
> > > about
> > > > // should probably always close and a few other potential places
> where it
> > > > might leak sockets. Unfortunately I ran out of time.
> > >
> > > Doesn't appear to be leaking fd's, but you can check with lsof.
> > >
> > > > Can anyone confirm if they see similar behavior? Note that on both
> > > > computers, both nodes are started manually (not automated yet) and as
> > > such
> > > > it isn't a race to see which node can start epmd first. Although, I
> > > wonder
> > > > if it might be related to the problem of the epmd 100% cpu use, I
> believe
> > > > another poster made the point that it would happen when epmd runs out
> of
> > > > file descriptor (which would happen if it leaks sockets in
> TIME_WAIT).
> > >
> > > That's just one error condition; for example, the connection could have
> > > been aborted or the socket could have been closed. Are you seeing a lot
> > > of CPU usage?
> > >
> > >
> > >
>


Post received from mailinglist
wuji
Posted: Wed Sep 05, 2012 4:30 am Reply with quote
User Joined: 10 Aug 2012 Posts: 654
down with some sledding, one of his favorite hobbies."I went went cheap jordan shoes went up to the mountain with some friends and started
sled, and I decided to try sledding head first," he he designer replica *beep* he recalled. "On my second try, I went down the
I lost complete control of the sled. I went into into [h1]Cheap Ralph Lauren Shirts[/h1] into a grove of trees, and instead of hitting the
with my face, I turned to my back at the the designer replica *beep* the very last second."What came after the impact was the
thing he could have ever imagined."I was starting to spit spit [h3]discount designer *beep*[/h3] spit up blood; the trees were spinning and worst of
I couldn't feel my legs."While Denniston sat motionless on the the [h3]discount designer *beep*[/h3] the snow, he recalled, an avalanche of possible scenarios flashed
his mind. "Would I be confined to a wheelchair, or or cheap jordans free shipping or will I actually die?" he asked himself."One of the
View user's profile Send private message

Display posts from previous:  

All times are GMT
Page 1 of 1
This forum is locked: you cannot post, reply to, or edit topics.

Jump to:  

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You cannot download files in this forum