Erlang Mailing Lists

Author Message

<  Erlang bugs mailing list  ~  R13B04 inet_res:resolve/4 inet_udp Port leak

Guest
Posted: Wed May 26, 2010 12:01 am Reply with quote
Guest
It appears that there may be an inet_udp Port leak in
inet_res:resolve/4, our current workaround is to spawn a new process
to call this function. We've noticed this primarily for a service that
regularly does a UDP DNS query that fails (because the response is too
big) and then we retry over TCP.

This is what the state of the process looked like when it was leaking ports:

(node@host)1> length(lists:filter(fun erlang:is_port/1, element(2,
erlang:process_info(whereis(dns_gen_server), links)))).
577
(node@host)2> lists:usort([erlang:port_info(P, name) || P <-
lists:filter(fun erlang:is_port/1, element(2,
erlang:process_info(whereis(dns_gen_server), links)))]).
[{name,"udp_inet"}]

The code looked like this, before the workaround was implemented:

%% @spec dns(string()) -> [string()]
%% @doc Return the A records (IPv4 IPs) as strings for the given Host name.
%% This may return an empty list if there no A records for this Host name.
dns(Host) when is_list(Host) ->
dns(Host, fun inet_res:resolve/4).

dns(Host, ResolveFun) ->
case ResolveFun(Host, in, a, []) of
{ok, Msg} ->
ips_for_answers(Msg);
{error, {nxdomain, _}} ->
[];
{error, timeout} ->
%% retry with TCP
case ResolveFun(Host, in, a, [{usevc, true}]) of
{ok, Msg} ->
ips_for_answers(Msg);
{error, {nxdomain, _}} ->
[];
Error = {error, _} ->
Error
end;
Error = {error, _} ->
Error
end.

ips_for_answers(Msg) ->
[inet_parse:ntoa(inet_dns:rr(Answer, data))
|| Answer <- inet_dns:msg(Msg, anlist)].

The workaround we used was to call it indirectly with this function, I
couldn't find anything in OTP that did the same thing that didn't have
local call optimizations.

%% @spec process_apply(atom(), atom(), [term()]) -> term()
%% @doc erlang:apply(M, F, A) in a temporary process and return the results.
process_apply(M,F,A) ->
%% We can't just use rpc here because there's a local call optimization.
Parent = self(),
Fun = fun () ->
try
Parent ! {self(), erlang:apply(M, F, A)}
catch
Class:Reason ->
Stacktrace = erlang:get_stacktrace(),
Parent ! {self(), Class, Reason, Stacktrace}
end
end,
{Pid, Ref} = erlang:spawn_monitor(Fun),
receive
{Pid, Res} ->
receive {'DOWN', Ref, process, Pid, _} -> ok end,
Res;
{Pid, Class, Reason, Stacktrace} ->
receive {'DOWN', Ref, process, Pid, _} -> ok end,
erlang:error(erlang:raise(Class, Reason, Stacktrace));
{'DOWN', Ref, process, Pid, Reason} ->
erlang:exit(Reason)
end.

________________________________________________________________
erlang-bugs (at) erlang.org mailing list.
See http://www.erlang.org/faq.html
To unsubscribe; mailto:erlang-bugs-unsubscribe@erlang.org

Post received from mailinglist
Guest
Posted: Wed May 26, 2010 9:07 am Reply with quote
Guest
By reading the code it seems there is a bug when all nameservers
return an answer that causes decode errors, or can not be
contacted (enetunreach or econnrefused); then an
UDP port (or maybe two; one inet and one inet6) is leaked
since the inet_res:udp_close/1 is not called.

This should be fixed with:

diff --git a/lib/kernel/src/inet_res.erl b/lib/kernel/src/inet_res.erl
index 9b9e078..3d38a01 100644
--- a/lib/kernel/src/inet_res.erl
+++ b/lib/kernel/src/inet_res.erl
@@ -592,6 +592,7 @@ query_retries(_Q, _NSs, _Timer, Retry, Retry, S) ->
query_retries(Q, NSs, Timer, Retry, I, S0) ->
Num = length(NSs),
if Num =:= 0 ->
+ udp_close(S),
{error,timeout};
true ->
case query_nss(Q, NSs, Timer, Retry, I, S0, []) of

This "retry with TCP" trick of yours should really not be necessary
since inet_res retries with TCP if it gets a truncated UDP answer.
Have you got some other case when retrying with TCP is essential?

Or, does your DNS server produce a (valid?) result that
triggers a debug bug in inet_res, causing the decode error,
triggering the port leak bug, forcing you to retry with TCP?

On Tue, May 25, 2010 at 05:00:39PM -0700, Bob Ippolito wrote:
> It appears that there may be an inet_udp Port leak in
> inet_res:resolve/4, our current workaround is to spawn a new process
> to call this function. We've noticed this primarily for a service that
> regularly does a UDP DNS query that fails (because the response is too
> big) and then we retry over TCP.
>
> This is what the state of the process looked like when it was leaking ports:
>
> (node@host)1> length(lists:filter(fun erlang:is_port/1, element(2,
> erlang:process_info(whereis(dns_gen_server), links)))).
> 577
> (node@host)2> lists:usort([erlang:port_info(P, name) || P <-
> lists:filter(fun erlang:is_port/1, element(2,
> erlang:process_info(whereis(dns_gen_server), links)))]).
> [{name,"udp_inet"}]
>
> The code looked like this, before the workaround was implemented:
>
> %% @spec dns(string()) -> [string()]
> %% @doc Return the A records (IPv4 IPs) as strings for the given Host name.
> %% This may return an empty list if there no A records for this Host name.
> dns(Host) when is_list(Host) ->
> dns(Host, fun inet_res:resolve/4).
>
> dns(Host, ResolveFun) ->
> case ResolveFun(Host, in, a, []) of
> {ok, Msg} ->
> ips_for_answers(Msg);
> {error, {nxdomain, _}} ->
> [];
> {error, timeout} ->
> %% retry with TCP
> case ResolveFun(Host, in, a, [{usevc, true}]) of
> {ok, Msg} ->
> ips_for_answers(Msg);
> {error, {nxdomain, _}} ->
> [];
> Error = {error, _} ->
> Error
> end;
> Error = {error, _} ->
> Error
> end.
>
> ips_for_answers(Msg) ->
> [inet_parse:ntoa(inet_dns:rr(Answer, data))
> || Answer <- inet_dns:msg(Msg, anlist)].
>
> The workaround we used was to call it indirectly with this function, I
> couldn't find anything in OTP that did the same thing that didn't have
> local call optimizations.
>
> %% @spec process_apply(atom(), atom(), [term()]) -> term()
> %% @doc erlang:apply(M, F, A) in a temporary process and return the results.
> process_apply(M,F,A) ->
> %% We can't just use rpc here because there's a local call optimization.
> Parent = self(),
> Fun = fun () ->
> try
> Parent ! {self(), erlang:apply(M, F, A)}
> catch
> Class:Reason ->
> Stacktrace = erlang:get_stacktrace(),
> Parent ! {self(), Class, Reason, Stacktrace}
> end
> end,
> {Pid, Ref} = erlang:spawn_monitor(Fun),
> receive
> {Pid, Res} ->
> receive {'DOWN', Ref, process, Pid, _} -> ok end,
> Res;
> {Pid, Class, Reason, Stacktrace} ->
> receive {'DOWN', Ref, process, Pid, _} -> ok end,
> erlang:error(erlang:raise(Class, Reason, Stacktrace));
> {'DOWN', Ref, process, Pid, Reason} ->
> erlang:exit(Reason)
> end.
>
> ________________________________________________________________
> erlang-bugs (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:erlang-bugs-unsubscribe@erlang.org

--

/ Raimo Niskanen, Erlang/OTP, Ericsson AB

________________________________________________________________
erlang-bugs (at) erlang.org mailing list.
See http://www.erlang.org/faq.html
To unsubscribe; mailto:erlang-bugs-unsubscribe@erlang.org

Post received from mailinglist
Guest
Posted: Wed May 26, 2010 3:09 pm Reply with quote
Guest
Well, I'm not sure exactly which scenario is happening because I
haven't looked at the packets yet, but the manual TCP retry is
required.

mochi@robin:~$ /mochi/opt/erlang-R13B04/bin/erl
Erlang R13B04 (erts-5.7.5) [source] [64-bit] [smp:8:8] [rq:8]
[async-threads:4] [hipe] [kernel-poll:true]

Eshell V5.7.5 (abort with ^G)
1> lists:filter(fun erlang:is_port/1, element(2,
erlang:process_info(self(), links))).
[]
2> inet_res:resolve("mochisvn.erl.mochimedia.net", in, a, []).
{error,timeout}
3> lists:filter(fun erlang:is_port/1, element(2,
erlang:process_info(self(), links))).
[#Port<0.514>]
4> element(1, inet_res:resolve("mochisvn.erl.mochimedia.net", in, a, [usevc])).
ok


On Wed, May 26, 2010 at 2:06 AM, Raimo Niskanen
<raimo+erlang-bugs@erix.ericsson.se> wrote:
> By reading the code it seems there is a bug when all nameservers
> return an answer that causes decode errors, or can not be
> contacted (enetunreach or econnrefused); then an
> UDP port (or maybe two; one inet and one inet6) is leaked
> since the inet_res:udp_close/1 is not called.
>
> This should be fixed with:
>
> diff --git a/lib/kernel/src/inet_res.erl b/lib/kernel/src/inet_res.erl
> index 9b9e078..3d38a01 100644
> --- a/lib/kernel/src/inet_res.erl
> +++ b/lib/kernel/src/inet_res.erl
> @@ -592,6 +592,7 @@ query_retries(_Q, _NSs, _Timer, Retry, Retry, S) ->
>  query_retries(Q, NSs, Timer, Retry, I, S0) ->
>     Num = length(NSs),
>     if Num =:= 0 ->
> +           udp_close(S),
>            {error,timeout};
>        true ->
>            case query_nss(Q, NSs, Timer, Retry, I, S0, []) of
>
> This "retry with TCP" trick of yours should really not be necessary
> since inet_res retries with TCP if it gets a truncated UDP answer.
> Have you got some other case when retrying with TCP is essential?
>
> Or, does your DNS server produce a (valid?) result that
> triggers a debug bug in inet_res, causing the decode error,
> triggering the port leak bug, forcing you to retry with TCP?
>
> On Tue, May 25, 2010 at 05:00:39PM -0700, Bob Ippolito wrote:
>> It appears that there may be an inet_udp Port leak in
>> inet_res:resolve/4, our current workaround is to spawn a new process
>> to call this function. We've noticed this primarily for a service that
>> regularly does a UDP DNS query that fails (because the response is too
>> big) and then we retry over TCP.
>>
>> This is what the state of the process looked like when it was leaking ports:
>>
>> (node@host)1> length(lists:filter(fun erlang:is_port/1, element(2,
>> erlang:process_info(whereis(dns_gen_server), links)))).
>> 577
>> (node@host)2> lists:usort([erlang:port_info(P, name) || P <-
>> lists:filter(fun erlang:is_port/1, element(2,
>> erlang:process_info(whereis(dns_gen_server), links)))]).
>> [{name,"udp_inet"}]
>>
>> The code looked like this, before the workaround was implemented:
>>
>> %% @spec dns(string()) -> [string()]
>> %% @doc Return the A records (IPv4 IPs) as strings for the given Host name.
>> %%     This may return an empty list if there no A records for this Host name.
>> dns(Host) when is_list(Host) ->
>>     dns(Host, fun inet_res:resolve/4).
>>
>> dns(Host, ResolveFun) ->
>>     case ResolveFun(Host, in, a, []) of
>>         {ok, Msg} ->
>>             ips_for_answers(Msg);
>>         {error, {nxdomain, _}} ->
>>             [];
>>         {error, timeout} ->
>>             %% retry with TCP
>>             case ResolveFun(Host, in, a, [{usevc, true}]) of
>>                 {ok, Msg} ->
>>                     ips_for_answers(Msg);
>>                 {error, {nxdomain, _}} ->
>>                     [];
>>                 Error = {error, _} ->
>>                     Error
>>             end;
>>         Error = {error, _} ->
>>             Error
>>     end.
>>
>> ips_for_answers(Msg) ->
>>     [inet_parse:ntoa(inet_dns:rr(Answer, data))
>>      || Answer <- inet_dns:msg(Msg, anlist)].
>>
>> The workaround we used was to call it indirectly with this function, I
>> couldn't find anything in OTP that did the same thing that didn't have
>> local call optimizations.
>>
>> %% @spec process_apply(atom(), atom(), [term()]) -> term()
>> %% @doc erlang:apply(M, F, A) in a temporary process and return the results.
>> process_apply(M,F,A) ->
>>     %% We can't just use rpc here because there's a local call optimization.
>>     Parent = self(),
>>     Fun = fun () ->
>>                   try
>>                       Parent ! {self(), erlang:apply(M, F, A)}
>>                   catch
>>                       Class:Reason ->
>>                           Stacktrace = erlang:get_stacktrace(),
>>                           Parent ! {self(), Class, Reason, Stacktrace}
>>                   end
>>           end,
>>     {Pid, Ref} = erlang:spawn_monitor(Fun),
>>     receive
>>         {Pid, Res} ->
>>             receive {'DOWN', Ref, process, Pid, _} -> ok end,
>>             Res;
>>         {Pid, Class, Reason, Stacktrace} ->
>>             receive {'DOWN', Ref, process, Pid, _} -> ok end,
>>             erlang:error(erlang:raise(Class, Reason, Stacktrace));
>>         {'DOWN', Ref, process, Pid, Reason} ->
>>             erlang:exit(Reason)
>>     end.
>>
>> ________________________________________________________________
>> erlang-bugs (at) erlang.org mailing list.
>> See http://www.erlang.org/faq.html
>> To unsubscribe; mailto:erlang-bugs-unsubscribe@erlang.org
>
> --
>
> / Raimo Niskanen, Erlang/OTP, Ericsson AB
>

________________________________________________________________
erlang-bugs (at) erlang.org mailing list.
See http://www.erlang.org/faq.html
To unsubscribe; mailto:erlang-bugs-unsubscribe@erlang.org

Post received from mailinglist
Guest
Posted: Wed May 26, 2010 7:07 pm Reply with quote
Guest
Here's the DNS packet that is being received as a response to the query:

1> inet_dns:decode(<<0,1,131,128,0,1,0,60,0,0,0,0,8,109,111,99,104,105,115,118,110,
3,101,114,108,10,109,111,99,104,105,109,101,100,105,97,3,110,
101,116,0,0,1,0,1>>).
{error,fmt}

On Wed, May 26, 2010 at 8:02 AM, Bob Ippolito <bob@redivi.com> wrote:
> Well, I'm not sure exactly which scenario is happening because I
> haven't looked at the packets yet, but the manual TCP retry is
> required.
>
> mochi@robin:~$ /mochi/opt/erlang-R13B04/bin/erl
> Erlang R13B04 (erts-5.7.5) [source] [64-bit] [smp:8:8] [rq:8]
> [async-threads:4] [hipe] [kernel-poll:true]
>
> Eshell V5.7.5  (abort with ^G)
> 1> lists:filter(fun erlang:is_port/1, element(2,
> erlang:process_info(self(), links))).
> []
> 2> inet_res:resolve("mochisvn.erl.mochimedia.net", in, a, []).
> {error,timeout}
> 3> lists:filter(fun erlang:is_port/1, element(2,
> erlang:process_info(self(), links))).
> [#Port<0.514>]
> 4> element(1, inet_res:resolve("mochisvn.erl.mochimedia.net", in, a, [usevc])).
> ok
>
>
> On Wed, May 26, 2010 at 2:06 AM, Raimo Niskanen
> <raimo+erlang-bugs@erix.ericsson.se> wrote:
>> By reading the code it seems there is a bug when all nameservers
>> return an answer that causes decode errors, or can not be
>> contacted (enetunreach or econnrefused); then an
>> UDP port (or maybe two; one inet and one inet6) is leaked
>> since the inet_res:udp_close/1 is not called.
>>
>> This should be fixed with:
>>
>> diff --git a/lib/kernel/src/inet_res.erl b/lib/kernel/src/inet_res.erl
>> index 9b9e078..3d38a01 100644
>> --- a/lib/kernel/src/inet_res.erl
>> +++ b/lib/kernel/src/inet_res.erl
>> @@ -592,6 +592,7 @@ query_retries(_Q, _NSs, _Timer, Retry, Retry, S) ->
>>  query_retries(Q, NSs, Timer, Retry, I, S0) ->
>>     Num = length(NSs),
>>     if Num =:= 0 ->
>> +           udp_close(S),
>>            {error,timeout};
>>        true ->
>>            case query_nss(Q, NSs, Timer, Retry, I, S0, []) of
>>
>> This "retry with TCP" trick of yours should really not be necessary
>> since inet_res retries with TCP if it gets a truncated UDP answer.
>> Have you got some other case when retrying with TCP is essential?
>>
>> Or, does your DNS server produce a (valid?) result that
>> triggers a debug bug in inet_res, causing the decode error,
>> triggering the port leak bug, forcing you to retry with TCP?
>>
>> On Tue, May 25, 2010 at 05:00:39PM -0700, Bob Ippolito wrote:
>>> It appears that there may be an inet_udp Port leak in
>>> inet_res:resolve/4, our current workaround is to spawn a new process
>>> to call this function. We've noticed this primarily for a service that
>>> regularly does a UDP DNS query that fails (because the response is too
>>> big) and then we retry over TCP.
>>>
>>> This is what the state of the process looked like when it was leaking ports:
>>>
>>> (node@host)1> length(lists:filter(fun erlang:is_port/1, element(2,
>>> erlang:process_info(whereis(dns_gen_server), links)))).
>>> 577
>>> (node@host)2> lists:usort([erlang:port_info(P, name) || P <-
>>> lists:filter(fun erlang:is_port/1, element(2,
>>> erlang:process_info(whereis(dns_gen_server), links)))]).
>>> [{name,"udp_inet"}]
>>>
>>> The code looked like this, before the workaround was implemented:
>>>
>>> %% @spec dns(string()) -> [string()]
>>> %% @doc Return the A records (IPv4 IPs) as strings for the given Host name.
>>> %%     This may return an empty list if there no A records for this Host name.
>>> dns(Host) when is_list(Host) ->
>>>     dns(Host, fun inet_res:resolve/4).
>>>
>>> dns(Host, ResolveFun) ->
>>>     case ResolveFun(Host, in, a, []) of
>>>         {ok, Msg} ->
>>>             ips_for_answers(Msg);
>>>         {error, {nxdomain, _}} ->
>>>             [];
>>>         {error, timeout} ->
>>>             %% retry with TCP
>>>             case ResolveFun(Host, in, a, [{usevc, true}]) of
>>>                 {ok, Msg} ->
>>>                     ips_for_answers(Msg);
>>>                 {error, {nxdomain, _}} ->
>>>                     [];
>>>                 Error = {error, _} ->
>>>                     Error
>>>             end;
>>>         Error = {error, _} ->
>>>             Error
>>>     end.
>>>
>>> ips_for_answers(Msg) ->
>>>     [inet_parse:ntoa(inet_dns:rr(Answer, data))
>>>      || Answer <- inet_dns:msg(Msg, anlist)].
>>>
>>> The workaround we used was to call it indirectly with this function, I
>>> couldn't find anything in OTP that did the same thing that didn't have
>>> local call optimizations.
>>>
>>> %% @spec process_apply(atom(), atom(), [term()]) -> term()
>>> %% @doc erlang:apply(M, F, A) in a temporary process and return the results.
>>> process_apply(M,F,A) ->
>>>     %% We can't just use rpc here because there's a local call optimization.
>>>     Parent = self(),
>>>     Fun = fun () ->
>>>                   try
>>>                       Parent ! {self(), erlang:apply(M, F, A)}
>>>                   catch
>>>                       Class:Reason ->
>>>                           Stacktrace = erlang:get_stacktrace(),
>>>                           Parent ! {self(), Class, Reason, Stacktrace}
>>>                   end
>>>           end,
>>>     {Pid, Ref} = erlang:spawn_monitor(Fun),
>>>     receive
>>>         {Pid, Res} ->
>>>             receive {'DOWN', Ref, process, Pid, _} -> ok end,
>>>             Res;
>>>         {Pid, Class, Reason, Stacktrace} ->
>>>             receive {'DOWN', Ref, process, Pid, _} -> ok end,
>>>             erlang:error(erlang:raise(Class, Reason, Stacktrace));
>>>         {'DOWN', Ref, process, Pid, Reason} ->
>>>             erlang:exit(Reason)
>>>     end.
>>>
>>> ________________________________________________________________
>>> erlang-bugs (at) erlang.org mailing list.
>>> See http://www.erlang.org/faq.html
>>> To unsubscribe; mailto:erlang-bugs-unsubscribe@erlang.org
>>
>> --
>>
>> / Raimo Niskanen, Erlang/OTP, Ericsson AB
>>
>

________________________________________________________________
erlang-bugs (at) erlang.org mailing list.
See http://www.erlang.org/faq.html
To unsubscribe; mailto:erlang-bugs-unsubscribe@erlang.org

Post received from mailinglist
Guest
Posted: Thu May 27, 2010 7:44 am Reply with quote
Guest
On Wed, May 26, 2010 at 11:59:26AM -0700, Bob Ippolito wrote:
> Here's the DNS packet that is being received as a response to the query:
>
> 1> inet_dns:decode(<<0,1,131,128,0,1,0,60,0,0,0,0,8,109,111,99,104,105,115,118,110,
> 3,101,114,108,10,109,111,99,104,105,109,101,100,105,97,3,110,
> 101,116,0,0,1,0,1>>).
> {error,fmt}

Thank you very much! I was about to give you detailed instructions
about how to dig that up Smile

I have spotted the problem.

The DNS reply packet has got the TC (TrunCation) bit set and claims to contain
60 answer records, but actually contains zero. inet_dns expects to find
60 answer records if it says so. This is a hazy part of the
DNS specifications and the resolver I tested truncation on did
not do this kind of self-contradiction, but it _may_ be allowed
by the specification...

I regard it as a bug (or at least need-to-fix-problem) in inet_dns
since it should be real-world compatible not just specification compatible.
It should allow record shortage in a section if the TC bit is set.
I'll try to fix it in R14A.

Can you try my patch adding a missing udp_close(S) to see
if it stops the leaking port problem? That is a more serious bug.

>
> On Wed, May 26, 2010 at 8:02 AM, Bob Ippolito <bob@redivi.com> wrote:
> > Well, I'm not sure exactly which scenario is happening because I
> > haven't looked at the packets yet, but the manual TCP retry is
> > required.
> >
> > mochi@robin:~$ /mochi/opt/erlang-R13B04/bin/erl
> > Erlang R13B04 (erts-5.7.5) [source] [64-bit] [smp:8:8] [rq:8]
> > [async-threads:4] [hipe] [kernel-poll:true]
> >
> > Eshell V5.7.5
Guest
Posted: Thu May 27, 2010 10:03 am Reply with quote
Guest
I have created a fix for these problems:
git fetch git://github.com/RaimoNiskanen/otp.git rn/resolver-leaking-ports

It will be included in 'pu'.

Unfortunately, the second commit eliminates the bug trigger
for what the first commit fixes. So to test if the bug fix
is fixing the bug, one should apply the first commit only.

On Thu, May 27, 2010 at 09:43:54AM +0200, Raimo Niskanen wrote:
> On Wed, May 26, 2010 at 11:59:26AM -0700, Bob Ippolito wrote:
> > Here's the DNS packet that is being received as a response to the query:
> >
> > 1> inet_dns:decode(<<0,1,131,128,0,1,0,60,0,0,0,0,8,109,111,99,104,105,115,118,110,
> > 3,101,114,108,10,109,111,99,104,105,109,101,100,105,97,3,110,
> > 101,116,0,0,1,0,1>>).
> > {error,fmt}
>
> Thank you very much! I was about to give you detailed instructions
> about how to dig that up Smile
>
> I have spotted the problem.
>
> The DNS reply packet has got the TC (TrunCation) bit set and claims to contain
> 60 answer records, but actually contains zero. inet_dns expects to find
> 60 answer records if it says so. This is a hazy part of the
> DNS specifications and the resolver I tested truncation on did
> not do this kind of self-contradiction, but it _may_ be allowed
> by the specification...
>
> I regard it as a bug (or at least need-to-fix-problem) in inet_dns
> since it should be real-world compatible not just specification compatible.
> It should allow record shortage in a section if the TC bit is set.
> I'll try to fix it in R14A.
>
> Can you try my patch adding a missing udp_close(S) to see
> if it stops the leaking port problem? That is a more serious bug.
>
> >
> > On Wed, May 26, 2010 at 8:02 AM, Bob Ippolito <bob@redivi.com> wrote:
> > > Well, I'm not sure exactly which scenario is happening because I
> > > haven't looked at the packets yet, but the manual TCP retry is
> > > required.
> > >
> > > mochi@robin:~$ /mochi/opt/erlang-R13B04/bin/erl
> > > Erlang R13B04 (erts-5.7.5) [source] [64-bit] [smp:8:8] [rq:8]
> > > [async-threads:4] [hipe] [kernel-poll:true]
> > >
> > > Eshell V5.7.5
Guest
Posted: Thu May 27, 2010 2:11 pm Reply with quote
Guest
I can confirm that the first commit fixes the port leak bug.

On Thu, May 27, 2010 at 3:02 AM, Raimo Niskanen
<raimo+erlang-bugs@erix.ericsson.se> wrote:
> I have created a fix for these problems:
>    git fetch git://github.com/RaimoNiskanen/otp.git rn/resolver-leaking-ports
>
> It will be included in 'pu'.
>
> Unfortunately, the second commit eliminates the bug trigger
> for what the first commit fixes. So to test if the bug fix
> is fixing the bug, one should apply the first commit only.
>
> On Thu, May 27, 2010 at 09:43:54AM +0200, Raimo Niskanen wrote:
>> On Wed, May 26, 2010 at 11:59:26AM -0700, Bob Ippolito wrote:
>> > Here's the DNS packet that is being received as a response to the query:
>> >
>> > 1> inet_dns:decode(<<0,1,131,128,0,1,0,60,0,0,0,0,8,109,111,99,104,105,115,118,110,
>> > 3,101,114,108,10,109,111,99,104,105,109,101,100,105,97,3,110,
>> > 101,116,0,0,1,0,1>>).
>> > {error,fmt}
>>
>> Thank you very much! I was about to give you detailed instructions
>> about how to dig that up Smile
>>
>> I have spotted the problem.
>>
>> The DNS reply packet has got the TC (TrunCation) bit set and claims to contain
>> 60 answer records, but actually contains zero. inet_dns expects to find
>> 60 answer records if it says so. This is a hazy part of the
>> DNS specifications and the resolver I tested truncation on did
>> not do this kind of self-contradiction, but it _may_ be allowed
>> by the specification...
>>
>> I regard it as a bug (or at least need-to-fix-problem) in inet_dns
>> since it should be real-world compatible not just specification compatible.
>> It should allow record shortage in a section if the TC bit is set.
>> I'll try to fix it in R14A.
>>
>> Can you try my patch adding a missing udp_close(S) to see
>> if it stops the leaking port problem? That is a more serious bug.
>>
>> >
>> > On Wed, May 26, 2010 at 8:02 AM, Bob Ippolito <bob@redivi.com> wrote:
>> > > Well, I'm not sure exactly which scenario is happening because I
>> > > haven't looked at the packets yet, but the manual TCP retry is
>> > > required.
>> > >
>> > > mochi@robin:~$ /mochi/opt/erlang-R13B04/bin/erl
>> > > Erlang R13B04 (erts-5.7.5) [source] [64-bit] [smp:8:8] [rq:8]
>> > > [async-threads:4] [hipe] [kernel-poll:true]
>> > >
>> > > Eshell V5.7.5  (abort with ^G)
>> > > 1> lists:filter(fun erlang:is_port/1, element(2,
>> > > erlang:process_info(self(), links))).
>> > > []
>> > > 2> inet_res:resolve("mochisvn.erl.mochimedia.net", in, a, []).
>> > > {error,timeout}
>> > > 3> lists:filter(fun erlang:is_port/1, element(2,
>> > > erlang:process_info(self(), links))).
>> > > [#Port<0.514>]
>> > > 4> element(1, inet_res:resolve("mochisvn.erl.mochimedia.net", in, a, [usevc])).
>> > > ok
>> > >
>> > >
>> > > On Wed, May 26, 2010 at 2:06 AM, Raimo Niskanen
>> > > <raimo+erlang-bugs@erix.ericsson.se> wrote:
>> > >> By reading the code it seems there is a bug when all nameservers
>> > >> return an answer that causes decode errors, or can not be
>> > >> contacted (enetunreach or econnrefused); then an
>> > >> UDP port (or maybe two; one inet and one inet6) is leaked
>> > >> since the inet_res:udp_close/1 is not called.
>> > >>
>> > >> This should be fixed with:
>> > >>
>> > >> diff --git a/lib/kernel/src/inet_res.erl b/lib/kernel/src/inet_res.erl
>> > >> index 9b9e078..3d38a01 100644
>> > >> --- a/lib/kernel/src/inet_res.erl
>> > >> +++ b/lib/kernel/src/inet_res.erl
>> > >> @@ -592,6 +592,7 @@ query_retries(_Q, _NSs, _Timer, Retry, Retry, S) ->
>> > >>  query_retries(Q, NSs, Timer, Retry, I, S0) ->
>> > >>     Num = length(NSs),
>> > >>     if Num =:= 0 ->
>> > >> +           udp_close(S),
>> > >>            {error,timeout};
>> > >>        true ->
>> > >>            case query_nss(Q, NSs, Timer, Retry, I, S0, []) of
>> > >>
>> > >> This "retry with TCP" trick of yours should really not be necessary
>> > >> since inet_res retries with TCP if it gets a truncated UDP answer.
>> > >> Have you got some other case when retrying with TCP is essential?
>> > >>
>> > >> Or, does your DNS server produce a (valid?) result that
>> > >> triggers a debug bug in inet_res, causing the decode error,
>> > >> triggering the port leak bug, forcing you to retry with TCP?
>> > >>
>> > >> On Tue, May 25, 2010 at 05:00:39PM -0700, Bob Ippolito wrote:
>> > >>> It appears that there may be an inet_udp Port leak in
>> > >>> inet_res:resolve/4, our current workaround is to spawn a new process
>> > >>> to call this function. We've noticed this primarily for a service that
>> > >>> regularly does a UDP DNS query that fails (because the response is too
>> > >>> big) and then we retry over TCP.
>> > >>>
>> > >>> This is what the state of the process looked like when it was leaking ports:
>> > >>>
>> > >>> (node@host)1> length(lists:filter(fun erlang:is_port/1, element(2,
>> > >>> erlang:process_info(whereis(dns_gen_server), links)))).
>> > >>> 577
>> > >>> (node@host)2> lists:usort([erlang:port_info(P, name) || P <-
>> > >>> lists:filter(fun erlang:is_port/1, element(2,
>> > >>> erlang:process_info(whereis(dns_gen_server), links)))]).
>> > >>> [{name,"udp_inet"}]
>> > >>>
>> > >>> The code looked like this, before the workaround was implemented:
>> > >>>
>> > >>> %% @spec dns(string()) -> [string()]
>> > >>> %% @doc Return the A records (IPv4 IPs) as strings for the given Host name.
>> > >>> %%     This may return an empty list if there no A records for this Host name.
>> > >>> dns(Host) when is_list(Host) ->
>> > >>>     dns(Host, fun inet_res:resolve/4).
>> > >>>
>> > >>> dns(Host, ResolveFun) ->
>> > >>>     case ResolveFun(Host, in, a, []) of
>> > >>>         {ok, Msg} ->
>> > >>>             ips_for_answers(Msg);
>> > >>>         {error, {nxdomain, _}} ->
>> > >>>             [];
>> > >>>         {error, timeout} ->
>> > >>>             %% retry with TCP
>> > >>>             case ResolveFun(Host, in, a, [{usevc, true}]) of
>> > >>>                 {ok, Msg} ->
>> > >>>                     ips_for_answers(Msg);
>> > >>>                 {error, {nxdomain, _}} ->
>> > >>>                     [];
>> > >>>                 Error = {error, _} ->
>> > >>>                     Error
>> > >>>             end;
>> > >>>         Error = {error, _} ->
>> > >>>             Error
>> > >>>     end.
>> > >>>
>> > >>> ips_for_answers(Msg) ->
>> > >>>     [inet_parse:ntoa(inet_dns:rr(Answer, data))
>> > >>>      || Answer <- inet_dns:msg(Msg, anlist)].
>> > >>>
>> > >>> The workaround we used was to call it indirectly with this function, I
>> > >>> couldn't find anything in OTP that did the same thing that didn't have
>> > >>> local call optimizations.
>> > >>>
>> > >>> %% @spec process_apply(atom(), atom(), [term()]) -> term()
>> > >>> %% @doc erlang:apply(M, F, A) in a temporary process and return the results.
>> > >>> process_apply(M,F,A) ->
>> > >>>     %% We can't just use rpc here because there's a local call optimization.
>> > >>>     Parent = self(),
>> > >>>     Fun = fun () ->
>> > >>>                   try
>> > >>>                       Parent ! {self(), erlang:apply(M, F, A)}
>> > >>>                   catch
>> > >>>                       Class:Reason ->
>> > >>>                           Stacktrace = erlang:get_stacktrace(),
>> > >>>                           Parent ! {self(), Class, Reason, Stacktrace}
>> > >>>                   end
>> > >>>           end,
>> > >>>     {Pid, Ref} = erlang:spawn_monitor(Fun),
>> > >>>     receive
>> > >>>         {Pid, Res} ->
>> > >>>             receive {'DOWN', Ref, process, Pid, _} -> ok end,
>> > >>>             Res;
>> > >>>         {Pid, Class, Reason, Stacktrace} ->
>> > >>>             receive {'DOWN', Ref, process, Pid, _} -> ok end,
>> > >>>             erlang:error(erlang:raise(Class, Reason, Stacktrace));
>> > >>>         {'DOWN', Ref, process, Pid, Reason} ->
>> > >>>             erlang:exit(Reason)
>> > >>>     end.
>> > >>>
>> > >>> ________________________________________________________________
>> > >>> erlang-bugs (at) erlang.org mailing list.
>> > >>> See http://www.erlang.org/faq.html
>> > >>> To unsubscribe; mailto:erlang-bugs-unsubscribe@erlang.org
>> > >>
>> > >> --
>> > >>
>> > >> / Raimo Niskanen, Erlang/OTP, Ericsson AB
>> > >>
>> > >
>> >
>> > ________________________________________________________________
>> > erlang-bugs (at) erlang.org mailing list.
>> > See http://www.erlang.org/faq.html
>> > To unsubscribe; mailto:erlang-bugs-unsubscribe@erlang.org
>> >
>>
>> --
>>
>> / Raimo Niskanen, Erlang/OTP, Ericsson AB
>>
>> ________________________________________________________________
>> erlang-bugs (at) erlang.org mailing list.
>> See http://www.erlang.org/faq.html
>> To unsubscribe; mailto:erlang-bugs-unsubscribe@erlang.org
>
> --
>
> / Raimo Niskanen, Erlang/OTP, Ericsson AB
>

________________________________________________________________
erlang-bugs (at) erlang.org mailing list.
See http://www.erlang.org/faq.html
To unsubscribe; mailto:erlang-bugs-unsubscribe@erlang.org

Post received from mailinglist
Guest
Posted: Thu May 27, 2010 2:29 pm Reply with quote
Guest
On Thu, May 27, 2010 at 07:11:03AM -0700, Bob Ippolito wrote:
> I can confirm that the first commit fixes the port leak bug.

Great! Then that branch should be complete. The second
commit makes your DNS reply message below decode with
the TC bit set, which should make inet_res retry with
'usevc' internally, obsoleting your wrapper (hopefully).

>
> On Thu, May 27, 2010 at 3:02 AM, Raimo Niskanen
> <raimo+erlang-bugs@erix.ericsson.se> wrote:
> > I have created a fix for these problems:
> >
Guest
Posted: Mon May 31, 2010 12:47 pm Reply with quote
Guest
On Thu, May 27, 2010 at 04:26:20PM +0200, Raimo Niskanen wrote:
> On Thu, May 27, 2010 at 07:11:03AM -0700, Bob Ippolito wrote:
> > I can confirm that the first commit fixes the port leak bug.
>
> Great! Then that branch should be complete. The second
> commit makes your DNS reply message below decode with
> the TC bit set, which should make inet_res retry with
> 'usevc' internally, obsoleting your wrapper (hopefully).

4aa2ead3149d3727ec6ad67b653ff51c74405671
New commit. The previous only worked for your special case.
What is not tested does not work...

It will be included in 'pu'.

>
> >
> > On Thu, May 27, 2010 at 3:02 AM, Raimo Niskanen
> > <raimo+erlang-bugs@erix.ericsson.se> wrote:
> > > I have created a fix for these problems:
> > >

Display posts from previous:  

All times are GMT
Page 1 of 1
This forum is locked: you cannot post, reply to, or edit topics.

Jump to:  

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You cannot download files in this forum