Erlang Mailing Lists

Author Message

<  Erlang bugs mailing list  ~  erts_bld_string_n returns negative characters

Guest
Posted: Thu Jun 03, 2010 10:30 pm Reply with quote
Guest
Hello,

I noticed a significant behaviour difference between ssl_imp new and ssl_imp old when using them with {packet, http} due to the fact that ssl_imp old decodes packets through inet driver (and the broker), while ssl_imp new decodes packets with erlang:decode_packet/3 and both do not generate the same data.

The (simplified) http packet is:
<<71,69,84,32,47,230,157,177,228,186,172,32,72,84,84,80,47,49,46,49,13,10,13,10>>

With {ssl_imp old}, I get:
{http_request,'GET',
{abs_path,[47,230,157,177,228,186,172]},
{1,1}}

With {ssl_imp, new}, I get:
{http_request,'GET',
{abs_path,[47,-26,-99,-79,-28,-70,-84]},
{1,1}}

One can get the same result with:
erlang:decode_packet(http, <<71,69,84,32,47,230,157,177,228,186,172,32,72,84,84,80,47,49,46,49,13,10,13,10>>, [{packet_size, 0}]).

erlang:decode_packet eventually calls erts_bld_string_n. Things happen line 513 of current dev branch on github :

http://github.com/erlang/otp/blob/dev/erts/emulator/beam/utils.c#L513

str[i] can be negative (> 0x7f) and therefore promoted to a small negative integer.

It seems to me that erts_bld_string_n is supposed to take ISO-8859-1 characters, for example when called from enif_make_string_len (which is therefore broken?). It should return small positive integers instead of negative ones for values > 0x7f. Line 513 should be replaced from:
res = CONS(*hpp, make_small(str[i]), res);
to:
res = CONS(*hpp, make_small((const unsigned char) str[i]), res);
or:
res = CONS(*hpp, make_small((byte) str[i]), res);

This change would mimic what happens with inet_drv. It encodes the string with ERL_DRV_STRING, which is then decoded in beam/io.c with buf_to_intlist, which goes like this:

tail = CONS(hp, make_small((byte)*buf), tail);

http://github.com/erlang/otp/blob/dev/erts/emulator/beam/utils.c#L2881

Regards,

Paul

PS: I realize this is not a valid HTTP packet (URIs should be encoded as ASCII 7 bits), but curl 7.20.0 sends it.


________________________________________________________________
erlang-bugs (at) erlang.org mailing list.
See http://www.erlang.org/faq.html
To unsubscribe; mailto:erlang-bugs-unsubscribe@erlang.org

Post received from mailinglist

Display posts from previous:  

All times are GMT
Page 1 of 1
This forum is locked: you cannot post, reply to, or edit topics.

Jump to:  

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You cannot download files in this forum