Erlang Mailing Lists

Author Message

<  Erlang questions mailing list  ~  Thoughts on laziness (long)

stimuli at shadow.net
Posted: Tue Oct 17, 2000 11:41 pm Reply with quote
Guest
Hi,

Given all the recent discussion on lazy lists, I thought I'd make some
comments. The availability of true call-by-need laziness is something
that I think Erlang sorely lacks; in fact, I think it is the one
useful mechanism that it most sorely lacks.

First off, lazy lists are actually rather familiar objects to us
programmers, even if we don't know it. For example, almost every
language out there implements a rather specialized version of laziness
called a "file stream". Think about it, a file stream is really a
list of characters (or bytes) which are read lazily from the
filesystem as needed.

This "lazy list as stream" motif is well established. In the ML
community, where lists are by default strict, laziness is provided
through optional modules and called, you guessed it, "streams", and it
does seem a useful terminology in languages, such as Erlang, where
lists are by default strict.

Now, some may think that adding laziness to the language is merely a
matter of providing syntactic sugar on top of what can now be done
with Funs, but this is not so. While a standard module for lazy lists
as well as, perhaps, lazy list comprehension would be welcome, they
don't answer the fundamental problem: Erlang's Funs only allow
call-by-name style laziness, not the more powerful call-by-need.

Call-by-name refers to a style of processing where the next element of
a structure is passed as a closure (read function) instead of a data
element. When the value is needed the closure is "forced" and the
value thus obtained. For a stream, the function will return not only
the next value, but also the next closure. A problem arises, however,
when you force a closure twice, as you might if you pass the stream
around to different functions, which may iterate over it multiple
times. Each time a computation starts at the head of the list and
iterates over it, the computation of the lists values must be
performed anew. With a strict list the computation is only performed
once.

For some algorithms, of course, the data only needs to be processed
once, and call-by-name laziness is fine. If you are using an
algorithm that accesses the list multiple times (in part of whole)
then strictness would serve better.

There is another approach to laziness, however, which is used in
Haskell and available in some versions of ML (and quite a few other
systems), known as "call-by-need" laziness. In call-by-need laziness,
a lazy closure is called a "suspension", and is only evaluated once.
Then the value is remembered, and if that part of the stream is
visited again, the previously used value is returned. With
call-by-need laziness you are guaranteed that the computation will be
performed at most once. So, call-by-need guarantees the same
asomtoptic time efficiency as strict evaluation (although there is a
constant time factor slowdown compared to strict evaluation).

Call-by-need modifies values in the heap when a suspension is forced,
and this may seem to violate functional purity. However, if you are
thinking at the proper level of abstraction you will realize that this
is not so. If you think of what is immutable as being the *value* of
the suspension, the fact that is is computed later rather than sooner
in no way "modifies" the value, even though behind the scenes pointers
in the heap are being changed.

That being said, the fact that call-by-need laziness does update the
heap proves quite useful. In traditional strict languages,
amortization methods prove impossible in algorithm analysis, as spread
out modifications made by expensive operations over the course of
several cheap operations. This sort of thing is not possible in a
strict language, but there are tricks that can be done via the
call-by-need mechanism to achieve it. I won't provide much detail
here, but Chris Okasaki's _Purely Functional Data Structures_
discusses the topic at length. There exist implementations of queues,
for instance, that have much better amortized behavior than what is
possible in Erlang, particularly in cases where more than one
computation iterates across the queue.

Note that laziness does not always get better heap behavior than
strictness. Depending on how you structure your list iteration, it is
possible that pointers to the head of the list still exist during its
entire lifetime. If this is so, then the garbage collector may not
clean up the already processed nodes, even if they will not be visited
again. Special compiler optimizations are sometime required to
mitigate against this. Also, in many cases the ideal heap behavior is
achieved by combining strictness and laziness. Haskell compilers, for
instance, do extensive "strictness analysis" to find areas where
strictness can be added to a computation to improve memory usage.
This is the source of the claim that laziness can lead to bizzair heap
behavior. It can, but in languages that are by default strict it is
less of a problem with the few specifically lazy structures. And
remember, with a lazy stream the worse case scenario is the same
(within a constant factor) as a strict list: the entire list being
created on the heap.

Minus a constant factor, lazy streams will not have worse heap
behavior than strict lists, and they might be very much better.

Call-by-need laziness seems like a real winner. However, it would
require compiler support to be used in Erlang, as behind the scenes it
must perform updates on data elements, which is not possible in Erlang
(at least not efficiently).

Also, if one sees the value of lazy streams, then surely lazy trees,
queues and other such are required. Other strict languages have
provided laziness through a few simple primitives, such as Suspend and
Force. Means to use these to construct lazy streams, and other such
structures, are well covered in the literature.

-- Jeffrey Straszheim | A sufficiently advanced
-- Systems Engineer, Programmer | regular expression is
-- http://www.shadow.net/~stimuli | indistinguishable from
-- stimuli AT shadow DOT net | magic


Post generated using Mail2Forum (http://m2f.sourceforge.net)
rv at bluetail.com
Posted: Wed Oct 18, 2000 11:12 am Reply with quote
Guest
Jeffrey Straszhiem <stimuli_at_shadow.net> writes:
>Hi,
... Long discussion on call by need laziness
>

I personally think that laziness can be interesting but have yet come
across any applications in Erlang where it would have been useful.

Many of the uses of laziness in other functional languages are handled
in Erlang with processes and message communication. Implementing lazy
streams for communication does not give you very much and in many cases
is much more limiting than Erlang's message passing with selective
receives.

Another problem with call-by-need, which I tried to point out in an
earlier mail, is that it creates problems when used in conjunction with
processes and distribution. For many applicattions it IS important WHEN
and WHERE things are done.

You also have the proble that in Erlang programs there are many
functiona calls which are done for effect, send a message, so waiting
to evaluate them until you need the result won't give you much. :-)

My solution was, as mentioned, force full evalutaion for all messages
and for all calls which ignore the result. This should make the most
things work, but you will still end up with sequencing problems:

R1 = foo(...),
R2 = bar(...),

If R1 and R2 are used then both foo nad bar will be called but I may
lose control of the order, which may be important.

This is similar to the problems we had when implementing Erlang on top
of Strand, a parallel logic language, where I had to pass around a
state variable and wait for it everywhere just to be sure of sequencing.

I agree, however, that to make this work properly it would have to be
proper call-by-need with updating of the heap when a result has been
evaluated.

However, I wonder if it possible to update Erlang with such a feature
and still keep it sufficiently backwards compatible to call be able to
call it the same language. You might end up with a new language, Lazy
Erlang, with new semantics.

Robert

--
Robert Virding Tel: +46 (0)8 545 55 017
Alteon Web Systems Email: rv_at_bluetail.com
S:t Eriksgatan 44 WWW: http://www.bluetail.com/~rv
SE-112 34 Stockholm, SWEDEN
"Folk s
stimuli at shadow.net
Posted: Thu Oct 19, 2000 1:05 am Reply with quote
Guest
On Wed, Oct 18, 2000 at 01:11:44PM +0200, Robert Virding wrote:

> I personally think that laziness can be interesting but have yet
> come across any applications in Erlang where it would have been
> useful.

> Many of the uses of laziness in other functional languages are handled
> in Erlang with processes and message communication. Implementing lazy
> streams for communication does not give you very much and in many cases
> is much more limiting than Erlang's message passing with selective
> receives.

True, Erlang's rich process and message features do provide
alternatives to lazy streams for many tasks. However, I think
Okasaki's amortized data structures might alone provide a good reason
for some call-by-need laziness features.

<snip>

> Another problem with call-by-need, which I tried to point out in an
> earlier mail, is that it creates problems when used in conjunction
> with processes and distribution. For many applicattions it IS
> important WHEN and WHERE things are done.

> You also have the proble that in Erlang programs there are many
> functiona calls which are done for effect, send a message, so
> waiting to evaluate them until you need the result won't give you
> much. Smile

True, but these are areas where one would surely *not* choose a lazy
structure.

Regarding whether messages should be forced, if you are only using
laziness on side-effect-free values (as you should), it doesn't matter
when the suspension is forced. And as suspensions are just memoized
Funs, and you can pass Funs as messages, there certainly is an option
to send the suspension unforced.

That being said, you give good reasons to force them. I don't know
which would be best, and this should be left an issue of further
research (by smarter people than me). For the purposes of amortized
structures, there are arguments that could be made for both options,
and an even stronger argument for putting suspensions in shared heap
-- but that is an entirely different topic I won't address :)

> My solution was, as mentioned, force full evalutaion for all messages
> and for all calls which ignore the result. This should make the most
> things work, but you will still end up with sequencing problems:

> R1 = foo(...),
> R2 = bar(...),

> If R1 and R2 are used then both foo nad bar will be called but I may
> lose control of the order, which may be important.

This is true, and without some Byzantine structure like monads, I
don't see an easy solution. Perhaps laziness should be assigned to
the "only use this if you know what you're doing" category (which is
fortunately a small category in Erlang space, but needn't be an empty
category).

For instance, Okasaki's amortized queues wouldn't suffer because all
values place into the queue will have been evaluated before insertion
(as Erlang is still elsewhere strict). Only the internal queue
structures (the cons's and such) themselves will be modified lazily.
This would be safe and useful, and could be provided in a nice module.

<snip>

> However, I wonder if it possible to update Erlang with such a feature
> and still keep it sufficiently backwards compatible to call be able to
> call it the same language. You might end up with a new language, Lazy
> Erlang, with new semantics.

Sure it would be backward compatible, as laziness would only be a
small addition to the language -- and only used just where
specifically desired. I don't think anyone is advocating replacing
Erlang's present computational model with a lazy one, nor replacing
strict lists with lazy streams. Lazy-by-default languages are complex
beasts, often hard to learn, and with a vague feeling of
"specifications on an amorphous sea of computation". Erlang's simple
and direct model is easy to understand, and easy to learn. This must
never change.

(As usual, just my 2 monetary units.)

-- Jeffrey Straszheim | A sufficiently advanced
-- Systems Engineer, Programmer | regular expression is
-- http://www.shadow.net/~stimuli | indistinguishable from
-- stimuli AT shadow DOT net | magic



Post generated using Mail2Forum (http://m2f.sourceforge.net)
matthias at corelatus.com
Posted: Thu Oct 19, 2000 1:32 pm Reply with quote
Guest
> Robert Virding wrote:

> > I [...] have [not] yet come across any applications in Erlang
> > where it [laziness] would have been useful.

Jeremy Strazhiem wrote:

> However, I think Okasaki's amortized data structures might alone
> provide a good reason for some call-by-need laziness features.

Writing data structures in functional languages leads to embarassment:
almost no matter what you do, it's usually slower and often also
has worse big-O behaviour than the equivalent in a language which
allows destructive update.

One solution is to come up with elegant but complicated data
structures such as Okasaki's. Many of the more attractive ones require
lazy semantics. If you go down that road, you can brag on
comp.lang.functional that your functional language now also supports
monadical-recursive-slowdown-higher-order-transmogrification. You also
scare off many potential users.

Another solution is to just provide hashtables a la ETS. Nowhere near
as elegant, but (a) it doesn't require turning the language's
semantics upside down and (b) it's reasonable to assume it's faster*.

Matthias

------------------------------

* Every attempt I've seen to build a clever, general data structure in
Erlang has resulted in something that's slower than either lists or
ETS. Two examples:

ra_list

ra_list is one of Okasaki's data structures. It's a data
structure with both list-like and array-like behaviour;
you get O(log N) access to any element AND O(1) head()/tail()

gb_trees

gb_trees is on the user contributions page at erlang.org.
As far as I can tell, it's supposed to have O(log N)
access to elements. The documentation makes no claims
beyond "highly efficient".

Doing a quick benchmark doesn't give very encouraging results.

Sum: take a list of numbers 1..N and sum them (not Gauss' way ;-)

Update: change the 3rd, middle and 3rd-last elements in a list

-----list------ ------ra-list-- -------ets----- gb_tree
N sum update sum update sum update update
----------------------------------------------------------------------
100 107 197 962 84 1660 50 87
1000 921 3666 9903 127 16k 60 89
10k 9324 46k 93k 168 168k 74 96
100k 493k 660k 1357k 188 2109k 84 97

The table shows execution time, i.e. larger numbers are worse.
I can't see why anyone would want to use a gb_tree---ETS is
faster at the very operation gb_tree is supposed to be really
good at. The picture for ra_list is a little more complex, but
it doesn't make me want to jump up and down saying "yay! ra-lists".

Reasons why the above tests might be complete crap:

- I might have implemented ra-list really badly

- The above benchmark might be contrived specially to make
ETS look good. Then again, with such small tuples, I think
it's a worst-case for ETS.

- I might just be showing that the current Erlang implementation
sucks. Maybe ETOS or HIPE can show much better results for
purely functional data structures.

- I wasn't super-careful with the timing, e.g. it's all the
same Erlang machine, so garbage collection from earlier test
runs might have perturbed the results.

Source code is available on request if someone wants to investigate
this some more.

--- eot ---



Post generated using Mail2Forum (http://m2f.sourceforge.net)
richardc at it.uu.se
Posted: Thu Oct 19, 2000 2:03 pm Reply with quote
Guest
On Thu, 19 Oct 2000 matthias_at_corelatus.com wrote:

> Doing a quick benchmark doesn't give very encouraging results.
>
> Sum: take a list of numbers 1..N and sum them (not Gauss' way Wink
>
> Update: change the 3rd, middle and 3rd-last elements in a list
>
> -----list------ ------ra-list-- -------ets----- gb_tree
> N sum update sum update sum update update
> ----------------------------------------------------------------------
> 100 107 197 962 84 1660 50 87
> 1000 921 3666 9903 127 16k 60 89
> 10k 9324 46k 93k 168 168k 74 96
> 100k 493k 660k 1357k 188 2109k 84 97
>
> The table shows execution time, i.e. larger numbers are worse.
> I can't see why anyone would want to use a gb_tree---ETS is
> faster at the very operation gb_tree is supposed to be really
> good at. The picture for ra_list is a little more complex, but
> it doesn't make me want to jump up and down saying "yay! ra-lists".

Well, you couldn't expect to beat a destructive update with a tree rewrite
(but actually, I'm surprised that the ETS tables were not even faster).

Advantages of gb-trees are that they are much more lightweight (an empty
tree is a tuple {0, nil}, so you can create and throw away as many as you
like on the fly, and that they do no copying, since they reside on the
local heap - you could try the same test inserting/updating/fetching some
bigger things than integers and see the difference.

Myself, I mainly use them for representing environments in program
analysis or interpretation.

> - I might just be showing that the current Erlang implementation
> sucks. Maybe ETOS or HIPE can show much better results for
> purely functional data structures.

They should perform better in native code compared to ETS than under the
BEAM emulator, but in general the complexity of the algorithms would
remain the same.


Richard Carlsson (richardc_at_csd.uu.se) (This space intentionally left blank.)
E-mail: Richard.Carlsson_at_csd.uu.se WWW: http://www.csd.uu.se/~richardc/



Post generated using Mail2Forum (http://m2f.sourceforge.net)
etxuwig at etxb.ericsson.
Posted: Thu Oct 19, 2000 2:19 pm Reply with quote
Guest
On Thu, 19 Oct 2000 matthias_at_corelatus.com wrote:

>* Every attempt I've seen to build a clever, general data structure in
> Erlang has resulted in something that's slower than either lists or
> ETS. Two examples:

I suspect that it's hard to beat plain lists on the "sum" benchmark,
but then again, lists suck badly on the "update" benchmark, as does
ets on "sum". So the niche of alternative structures would be to
strike a good compromise between the two.

>Reasons why the above tests might be complete crap:
>
> - The above benchmark might be contrived specially to make
> ETS look good. Then again, with such small tuples, I think
> it's a worst-case for ETS.

Actually, I think it's the other way around. In my experience, having
built a linear hashing table in Erlang, for example, ets scales better
with an increasing number of objects, but loses to a heap-based
implementation when the objects get large. For small objects, it's
very difficult to beat ets, but with > 1KBytes/object, you stand a
good chance.

/Uffe
--
Ulf Wiger tfn: +46 8 719 81 95
Strategic Product & System Management mob: +46 70 519 81 95
Ericsson Telecom AB, Datacom Networks and IP Services
Varuv
bernardp at cli.di.unipi.
Posted: Thu Oct 19, 2000 2:22 pm Reply with quote
Guest
On Thu, 19 Oct 2000 matthias_at_corelatus.com wrote:

> * Every attempt I've seen to build a clever, general data structure in
> Erlang has resulted in something that's slower than either lists or
> ETS. Two examples:
>
> ra_list
>
> ra_list is one of Okasaki's data structures. It's a data
> structure with both list-like and array-like behaviour;
> you get O(log N) access to any element AND O(1) head()/tail()

And O(log N) replace.

> Doing a quick benchmark doesn't give very encouraging results.
>
> Sum: take a list of numbers 1..N and sum them (not Gauss' way Wink
>
> Update: change the 3rd, middle and 3rd-last elements in a list
>
> -----list------ ------ra-list-- -------ets----- gb_tree
> N sum update sum update sum update update
> ----------------------------------------------------------------------
> 100 107 197 962 84 1660 50 87
> 1000 921 3666 9903 127 16k 60 89
> 10k 9324 46k 93k 168 168k 74 96
> 100k 493k 660k 1357k 188 2109k 84 97
>
> The table shows execution time, i.e. larger numbers are worse.

I too implemented ralists. If I remember correctly, my implementation
was about 5/6 times slower than built-in lists in a
cons/head/tail-based benchmark.

Source code is appended below.

Worth noting is the fact that uninitialized RALs can be built
which occupy very little space. E.g. ral:create(1000000000000) does work.

PB

================================================================

%%% File : ral.erl
%%% Author : Pierpaolo Bernardi <bernardp_at_cli.di.unipi.it>
%%% Purpose : Implements Random Access Lists.
%%% Created : 14 Dec 1998 by Pierpaolo Bernardi <bernardp_at_cli.di.unipi.it>


% See Okasaki - Purely Functional Data Structures.

-module(ral).
-author('bernardp_at_cli.di.unipi.it').

%-compile(export_all).

-export([empty/0,
cons/2,
head/1,
tail/1,
is_empty/1,
nth/2,
update/3,
ralength/1,
create/2,
create/1,
drop/2,
rev_append/2,
rev/1,
append/2,
flatten/1,
to_list/1,
from_list/1,
iota/1,
map/2,
iter/2,
for_all/2,
exists/2,
mem/2,
split/1,
combine/2
]).

%% A RAL is represented either as the atom 'the_empty_ral', or a list
%% of the form: [{Size,Tree} ... | the_empty_ral]. Size is the number
%% of nodes contained in the Tree, Tree is either {X} for a leaf node
%% or {X,LeftSubtree,RighSubtree} for nonleaf nodes.


empty() -> the_empty_ral.


% cons(Item,Ral).

cons(X,[{Size1,T1}, {Size2,T2}|Rest]) when Size1 =:= Size2 ->
[{1+Size1+Size2,{X,T1,T2}}|Rest];
cons(X,[{Size1,T1}, {Size2,T2}|Rest]) when Size1 /= Size2 ->
[{1,{X}}|[{Size1,T1}, {Size2,T2}|Rest]];
cons(X,Y) ->
[{1,{X}}|Y].


% head(Ral).

head([{Size,{X}}|Rest]) -> X;
head([{Size,{X,T1,T2}}|Rest]) -> X.


% tail(Ral).

tail([{Size,{X}}|Rest]) -> Rest;
tail([{Size,{X,T1,T2}}|Rest]) ->
Size1 = Size div 2,
[{Size1,T1},{Size1,T2}|Rest].


% is_empty(Ral).

is_empty(the_empty_ral) -> true;
is_empty(X) -> false.


tree_nth(Size,{X},0) -> X;
tree_nth(Size,{X,T1,T2},0) -> X;
tree_nth(Size,{X,T1,T2},I) ->
Size1 = Size div 2,
if I =< Size1 -> tree_nth(Size1,T1,I-1);
true -> tree_nth(Size1,T2,I-1-Size1)
end.


% nth(Ral,Index).

nth([{Size,T}|Rest],I) when I < Size -> tree_nth(Size,T,I);
nth([{Size,T}|Rest],I) -> nth(Rest,I-Size).


tree_update(Size,{X},0,Y) -> {Y};
tree_update(Size,{X,T1,T2},0,Y) -> {Y,T1,T2};
tree_update(Size,{X,T1,T2},I,Y) ->
Size1 = Size div 2,
if I =< Size1 -> {X,tree_update(Size1,T1,I-1,Y),T2};
true -> {X,T1,tree_update(Size1,T2,I-1-Size1,Y)}
end.


% update(Ral,Index,NewItem).

update([{Size,T}|Rest],I,Y) when I < Size ->
[{Size,tree_update(Size,T,I,Y)}|Rest];
update([{Size,T}|Rest],I,Y) ->
[{Size,T}|update(Rest,I-Size,Y)].


% Additional efficient operations not described in paper (Okasaki)

% ralength(Ral).

ralength(the_empty_ral) -> 0;
ralength([{Size,T}|Rest]) -> Size + ralength(Rest).


make(X,N,Size,T,Rest) when Size > N -> Rest;
make(X,N,Size,T,Rest) -> make(X, N, 1+Size+Size, {X,T,T}, [{Size,T}|Rest]).

select(0,X,Xs) -> Xs;
select(M,[{Size,T}|Rest],Xs) when M < Size -> select(M,Rest,Xs);
select(M,[{Size,T}|Rest],Xs) -> select(M-Size,[{Size,T}|Rest],[{Size,T}|Xs]).


% create(Size,Elem).

create(N,X) when integer(N) ->
% make a list of all trees up to size n, then select
% those trees that form the greedy decomposition
select(N,make(X,N,1,{X},the_empty_ral),the_empty_ral).

create(N) -> create(N,not_initialized).


tree_drop(Size,T,0,Rest) -> [{Size,T}|Rest];
tree_drop(Size,{X},1,Rest) -> Rest;
tree_drop(Size,{X,T1,T2},I,Rest) ->
Size1 = Size div 2,
if I =< Size1 -> tree_drop(Size1,T1,I-1,[{Size1,T2}|Rest]);
true -> tree_drop(Size1,T2,I-1-Size1,Rest)
end.


% drop(Ral,N).

drop(Xs,0) -> Xs;
drop([{Size,T}|Rest],I) ->
if I < Size ->
tree_drop(Size,T,I,Rest);
true ->
drop(Rest,I-Size)
end.



% Pierpaolo

decompo_loop(C,Top) ->
Next = C+C+1,
if Next > Top -> {Top-C,[C]};
true -> {Rest,Sizes} = decompo_loop(Next,Top),
if Next =:= Rest -> {0,[Next|Sizes]};
C > Rest -> {Rest,Sizes};
true -> {Rest-C,[C|Sizes]}
end
end.

decompo(N) ->
if N < 0 -> invalid_argument;
N =:= 0 -> the_empty_ral;
true -> {Rest,Sizes} = decompo_loop(1,N),
case Rest of
0 -> Sizes;
1 -> [1|Sizes]
end
end.


tree_rev_append({X},L) -> cons(X,L);
tree_rev_append({X,T1,T2},L) ->
tree_rev_append(T2, tree_rev_append(T1,cons(X,L))).


% rev_append(Rovescianda,Tail).

rev_append(the_empty_ral,L2) -> L2;
rev_append([{Size,T1}|T2],L2) -> rev_append(T2, tree_rev_append(T1,L2)).


% rev(Ral).

rev(L) -> rev_append(L,the_empty_ral).


tree_append({X},L) -> cons(X,L);
tree_append({X,T1,T2},L) -> cons(X,tree_append(T1, tree_append(T2,L))).

% append(Ral1,Ral2).

append(the_empty_ral,L2) -> L2;
append([{Size,T1}|T2],L2) -> tree_append(T1,append(T2,L2)).

% flatten(RalOfRals).

flatten(the_empty_ral) -> the_empty_ral;
flatten(L) -> append(head(L),flatten(tail(L))).


tree_to_list({X},Acc) -> [X|Acc];
tree_to_list({X,T1,T2},Acc) -> [X|tree_to_list(T1,tree_to_list(T2,Acc))].


tol(the_empty_ral,Acc) -> Acc;
tol([{Size,T1}|T2],Acc) -> tree_to_list(T1,tol(T2,Acc)).

% to_list(Ral).

to_list(L) -> tol(L,[]).


from_list_loop([],Acc) -> Acc;
from_list_loop([X|Xs],Acc) -> from_list_loop(Xs,cons(X,Acc)).

% from_list(List).

from_list(L) -> from_list_loop(lists:reverse(L),the_empty_ral).


tree_iota(Da,1) -> {Da};
tree_iota(Da,Size) ->
Half = Size div 2,
{Da, tree_iota(Da+1,Half), tree_iota(Da+Half+1,Half)}.

iota_loop([],Da) -> the_empty_ral;
iota_loop([X|Xs],Da) -> [{X,tree_iota(Da,X)}|iota_loop(Xs,(Da+X))].

% iota(Int).

iota(N) -> iota_loop(decompo(N),0).


tree_map(F,{X}) -> {F(X)};
tree_map(F,{R,T1,T2}) -> {F(R), tree_map(F,T1), tree_map(F,T2)}.

% map(Fun,Ral).

map(F,the_empty_ral) -> the_empty_ral;
map(F,[{Size,T1}|T2]) -> [{Size,tree_map(F,T1)}|map(F,T2)].


tree_iter(F,{X}) -> F(X);
tree_iter(F,{X,T1,T2}) ->
F(X),
tree_iter(F,T1),
tree_iter(F,T2).


% iter(Fun,Ral).

iter(F,the_empty_ral) -> the_empty_ral;
iter(F,[{Size,T1}|T2]) ->
tree_iter(F,T1),
iter(F,T2).


tree_for_all(F,{X}) -> F(X);
tree_for_all(F,{X,T1,T2}) ->
F(X) and tree_for_all(F,T1) and tree_for_all(F,T2).


% for_all(Fun,Ral).

for_all(F,the_empty_ral) -> true;
for_all(F,[{Size,T1}|T2]) -> tree_for_all(F,T1) and for_all(F,T2).


tree_exists(F,{X}) -> F(X);
tree_exists(F,{X,T1,T2}) -> F(X) or tree_exists(F,T2) or tree_exists(F,T2).


% exists(Fun,Ral).

exists(F,the_empty_ral) -> false;
exists(F,[{Size,T1}|T2]) -> tree_exists(F,T1) or exists(F,T2).


tree_mem(X,{I}) when X =:= I -> true;
tree_mem(X,{I}) -> false;
tree_mem(X,{I,T1,T2}) when X =:= I -> true;
tree_mem(X,{I,T1,T2}) -> tree_mem(X,T1) or tree_mem(X,T2).


% mem(Item,Ral).

mem(X,the_empty_ral) -> false;
mem(X,[{Size,T1}|T2]) -> tree_mem(X,T1) or mem(X,T2).


tree_split({{X,Y}}) -> {{X},{Y}};
tree_split({{X,Y},T1,T2}) ->
{T11,T12} = tree_split(T1),
{T21,T22} = tree_split(T2),
{{X,T11,T21},{Y,T12,T22}}.


% split(RalOfPairs).

split(the_empty_ral) -> {the_empty_ral,the_empty_ral};
split([{Size,T1}|T2]) ->
{T11,T12} = tree_split(T1),
{T21,T22} = split(T2),
{[{Size,T11}|T21],[{Size,T12}|T22]}.


tree_combine({X1},{X2}) -> {{X1,X2}};
tree_combine({X1,T11,T21},{X2,T12,T22}) ->
{{X1,X2}, tree_combine(T11,T12), tree_combine(T21,T22)}.


% combine(Ral1,Ral2).

combine(the_empty_ral,the_empty_ral) -> the_empty_ral;
combine([{Size1,T11}|T21],[{Size2,T12}|T22]) when Size1 =:= Size2 ->
[{Size1,tree_combine(T11,T12)}|combine(T21,T22)].


% end



Post generated using Mail2Forum (http://m2f.sourceforge.net)
stimuli at shadow.net
Posted: Fri Oct 20, 2000 12:43 am Reply with quote
Guest
On Thu, Oct 19, 2000 at 03:33:55PM +0200, matthias_at_corelatus.com
wrote:

> Writing data structures in functional languages leads to
> embarassment: almost no matter what you do, it's usually slower and
> often also has worse big-O behaviour than the equivalent in a
> language which allows destructive update.

> One solution is to come up with elegant but complicated data
> structures such as Okasaki's. Many of the more attractive ones
> require lazy semantics. If you go down that road, you can brag on
> comp.lang.functional that your functional language now also supports
> monadical-recursive-slowdown-higher-order-transmogrification. You
> also scare off many potential users.

Well, I don't think the situation is quite as bad as that.

> Another solution is to just provide hashtables a la ETS. Nowhere
> near as elegant, but (a) it doesn't require turning the language's
> semantics upside down and (b) it's reasonable to assume it's
> faster*.

Efficient dictionaries are a very important feature, and hash tables
seem to be the only way to get them. And for that reason Erlang has
made the right choice in throwing purity to the wind and providing ets
tables and their like.

And I agree that the quest for purity, in whole or part, must be for
something more than bragging rights on c.l.f. I think what purity we
provide must be there for some good reason, and the case seems to be
the ease of understanding and maintaining functional code over
imperative code. This argument is based on intangible factors, but in
my experience it is very true.

<snip timing data>

One thing your comparison did not point out is this data is only
relevant in cases where you use the data in a single-threaded manner.
By "single-threaded" here I don't mean to imply concurrency, but
instead threads of data flow. Here is an example:

This code uses a queue in a single-threaded manner:

Queue1 = processQueue(Queue),
Queue2 = processQueueAgain(Queue1),
Queue2.

This code uses the queue in a multi-threaded manner:

Queue1 = processQueue(Queue),
Queue2 = processQueueDifferently(Queue),
{Queue1, Queue2}.

With functional data structures, the second example does what you
would expect. And, if neither function call has side effects, then
the order that they're called doesn't matter either -- not that this
is a very big deal, but when maintaining code it is nice not to have
to worry about just where to place a computation to be between the
correct side effects. This is easier with functional structures.

You get none of the advantages with ets tables.

So, except in cases where the performance cost is prohibitive, I would
prefer to stick with functional data structures. Even if in you
initial design you only use a structure in a single data flow thread,
you may later decide to add, say, multi-level undo, or some other
operation which is trivial to add to a functional structure.

As with everything, there are tradeoffs. (I'm allowed one blatant
truism per post :)

Now back to the comments on laziness. In the single-threaded example
above, the stdlib Erlang queues will do fine, giving good amortized
behavior -- every so often they'll take a bit of time to reverse a
list; the cost spreads out. However, as soon as you switch to the
multi-threaded example, this fails, and you can lose the O(N)
behavior, and in pathological cases end up with O(N^2) (this is from
memory, Okasaki provides detailed analysis). In these cases laziness
would prove useful.

-- Jeffrey Straszheim | A sufficiently advanced
-- Systems Engineer, Programmer | regular expression is
-- http://www.shadow.net/~stimuli | indistinguishable from
-- stimuli AT shadow DOT net | magic


Post generated using Mail2Forum (http://m2f.sourceforge.net)
etxuwig at etxb.ericsson.
Posted: Fri Oct 20, 2000 7:40 am Reply with quote
Guest
On Thu, 19 Oct 2000, Jeffrey Straszhiem wrote:

>Efficient dictionaries are a very important feature, and hash tables
>seem to be the only way to get them. And for that reason Erlang has
>made the right choice in throwing purity to the wind and providing ets
>tables and their like.

Actually, you could view ets tables as an analogy to a server process
holding the tables (or one process per table), updating a functional
structure, and servicing insert/lookup requests via message passing.
I wrote an Erlang-based ets implementation for the Erlang processor,
and this is how it was done.

My point is that you could well claim that the one impure feature of
Erlang's is message passing. Ets tables bring nothing more to the
table than what could be done using "classic" Erlang - it just does it
faster.

>One thing your comparison did not point out is this data is only
>relevant in cases where you use the data in a single-threaded manner.
>By "single-threaded" here I don't mean to imply concurrency, but
>instead threads of data flow. Here is an example:

Ehmm, I'm not too sure about this. If you have the case of a single
process updating a private data structure, then you can do well with a
functional data structure on the process's own heap, but as soon as
you want to support concurrency, you need to place the data structure
in a server process; then it's entirely up to that process to decide
which granularity of concurrency you want. In order to support safe
updates on ets tables, you'd funnel all updates through a server
process, while atomic reads might go directly agains the ets table
(for the functional structure, all operations must go through the same
process).

The performance comparison then stacks up the same way as for the
single-threaded case, with an equivalent copying and scheduling
overhead for both models; the one exception is that simple reads
on an ets table can be optimized and go directly to the table,
thus avoiding this overhead. As far as I can tell, this increases the
advantage of ets tables over functional structures in the concurrent
case.

I'm not knocking Okasaki, but I don't think he factored in Erlang's
concurrency model in his analysis.

/Uffe
--
Ulf Wiger tfn: +46 8 719 81 95
Strategic Product & System Management mob: +46 70 519 81 95
Ericsson Telecom AB, Datacom Networks and IP Services
Varuv
rv at bluetail.com
Posted: Fri Oct 20, 2000 8:25 am Reply with quote
Guest
Ulf Wiger <etxuwig_at_etxb.ericsson.se> writes:
>On Thu, 19 Oct 2000, Jeffrey Straszhiem wrote:
>
>>Efficient dictionaries are a very important feature, and hash tables
>>seem to be the only way to get them. And for that reason Erlang has
>>made the right choice in throwing purity to the wind and providing ets
>>tables and their like.
>
>Actually, you could view ets tables as an analogy to a server process
>holding the tables (or one process per table), updating a functional
>structure, and servicing insert/lookup requests via message passing.
>I wrote an Erlang-based ets implementation for the Erlang processor,
>and this is how it was done.
>
>My point is that you could well claim that the one impure feature of
>Erlang's is message passing. Ets tables bring nothing more to the
>table than what could be done using "classic" Erlang - it just does it
>faster.

You took the words right out pf my mouth. This has been my point for
quite a while that Erlang has really only one impure feature, and that
is processes and message passing. Anything else can be viewed in terms
of these. Like ETS tables.

What I would have preferred with ETS tables is to have more explicit
access to the primitive hash-table operations instead of bundling
everything into one package.

Robert

--
Robert Virding Tel: +46 (0)8 545 55 017
Alteon Web Systems Email: rv_at_bluetail.com
S:t Eriksgatan 44 WWW: http://www.bluetail.com/~rv
SE-112 34 Stockholm, SWEDEN
"Folk s
matthias at corelatus.com
Posted: Fri Oct 20, 2000 10:36 am Reply with quote
Guest
Pierpaolo BERNARDI writes:

> I too implemented ralists. If I remember correctly, my implementation
> was about 5/6 times slower than built-in lists in a
> cons/head/tail-based benchmark.

Thanks for the code. Your implementation is cleaner and more complete
than mine. Running it with the same test driver as I ran the other
code, I get no significant difference in performance:

N Sum Update Original ra-list
----------------------------------------
100: 981 89 (962, 84)
1000: 9731 125 (9903, 127)
10000: 98k 183 (93k, 168)
100k: 1501k 248 (1357k, 188)

Matthias


Post generated using Mail2Forum (http://m2f.sourceforge.net)
stimuli at shadow.net
Posted: Sat Oct 21, 2000 1:38 am Reply with quote
Guest
On Fri, Oct 20, 2000 at 09:40:24AM +0200, Ulf Wiger wrote:

> My point is that you could well claim that the one impure feature of
> Erlang's is message passing. Ets tables bring nothing more to the
> table than what could be done using "classic" Erlang - it just does
> it faster.

True. Actually I had assumed, without looking at the implementation,
that ets tables were indeed processes as you describe. I guess I
learn something new everyday. :)

>> One thing your comparison did not point out is this data is only
>> relevant in cases where you use the data in a single-threaded
>> manner. By "single-threaded" here I don't mean to imply
>> concurrency, but instead threads of data flow. Here is an example:

> Ehmm, I'm not too sure about this. If you have the case of a single
> process updating a private data structure, then you can do well with
> a functional data structure on the process's own heap, but as soon
> as you want to support concurrency, you need to place the data
> structure in a server process; then it's entirely up to that process
> to decide which granularity of concurrency you want. In order to
> support safe updates on ets tables, you'd funnel all updates through
> a server process, while atomic reads might go directly agains the
> ets table (for the functional structure, all operations must go
> through the same process).

Again true. In many ways Erlang server processes can end up looking
alot like objects with state, and not so much like "processes" at all.
I think your ets as a server example proves this point. However,
within an implementation of a server process there may arise a case
where a complex computation might need to be performed. If this sort
of computation is the kind that would benefit from laziness then it
would pay off.

That being said, I'm not sure how common the payoff would be. It is
certainly possible that laziness in Erlang would not be worth its
cost. On the other hand, simply providing 'suspend' and 'force'
operators to the language shouldn't be too difficult. Newcomers to
the language could produce plenty of code without them -- I really
imagine them used by experts to build data structure modules -- so
they need not impede the simplicity of learning.

Anyhow, surely everyone will agree that it is worth more thought.

-- Jeffrey Straszheim | A sufficiently advanced
-- Systems Engineer, Programmer | regular expression is
-- http://www.shadow.net/~stimuli | indistinguishable from
-- stimuli AT shadow DOT net | magic


Post generated using Mail2Forum (http://m2f.sourceforge.net)
matthias at corelatus.com
Posted: Mon Oct 23, 2000 9:33 am Reply with quote
Guest
> Advantages of gb-trees are that they are much more lightweight (an empty
> tree is a tuple {0, nil}, so you can create and throw away as many as you
> like on the fly, and that they do no copying, since they reside on the
> local heap - you could try the same test inserting/updating/fetching some
> bigger things than integers and see the difference.

Yes. If I use larger erlang structures, e.g. large tuples, large
lists, there's a marked slowdown in ETS. (If I avoid copying by using
binaries, there's no slowdown).

> They should perform better in native code compared to ETS than under the
> BEAM emulator, but in general the complexity of the algorithms would
> remain the same.

I remember reading that the 'Mercury' people hoped to be able to
perform complexity reducing source->source transforms. Does HIPE do
that, or does the potential (I read "in general" to mean "there are
some cases where we can do better". That may have been reading too
much into what you wrote.) come from 'striking it lucky' when doing
(local) optimisations on object code?

Matthias


Post generated using Mail2Forum (http://m2f.sourceforge.net)
richardc at it.uu.se
Posted: Mon Oct 23, 2000 10:00 am Reply with quote
Guest
On Mon, 23 Oct 2000 matthias_at_corelatus.com wrote:

> > They should perform better in native code compared to ETS than
> > under the BEAM emulator, but in general the complexity of the
> > algorithms would remain the same.
>
> I remember reading that the 'Mercury' people hoped to be able to
> perform complexity reducing source->source transforms. Does HIPE do
> that, or does the potential (I read "in general" to mean "there are
> some cases where we can do better". That may have been reading too
> much into what you wrote.) come from 'striking it lucky' when doing
> (local) optimisations on object code?

What I meant by that slightly opaque remark was that there are a number of
things we want to do in HiPE that are not being done today, such as type
analysis (in order to remove some runtime type tests), function inlining
and other optimizations on high-level intermediate code (Core Erlang),
using multiple return values instead of building intermediate heap objects
("{X, Y, Z} = foo(...), bar(X, Y, Z)"), and even partial evaluation.
However, these things all give linear speedup (no matter how lucky you
get). There *exist* some transformations which can give superlinear
speedup (i.e., changing the complexity), but typically, these are
difficult to control (you don't want your compiler to give unexpected code
size explosions) and are not necessarily useful for typical programs. They
are interesting, but are not a priority for us right now.

/Richard Carlsson


Richard Carlsson (richardc_at_csd.uu.se) (This space intentionally left blank.)
E-mail: Richard.Carlsson_at_csd.uu.se WWW: http://www.csd.uu.se/~richardc/



Post generated using Mail2Forum (http://m2f.sourceforge.net)
thomasl at bluetail.com
Posted: Mon Oct 23, 2000 10:06 am Reply with quote
Guest
> I remember reading that the 'Mercury' people hoped to be able to
> perform complexity reducing source->source transforms. Does HIPE do
> that, or does the potential (I read "in general" to mean "there are
> some cases where we can do better". That may have been reading too
> much into what you wrote.) come from 'striking it lucky' when doing
> (local) optimisations on object code?

As far as I am aware, the HIPE compiler does not reduce complexity
apart from removing simple dead computations. For example, it can't
remove a loop (exception-free, etc) producing an unused value. Or
at least it couldn't when I left Uppsala. Maybe there are some new
tricks I'm not aware of.

The Mercury compiler (among others) is capable of one form of
optimization that should be possible at nearly-source level:
reordering of conses (and other term constructions). This reduces
space complexity, e.g., from linear into running at constant stack
space. (Prolog implementations tend to use this by default, btw.)

Here is an illustration of the basic technique. Take simple append:

append([],Ys) -> Ys;
append([X|Xs],Ys) ->
Zs = append(Xs,Ys),
[X|Zs].

The code above pushes X on the stack in each append iteration, then
constructs the list one step at a time as the stack pops, when
append/2 returns. You can say this constructs the list back-to-front:
the last cons of the (new) list is the first to be constructed

This code can basically be rewritten into the more efficient:

append([],Ys) -> Ys;
append([X|Xs],Ys) ->
Res = [X|empty],
append_loop(Xs,Ys,Res).

append_loop([],Ys,Tail) -> setcdr(Tail,Ys); %% setcdr not really in erlang
append_loop([X|Xs],Ys,Tail) ->
Res = [X|empty],
setcdr(Tail,Res),
append_loop(Xs,Ys,Res).

Here, we construct the list 'front-to-back' instead. We keep a pointer
to the last cons of the list (which is incomplete, as indicated by 'empty')
and zap it with the next cons.

What is the gain? append_loop/3 is tail recursive. At each iteration,
append/2 saves (a) a return address and (b) X. Thus, we save 2 stores
+ 2 loads per list element, which isn't bad for append.

Thomas
--
Thomas Lindgren thomasl+junk_at_bluetail.com
Alteon Websystems Sweden http://www.bluetail.com


Post generated using Mail2Forum (http://m2f.sourceforge.net)

Display posts from previous:  

All times are GMT
Page 1 of 2
Goto page 1, 2  Next
This forum is locked: you cannot post, reply to, or edit topics.

Jump to:  

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You cannot download files in this forum