| Author |
Message |
|
| datacompboy |
Posted: Thu Dec 07, 2006 4:56 am |
|
|
|
User
Joined: 21 Sep 2006
Posts: 69
Location: Novosibirsk, Russia
|
Hi guys.
I have small leex parser, one of state it have is
. : YYlen, YYtcs, YYline, {token, {char, YYline, hd(YYtext)}}.
so i want ALL unknown to rest of parser part traverse to yecc as 'char' atom.
But generated state is
yystate(50, [C|Ics], Line, Tlen, _Action, _Alen) when C >= $\016, C =< $\037 ->
so when C >= $\000, C <= $\015 i'm have hangs parser :(
where and what I should change, to have compiled as C>= $\000 ?
for now, after any change I just change manually that state... |
_________________ --- suicide proc near\n call death\n suicide endp |
|
| Back to top |
|
| rvirding |
Posted: Fri Dec 08, 2006 5:04 am |
|
|
|
User
Joined: 30 Aug 2006
Posts: 452
Location: Stockholm, Sweden
|
'.' is supposed to mean any character except newline so what you have
there may seem wrong. But the characters for each state are not
independant of all other state, it could depend on your other rules. Do
you a general white space rule? What actually happens when you run the
scanner?
Robert
datacompboy wrote:
> Hi guys.
>
> I have small leex parser, one of state it have is
>
> . : YYlen, YYtcs, YYline, {token, {char, YYline, hd(YYtext)}}.
>
> so i want ALL unknown to rest of parser part traverse to yecc as 'char' atom.
> But generated state is
> yystate(50, [C|Ics], Line, Tlen, _Action, _Alen) when C >= $
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@erlang.org
> http://www.erlang.org/mailman/listinfo/erlang-questions
>
_______________________________________________
erlang-questions mailing list
erlang-questions@erlang.org
http://www.erlang.org/mailman/listinfo/erlang-questions
Post recived from mailinglist |
|
|
| Back to top |
|
| datacompboy |
Posted: Fri Dec 08, 2006 6:14 am |
|
|
|
User
Joined: 21 Sep 2006
Posts: 69
Location: Novosibirsk, Russia
|
rvirding wrote: '.' is supposed to mean any character except newline
I have separate rule for newline
rvirding wrote: so what you have there may seem wrong. But the characters for each state are not independant of all other state, it could depend on your other rules.
Whole grammar:
Code:
Definitions.
COMMENT1 = \(\*\(*([^*)]|[^*]\)|\*[^)])*\**\*\)
COMMENT2 = //([^\n]*)
STRING = "(\\\^.|\\.|[^"])*"
QUOTE = '(\\\^.|\\.|[^'])*'
Rules.
// Numbers
([0-9]+[.][0-9]+[eE][+-][0-9]+) : YYlen, YYtcs, YYline, {token, {number, YYline, erlang:list_to_float(YYtext)}}.
([0-9]+[.][0-9]+[eE][0-9]+) : YYlen, YYtcs, YYline, {token, {number, YYline, erlang:list_to_float(YYtext)}}.
([0-9]+[.][0-9]+) : YYlen, YYtcs, YYline, {token, {number, YYline, erlang:list_to_float(YYtext)}}.
([1-9][0-9]*) : YYlen, YYtcs, YYline, {token, {number, YYline, erlang:list_to_integer(YYtext)}}.
([0]*) : YYlen, YYtcs, YYline, {token, {number, YYline, 0}}.
(0x[0-9]+) : YYlen, YYtcs, YYline, {token, {number, YYline, erlang:list_to_integer(YYtext,16)}}.
% Number as char with code
\\. : YYlen, YYtcs, YYline, {token, {number, YYline, lists:nth(2,YYtext)}}.
% Any atom:
([A-Za-z_][a-z0-9A-Z_]*) : YYlen, YYtcs, YYline, {token, special(YYtext, YYline)}.
%% string
({STRING}|{QUOTE}) : %% Strip quotes.
S = lists:sublist(YYtext, 2, length(YYtext) - 2),
YYlen, YYtcs, YYline, {token,{string,YYline,string_gen(S)}}.
{COMMENT1}|{COMMENT2} : YYlen, YYtcs, YYline, skip_token.
>= : YYlen, YYtcs, YYline, {token, {'>=', YYline}}.
<= : YYlen, YYtcs, YYline, {token, {'<=', YYline}}.
<> : YYlen, YYtcs, YYline, {token, {'!=', YYline}}.
!= : YYlen, YYtcs, YYline, {token, {'!=', YYline}}.
=~ : YYlen, YYtcs, YYline, {token, {'=~', YYline}}.
-> : YYlen, YYtcs, YYline, {token, {'->', YYline}}.
\, : YYlen, YYtcs, YYline, {token, {',', YYline}}.
\+ : YYlen, YYtcs, YYline, {token, {'+', YYline}}.
\- : YYlen, YYtcs, YYline, {token, {'-', YYline}}.
\> : YYlen, YYtcs, YYline, {token, {'>', YYline}}.
\< : YYlen, YYtcs, YYline, {token, {'<', YYline}}.
= : YYlen, YYtcs, YYline, {token, {'=', YYline}}.
\* : YYlen, YYtcs, YYline, {token, {'*', YYline}}.
\: : YYlen, YYtcs, YYline, {token, {':', YYline}}.
\/ : YYlen, YYtcs, YYline, {token, {'/', YYline}}.
\$ : YYlen, YYtcs, YYline, {token, {'$', YYline}}.
\@ : YYlen, YYtcs, YYline, {token, {'@', YYline}}.
\% : YYlen, YYtcs, YYline, {token, {'%', YYline}}.
\( : YYlen, YYtcs, YYline, {token, {'(', YYline}}.
\) : YYlen, YYtcs, YYline, {token, {')', YYline}}.
%} : YYlen, YYtcs, YYline, {token, {'}', YYline}}.
%{ : YYlen, YYtcs, YYline, {token, {'{', YYline}}.
%\[ : YYlen, YYtcs, YYline, {token, {'[', YYline}}.
%\] : YYlen, YYtcs, YYline, {token, {']', YYline}}.
[\r\t\s] : YYlen, YYtcs, YYline, skip_token.
\n : YYlen, YYtcs, YYline, {token, {'\n', YYline}}.
. : YYlen, YYtcs, YYline, {token, {char, YYline, hd(YYtext)}}.
Erlang code.
% Skipped
After compilation code above (you can just remove unknown functions, that was defined in erlang code to see), we have one of state follows:
Code: yystate(??, [C|Ics], Line, Tlen, _Action, _Alen) when C >= $\016, C =< $\037 ->
(?? since I don't remember what number it actually have now).
So, codes from $\000 to $\016 are not handled at all! And scanner hangs on code 00 in input.
rvirding wrote: Do you a general white space rule? What actually happens when you run the scanner?
If I run it over file that contains such characters (<$\016), i have hangs my leex scanner. |
_________________ --- suicide proc near\n call death\n suicide endp |
|
| Back to top |
|
| rvirding |
Posted: Mon Dec 11, 2006 10:28 pm |
|
|
|
User
Joined: 30 Aug 2006
Posts: 452
Location: Stockholm, Sweden
|
Sorry I have been away for a few days.
Some mailer along the way had played havoc with you file but I have
managed to decypher it and have included it last. Some general comments:
- you need only mention YYlen, YYtcs, YYline when you use them
- the ordering of rules defines precedence, so be careful putting short
patterns early
- the \. pattern will match the '.' character
- as '*' meta characters your comment definitions seem a little strange
- is the last pattern the catch all . one?
What is the token syntax you are trying to scan? I will try to run your
spec through and see what happens. What is the real problem? Is it not
working?
Robert
-----
Definitions.
COMMENT1 = (*(*([^*)]|[^*])|*[^)])****)
COMMENT2 = //([^n]*)
STRING = "(\^.|\.|[^"])*"
QUOTE = '(\^.|\.|[^'])*'
Rules.
// Numbers
([0-9]+[.][0-9]+[eE][+-][0-9]+) : YYlen, YYtcs, YYline, {token, {number,
YYline, erlang:list_to_float(YYtext)}}.
([0-9]+[.][0-9]+[eE][0-9]+) : YYlen, YYtcs, YYline, {token, {number,
YYline, erlang:list_to_float(YYtext)}}.
([0-9]+[.][0-9]+) : YYlen, YYtcs, YYline, {token, {number, YYline,
erlang:list_to_float(YYtext)}}.
([1-9][0-9]*) : YYlen, YYtcs, YYline, {token, {number, YYline,
erlang:list_to_integer(YYtext)}}.
([0]*) : YYlen, YYtcs, YYline, {token, {number, YYline, 0}}.
(0x[0-9]+) : YYlen, YYtcs, YYline, {token, {number, YYline,
erlang:list_to_integer(YYtext,16)}}.
% Number as char with code
\. : YYlen, YYtcs, YYline, {token, {number, YYline,
lists:nth(2,YYtext)}}.
% Any atom:
([A-Za-z_][a-z0-9A-Z_]*) : YYlen, YYtcs, YYline, {token,
special(YYtext, YYline)}.
%% string
({STRING}|{QUOTE}) : %% Strip quotes.
S = lists:sublist(YYtext, 2, length(YYtext) - 2),
YYlen, YYtcs, YYline, {token,{string,YYline,string_gen(S)}}.
{COMMENT1}|{COMMENT2} : YYlen, YYtcs, YYline, skip_token.
>= : YYlen, YYtcs, YYline, {token, {'>=', YYline}}.
<= : YYlen, YYtcs, YYline, {token, {'<=', YYline}}.
<> : YYlen, YYtcs, YYline, {token, {'!=', YYline}}.
!= : YYlen, YYtcs, YYline, {token, {'!=', YYline}}.
=~ : YYlen, YYtcs, YYline, {token, {'=~', YYline}}.
-> : YYlen, YYtcs, YYline, {token, {'->', YYline}}.
, : YYlen, YYtcs, YYline, {token, {',', YYline}}.
+ : YYlen, YYtcs, YYline, {token, {'+', YYline}}.
- : YYlen, YYtcs, YYline, {token, {'-', YYline}}.
> : YYlen, YYtcs, YYline, {token, {'>', YYline}}.
< : YYlen, YYtcs, YYline, {token, {'<', YYline}}.
= : YYlen, YYtcs, YYline, {token, {'=', YYline}}.
* : YYlen, YYtcs, YYline, {token, {'*', YYline}}.
: : YYlen, YYtcs, YYline, {token, {':', YYline}}.
/ : YYlen, YYtcs, YYline, {token, {'/', YYline}}.
$ : YYlen, YYtcs, YYline, {token, {'$', YYline}}.
@ : YYlen, YYtcs, YYline, {token, {'@', YYline}}.
% : YYlen, YYtcs, YYline, {token, {'%', YYline}}.
( : YYlen, YYtcs, YYline, {token, {'(', YYline}}.
) : YYlen, YYtcs, YYline, {token, {')', YYline}}.
%} : YYlen, YYtcs, YYline, {token, {'}', YYline}}.
%{ : YYlen, YYtcs, YYline, {token, {'{', YYline}}.
%[ : YYlen, YYtcs, YYline, {token, {'[', YYline}}.
%] : YYlen, YYtcs, YYline, {token, {']', YYline}}.
[rts] : YYlen, YYtcs, YYline, skip_token.
n : YYlen, YYtcs, YYline, {token, {'n', YYline}}.
: YYlen, YYtcs, YYline, {token, {char, YYline, hd(YYtext)}}.
Erlang code.
% Skipped
_______________________________________________
erlang-questions mailing list
erlang-questions@erlang.org
http://www.erlang.org/mailman/listinfo/erlang-questions
Post recived from mailinglist |
|
|
| Back to top |
|
| datacompboy |
Posted: Tue Dec 12, 2006 5:40 am |
|
|
|
User
Joined: 21 Sep 2006
Posts: 69
Location: Novosibirsk, Russia
|
rvirding wrote: Some mailer along the way had played havoc with you file but I have managed to decypher it and have included it last.
You can see my post non-grabaged at
http://forum.trapexit.org/viewtopic.php?p=23477#23477
rvirding wrote: - you need only mention YYlen, YYtcs, YYline when you use them
I just suppress warnings about unused YYlen/YYtcs/YYline.
rvirding wrote: - the ordering of rules defines precedence, so be careful putting short patterns early
I'm know, tried to match longest first.
rvirding wrote: - the \. pattern will match the '.' character
I have not needed in separate '.', so there no such rule.
rvirding wrote: - as '*' meta characters your comment definitions seem a little strange
:) that rule I just steal from somewhere :)
rvirding wrote: - is the last pattern the catch all . one?
Last pattern catch everything not catched by other rules.
rvirding wrote: What is the token syntax you are trying to scan? I will try to run your spec through and see what happens. What is the real problem? Is it not working?
Just try to scan file
2.14 \000
where \000 is zero byte. |
_________________ --- suicide proc near\n call death\n suicide endp |
|
| Back to top |
|
| rvirding |
Posted: Sun Dec 17, 2006 12:13 am |
|
|
|
User
Joined: 30 Aug 2006
Posts: 452
Location: Stockholm, Sweden
|
OK, I have looked a little more at the problem and com up with some
answers. I converted the .xrl file to fit the latest leex on
trapexit.org and include it here.
- The first comment looks like a Pascal-like comment.
- The looping was caused by the ([0]*) rule which would match all cases
where a character was not explicitly the first char in another rule. It
would match with a zero length string which leex couldn't handle, they
should probably be forbidden.
- For some reason the final . pattern wasn't catching everything like it
should. I don't know why, never had the problem before and simple tests
didn't find it.
- // Numbers was tried as a pattern for a rule but syntax is faulty.
- Generally it is difficult to find which clause which matches a certain
rule as it is very dependant on all othere rules.
Robert
Post recived from mailinglist |
|
|
| Back to top |
|
| datacompboy |
Posted: Sun Dec 17, 2006 10:57 am |
|
|
|
User
Joined: 21 Sep 2006
Posts: 69
Location: Novosibirsk, Russia
|
rvirding wrote: - The looping was caused by the ([0]*) rule which would match all cases where a character was not explicitly the first char in another rule.
O!!! Really! Sorry, my miss :)
But that still not fix problem with codes < $\016 :( |
_________________ --- suicide proc near\n call death\n suicide endp |
|
| Back to top |
|
|
|
All times are GMT
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum You cannot attach files in this forum You cannot download files in this forum
|
|
|