[4suite-dev] Problems with BisonGen Pattern
Chimezie Ogbuji
chimezie at gmail.com
Fri Feb 24 07:54:19 MST 2006
I'm having some difficulty 'compiling' a BisonGen implementation of
the SPARQL specification that I'm currently working on (attached as a
tar/bzip2). I added print statements in Lexer.py to get the specifics
on the pattern causing the problem:
.. snip (up to line 79 of Lexer.py) ...
for state_idx, patterns in lexer.patterns.items():
pattern_names = []
table_base = 'lexer_%s_pattern_%%d' % state_lookup[state_idx]
for pattern in patterns:
print pattern.expression, pattern.token
name = table_base % len(pattern_names)
pattern_names.append(name)
header = 'static const Py_UCS4 %s[] = { ' % name
write(header)
width = len(header)
>From the command-line I'm getting the following traceback:
[chimezie at Zion SPARQL]$ BisonGen --mode=c SPARQL.bgen
Generate parser SPARQLParser.c
{String_Literal} STRING_LITERAL
{String_Literal_Long} STRING_LITERAL_LONG
{Langtag} LANGTAG
{Digit}+(\.{Digit}+)?([Ee][-+]?{Digit}+)? NumericLiteral
{Nil} NIL
{Anon} ANON
{Q_IRI_Ref} Q_IRI_REF
{QName_Pattern} QNAME
{QName_Pattern} QNAME_NS
{BlankNodeLabel} BLANK_NODE_LABEL
{VarName} VARNAME
Traceback (most recent call last):
File "/home/chimezie/bin/BisonGen", line 4, in ?
Main.Run(sys.argv)
File "/home/chimezie/lib/python2.4/BisonGen/Main.py", line 177, in Run
return Generate(spec_file, options)
File "/home/chimezie/lib/python2.4/BisonGen/Main.py", line 52, in Generate
C.Generate(parser, bison, lexer, filename)
File "/home/chimezie/lib/python2.4/BisonGen/C/__init__.py", line 23,
in Generate
Lexer.OutputTables(lexer, outfile)
File "/home/chimezie/lib/python2.4/BisonGen/C/Lexer.py", line 94, in
OutputTables
out = '%d, ' % code
TypeError: not all arguments converted during string formatting
It looks like the problem is specific to the VARNAME symbol/pattern
(http://www.w3.org/TR/rdf-sparql-query/#rVARNAME):
90] VARNAME ::= ( NCCHAR1 | [0-9] ) ( NCCHAR1 | [0-9] |
#x00B7 | [#x0300-#x036F] | [#x203F-#x2040] )*
I have it implemented as:
<define name='VarName'>({NCChar1}|[0-9])({NCChar1}|[0-9]|\u00B7|[\u0300-\u036F]|[\u0203F-\u2040])*</define>
>From my vantage point, It looks like valid REGEX to me (I'm assuming
that's the problem - the traceback isn't very clear). Any insights as
to what is failing or what additional print statements i can add to
further isolate the problem would be greately appreciated.
Chimezie
-------------- next part --------------
A non-text attachment was scrubbed...
Name: SPARQL-BisonGen.tar.bzip2
Type: application/octet-stream
Size: 7046 bytes
Desc: not available
Url : http://lists.fourthought.com/pipermail/4suite-dev/attachments/20060224/49c23516/SPARQL-BisonGen.tar.obj
More information about the 4suite-dev
mailing list