Lex & Yacc



Posted by lionel319 @ Wed 20 Aug, 08, 10:57AM under Work

 

This is getting uglier and uglier.

It seems like this thing is more complicated that I though ( and of course, with complication comes more powderful :p)


The previous link was too hard to read. I really need something which is easier to read and easily understandable in its explanation. And i believe here's one which does better job in this. [here]







Here's the brief explanation on Lex & Yacc:-



Figure 1: Compilation Sequence

You code patterns and input them to lex. It will read your patterns and generate C code for a lexical analyzer or scanner. The lexical analyzer matches strings in the input, based on your patterns, and converts the strings to tokens. Tokens are numerical representations of strings, and simplify processing. This is illustrated in Figure 1.

When the lexical analyzer finds identifiers in the input stream it enters them in a symbol table. The symbol table may also contain other information such as data type (integer or real) and location of the variable in memory. All subsequent references to identifiers refer to the appropriate symbol table index.

You code a grammar and input it to yacc. Yacc will read your grammar and generate C code for a syntax analyzer or parser. The syntax analyzer uses grammar rules that allow it to analyze tokens from the lexical analyzer and create a syntax tree. The syntax tree imposes a hierarchical structure on the tokens. For example, operator precedence and associativity are apparent in the syntax tree. The next step, code generation, does a depth-first walk of the syntax tree to generate code. Some compilers produce machine code, while others, as shown above, output assembly language.


Figure 2: Building a Compiler with Lex & Yacc

Figure 2 illustrates the file naming conventions used by lex and yacc. We’ll assume our goal is to write a BASIC compiler. First, we need to specify all pattern matching rules for lex (bas.l) and grammar rules for yacc (bas.y). Commands to create our compiler, bas.exe, are listed below:

yacc -d bas.y                   # create y.tab.h, y.tab.c
lex bas.l # create lex.yy.c
cc lex.yy.c y.tab.c -obas.exe # compile/link

Yacc reads the grammar descriptions in bas.y and generates a syntax analyzer (parser), that includes function yyparse, in file y.tab.c. Included in file bas.y are token declarations. The -d option causes yacc to generate definitions for tokens and place them in file y.tab.h. Lex reads the pattern descriptions in bas.l, includes file y.tab.h, and generates a lexical analyzer, function yylex, in file lex.yy.c.

Finally, the lexer and parser are compiled and linked together to form the executable, bas.exe. From main, we call yyparse to run the compiler. Function yyparse automatically calls yylex to obtain each token.









Blurrrrrrr ~~~~~~~~~~~~~~~~~

 

 

 

 

 

 



leave me a message
Name
Enter the Code below
(only contain alphabets):
Website
Rating Worst Best

kokidi @ Fri 22-08-08 12:34AM
cool.... teach me ^_^
lionel319 @ Thu 21-08-08 01:22PM
popo: it's not a compiler. It's a compiler that let's your set your rules/grammar to make a new compiler. ABX use this to make a compiler and parse out the verilog file format.
popo @ Thu 21-08-08 11:28AM
wah, you writing compiler?
Search

Back to TextMalaysia Home
TextMalaysia.com is Powered by TinyMCE

Subscribe to this blog
RSS Feed for lionel.textmalaysia.com
Categories
Archive



© TextMalaysia is a free blogging service powered by WSATP