Example: a grammar for HTML list structures

S->T<ul>L</ul>|T<ol>L</ol>
L-><li>TL|<li> SL|<li> T
T->aT|bT|...|zT|a|...|z

S=whole sentence, T=text part (at least one letter), L=list content (at least one <li> text). Now we require that there must always be some text before beginning a new list. Number of nested lists is not limited.

Notice that your program should also check that the original file is legal (follows the grammar), otherwise you cannot translate the resulting latex file!

How to test your program?

You can test the resulting latex files, if you replace <html>< body> in the beginning by \documentclass[a4paper,12pt]{article}\begin{document} and </body></html> by \end{document}.

If original file was test.html, then give it name test.tex, and run latex test.tex. The resulting file can be watched by xdvi test (in linux or unix). For printing, you can transform it to postscript by dvips test.dvi -o test.ps.

Do we need file management?

You don't have to read and write files, if you direct files in unix shell. E.g. compile this example by gpc transform.pas -o transform and then run transform < example.htm > example.tex. Now the program reads example.htm and writes example.tex.