Home |
Text2HTML |
Wikipedia |
Yacc2TT |
Delphi parser |
Java parser |
C preprocessor |
C parser |
HTML4 |
Utilities |
MIME parser |
Spamfilter |
Additional Examples |
Free components |
c_pp.ttp is the name of a TextTransformer project which imitates a C-preprocessor. C++ files can be remodeled into the preprocessed form with c_pp, like they are "seen" by the compiler: Preprocessor directives are removed, include files are included, definitions are replaced, not defined areas are removed and macros are expanded. In contrast to existing preprocessors of the different compiler manufacturers, c_pp does not only create an intermediate sequence of tokens, but a real text.
The name "c_pp" stands for C-preprocessor. The underscore distinguishes the name from a Cplusplus parser also existing with the name "Cpp".
The original version of this C++ preprocessor was developed to prepare the translation of a company software written in C++ into Java. So it wasn't the aim to produce a general preprocessor, which copes with all possible tricks of preprocessor Meta programming. The aim was rather pragmatic: The preprocessor directives should be replaced from the finite number of files in a way which maintained the meaning of these directives.
|
These special treatments tailored to the company software in question, were removed from the c_pp project published here. However, it is easily possible to insert corresponding special treatments for other translation projects once more again.
Other applications of c_pp are conceivable in addition to the task just described. For example, it could be used to test, whether the preprocessor commands actually produces the expected code. There are so many pitfalls, that a long section of the gnu preprocessor manual is dedicated to them.
Even instructions written correctly, have the disadvantage that they are difficult to debug. That's why Scott Meyers gives the advice already in the first chapter of his well known book: "Effective C ++": "prefer the compiler to the preprocessor". So another conceivable possible application of the c_pp project is, to really transform C++ files into new files with dissolved preprocessor instructions.
c_pp is nearly a Standards conformant implementation of the mandated C99/C++ preprocessor functionality. The deviations are discussed in the following annotations, which are ordered like the excellent introduction to the c preprocessor at:
Following the order of "cpp_info".
The file is read continously without - like other preprocessors - breaking it into lines at first. White spaces and tabulators and comments are ignored
c_pp doesn't handle trigraphs.
Backslashes `\' at the end of lines and following spaces are removed.
All comments are replaced with single spaces in the production "comment". "comment" is set as an inclusion production in c_pp. This is a special TextTransformer feature to handle comments etc. easily.
"Extremely confusing" tricks like splitting `/*', `*/', and `//' onto multiple lines with backslash-newline, aren't handled correctly by c_pp.ttp.
In TextTransformer projects regular expressions for the ignored characters often contain line breaks, too. However, line breaks have an important role in the C-preprocessor grammar so that their possible occurrences are set explicitly.
In the 1999 C standard, identifiers may contain letters which are not part of the ASCII character set. c_pp cannot treat such identifiers.
Included user and system header files, like
`#include "FILE"'
or
`#include <FILE>'
both are recognized by the expression
PD_INCLUDE ::= #\s*include\s*("([^"]+)"|<([^>]+)>)
The first sub-expression of this expression - in TextTransformer notation: 'xState.str(1)' -
gives the included file 'FILE'.
Everytime, c_pp finds an include directive, the function
'scan_include_file'
is called. In this function the file is loaded with 'load_file'. Then the file is processes with the production 'header' in the same way like the original file. I.e. the preprocessed text of the included file is attached to the text which was already generated from the original file. After the processing of the included file is completed, the processing of the including file is continued. If there are include directives in the included file too, then the inclusion method is executed analogously at a higher level. An integer parameter for the current level is incremented, when 'header' is called.
Whether a file really shall be included gets controlled by the function 'ReallyInclude'. The translator of C++ to Java mentioned at the beginning, only included headers immediately belonging to the source file.
c_pp does not distinguish between system and user headers presently. The headers are looked up in the same list of directoriey in both cases. This list is in the vector
m_vIncludeDirs
The list can be passed as a configuration parameter to the project. Depending on the way the TextTransformer project is executed, the configuration parameter has to be put in the project options (for the working space of the IDE), in the transformation manager or as a command line parameter.
Every list has to be put into one line in the configuration string. E.g.
D:\Tetra\Projects\Divers\Cpp
C:\Programme\Borland\CBuilder6\Include
This list is parsed with the production 'IncludePaths' before the start of the c_pp preprocessor to fill the m_vIncludeDirs-vector.
( SKIP {{ AddIncludeDir(trim_copy(xState.str())); }} ( EOL | EOF ) )*
The sub-parser is called in the Init function:
IncludePaths(ConfigParam());
In addition the root path of the source file is set as an include path with:
m_vIncludeDirs.push_back(SourceRoot());
The names of preprocessed headers are stored in the map
m_mHeaderPaths
The preprocessor could be accelerated, if files already contained in this list, were not parsed again. This method, however, would not be absolutely correct since the set of the defined expressions can have changed between two inclusions of the same file. The same file can therefore yield another result at renewed processing.
Macros are abbreviations of code fragments which are defined by the preprocessor directive '#define'. Function like macros contain brackets and possible arguments while object like macros simple are identifiers.
Macro definitions are parsed in the production 'definition'. It starts with the token
PD_DEFINE ::= #\s*define
For object like macros a simple identifier 'ID' follows. If, however, a token like
MACRO_DEF_BEGIN ::= (\w+)\(
follows, it is a function like macro definition.
If c_pp finds a macro call, the macro is expanded: Arguments are evaluated macro arguments with preceding '#' are stringified and '##' concatenates tokens. Variadisc macros aren't supported. Macros can be undefined or redefined. c_pp does not produce a warning if a new definition is different from the original.
The simplest sort of conditional is
#ifdef MACRO
CONTROLLED TEXT
#endif /* MACRO */
CONTROLLED TEXT will be included in the output of the preprocessor if and only if MACRO is defined. The CONTROLLED TEXT inside of a conditional can include preprocessing directives. They are executed only if the conditional succeeds. You can nest conditional groups inside other conditional groups.
A C-preprocessor written in Pascal from Dr. Hans-Peter Diettrich
http://members.aol.com/vbdis/
Standards conformant implementation of the mandated C99/C++ preprocessor functionality written in C++ from Hartmut Kaiser
http://www.boost.org/libs/wave/index.html
to the top |