Ack Description File Reference Manual
Ed Keizer Vakgroep Informatica Vrije Universiteit Amsterdam
1. Introduction The program ack(I) internally maintains a table of possible transformations and a table of string variables. The transformation table contains one entry for each possible transformation of a le. Which transformations are used depends on the sufx of the source le. Each transformation table entry tells which input sufxes are allowed and what sufx/name the output le has. When the output le does not already satisfy the request of the user (indicated with the ag c.sufx), the table is scanned starting with the next transformation in the table for another transformation that has as input sufx the output sufx of the previous transformation. A few special transformations are recognized, among them is the combiner, which is a program combining several les into one. When no stop sufx was specied (ag c.sufx) ack stops after executing the combiner with as arguments the possibly transformed input les and libraries. Ack will only perform the transformations in the order in which they are presented in the table. The string variables are used while creating the argument list and program call name for a particular transformation. 2. Which descriptions are used Ack always uses two description les: one to dene the front-end transformations and one for the machine dependent back-end transformations. Each description has a name. First the way of determining the name of the descriptions needed is described. When the shell environment variable ACKFE is set ack uses that to determine the front-end table name, otherwise it uses fe. The way the backend table name is determined is more convoluted. First, when the last lename in the program call name is not one of ack or the front-end call-names, this lename is used as the backend description name. Second, when the m is present the m is chopped of this ag and the rest is used as the backend description name. Third, when both failed the shell environment variable ACKM is used. Last, when also ACKM was not present the default backend is used, determined by the denition of ACKM in h/local.h. The presence and value of the denition of ACKM is determined at compile time of ack. Now, we have the names, but that is only the rst step. Ack stores a few descriptions at compile time. This descriptions are simply les read in at compile time. At the moment of writing this document, the descriptions included are: pdp, fe, i86, m68k2, vax2 and int. The name of a description is rst searched for internally, then in lib/descr/name, then in lib/name/descr, and nally in the current directory of the user.
-2-
3. Using the description le Before starting on a narrative of the description le, the introduction of a few terms is necessary. All these terms are used to describe the scanning of zero terminated strings, thereby producing another string or sequence of strings. Backslashing All characters preceded by \ are modied to prevent recognition at further scanning. This modication is undone before a string is passed to the outside world as argument or message. When reading the description les the sequences \\, \# and \<newline> have a special meaning. \\ translates to a single \, \# translates to a single # that is not recognized as the start of comment, but can be used in recognition and nally, \<newline> translates to nothing at all, thereby allowing continuation lines. Variable replacement The scan recognizes the sequences {{, {NAME} and {NAME?text} Where NAME can be any combination if characters excluding ? and } and text may be anything excluding }. ( \} is allowed of course ) The rst sequence produces an unescaped single {. The second produces the contents of the NAME, denitions are done by ack and in description les. When the NAME is not dened an error message is produced on the diagnostic output. The last sequence produces the contents of NAME if it is dened and text otherwise. Expression replacement Syntax: (sufx sequence:sufx sequence=text) Example: (.c.p.e:.e=tail_em) If the two sufx sequences have a common member .e in this case the text is produced. When no common member is present the empty string is produced. Thus the example given is a constant expression. Normally, one of the sufx sequences is produced by variable replacement. Ack sets three variables while performing the diverse transformations: HEAD, TAIL and RTS. All three variables depend on the properties rts and need from the transformations used. Whenever a transformation is used for the rst time, the text following the need is appended to both the HEAD and TAIL variable. The value of the variable RTS is determined by the rst transformation used with a rts property. Two runtime ags have effect on the value of one or more of these variables. The ag .sufx has the same effect on these three variables as if a le with that sufx was included in the argument list and had to be translated. The ag r.sufx only has that effect on the TAIL variable. The program call names acc and cc have the effect of an automatic .c ag. Apc and pc have the effect of an automatic .p ag. Line splitting The string is transformed into a sequence of strings by replacing the blank space by string separators (nulls). IO replacement The > in the string is replaced by the output le name. The < in the string is replaced by the input le name. When multiple input les are present the string is duplicated for each input le name. Each description is a sequence of variable denitions followed by a sequence of transformation denitions. Variable denitions use a line each, transformations denitions consist of a sequence of lines. Empty lines are discarded, as are lines with nothing but comment. Comment is started by a # character, and continues to the end of the line. Three special two-characters sequences exist: \#, \\ and \<newline>. Their effect is described under backslashing above. Each nonempty line starts with a keyword, possibly preceded by blank space. The keyword can be followed by a further specication. The two are separated by blank space. Variable denitions use the keyword var and look like this: var NAME=text The name can be any identier, the text may contain any character. Blank space before the equal sign is not
-3-
part of the NAME. Blank space after the equal is considered as part of the text. The text is scanned for variable replacement before it is associated with the variable name.
The start of a transformation denition is indicated by the keyword name. The last line of such a denition contains the keyword end. The lines in between associate properties to a transformation and may be presented in any order. The identier after the name keyword determines the name of the transformation. This name is used for debugging and by the R ag. The keywords are used to specify which input sufces are recognized by that transformation, the program to run, the arguments to be handed to that program and the name or sufx of the resulting output le. Two keywords are used to indicate which run-time startoffs and libraries are needed. The possible keywords are: from followed by a sequence of sufces. Each le with one of these sufces is allowed as input le. Preprocessor transformations do not need the from keyword. All other transformations do. to followed by the sufx of the output le name or in the case of a linker the output le name. program followed by name of the load le of the program, a pathname most likely starts with either a / or {EM}. This keyword must be present, the remainder of the line is subject to backslashing and variable replacement. mapag The mapags are used to grab ags given to ack and pass them on to a specic transformation. This feature uses a few simple pattern matching and replacement facilities. Multiple occurrences of this keyword are allowed. This text following the keyword is subjected to backslashing. The keyword is followed by a match expression and a variable assignment separated by blank space. As soon as both description les are read, ack looks at all transformations in these les to nd a match for the ags given to ack. The ags m, o, O, r, v, g, c, t, k, R and . are specic to ack and not handed down to any transformation. The matching is performed in the order in which the entries appear in the denition. The scanning stops after rst match is found. When a match is found, the variable assignment is executed. A * in the match expression matches any sequence of characters, a * in the right hand part of the assignment is replaced by the characters matched by the * in the expression. The right hand part is also subject to variable replacement. The variable will probably be used in the program arguments. The l ags are special, the order in which they are presented to ack must be preserved. The identier LNAME is used in conjunction with the scanning of l ags. The value assigned to LNAME is used to replace the ag. The example further on shows the use of all this. args The keyword is followed by the program call arguments. It is subject to backslashing, variable replacement, expression replacement, line splitting and IO replacement. The variables assigned to by mapags will probably be used here. The ags not recognized by ack or any of the transformations are passed to the linker and inserted before all other arguments. stdin This keyword indicates that the transformation reads from standard input. stdout This keyword indicates that the transformation writes on standard output. optimizer The presence of this keyword indicates that this transformation is an optimizer. It can be followed by a number, indicating the "level" of the optimizer (see description of the -O option in the ack(1ACK) manual page). priority This optional keyword is followed by a number. Positive priority means that the transformation is likely to be used, negative priority means that the transformation is unlikely to be used. Priorities can also be set with a ack(1ACK) command line option. Priorities come in handy when there are several
-4-
implementations of a certain transformation. They can then be used to select a default one. linker This keyword indicates that this transformation is the linker. combiner This keyword indicates that this transformation is a combiner. A combiner is a program combining several les into one, but is not a linker. An example of a combiner is the global optimizer. prep This optional keyword is followed an option indicating its relation to the preprocessor. The possible options are: always the input les must be preprocessed cond the input les must be preprocessed when starting with # is this transformation is the preprocessor rts This optional keyword indicates that the rest of the line must be used to set the variable RTS, if it was not already set. Thus the variable RTS is set by the rst transformation executed which such a property or as a result from acks program call name (acc, cc, apc or pc) or by the .sufx ag. need This optional keyword indicates that the rest of the line must be concatenated to the HEAD and TAIL variables. This is done once for every transformation used or indicated by one of the program call names mentioned above or indicated by the .sufx ag. 4. Conventions used in description les Ack reads two description les. A few of the variables dened in the machine specic le are used by the descriptions of the front-ends. Other variables, set by ack, are of use to all transformations. Ack sets the variable EM to the home directory of the Amsterdam Compiler Kit. The variable SOURCE is set to the name of the argument that is currently being massaged, this is useful for debugging. The variable SUFFIX is set to the sufx of the argument that is currently being massaged. The variable M indicates the directory in lib/{M}/tail_..... and NAME is the string to be dened by the preprocessor with D{NAME}. The denitions of {w}, {s}, {l}, {d}, {f} and {p} indicate EM_WSIZE, EM_SSIZE, EM_LSIZE, EM_DSIZE, EM_FSIZE and EM_PSIZE respectively. The variable INCLUDES is used as the last argument to cpp. It is used to add directories to the list of directories containing #include les. The variables HEAD, TAIL and RTS are set by ack and used to compose the arguments for the linker. 5. Example Description for front-end
-5-
name cpp
# the C-preprocessor # no from, its governed by the P property to .i # result les have sufx i program {EM}/lib/cpp # pathname of loadle mapag I* CPP_F={CPP_F?} I* # grab I.. U.. and mapag U* CPP_F={CPP_F?} U* # D.. to use as arguments mapag D* CPP_F={CPP_F?} D* # in the variable CPP_F args {CPP_F?} {INCLUDES?} D{NAME} DEM_WSIZE={w} DEM_PSIZE={p} \ DEM_SSIZE={s} DEM_LSIZE={l} DEM_FSIZE={f} DEM_DSIZE={d} < # The arguments are: rst the [IUD]... # then the include dirs for this machine # then the NAME and size values nally # followed by the input le name stdout # Output on stdout prep is # Is preprocessor
end name cem # the C-compiler proper from .c # used for les with sufx .c to .k # produces compact code les program {EM}/lib/em_cem # pathname of loadle mapag p CEM_F={CEM_F?} Xp # pass p as Xp to cem mapag L CEM_F={CEM_F?} l # pass L as l to cem args Vw{w}i{w}p{p}f{f}s{s}l{l}d{d} {CEM_F?} # the arguments are the object sizes in # the V... ag and possibly l and Xp stdin # input from stdin stdout # output on stdout prep always # use cpp rts .c # use the C run-time system need .c # use the C libraries end name decode # make human readable les from compact code from .k.m # accept les with sufx .k or .m to .e # produce .e les program {EM}/lib/em_decode # pathname of loadle args < # the input le name is the only argument stdout # the output comes on stdout end
-6-
Example of a backend, in this case the EM assembler/loader. var w=2 var p=2 var s=2 var l=4 var f=4 var d=8 var M=em22 var NAME=em22 var LIB=lib/{M}/tail_ var RT=lib/{M}/head_ var SIZE_FLAG=sm var INCLUDES=I{EM}/include name asld from .k.m.a to e.out program {EM}/lib/em_ass mapag l* LNAME={EM}/{LIB}* # wordsize 2 # pointersize 2 # short size 2 # long size 4 # oat size 4 # double size 8
# for cpp (NAME=em22 results in #dene em22 1) # part of le name for libraries # part of le name for run-time startoff # default internal table size ag # use {EM}/include for #include les # Assembler/loader # accepts compact code and archives # output le name # load le pathname # e.g. ly becomes #{EM}/mach/int/lib/tail_y mapag +* ASS_F={ASS_F?} +* # recognize + and mapag * ASS_F={ASS_F?} * mapag s* SIZE_FLAG=s* # overwrite old value of SIZE_FLAG args {SIZE_FLAG} \ ({RTS}:.c={EM}/{RT}cc) ({RTS}:.p={EM}/{RT}pc) o > < \ (.p:{TAIL}={EM}/{LIB}pc) \ (.c:{TAIL}={EM}/{LIB}cc.1s {EM}/{LIB}cc.2g) \ (.c.p:{TAIL}={EM}/{LIB}mon) # s[sml] must be rst argument # the next line contains the choice for head_cc or head_pc # and the specication of in- and output. # the last three args lines choose libraries linker
end The command ack mem22 v v I../h L ly prog.c would result in the following calls (with exec(II)): 1) 2) 3) /lib/cpp I../h I/usr/em/include Dem22 DEM_WSIZE=2 DEM_PSIZE=2 \ DEM_SSIZE=2 DEM_LSIZE=4 DEM_FSIZE=4 DEM_DSIZE=8 prog.c /usr/em/lib/em_cem Vw2i2p2f4s2l4d8 l /usr/em/lib/em_ass sm /usr/em/lib/em22/head_cc o e.out prog.k /usr/em/lib/em22/tail_y /usr/em/lib/em22/tail_cc.1s /usr/em/lib/em22/tail_cc.2g /usr/em/lib/em22/tail_mon