+@end ifset
+@ifset INTERNALS
+@node Processor pipeline description
+@subsection Specifying processor pipeline description
+@cindex processor pipeline description
+@cindex processor functional units
+@cindex instruction latency time
+@cindex interlock delays
+@cindex data dependence delays
+@cindex reservation delays
+@cindex pipeline hazard recognizer
+@cindex automaton based pipeline description
+@cindex regular expressions
+@cindex deterministic finite state automaton
+@cindex automaton based scheduler
+@cindex RISC
+@cindex VLIW
+
+To achieve better performance, most modern processors
+(super-pipelined, superscalar @acronym{RISC}, and @acronym{VLIW}
+processors) have many @dfn{functional units} on which several
+instructions can be executed simultaneously. An instruction starts
+execution if its issue conditions are satisfied. If not, the
+instruction is stalled until its conditions are satisfied. Such
+@dfn{interlock (pipeline) delay} causes interruption of the fetching
+of successor instructions (or demands nop instructions, e.g.@: for some
+MIPS processors).
+
+There are two major kinds of interlock delays in modern processors.
+The first one is a data dependence delay determining @dfn{instruction
+latency time}. The instruction execution is not started until all
+source data have been evaluated by prior instructions (there are more
+complex cases when the instruction execution starts even when the data
+are not available but will be ready in given time after the
+instruction execution start). Taking the data dependence delays into
+account is simple. The data dependence (true, output, and
+anti-dependence) delay between two instructions is given by a
+constant. In most cases this approach is adequate. The second kind
+of interlock delays is a reservation delay. The reservation delay
+means that two instructions under execution will be in need of shared
+processors resources, i.e.@: buses, internal registers, and/or
+functional units, which are reserved for some time. Taking this kind
+of delay into account is complex especially for modern @acronym{RISC}
+processors.
+
+The task of exploiting more processor parallelism is solved by an
+instruction scheduler. For a better solution to this problem, the
+instruction scheduler has to have an adequate description of the
+processor parallelism (or @dfn{pipeline description}). GCC
+machine descriptions describe processor parallelism and functional
+unit reservations for groups of instructions with the aid of
+@dfn{regular expressions}.
+
+The GCC instruction scheduler uses a @dfn{pipeline hazard recognizer} to
+figure out the possibility of the instruction issue by the processor
+on a given simulated processor cycle. The pipeline hazard recognizer is
+automatically generated from the processor pipeline description. The
+pipeline hazard recognizer generated from the machine description
+is based on a deterministic finite state automaton (@acronym{DFA}):
+the instruction issue is possible if there is a transition from one
+automaton state to another one. This algorithm is very fast, and
+furthermore, its speed is not dependent on processor
+complexity@footnote{However, the size of the automaton depends on
+processor complexity. To limit this effect, machine descriptions
+can split orthogonal parts of the machine description among several
+automata: but then, since each of these must be stepped independently,
+this does cause a small decrease in the algorithm's performance.}.
+
+@cindex automaton based pipeline description
+The rest of this section describes the directives that constitute
+an automaton-based processor pipeline description. The order of
+these constructions within the machine description file is not
+important.
+
+@findex define_automaton
+@cindex pipeline hazard recognizer
+The following optional construction describes names of automata
+generated and used for the pipeline hazards recognition. Sometimes
+the generated finite state automaton used by the pipeline hazard
+recognizer is large. If we use more than one automaton and bind functional
+units to the automata, the total size of the automata is usually
+less than the size of the single automaton. If there is no one such
+construction, only one finite state automaton is generated.
+
+@smallexample
+(define_automaton @var{automata-names})
+@end smallexample
+
+@var{automata-names} is a string giving names of the automata. The
+names are separated by commas. All the automata should have unique names.
+The automaton name is used in the constructions @code{define_cpu_unit} and
+@code{define_query_cpu_unit}.
+
+@findex define_cpu_unit
+@cindex processor functional units
+Each processor functional unit used in the description of instruction
+reservations should be described by the following construction.
+
+@smallexample
+(define_cpu_unit @var{unit-names} [@var{automaton-name}])
+@end smallexample
+
+@var{unit-names} is a string giving the names of the functional units
+separated by commas. Don't use name @samp{nothing}, it is reserved
+for other goals.
+
+@var{automaton-name} is a string giving the name of the automaton with
+which the unit is bound. The automaton should be described in
+construction @code{define_automaton}. You should give
+@dfn{automaton-name}, if there is a defined automaton.
+
+The assignment of units to automata are constrained by the uses of the
+units in insn reservations. The most important constraint is: if a
+unit reservation is present on a particular cycle of an alternative
+for an insn reservation, then some unit from the same automaton must
+be present on the same cycle for the other alternatives of the insn
+reservation. The rest of the constraints are mentioned in the
+description of the subsequent constructions.
+
+@findex define_query_cpu_unit
+@cindex querying function unit reservations
+The following construction describes CPU functional units analogously
+to @code{define_cpu_unit}. The reservation of such units can be
+queried for an automaton state. The instruction scheduler never
+queries reservation of functional units for given automaton state. So
+as a rule, you don't need this construction. This construction could
+be used for future code generation goals (e.g.@: to generate
+@acronym{VLIW} insn templates).
+
+@smallexample
+(define_query_cpu_unit @var{unit-names} [@var{automaton-name}])
+@end smallexample
+
+@var{unit-names} is a string giving names of the functional units
+separated by commas.
+
+@var{automaton-name} is a string giving the name of the automaton with
+which the unit is bound.
+
+@findex define_insn_reservation
+@cindex instruction latency time
+@cindex regular expressions
+@cindex data bypass
+The following construction is the major one to describe pipeline
+characteristics of an instruction.
+
+@smallexample
+(define_insn_reservation @var{insn-name} @var{default_latency}
+ @var{condition} @var{regexp})
+@end smallexample
+
+@var{default_latency} is a number giving latency time of the
+instruction. There is an important difference between the old
+description and the automaton based pipeline description. The latency
+time is used for all dependencies when we use the old description. In
+the automaton based pipeline description, the given latency time is only
+used for true dependencies. The cost of anti-dependencies is always
+zero and the cost of output dependencies is the difference between
+latency times of the producing and consuming insns (if the difference
+is negative, the cost is considered to be zero). You can always
+change the default costs for any description by using the target hook
+@code{TARGET_SCHED_ADJUST_COST} (@pxref{Scheduling}).
+
+@var{insn-name} is a string giving the internal name of the insn. The
+internal names are used in constructions @code{define_bypass} and in
+the automaton description file generated for debugging. The internal
+name has nothing in common with the names in @code{define_insn}. It is a
+good practice to use insn classes described in the processor manual.
+
+@var{condition} defines what RTL insns are described by this
+construction. You should remember that you will be in trouble if
+@var{condition} for two or more different
+@code{define_insn_reservation} constructions is TRUE for an insn. In
+this case what reservation will be used for the insn is not defined.
+Such cases are not checked during generation of the pipeline hazards
+recognizer because in general recognizing that two conditions may have
+the same value is quite difficult (especially if the conditions
+contain @code{symbol_ref}). It is also not checked during the
+pipeline hazard recognizer work because it would slow down the
+recognizer considerably.
+
+@var{regexp} is a string describing the reservation of the cpu's functional
+units by the instruction. The reservations are described by a regular
+expression according to the following syntax:
+
+@smallexample
+ regexp = regexp "," oneof
+ | oneof
+
+ oneof = oneof "|" allof
+ | allof
+
+ allof = allof "+" repeat
+ | repeat
+
+ repeat = element "*" number
+ | element
+
+ element = cpu_function_unit_name
+ | reservation_name
+ | result_name
+ | "nothing"
+ | "(" regexp ")"
+@end smallexample
+
+@itemize @bullet
+@item
+@samp{,} is used for describing the start of the next cycle in
+the reservation.
+
+@item
+@samp{|} is used for describing a reservation described by the first
+regular expression @strong{or} a reservation described by the second
+regular expression @strong{or} etc.
+
+@item
+@samp{+} is used for describing a reservation described by the first
+regular expression @strong{and} a reservation described by the
+second regular expression @strong{and} etc.
+
+@item
+@samp{*} is used for convenience and simply means a sequence in which
+the regular expression are repeated @var{number} times with cycle
+advancing (see @samp{,}).
+
+@item
+@samp{cpu_function_unit_name} denotes reservation of the named
+functional unit.
+
+@item
+@samp{reservation_name} --- see description of construction
+@samp{define_reservation}.
+
+@item
+@samp{nothing} denotes no unit reservations.
+@end itemize
+
+@findex define_reservation
+Sometimes unit reservations for different insns contain common parts.
+In such case, you can simplify the pipeline description by describing
+the common part by the following construction
+
+@smallexample
+(define_reservation @var{reservation-name} @var{regexp})
+@end smallexample
+
+@var{reservation-name} is a string giving name of @var{regexp}.
+Functional unit names and reservation names are in the same name
+space. So the reservation names should be different from the
+functional unit names and can not be the reserved name @samp{nothing}.
+
+@findex define_bypass
+@cindex instruction latency time
+@cindex data bypass
+The following construction is used to describe exceptions in the
+latency time for given instruction pair. This is so called bypasses.
+
+@smallexample
+(define_bypass @var{number} @var{out_insn_names} @var{in_insn_names}
+ [@var{guard}])
+@end smallexample
+
+@var{number} defines when the result generated by the instructions
+given in string @var{out_insn_names} will be ready for the
+instructions given in string @var{in_insn_names}. The instructions in
+the string are separated by commas.
+
+@var{guard} is an optional string giving the name of a C function which
+defines an additional guard for the bypass. The function will get the
+two insns as parameters. If the function returns zero the bypass will
+be ignored for this case. The additional guard is necessary to
+recognize complicated bypasses, e.g.@: when the consumer is only an address
+of insn @samp{store} (not a stored value).
+
+@findex exclusion_set
+@findex presence_set
+@findex final_presence_set
+@findex absence_set
+@findex final_absence_set
+@cindex VLIW
+@cindex RISC
+The following five constructions are usually used to describe
+@acronym{VLIW} processors, or more precisely, to describe a placement
+of small instructions into @acronym{VLIW} instruction slots. They
+can be used for @acronym{RISC} processors, too.
+
+@smallexample
+(exclusion_set @var{unit-names} @var{unit-names})
+(presence_set @var{unit-names} @var{patterns})
+(final_presence_set @var{unit-names} @var{patterns})
+(absence_set @var{unit-names} @var{patterns})
+(final_absence_set @var{unit-names} @var{patterns})
+@end smallexample
+
+@var{unit-names} is a string giving names of functional units
+separated by commas.
+
+@var{patterns} is a string giving patterns of functional units
+separated by comma. Currently pattern is one unit or units
+separated by white-spaces.
+
+The first construction (@samp{exclusion_set}) means that each
+functional unit in the first string can not be reserved simultaneously
+with a unit whose name is in the second string and vice versa. For
+example, the construction is useful for describing processors
+(e.g.@: some SPARC processors) with a fully pipelined floating point
+functional unit which can execute simultaneously only single floating
+point insns or only double floating point insns.
+
+The second construction (@samp{presence_set}) means that each
+functional unit in the first string can not be reserved unless at
+least one of pattern of units whose names are in the second string is
+reserved. This is an asymmetric relation. For example, it is useful
+for description that @acronym{VLIW} @samp{slot1} is reserved after
+@samp{slot0} reservation. We could describe it by the following
+construction
+
+@smallexample
+(presence_set "slot1" "slot0")
+@end smallexample
+
+Or @samp{slot1} is reserved only after @samp{slot0} and unit @samp{b0}
+reservation. In this case we could write
+
+@smallexample
+(presence_set "slot1" "slot0 b0")
+@end smallexample
+
+The third construction (@samp{final_presence_set}) is analogous to
+@samp{presence_set}. The difference between them is when checking is
+done. When an instruction is issued in given automaton state
+reflecting all current and planned unit reservations, the automaton
+state is changed. The first state is a source state, the second one
+is a result state. Checking for @samp{presence_set} is done on the
+source state reservation, checking for @samp{final_presence_set} is
+done on the result reservation. This construction is useful to
+describe a reservation which is actually two subsequent reservations.
+For example, if we use
+
+@smallexample
+(presence_set "slot1" "slot0")
+@end smallexample
+
+the following insn will be never issued (because @samp{slot1} requires
+@samp{slot0} which is absent in the source state).
+
+@smallexample
+(define_reservation "insn_and_nop" "slot0 + slot1")
+@end smallexample
+
+but it can be issued if we use analogous @samp{final_presence_set}.
+
+The forth construction (@samp{absence_set}) means that each functional
+unit in the first string can be reserved only if each pattern of units
+whose names are in the second string is not reserved. This is an
+asymmetric relation (actually @samp{exclusion_set} is analogous to
+this one but it is symmetric). For example it might be useful in a
+@acronym{VLIW} description to say that @samp{slot0} cannot be reserved
+after either @samp{slot1} or @samp{slot2} have been reserved. This
+can be described as:
+
+@smallexample
+(absence_set "slot0" "slot1, slot2")
+@end smallexample
+
+Or @samp{slot2} can not be reserved if @samp{slot0} and unit @samp{b0}
+are reserved or @samp{slot1} and unit @samp{b1} are reserved. In
+this case we could write
+
+@smallexample
+(absence_set "slot2" "slot0 b0, slot1 b1")
+@end smallexample
+
+All functional units mentioned in a set should belong to the same
+automaton.
+
+The last construction (@samp{final_absence_set}) is analogous to
+@samp{absence_set} but checking is done on the result (state)
+reservation. See comments for @samp{final_presence_set}.
+
+@findex automata_option
+@cindex deterministic finite state automaton
+@cindex nondeterministic finite state automaton
+@cindex finite state automaton minimization
+You can control the generator of the pipeline hazard recognizer with
+the following construction.
+
+@smallexample
+(automata_option @var{options})
+@end smallexample
+
+@var{options} is a string giving options which affect the generated
+code. Currently there are the following options:
+
+@itemize @bullet
+@item
+@dfn{no-minimization} makes no minimization of the automaton. This is
+only worth to do when we are debugging the description and need to
+look more accurately at reservations of states.
+
+@item
+@dfn{time} means printing time statistics about the generation of
+automata.
+
+@item
+@dfn{stats} means printing statistics about the generated automata
+such as the number of DFA states, NDFA states and arcs.
+
+@item
+@dfn{v} means a generation of the file describing the result automata.
+The file has suffix @samp{.dfa} and can be used for the description
+verification and debugging.
+
+@item
+@dfn{w} means a generation of warning instead of error for
+non-critical errors.
+
+@item
+@dfn{ndfa} makes nondeterministic finite state automata. This affects
+the treatment of operator @samp{|} in the regular expressions. The
+usual treatment of the operator is to try the first alternative and,
+if the reservation is not possible, the second alternative. The
+nondeterministic treatment means trying all alternatives, some of them
+may be rejected by reservations in the subsequent insns.
+
+@item
+@dfn{progress} means output of a progress bar showing how many states
+were generated so far for automaton being processed. This is useful
+during debugging a @acronym{DFA} description. If you see too many
+generated states, you could interrupt the generator of the pipeline
+hazard recognizer and try to figure out a reason for generation of the
+huge automaton.
+@end itemize
+
+As an example, consider a superscalar @acronym{RISC} machine which can
+issue three insns (two integer insns and one floating point insn) on
+the cycle but can finish only two insns. To describe this, we define
+the following functional units.
+
+@smallexample
+(define_cpu_unit "i0_pipeline, i1_pipeline, f_pipeline")
+(define_cpu_unit "port0, port1")
+@end smallexample
+
+All simple integer insns can be executed in any integer pipeline and
+their result is ready in two cycles. The simple integer insns are
+issued into the first pipeline unless it is reserved, otherwise they
+are issued into the second pipeline. Integer division and
+multiplication insns can be executed only in the second integer
+pipeline and their results are ready correspondingly in 8 and 4
+cycles. The integer division is not pipelined, i.e.@: the subsequent
+integer division insn can not be issued until the current division
+insn finished. Floating point insns are fully pipelined and their
+results are ready in 3 cycles. Where the result of a floating point
+insn is used by an integer insn, an additional delay of one cycle is
+incurred. To describe all of this we could specify