mirror of
https://github.com/freebsd/freebsd-src.git
synced 2024-12-03 23:28:57 +00:00
dcd8284393
(to be imported soon).
1130 lines
48 KiB
Plaintext
1130 lines
48 KiB
Plaintext
\input texinfo @c -*-texinfo-*-
|
|
@settitle User's Guide to @code{gperf}
|
|
@setfilename gperf.info
|
|
|
|
@ifinfo
|
|
This file documents the features of the GNU Perfect Hash Function Generator
|
|
|
|
Copyright (C) 1989 Free Software Foundation, Inc.
|
|
|
|
Permission is granted to make and distribute verbatim copies of
|
|
this manual provided the copyright notice and this permission notice
|
|
are preserved on all copies.
|
|
|
|
@ignore
|
|
Permission is granted to process this file through @TeX{} and print the
|
|
results, provided the printed document carries copying permission
|
|
notice identical to this one except for the removal of this paragraph
|
|
(this paragraph not being relevant to the printed manual).
|
|
|
|
@end ignore
|
|
|
|
Permission is granted to copy and distribute modified versions of this
|
|
manual under the conditions for verbatim copying, provided also that the
|
|
section entitled ``GNU General Public License'' is included exactly as
|
|
in the original, and provided that the entire resulting derived work is
|
|
distributed under the terms of a permission notice identical to this one.
|
|
|
|
Permission is granted to copy and distribute translations of this manual
|
|
into another language, under the above conditions for modified versions,
|
|
except that the section entitled ``GNU @code{gperf} General Public License'' and
|
|
this permission notice may be included in translations approved by the
|
|
Free Software Foundation instead of in the original English.
|
|
@end ifinfo
|
|
|
|
@setchapternewpage odd
|
|
|
|
@titlepage
|
|
@center @titlefont{User's Guide}
|
|
@sp 2
|
|
@center @titlefont{for the}
|
|
@sp 2
|
|
@center @titlefont{GNU @code{gperf} Utility}
|
|
@sp 4
|
|
@center Douglas C. Schmidt
|
|
@sp 3
|
|
@center last updated 1 November 1989
|
|
@sp 1
|
|
@center for version 2.0
|
|
@page
|
|
@vskip 0pt plus 1filll
|
|
Copyright @copyright{} 1989 Free Software Foundation, Inc.
|
|
|
|
|
|
Permission is granted to make and distribute verbatim copies of
|
|
this manual provided the copyright notice and this permission notice
|
|
are preserved on all copies.
|
|
|
|
Permission is granted to copy and distribute modified versions of this
|
|
manual under the conditions for verbatim copying, provided also that the
|
|
section entitled ``GNU @code{gperf} General Public License'' is included exactly as
|
|
in the original, and provided that the entire resulting derived work is
|
|
distributed under the terms of a permission notice identical to this one.
|
|
|
|
Permission is granted to copy and distribute translations of this manual
|
|
into another language, under the above conditions for modified versions,
|
|
except that the section entitled ``GNU @code{gperf} General Public License'' may be
|
|
included in a translation approved by the author instead of in the original
|
|
English.
|
|
@end titlepage
|
|
|
|
@ifinfo
|
|
@node Top, Copying, , (DIR)
|
|
@ichapter Introduction
|
|
|
|
This manual documents the GNU @code{gperf} perfect hash function generator
|
|
utility, focusing on its features and how to use them, and how to report
|
|
bugs.
|
|
|
|
@end ifinfo
|
|
@menu
|
|
* Copying:: GNU @code{gperf} General Public License says
|
|
how you can copy and share @code{gperf}.
|
|
* Contributors:: People who have contributed to @code{gperf}.
|
|
* Motivation:: Introduction to @code{gperf}.
|
|
* Search Structures:: Static search structures and GNU GPERF.
|
|
* Description:: High-level discussion of how GPERF functions.
|
|
* Options:: A description of options to the program.
|
|
* Bugs:: Known bugs and limitations with GPERF.
|
|
* Projects:: Things still left to do.
|
|
* Implementation:: Implementation Details for GNU GPERF.
|
|
* Bibliography:: Material Referenced in this Report.
|
|
@end menu
|
|
|
|
@node Copying, Contributors, Top, Top
|
|
@unnumbered GNU GENERAL PUBLIC LICENSE
|
|
@center Version 1, February 1989
|
|
|
|
@display
|
|
Copyright @copyright{} 1989 Free Software Foundation, Inc.
|
|
675 Mass Ave, Cambridge, MA 02139, USA
|
|
|
|
Everyone is permitted to copy and distribute verbatim copies
|
|
of this license document, but changing it is not allowed.
|
|
@end display
|
|
|
|
@unnumberedsec Preamble
|
|
|
|
The license agreements of most software companies try to keep users
|
|
at the mercy of those companies. By contrast, our General Public
|
|
License is intended to guarantee your freedom to share and change free
|
|
software---to make sure the software is free for all its users. The
|
|
General Public License applies to the Free Software Foundation's
|
|
software and to any other program whose authors commit to using it.
|
|
You can use it for your programs, too.
|
|
|
|
When we speak of free software, we are referring to freedom, not
|
|
price. Specifically, the General Public License is designed to make
|
|
sure that you have the freedom to give away or sell copies of free
|
|
software, that you receive source code or can get it if you want it,
|
|
that you can change the software or use pieces of it in new free
|
|
programs; and that you know you can do these things.
|
|
|
|
To protect your rights, we need to make restrictions that forbid
|
|
anyone to deny you these rights or to ask you to surrender the rights.
|
|
These restrictions translate to certain responsibilities for you if you
|
|
distribute copies of the software, or if you modify it.
|
|
|
|
For example, if you distribute copies of a such a program, whether
|
|
gratis or for a fee, you must give the recipients all the rights that
|
|
you have. You must make sure that they, too, receive or can get the
|
|
source code. And you must tell them their rights.
|
|
|
|
We protect your rights with two steps: (1) copyright the software, and
|
|
(2) offer you this license which gives you legal permission to copy,
|
|
distribute and/or modify the software.
|
|
|
|
Also, for each author's protection and ours, we want to make certain
|
|
that everyone understands that there is no warranty for this free
|
|
software. If the software is modified by someone else and passed on, we
|
|
want its recipients to know that what they have is not the original, so
|
|
that any problems introduced by others will not reflect on the original
|
|
authors' reputations.
|
|
|
|
The precise terms and conditions for copying, distribution and
|
|
modification follow.
|
|
|
|
@iftex
|
|
@unnumberedsec TERMS AND CONDITIONS
|
|
@end iftex
|
|
@ifinfo
|
|
@center TERMS AND CONDITIONS
|
|
@end ifinfo
|
|
|
|
@enumerate
|
|
@item
|
|
This License Agreement applies to any program or other work which
|
|
contains a notice placed by the copyright holder saying it may be
|
|
distributed under the terms of this General Public License. The
|
|
``Program'', below, refers to any such program or work, and a ``work based
|
|
on the Program'' means either the Program or any work containing the
|
|
Program or a portion of it, either verbatim or with modifications. Each
|
|
licensee is addressed as ``you''.
|
|
|
|
@item
|
|
You may copy and distribute verbatim copies of the Program's source
|
|
code as you receive it, in any medium, provided that you conspicuously and
|
|
appropriately publish on each copy an appropriate copyright notice and
|
|
disclaimer of warranty; keep intact all the notices that refer to this
|
|
General Public License and to the absence of any warranty; and give any
|
|
other recipients of the Program a copy of this General Public License
|
|
along with the Program. You may charge a fee for the physical act of
|
|
transferring a copy.
|
|
|
|
@item
|
|
You may modify your copy or copies of the Program or any portion of
|
|
it, and copy and distribute such modifications under the terms of Paragraph
|
|
1 above, provided that you also do the following:
|
|
|
|
@itemize @bullet
|
|
@item
|
|
cause the modified files to carry prominent notices stating that
|
|
you changed the files and the date of any change; and
|
|
|
|
@item
|
|
cause the whole of any work that you distribute or publish, that
|
|
in whole or in part contains the Program or any part thereof, either
|
|
with or without modifications, to be licensed at no charge to all
|
|
third parties under the terms of this General Public License (except
|
|
that you may choose to grant warranty protection to some or all
|
|
third parties, at your option).
|
|
|
|
@item
|
|
If the modified program normally reads commands interactively when
|
|
run, you must cause it, when started running for such interactive use
|
|
in the simplest and most usual way, to print or display an
|
|
announcement including an appropriate copyright notice and a notice
|
|
that there is no warranty (or else, saying that you provide a
|
|
warranty) and that users may redistribute the program under these
|
|
conditions, and telling the user how to view a copy of this General
|
|
Public License.
|
|
|
|
@item
|
|
You may charge a fee for the physical act of transferring a
|
|
copy, and you may at your option offer warranty protection in
|
|
exchange for a fee.
|
|
@end itemize
|
|
|
|
Mere aggregation of another independent work with the Program (or its
|
|
derivative) on a volume of a storage or distribution medium does not bring
|
|
the other work under the scope of these terms.
|
|
|
|
@item
|
|
You may copy and distribute the Program (or a portion or derivative of
|
|
it, under Paragraph 2) in object code or executable form under the terms of
|
|
Paragraphs 1 and 2 above provided that you also do one of the following:
|
|
|
|
@itemize @bullet
|
|
@item
|
|
accompany it with the complete corresponding machine-readable
|
|
source code, which must be distributed under the terms of
|
|
Paragraphs 1 and 2 above; or,
|
|
|
|
@item
|
|
accompany it with a written offer, valid for at least three
|
|
years, to give any third party free (except for a nominal charge
|
|
for the cost of distribution) a complete machine-readable copy of the
|
|
corresponding source code, to be distributed under the terms of
|
|
Paragraphs 1 and 2 above; or,
|
|
|
|
@item
|
|
accompany it with the information you received as to where the
|
|
corresponding source code may be obtained. (This alternative is
|
|
allowed only for noncommercial distribution and only if you
|
|
received the program in object code or executable form alone.)
|
|
@end itemize
|
|
|
|
Source code for a work means the preferred form of the work for making
|
|
modifications to it. For an executable file, complete source code means
|
|
all the source code for all modules it contains; but, as a special
|
|
exception, it need not include source code for modules which are standard
|
|
libraries that accompany the operating system on which the executable
|
|
file runs, or for standard header files or definitions files that
|
|
accompany that operating system.
|
|
|
|
@item
|
|
You may not copy, modify, sublicense, distribute or transfer the
|
|
Program except as expressly provided under this General Public License.
|
|
Any attempt otherwise to copy, modify, sublicense, distribute or transfer
|
|
the Program is void, and will automatically terminate your rights to use
|
|
the Program under this License. However, parties who have received
|
|
copies, or rights to use copies, from you under this General Public
|
|
License will not have their licenses terminated so long as such parties
|
|
remain in full compliance.
|
|
|
|
@item
|
|
By copying, distributing or modifying the Program (or any work based
|
|
on the Program) you indicate your acceptance of this license to do so,
|
|
and all its terms and conditions.
|
|
|
|
@item
|
|
Each time you redistribute the Program (or any work based on the
|
|
Program), the recipient automatically receives a license from the original
|
|
licensor to copy, distribute or modify the Program subject to these
|
|
terms and conditions. You may not impose any further restrictions on the
|
|
recipients' exercise of the rights granted herein.
|
|
|
|
@item
|
|
The Free Software Foundation may publish revised and/or new versions
|
|
of the General Public License from time to time. Such new versions will
|
|
be similar in spirit to the present version, but may differ in detail to
|
|
address new problems or concerns.
|
|
|
|
Each version is given a distinguishing version number. If the Program
|
|
specifies a version number of the license which applies to it and ``any
|
|
later version'', you have the option of following the terms and conditions
|
|
either of that version or of any later version published by the Free
|
|
Software Foundation. If the Program does not specify a version number of
|
|
the license, you may choose any version ever published by the Free Software
|
|
Foundation.
|
|
|
|
@item
|
|
If you wish to incorporate parts of the Program into other free
|
|
programs whose distribution conditions are different, write to the author
|
|
to ask for permission. For software which is copyrighted by the Free
|
|
Software Foundation, write to the Free Software Foundation; we sometimes
|
|
make exceptions for this. Our decision will be guided by the two goals
|
|
of preserving the free status of all derivatives of our free software and
|
|
of promoting the sharing and reuse of software generally.
|
|
|
|
@iftex
|
|
@heading NO WARRANTY
|
|
@end iftex
|
|
@ifinfo
|
|
@center NO WARRANTY
|
|
@end ifinfo
|
|
|
|
@item
|
|
BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
|
|
FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
|
|
OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
|
|
PROVIDE THE PROGRAM ``AS IS'' WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
|
|
OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
|
|
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
|
|
TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
|
|
PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
|
|
REPAIR OR CORRECTION.
|
|
|
|
@item
|
|
IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL
|
|
ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
|
|
REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
|
|
INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES
|
|
ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT
|
|
LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES
|
|
SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE
|
|
WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN
|
|
ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
|
|
@end enumerate
|
|
|
|
@iftex
|
|
@heading END OF TERMS AND CONDITIONS
|
|
@end iftex
|
|
@ifinfo
|
|
@center END OF TERMS AND CONDITIONS
|
|
@end ifinfo
|
|
|
|
@page
|
|
@unnumberedsec Appendix: How to Apply These Terms to Your New Programs
|
|
|
|
If you develop a new program, and you want it to be of the greatest
|
|
possible use to humanity, the best way to achieve this is to make it
|
|
free software which everyone can redistribute and change under these
|
|
terms.
|
|
|
|
To do so, attach the following notices to the program. It is safest to
|
|
attach them to the start of each source file to most effectively convey
|
|
the exclusion of warranty; and each file should have at least the
|
|
``copyright'' line and a pointer to where the full notice is found.
|
|
|
|
@smallexample
|
|
@var{one line to give the program's name and a brief idea of what it does.}
|
|
Copyright (C) 19@var{yy} @var{name of author}
|
|
|
|
This program is free software; you can redistribute it and/or modify
|
|
it under the terms of the GNU General Public License as published by
|
|
the Free Software Foundation; either version 1, or (at your option)
|
|
any later version.
|
|
|
|
This program is distributed in the hope that it will be useful,
|
|
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
GNU General Public License for more details.
|
|
|
|
You should have received a copy of the GNU General Public License
|
|
along with this program; if not, write to the Free Software
|
|
Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
|
|
@end smallexample
|
|
|
|
Also add information on how to contact you by electronic and paper mail.
|
|
|
|
If the program is interactive, make it output a short notice like this
|
|
when it starts in an interactive mode:
|
|
|
|
@smallexample
|
|
Gnomovision version 69, Copyright (C) 19@var{yy} @var{name of author}
|
|
Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
|
|
This is free software, and you are welcome to redistribute it
|
|
under certain conditions; type `show c' for details.
|
|
@end smallexample
|
|
|
|
The hypothetical commands `show w' and `show c' should show the
|
|
appropriate parts of the General Public License. Of course, the
|
|
commands you use may be called something other than `show w' and `show
|
|
c'; they could even be mouse-clicks or menu items---whatever suits your
|
|
program.
|
|
|
|
You should also get your employer (if you work as a programmer) or your
|
|
school, if any, to sign a ``copyright disclaimer'' for the program, if
|
|
necessary. Here a sample; alter the names:
|
|
|
|
@example
|
|
Yoyodyne, Inc., hereby disclaims all copyright interest in the
|
|
program `Gnomovision' (a program to direct compilers to make passes
|
|
at assemblers) written by James Hacker.
|
|
|
|
@var{signature of Ty Coon}, 1 April 1989
|
|
Ty Coon, President of Vice
|
|
@end example
|
|
|
|
That's all there is to it!
|
|
|
|
@node Contributors, Motivation, Copying, Top
|
|
@unnumbered Contributors to GNU @code{gperf} Utility
|
|
|
|
@itemize @bullet
|
|
@item
|
|
The GNU @code{gperf} perfect hash function generator utility was
|
|
originally written in GNU C++ by Douglas C. Schmidt. It is now also
|
|
available in a highly-portable ``old-style'' C version. The general
|
|
idea for the perfect hash function generator was inspired by Keith
|
|
Bostic's algorithm written in C, and distributed to @code{net.sources} around
|
|
1984. The current program is a heavily modified, enhanced, and extended
|
|
implementation of Keith's basic idea, created at the University of
|
|
California, Irvine. Bugs, patches, and suggestions should be reported
|
|
to @code{schmidt@@ics.uci.edu}.
|
|
|
|
@item
|
|
Special thanks is extended to Michael Tiemann and Doug Lea, for
|
|
providing a useful compiler, and for giving me a forum to exhibit my
|
|
creation.
|
|
|
|
In addition, Adam de Boor and Nels Olson provided many tips and insights
|
|
that greatly helped improve the quality and functionality of @code{gperf}.
|
|
@end itemize
|
|
|
|
@node Motivation, Search Structures, Contributors, Top
|
|
@chapter Introduction
|
|
|
|
@code{gperf} is a perfect hash function generator written in C++. It
|
|
transforms an @emph{n} element user-specified keyword set @emph{W} into
|
|
a perfect hash function @emph{F}. @emph{F} uniquely maps keywords in
|
|
@emph{W} onto the range 0..@emph{k}, where @emph{k} >= @emph{n}. If
|
|
@emph{k = n} then @emph{F} is a @emph{minimal} perfect hash function.
|
|
@code{gperf} generates a 0..@emph{k} element static lookup table and a
|
|
pair of C functions. These functions determine whether a given
|
|
character string @emph{s} occurs in @emph{W}, using at most one probe
|
|
into the lookup table.
|
|
|
|
@code{gperf} currently generates the reserved keyword recognizer for
|
|
lexical analyzers in several production and research compilers and
|
|
language processing tools, including GNU C, GNU C++, GNU Pascal, GNU
|
|
Modula 3, and GNU indent. Complete C++ source code for @code{gperf} is
|
|
available via anonymous ftp from @code{ics.uci.edu}. @code{gperf} also is
|
|
distributed along with the GNU libg++ library. Finally, a highly
|
|
portable, functionally equivalent K&R C version of @code{gperf} is
|
|
archived in @code{comp.sources.unix}, volume 20.
|
|
|
|
@node Search Structures, Description, Motivation, Top
|
|
@chapter Static search structures and GNU @code{gperf}
|
|
|
|
A @dfn{static search structure} is an Abstract Data Type with certain
|
|
fundamental operations, @emph{e.g.}, @emph{initialize}, @emph{insert},
|
|
and @emph{retrieve}. Conceptually, all insertions occur before any
|
|
retrievals.@footnote{In practice, @code{gperf} generates a @code{static}
|
|
array containing search set keywords and any associated attributes
|
|
specified by the user. Thus, there is essentially no execution-time
|
|
cost for the insertions.} It is a useful data structure for
|
|
representing @emph{static search sets}. Static search sets occur
|
|
frequently in software system applications. Typical static search
|
|
sets include compiler reserved words, assembler instruction opcodes,
|
|
and built-in shell interpreter commands. Search set members, called
|
|
@dfn{keywords}, are inserted into the structure only once, usually
|
|
during program initialization, and are not generally modified at
|
|
run-time.
|
|
|
|
Numerous static search structure implementations exist, @emph{e.g.},
|
|
arrays, linked lists, binary search trees, digital search tries, and
|
|
hash tables. Different approaches offer trade-offs between space
|
|
utilization and search time efficiency. For example, an $n$ element
|
|
sorted array is space efficient, though the average-case time
|
|
complexity for retrieval operations using binary search is
|
|
proportional to $\log n$. Conversely, hash table implementations
|
|
often locate a table entry in constant time, but typically impose
|
|
additional memory overhead and exhibit poor worst case performance
|
|
@cite{aho, etc.}.
|
|
|
|
@emph{Minimal perfect hash functions} provide an optimal solution for a
|
|
particular class of static search sets. A minimal perfect hash
|
|
function is defined by two properties:
|
|
|
|
@itemize @bullet
|
|
@item It allows keyword recognition in a static search set using
|
|
at most @emph{one} probe into the hash table. This represents the
|
|
``perfect'' property.
|
|
@item The actual memory allocated to store the keywords is precisely
|
|
large enough for the keyword set, and @emph{no larger}. This is the
|
|
``minimal'' property.
|
|
@end itemize
|
|
|
|
For most applications it is far easier to generate @emph{perfect} hash
|
|
functions than @emph{minimal perfect} hash functions @cite{many bozos}.
|
|
Moreover, non-minimal perfect hash functions frequently execute faster
|
|
than minimal ones in practice @cite{cichelli}. This phenomena occurs
|
|
since searching a sparse keyword table increases the probability of
|
|
locating a ``null'' entry, thereby reducing string comparisons.
|
|
@code{gperf}'s default behavior generates @emph{near-minimal} perfect hash
|
|
functions for keyword sets. However, @code{gperf} provides many
|
|
options that permit user control over the degree of minimality and
|
|
perfection.
|
|
|
|
Static search sets often exhibit relative stability over time. For
|
|
example, Ada's 63 reserved words have remained constant for nearly a
|
|
decade. It is therefore frequently worthwhile to expend concerted
|
|
effort building an optimal search structure @emph{once}, if it
|
|
subsequently receives heavy use multiple times. @code{gperf} removes
|
|
the drudgery associated with constructing time- and space-efficient
|
|
search structures by hand. It has proven a useful and practical tool
|
|
for serious programming projects. Output from @code{gperf} is
|
|
currently used in several production and research compilers, including
|
|
GNU C, GNU C++, GNU Pascal, and GNU Modula 3. @footnote{The latter two
|
|
compilers are not yet part of the official GNU distribution.} Each
|
|
compiler utilizes @code{gperf} to automatically generate static search
|
|
structures that efficiently identify their respective reserved
|
|
keywords.
|
|
|
|
@node Description, Options, Search Structures, Top
|
|
@chapter High-Level Description of GNU @code{gperf}
|
|
|
|
@menu
|
|
* Input Format:: Input Format to @code{gperf}
|
|
* Output Format:: Output Format for Generated C Code with @code{gperf}
|
|
@end menu
|
|
|
|
The perfect hash function generator @code{gperf} reads a set of
|
|
``keywords'' from a @dfn{keyfile} (or from the standard input by
|
|
default). It attempts to derive a perfect hashing function that
|
|
recognizes a member of the @dfn{static keyword set} with at most a
|
|
single probe into the lookup table. If @code{gperf} succeeds in
|
|
generating such a function it produces a pair of C source code routines
|
|
that perform hashing and table lookup recognition. All generated C code
|
|
is directed to the standard output. Command-line options described
|
|
below allow you to modify the input and output format to @code{gperf}.
|
|
|
|
By default, @code{gperf} attempts to produce time-efficient code, with
|
|
less emphasis on efficient space utilization. However, several options
|
|
exist that permit trading-off execution time for storage space and vice
|
|
versa. In particular, expanding the generated table size produces a
|
|
sparse search structure, generally yielding faster searches.
|
|
Conversely, you can direct @code{gperf} to utilize a C @code{switch}
|
|
statement scheme that minimizes data space storage size. Furthermore,
|
|
using a C @code{switch} may actually speed up the keyword retrieval time
|
|
somewhat. Actual results depend on your C compiler, of course.
|
|
|
|
In general, @code{gperf} assigns values to the characters it is using
|
|
for hashing until some set of values gives each keyword a unique value.
|
|
A helpful heuristic is that the larger the hash value range, the easier
|
|
it is for @code{gperf} to find and generate a perfect hash function.
|
|
Experimentation is the key to getting the most from @code{gperf}.
|
|
|
|
@node Input Format, Declarations, Description, Description
|
|
@section Input Format to @code{gperf}
|
|
|
|
You can control the input keyfile format by varying certain command-line
|
|
arguments, in particular the @samp{-t} option. The input's appearance
|
|
is similar to GNU utilities @code{flex} and @code{bison} (or UNIX
|
|
utilities @code{lex} and @code{yacc}). Here's an outline of the general
|
|
format:
|
|
|
|
@group
|
|
@example
|
|
declarations
|
|
%%
|
|
keywords
|
|
%%
|
|
functions
|
|
@end example
|
|
@end group
|
|
|
|
@emph{Unlike} @code{flex} or @code{bison}, all sections of @code{gperf}'s input
|
|
are optional. The following sections describe the input format for each
|
|
section.
|
|
|
|
@menu
|
|
* Declarations:: @code{struct} Declarations and C Code Inclusion.
|
|
* Keywords:: Format for Keyword Entries.
|
|
* Functions:: Including Additional C Functions.
|
|
@end menu
|
|
|
|
@node Declarations, Keywords, Input Format, Input Format
|
|
@subsection @code{struct} Declarations and C Code Inclusion
|
|
|
|
The keyword input file optionally contains a section for including
|
|
arbitrary C declarations and definitions, as well as provisions for
|
|
providing a user-supplied @code{struct}. If the @samp{-t} option
|
|
@emph{is} enabled, you @emph{must} provide a C @code{struct} as the last
|
|
component in the declaration section from the keyfile file. The first
|
|
field in this struct must be a @code{char *} identifier called ``name,''
|
|
although it is possible to modify this field's name with the @samp{-K}
|
|
option described below.
|
|
|
|
Here is simple example, using months of the year and their attributes as
|
|
input:
|
|
|
|
@group
|
|
@example
|
|
struct months @{ char *name; int number; int days; int leap_days; @};
|
|
%%
|
|
january, 1, 31, 31
|
|
february, 2, 28, 29
|
|
march, 3, 31, 31
|
|
april, 4, 30, 30
|
|
may, 5, 31, 31
|
|
june, 6, 30, 30
|
|
july, 7, 31, 31
|
|
august, 8, 31, 31
|
|
september, 9, 30, 30
|
|
october, 10, 31, 31
|
|
november, 11, 30, 30
|
|
december, 12, 31, 31
|
|
@end example
|
|
@end group
|
|
|
|
Separating the @code{struct} declaration from the list of key words and
|
|
other fields are a pair of consecutive percent signs, @code{%%},
|
|
appearing left justified in the first column, as in the UNIX utility
|
|
@code{lex}.
|
|
|
|
Using a syntax similar to GNU utilities @code{flex} and @code{bison}, it
|
|
is possible to directly include C source text and comments verbatim into
|
|
the generated output file. This is accomplished by enclosing the region
|
|
inside left-justified surrounding @code{%@{}, @code{%@}} pairs. Here is
|
|
an input fragment based on the previous example that illustrates this
|
|
feature:
|
|
|
|
@group
|
|
@example
|
|
%@{
|
|
#include <assert.h>
|
|
/* This section of code is inserted directly into the output. */
|
|
int return_month_days (struct months *months, int is_leap_year);
|
|
%@}
|
|
struct months @{ char *name; int number; int days; int leap_days; @};
|
|
%%
|
|
january, 1, 31, 31
|
|
february, 2, 28, 29
|
|
march, 3, 31, 31
|
|
...
|
|
@end example
|
|
@end group
|
|
|
|
It is possible to omit the declaration section entirely. In this case
|
|
the keyfile begins directly with the first keyword line, @emph{e.g.}:
|
|
|
|
@group
|
|
@example
|
|
january, 1, 31, 31
|
|
february, 2, 28, 29
|
|
march, 3, 31, 31
|
|
april, 4, 30, 30
|
|
...
|
|
@end example
|
|
@end group
|
|
|
|
@node Keywords, Functions, Declarations, Input Format
|
|
@subsection Format for Keyword Entries
|
|
|
|
The second keyfile format section contains lines of keywords and any
|
|
associated attributes you might supply. A line beginning with @samp{#}
|
|
in the first column is considered a comment. Everything following the
|
|
@samp{#} is ignored, up to and including the following newline.
|
|
|
|
The first field of each non-comment line is always the key itself. It
|
|
should be given as a simple name, @emph{i.e.}, without surrounding
|
|
string quotation marks, and be left-justified flush against the first
|
|
column. In this context, a ``field'' is considered to extend up to, but
|
|
not include, the first blank, comma, or newline. Here is a simple
|
|
example taken from a partial list of C reserved words:
|
|
|
|
@group
|
|
@example
|
|
# These are a few C reserved words, see the c.@code{gperf} file
|
|
# for a complete list of ANSI C reserved words.
|
|
unsigned
|
|
sizeof
|
|
switch
|
|
signed
|
|
if
|
|
default
|
|
for
|
|
while
|
|
return
|
|
@end example
|
|
@end group
|
|
|
|
Note that unlike @code{flex} or @code{bison} the first @code{%%} marker
|
|
may be elided if the declaration section is empty.
|
|
|
|
Additional fields may optionally follow the leading keyword. Fields
|
|
should be separated by commas, and terminate at the end of line. What
|
|
these fields mean is entirely up to you; they are used to initialize the
|
|
elements of the user-defined @code{struct} provided by you in the
|
|
declaration section. If the @samp{-t} option is @emph{not} enabled
|
|
these fields are simply ignored. All previous examples except the last
|
|
one contain keyword attributes.
|
|
|
|
@node Functions, Output Format, Keywords, Input Format
|
|
@subsection Including Additional C Functions
|
|
|
|
The optional third section also corresponds closely with conventions
|
|
found in @code{flex} and @code{bison}. All text in this section,
|
|
starting at the final @code{%%} and extending to the end of the input
|
|
file, is included verbatim into the generated output file. Naturally,
|
|
it is your responsibility to ensure that the code contained in this
|
|
section is valid C.
|
|
|
|
@node Output Format, , Functions, Description
|
|
@section Output Format for Generated C Code with @code{gperf}
|
|
|
|
Several options control how the generated C code appears on the standard
|
|
output. Two C function are generated. They are called @code{hash} and
|
|
@code{in_word_set}, although you may modify the name for
|
|
@code{in_word_set} with a command-line option. Both functions require
|
|
two arguments, a string, @code{char *} @var{str}, and a length
|
|
parameter, @code{int} @var{len}. Their default function prototypes are
|
|
as follows:
|
|
|
|
@group
|
|
@example
|
|
static int hash (char *str, int len);
|
|
int in_word_set (char *str, int len);
|
|
@end example
|
|
@end group
|
|
|
|
By default, the generated @code{hash} function returns an integer value
|
|
created by adding @var{len} to several user-specified @var{str} key
|
|
positions indexed into an @dfn{associated values} table stored in a
|
|
local static array. The associated values table is constructed
|
|
internally by @code{gperf} and later output as a static local C array called
|
|
@var{hash_table}; its meaning and properties are described below.
|
|
@xref{Implementation}. The relevant key positions are specified via the
|
|
@samp{-k} option when running @code{gperf}, as detailed in the @emph{Options}
|
|
section below. @xref{Options}.
|
|
|
|
Two options, @samp{-g} (assume you are compiling with GNU C and its
|
|
@code{inline} feature) and @samp{-a} (assume ANSI C-style function
|
|
prototypes), alter the content of both the generated @code{hash} and
|
|
@code{in_word_set} routines. However, function @code{in_word_set} may
|
|
be modified more extensively, in response to your option settings. The
|
|
options that affect the @code{in_word_set} structure are:
|
|
|
|
@itemize @bullet
|
|
@table @samp
|
|
@item -p
|
|
Have function @code{in_word_set} return a pointer rather than a boolean.
|
|
|
|
@item -t
|
|
Make use of the user-defined @code{struct}.
|
|
|
|
@item -S @var{total switch statements}
|
|
Generate 1 or more C @code{switch} statement rather than use a large,
|
|
(and potentially sparse) static array. Although the exact time and
|
|
space savings of this approach vary according to your C compiler's
|
|
degree of optimization, this method often results in smaller and faster
|
|
code.
|
|
@end table
|
|
@end itemize
|
|
|
|
If the @samp{-t}, @samp{-S}, and @samp{-p} options are omitted the
|
|
default action is to generate a @code{char *} array containing the keys,
|
|
together with additional null strings used for padding the array. By
|
|
experimenting with the various input and output options, and timing the
|
|
resulting C code, you can determine the best option choices for
|
|
different keyword set characteristics.
|
|
|
|
@node Options, Bugs, Description, Top
|
|
@chapter Options to the @code{gperf} Utility
|
|
|
|
There are @emph{many} options to @code{gperf}. They were added to make
|
|
the program more convenient for use with real applications. ``On-line''
|
|
help is readily available via the @samp{-h} option. Other options
|
|
include:
|
|
|
|
@itemize @bullet
|
|
@table @samp
|
|
@item -a
|
|
Generate ANSI Standard C code using function prototypes. The default is
|
|
to use ``classic'' K&R C function declaration syntax.
|
|
|
|
@item -c
|
|
Generates C code that uses the @code{strncmp} function to perform
|
|
string comparisons. The default action is to use @code{strcmp}.
|
|
|
|
@item -C
|
|
Makes the contents of all generated lookup tables constant, @emph{i.e.},
|
|
``readonly.'' Many compilers can generate more efficient code for this
|
|
by putting the tables in readonly memory.
|
|
|
|
@item -d
|
|
Enables the debugging option. This produces verbose diagnostics to
|
|
``standard error'' when @code{gperf} is executing. It is useful both for
|
|
maintaining the program and for determining whether a given set of
|
|
options is actually speeding up the search for a solution. Some useful
|
|
information is dumped at the end of the program when the @samp{-d}
|
|
option is enabled.
|
|
|
|
@item -D
|
|
Handle keywords whose key position sets hash to duplicate values.
|
|
Duplicate hash values occur for two reasons:
|
|
|
|
@itemize @bullet
|
|
@item
|
|
Since @code{gperf} does not backtrack it is possible for it to process
|
|
all your input keywords without finding a unique mapping for each word.
|
|
However, frequently only a very small number of duplicates occur, and
|
|
the majority of keys still require one probe into the table.
|
|
@item
|
|
Sometimes a set of keys may have the same names, but possess different
|
|
attributes. With the -D option @code{gperf} treats all these keys as part of
|
|
an equivalence class and generates a perfect hash function with multiple
|
|
comparisons for duplicate keys. It is up to you to completely
|
|
disambiguate the keywords by modifying the generated C code. However,
|
|
@code{gperf} helps you out by organizing the output.
|
|
@end itemize
|
|
|
|
Option @samp{-D} is extremely useful for certain large or highly
|
|
redundant keyword sets, @emph{i.e.}, assembler instruction opcodes.
|
|
Using this option usually means that the generated hash function is no
|
|
longer perfect. On the other hand, it permits @code{gperf} to work on keyword
|
|
sets that it otherwise could not handle.
|
|
|
|
@item -e @var{keyword delimiter list}
|
|
Allows the user to provide a string containing delimiters used to
|
|
separate keywords from their attributes. The default is ",\n". This
|
|
option is essential if you want to use keywords that have embedded
|
|
commas or newlines. One useful trick is to use -e'TAB', where TAB is
|
|
the literal tab character.
|
|
|
|
@item -f @var{iteration amount}
|
|
Generate the perfect hash function ``fast.'' This decreases @code{gperf}'s
|
|
running time at the cost of minimizing generated table-size. The
|
|
iteration amount represents the number of times to iterate when
|
|
resolving a collision. `0' means `iterate by the number of keywords.
|
|
This option is probably most useful when used in conjunction with options
|
|
@samp{-D} and/or @samp{-S} for @emph{large} keyword sets.
|
|
|
|
@item -g
|
|
Assume a GNU compiler, @emph{e.g.}, @code{g++} or @code{gcc}. This
|
|
makes all generated routines use the ``inline'' keyword to remove the
|
|
cost of function calls. Note that @samp{-g} does @emph{not} imply
|
|
@samp{-a}, since other non-ANSI C compilers may have provisions for a
|
|
function @code{inline} feature.
|
|
|
|
@item -G
|
|
Generate the static table of keywords as a static global variable,
|
|
rather than hiding it inside of the lookup function (which is the
|
|
default behavior).
|
|
|
|
@item -h
|
|
Prints a short summary on the meaning of each program option. Aborts
|
|
further program execution.
|
|
|
|
@item -H @var{hash function name}
|
|
Allows you to specify the name for the generated hash function. Default
|
|
name is `hash.' This option permits the use of two hash tables in the
|
|
same file.
|
|
|
|
@item -i @var{initial value}
|
|
Provides an initial @var{value} for the associate values array. Default
|
|
is 0. Increasing the initial value helps inflate the final table size,
|
|
possibly leading to more time efficient keyword lookups. Note that this
|
|
option is not particularly useful when @samp{-S} is used. Also,
|
|
@samp{-i} is overriden when the @samp{-r} option is used.
|
|
|
|
@item -j @var{jump value}
|
|
Affects the ``jump value,'' @emph{i.e.}, how far to advance the
|
|
associated character value upon collisions. @var{Jump value} is rounded
|
|
up to an odd number, the default is 5. If the @var{jump value} is 0 @code{gperf}
|
|
jumps by random amounts.
|
|
|
|
@item -k @var{keys}
|
|
Allows selection of the character key positions used in the keywords'
|
|
hash function. The allowable choices range between 1-126, inclusive.
|
|
The positions are separated by commas, @emph{e.g.}, @samp{-k 9,4,13,14};
|
|
ranges may be used, @emph{e.g.}, @samp{-k 2-7}; and positions may occur
|
|
in any order. Furthermore, the meta-character '*' causes the generated
|
|
hash function to consider @strong{all} character positions in each key,
|
|
whereas '$' instructs the hash function to use the ``final character''
|
|
of a key (this is the only way to use a character position greater than
|
|
126, incidentally).
|
|
|
|
For instance, the option @samp{-k 1,2,4,6-10,'$'} generates a hash
|
|
function that considers positions 1,2,4,6,7,8,9,10, plus the last
|
|
character in each key (which may differ for each key, obviously). Keys
|
|
with length less than the indicated key positions work properly, since
|
|
selected key positions exceeding the key length are simply not
|
|
referenced in the hash function.
|
|
|
|
@item -K @var{key name}
|
|
By default, the program assumes the structure component identifier for
|
|
the keyword is ``name.'' This option allows an arbitrary choice of
|
|
identifier for this component, although it still must occur as the first
|
|
field in your supplied @code{struct}.
|
|
|
|
@item -l
|
|
Compare key lengths before trying a string comparison. This might cut
|
|
down on the number of string comparisons made during the lookup, since
|
|
keys with different lengths are never compared via @code{strcmp}.
|
|
However, using @samp{-l} might greatly increase the size of the
|
|
generated C code if the lookup table range is large (which implies that
|
|
the switch option @samp{-S} is not enabled), since the length table
|
|
contains as many elements as there are entries in the lookup table.
|
|
|
|
@item -n
|
|
Instructs the generator not to include the length of a keyword when
|
|
computing its hash value. This may save a few assembly instructions in
|
|
the generated lookup table.
|
|
|
|
@item -N @var{lookup function name}
|
|
Allows you to specify the name for the generated lookup function.
|
|
Default name is `in_word_set.' This option permits completely automatic
|
|
generation of perfect hash functions, especially when multiple generated
|
|
hash functions are used in the same application.
|
|
|
|
@item -o
|
|
Reorders the keywords by sorting the keywords so that frequently
|
|
occuring key position set components appear first. A second reordering
|
|
pass follows so that keys with ``already determined values'' are placed
|
|
towards the front of the keylist. This may decrease the time required
|
|
to generate a perfect hash function for many keyword sets, and also
|
|
produce more minimal perfect hash functions. The reason for this is
|
|
that the reordering helps prune the search time by handling inevitable
|
|
collisions early in the search process. On the other hand, if the
|
|
number of keywords is @emph{very} large using @samp{-o} may
|
|
@emph{increase} @code{gperf}'s execution time, since collisions will begin
|
|
earlier and continue throughout the remainder of keyword processing.
|
|
See Cichelli's paper from the January 1980 Communications of the ACM for
|
|
details.
|
|
|
|
@item -p
|
|
Changes the return value of the generated function @code{in_word_set}
|
|
from boolean (@emph{i.e.}, 0 or 1), to either type ``pointer to
|
|
user-defined struct,'' (if the @samp{-t} option is enabled), or simply
|
|
to @code{char *}, if @samp{-t} is not enabled. This option is most
|
|
useful when the @samp{-t} option (allowing user-defined structs) is
|
|
used. For example, it is possible to automatically generate the GNU C
|
|
reserved word lookup routine with the options @samp{-p} and @samp{-t}.
|
|
|
|
@item -r
|
|
Utilizes randomness to initialize the associated values table. This
|
|
frequently generates solutions faster than using deterministic
|
|
initialization (which starts all associated values at 0). Furthermore,
|
|
using the randomization option generally increases the size of the
|
|
table. If @code{gperf} has difficultly with a certain keyword set try using
|
|
@samp{-r} or @samp{-D}.
|
|
|
|
@item -s @var{size-multiple}
|
|
Affects the size of the generated hash table. The numeric argument for
|
|
this option indicates ``how many times larger'' the maximum associated
|
|
value range should be, in relationship to the number of keys. For
|
|
example, a value of 3 means ``allow the maximum associated value to be
|
|
about 3 times larger than the number of input keys.'' If option
|
|
@samp{-S} is @emph{not} enabled, the maximum associated value influences
|
|
the static array table size, and a larger table should decrease the time
|
|
required for an unsuccessful search, at the expense of extra table
|
|
space.
|
|
|
|
The default value is 1, thus the default maximum associated value about
|
|
the same size as the number of keys ( for efficiency, the maximum
|
|
associated value is always rounded up to a power of 2). The actual
|
|
table size may vary somewhat, since this technique is essentially a
|
|
heuristic. In particular, setting this value too high slows down
|
|
@code{gperf}'s runtime, since it must search through a much larger range of
|
|
values. Judicious use of the @samp{-f} option helps alleviate this
|
|
overhead, however.
|
|
|
|
@item -S @var{total switch statements}
|
|
Causes the generated C code to use a @code{switch} statement scheme,
|
|
rather than an array lookup table. This can lead to a reduction in both
|
|
time and space requirements for some keyfiles. The argument to this
|
|
option determines how many @code{switch} statements are generated. A
|
|
value of 1 generates 1 @code{switch} containing all the elements, a
|
|
value of 2 generates 2 tables with 1/2 the elements in each
|
|
@code{switch}, etc. This is useful since many C compilers cannot
|
|
correctly generate code for large @code{switch} statements. This option
|
|
was inspired in part by Keith Bostic's original C program.
|
|
|
|
@item -t
|
|
Allows you to include a @code{struct} type declaration for generated
|
|
code. Any text before a pair of consecutive %% is consider part of the
|
|
type declaration. Key words and additional fields may follow this, one
|
|
group of fields per line. A set of examples for generating perfect hash
|
|
tables and functions for Ada, C, and G++, Pascal, and Modula 2 and 3
|
|
reserved words are distributed with this release.
|
|
|
|
@item -T
|
|
Prevents the transfer of the type declaration to the output file. Use
|
|
this option if the type is already defined elsewhere.
|
|
|
|
@item -v
|
|
Prints out the current version number.
|
|
@end table
|
|
@end itemize
|
|
|
|
@node Bugs, Projects, Options, Top
|
|
@chapter Known Bugs and Limitations with @code{gperf}
|
|
|
|
The following are some limitations with the current release of
|
|
@code{gperf}:
|
|
|
|
@itemize @bullet
|
|
@item
|
|
The @code{gperf} utility is tuned to execute quickly, and works quickly
|
|
for small to medium size data sets (around 1000 keywords). It is
|
|
extremely useful for maintaining perfect hash functions for compiler
|
|
keyword sets. Several recent enhancements now enable @code{gperf} to
|
|
work efficiently on much larger keyword sets (over 15,000 keywords).
|
|
When processing large keyword sets it helps greatly to have over 8 megs
|
|
of RAM.
|
|
|
|
However, since @code{gperf} does not backtrack no guaranteed solution
|
|
occurs on every run. On the other hand, it is usually easy to obtain a
|
|
solution by varying the option parameters. In particular, try the
|
|
@samp{-r} option, and also try changing the default arguments to the
|
|
@samp{-s} and @samp{-j} options. To @emph{guarantee} a solution, use
|
|
the @samp{-D} and @samp{-S} options, although the final results are not
|
|
likely to be a @emph{perfect} hash function anymore! Finally, use the
|
|
@samp{-f} option if you want @code{gperf} to generate the perfect hash
|
|
function @emph{fast}, with less emphasis on making it minimal.
|
|
|
|
@item
|
|
The size of the generate static keyword array can get @emph{extremely}
|
|
large if the input keyword file is large or if the keywords are quite
|
|
similar. This tends to slow down the compilation of the generated C
|
|
code, and @emph{greatly} inflates the object code size. If this
|
|
situation occurs, consider using the @samp{-S} option to reduce data
|
|
size, potentially increasing keyword recognition time a negligible
|
|
amount. Since many C compilers cannot correctly generated code for
|
|
large switch statements it is important to qualify the @var{-S} option
|
|
with an appropriate numerical argument that controls the number of
|
|
switch statements generated.
|
|
|
|
@item
|
|
The maximum number of key positions selected for a given key has an
|
|
arbitrary limit of 126. This restriction should be removed, and if
|
|
anyone considers this a problem write me and let me know so I can remove
|
|
the constraint.
|
|
|
|
@item
|
|
The C++ source code only compiles correctly with GNU G++, version 1.36
|
|
(and hopefully later versions). Porting to AT&T cfront would be
|
|
tedious, but possible (and desirable). There is also a K&R C version
|
|
available now. This should compile without change on most BSD systems,
|
|
but may require a bit of work to run on SYSV, since @code{gperf} uses
|
|
@var{alloca} in several places. Send mail to @code{schmidt@@ics.uci.edu} for
|
|
information.
|
|
@end itemize
|
|
|
|
@node Projects, Implementation, Bugs, Top
|
|
@chapter Things Still Left to Do
|
|
|
|
It should be ``relatively'' easy to replace the current perfect hash
|
|
function algorithm with a more exhaustive approach; the perfect hash
|
|
module is essential independent from other program modules. Additional
|
|
worthwhile improvements include:
|
|
|
|
@itemize @bullet
|
|
@item
|
|
Make the algorithm more robust. At present, the program halts with an
|
|
error diagnostic if it can't find a direct solution and the @samp{-D}
|
|
option is not enabled. A more comprehensive, albeit computationally
|
|
expensive, approach would employ backtracking or enable alternative
|
|
options and retry. It's not clear how helpful this would be, in
|
|
general, since most search sets are rather small in practice.
|
|
|
|
@item
|
|
Another useful extension involves modifying the program to generate
|
|
``minimal'' perfect hash functions (under certain circumstances, the
|
|
current version can be rather extravagant in the generated table size).
|
|
Again, this is mostly of theoretical interest, since a sparse table
|
|
often produces faster lookups, and use of the @samp{-S} @code{switch}
|
|
option can minimize the data size, at the expense of slightly longer
|
|
lookups (note that the gcc compiler generally produces good code for
|
|
@code{switch} statements, reducing the need for more complex schemes).
|
|
|
|
@item
|
|
In addition to improving the algorithm, it would also be useful to
|
|
generate a C++ class or Ada package as the code output, in addition to
|
|
the current C routines.
|
|
@end itemize
|
|
|
|
@node Implementation, Bibliography, Projects, Top
|
|
@chapter Implementation Details of GNU @code{gperf}
|
|
|
|
A paper describing the high-level description of the data structures and
|
|
algorithms used to implement @code{gperf} will soon be available. This
|
|
paper is useful not only from a maintenance and enhancement perspective,
|
|
but also because they demonstrate several clever and useful programming
|
|
techniques, @emph{e.g.}, `Iteration Number' boolean arrays, double
|
|
hashing, a ``safe'' and efficient method for reading arbitrarily long
|
|
input from a file, and a provably optimal algorithm for simultaneously
|
|
determining both the minimum and maximum elements in a list.
|
|
|
|
@page
|
|
|
|
@node Bibliography, , Implementation, Top
|
|
@chapter Bibliography
|
|
|
|
[1] Chang, C.C.: @i{A Scheme for Constructing Ordered Minimal Perfect
|
|
Hashing Functions} Information Sciences 39(1986), 187-195.
|
|
|
|
[2] Cichelli, Richard J. @i{Author's Response to ``On Cichelli's Minimal Perfect Hash
|
|
Functions Method''} Communications of the ACM, 23, 12(December 1980), 729.
|
|
|
|
[3] Cichelli, Richard J. @i{Minimal Perfect Hash Functions Made Simple}
|
|
Communications of the ACM, 23, 1(January 1980), 17-19.
|
|
|
|
[4] Cook, C. R. and Oldehoeft, R.R. @i{A Letter Oriented Minimal
|
|
Perfect Hashing Function} SIGPLAN Notices, 17, 9(September 1982), 18-27.
|
|
|
|
[5] Cormack, G. V. and Horspool, R. N. S. and Kaiserwerth, M.
|
|
@i{Practical Perfect Hashing} Computer Journal, 28, 1(January 1985), 54-58.
|
|
|
|
[6] Jaeschke, G. @i{Reciprocal Hashing: A Method for Generating Minimal
|
|
Perfect Hashing Functions} Communications of the ACM, 24, 12(December
|
|
1981), 829-833.
|
|
|
|
[7] Jaeschke, G. and Osterburg, G. @i{On Cichelli's Minimal Perfect
|
|
Hash Functions Method} Communications of the ACM, 23, 12(December 1980),
|
|
728-729.
|
|
|
|
[8] Sager, Thomas J. @i{A Polynomial Time Generator for Minimal Perfect
|
|
Hash Functions} Communications of the ACM, 28, 5(December 1985), 523-532
|
|
|
|
[9] Sebesta, R.W. and Taylor, M.A. @i{Minimal Perfect Hash Functions
|
|
for Reserved Word Lists} SIGPLAN Notices, 20, 12(September 1985), 47-53.
|
|
@contents
|
|
|
|
[10] Sprugnoli, R. @i{Perfect Hashing Functions: A Single Probe
|
|
Retrieving Method for Static Sets} Communications of the ACM, 20
|
|
11(November 1977), 841-850.
|
|
|
|
[11] Stallman, Richard M. @i{Using and Porting GNU CC} Free Software Foundation,
|
|
1988.
|
|
|
|
[12] Stroustrup, Bjarne @i{The C++ Programming Language.} Addison-Wesley, 1986.
|
|
|
|
[13] Tiemann, Michael D. @i{User's Guide to GNU C++} Free Software
|
|
Foundation, 1989.
|
|
@bye
|