[
  {
    "path": "COPYRIGHT",
    "content": "Copyright (c) 1986, 1993, 1995 by University of Toronto.\nWritten by Henry Spencer.  Not derived from licensed software.\n\nPermission is granted to anyone to use this software for any\npurpose on any computer system, and to redistribute it in any way,\nsubject to the following restrictions:\n\n1. The author is not responsible for the consequences of use of\n\tthis software, no matter how awful, even if they arise\n\tfrom defects in it.\n\n2. The origin of this software must not be misrepresented, either\n\tby explicit claim or by omission.\n\n3. Altered versions must be plainly marked as such, and must not\n\tbe misrepresented (by explicit claim or omission) as being\n\tthe original software.\n\n4. This notice must not be removed or altered.\n"
  },
  {
    "path": "Makefile",
    "content": "# Things you might want to put in ENV:\n# -DERRAVAIL\t\thave utzoo-compatible error() function and friends\nENV=\n\n# Things you might want to put in TEST:\n# -DDEBUG\t\tdebugging hooks\n# -I.\t\t\tregexp.h from current directory, not /usr/include\nTEST=-I.\n\n# Things you might want to put in PROF:\n# -pg\t\t\tprofiler\nPROF=\n\nCFLAGS=-O $(ENV) $(TEST) $(PROF)\nLDFLAGS=$(PROF)\n\nLIB=libregexp.a\nOBJ=regexp.o regsub.o regerror.o\nTMP=dtr.tmp\n\ndefault:\tr\n\ntry:\ttry.o $(LIB)\n\tcc $(LDFLAGS) try.o $(LIB) -o try\n\n# Making timer will probably require putting stuff in $(PROF) and then\n# recompiling everything; the following is just the final stage.\ntimer:\ttimer.o $(LIB)\n\tcc $(LDFLAGS) timer.o $(LIB) -o timer\n\ntimer.o:\ttimer.c timer.t.h\n\ntimer.t.h:\ttests\n\tsed 's/\t/\",\"/g;s/\\\\/&&/g;s/.*/{\"&\"},/' tests >timer.t.h\n\n# Regression test.\nr:\ttry tests\n\t./try <tests\t\t# no news is good news...\n\n$(LIB):\t$(OBJ)\n\tar cr $(LIB) $(OBJ)\n\nregexp.o:\tregexp.c regexp.h regmagic.h\nregsub.o:\tregsub.c regexp.h regmagic.h\n\nclean:\n\trm -f *.o core mon.out gmon.out timer.t.h copy try timer r.*\n\trm -f residue rs.* re.1 rm.h re.h ch.soe ch.ps j badcom fig[012]\n\trm -f ch.sml fig[12].ps $(LIB)\n\trm -rf $(TMP) dtr.*\n\n# the rest of this is unlikely to be of use to you\n\nBITS = r.1 rs.1 re.1 rm.h re.h\nOPT=-p -ms\n\nch.soe:\tch $(BITS)\n\tsoelim ch >$@\n\nch.sml:\tch $(BITS) smlize splitfigs\n\tsplitfigs ch | soelim | smlize >$@\n\nfig0 fig1 fig2:\tch splitfigs\n\tsplitfigs ch >/dev/null\n\nf:\tfig0 fig1 fig2 figs\n\tgroff -Tps -s $(OPT) figs | lpr\n\nfig1.ps:\tfig0 fig1\n\t( cat fig0 ; echo \".LP\" ; cat fig1 ) | groff -Tps $(OPT) >$@\n\nfig2.ps:\tfig0 fig2\n\t( cat fig0 ; echo \".LP\" ; cat fig2 ) | groff -Tps $(OPT) >$@\n\nfp:\tfig1.ps fig2.ps\n\nr.1:\tregexp.c splitter\n\tsplitter regexp.c\n\nrs.1:\tregsub.c splitter\n\tsplitter regsub.c\n\nre.1:\tregerror.c splitter\n\tsplitter regerror.c\n\nrm.h:\tregmagic.h splitter\n\tsplitter regmagic.h\n\nre.h:\tregexp.h splitter\n\tsplitter regexp.h\n\nPLAIN=COPYRIGHT README Makefile regexp.3 try.c timer.c tests\nFIX=regexp.h regexp.c regsub.c regerror.c regmagic.h\nDTR=$(PLAIN) $(FIX)\n\ndtr:\tr $(DTR)\n\trm -rf $(TMP)\n\tmkdir $(TMP)\n\tcp $(PLAIN) $(TMP)\n\tfor f in $(FIX) ; do normalize $$f >$(TMP)/$$f ; done\n\t( cd $(TMP) ; makedtr $(DTR) ) >bookregexp.shar\n\t( cd $(TMP) ; tar -cvf ../bookregexp.tar $(DTR) )\n\trm -rf $(TMP)\n\nch.ps:\tch Makefile $(BITS)\n\tgroff -Tps $(OPT) ch >$@\n\ncopy:\tch.soe ch.sml fp\n\tmakedtr REMARKS ch.sml fig*.ps ch.soe >$@\n\ngo:\tcopy dtr\n"
  },
  {
    "path": "README",
    "content": "This is a revision of my well-known regular-expression package, regexp(3).\nIt gives C programs the ability to use egrep-style regular expressions, and\ndoes it in a much cleaner fashion than the analogous routines in SysV.\nIt is not, alas, fully POSIX.2-compliant; that is hard.  (I'm working on\na full reimplementation that will do that.)\n\nThis version is the one which is examined and explained in one chapter of\n\"Software Solutions in C\" (Dale Schumacher, ed.; AP Professional 1994;\nISBN 0-12-632360-7), plus a couple of insignificant updates, plus one\nsignificant bug fix (done 10 Nov 1995).\n\nAlthough this package was inspired by the Bell V8 regexp(3), this\nimplementation is *NOT* AT&T/Bell code, and is not derived from licensed\nsoftware.  Even though U of T is a V8 licensee.  This software is based on\na V8 manual page sent to me by Dennis Ritchie (the manual page enclosed\nhere is a complete rewrite and hence is not covered by AT&T copyright).\nI admit to some familiarity with regular-expression implementations of\nthe past, but the only one that this code traces any ancestry to is the\none published in Kernighan & Plauger's \"Software Tools\" (from which\nthis one draws ideas but not code).\n\nSimplistically:  put this stuff into a source directory, inspect Makefile\nfor compilation options that need changing to suit your local environment,\nand then do \"make\".  This compiles the regexp(3) functions, builds a\nlibrary containing them, compiles a test program, and runs a large set of\nregression tests.  If there are no complaints, then put regexp.h into\n/usr/include, add regexp.o, regsub.o, and regerror.o into your C library\n(or put libre.a into /usr/lib), and install regexp.3 (perhaps with slight\nmodifications) in your manual-pages directory. \n\nThe files are:\n\nCOPYRIGHT\tcopyright notice\nREADME\t\tthis text\nMakefile\tinstructions to make everything\nregexp.3\tmanual page\nregexp.h\theader file, for /usr/include\nregexp.c\tsource for regcomp() and regexec()\nregsub.c\tsource for regsub()\nregerror.c\tsource for default regerror()\nregmagic.h\tinternal header file\ntry.c\t\tsource for test program\ntimer.c\t\tsource for timing program\ntests\t\ttest list for try and timer\n\nThis implementation uses nondeterministic automata rather than the\ndeterministic ones found in some other implementations, which makes it\nsimpler, smaller, and faster at compiling regular expressions, but slower\nat executing them.  Many users have found the speed perfectly adequate,\nalthough replacing the insides of egrep with this code would be a mistake.\n\nThis stuff should be pretty portable, given an ANSI C compiler and\nappropriate option settings.  There are no \"reserved\" char values except for\nNUL, and no special significance is attached to the top bit of chars.\nThe string(3) functions are used a fair bit, on the grounds that they are\nprobably faster than coding the operations in line.  Some attempts at code\ntuning have been made, but this is invariably a bit machine-specific.\n\nThis distribution lives at ftp://ftp.zoo.toronto.edu/pub/bookregexp.{tar|shar}\nat present.\n"
  },
  {
    "path": "regerror.c",
    "content": "/*\n * regerror\n */\n#include <stdio.h>\n#include <stdlib.h>\n\nvoid\nregerror(s)\nchar *s;\n{\n#ifdef ERRAVAIL\n\terror(\"regexp: %s\", s);\n#else\n\tfprintf(stderr, \"regexp(3): %s\\n\", s);\n\texit(EXIT_FAILURE);\n#endif\n\t/* NOTREACHED */\n}\n"
  },
  {
    "path": "regexp.3",
    "content": ".TH REGEXP 3 \"5 Sept 1996\"\n.SH NAME\nregcomp, regexec, regsub, regerror \\- regular expression handler\n.SH SYNOPSIS\n.ft B\n.nf\n#include <regexp.h>\n\nregexp *regcomp(exp)\nconst char *exp;\n\nint regexec(prog, string)\nregexp *prog;\nconst char *string;\n\nvoid regsub(prog, source, dest)\nconst regexp *prog;\nconst char *source;\nchar *dest;\n\nvoid regerror(msg)\nchar *msg;\n.SH DESCRIPTION\nThese functions implement\n.IR egrep (1)-style\nregular expressions and supporting facilities.\n.PP\n.I Regcomp\ncompiles a regular expression into a structure of type\n.IR regexp ,\nand returns a pointer to it.\nThe space has been allocated using\n.IR malloc (3)\nand may be released by\n.IR free .\n.PP\n.I Regexec\nmatches a NUL-terminated \\fIstring\\fR against the compiled regular expression\nin \\fIprog\\fR.\nIt returns 1 for success and 0 for failure, and adjusts the contents of\n\\fIprog\\fR's \\fIstartp\\fR and \\fIendp\\fR (see below) accordingly.\n.PP\nThe members of a\n.I regexp\nstructure include at least the following (not necessarily in order):\n.PP\n.RS\nchar *startp[NSUBEXP];\n.br\nchar *endp[NSUBEXP];\n.RE\n.PP\nwhere\n.I NSUBEXP\nis defined (as 10) in the header file.\nOnce a successful \\fIregexec\\fR has been done using the \\fIregexp\\fR,\neach \\fIstartp\\fR-\\fIendp\\fR pair describes one substring\nwithin the \\fIstring\\fR,\nwith the \\fIstartp\\fR pointing to the first character of the substring and\nthe \\fIendp\\fR pointing to the first character following the substring.\nThe 0th substring is the substring of \\fIstring\\fR that matched the whole\nregular expression.\nThe others are those substrings that matched parenthesized expressions\nwithin the regular expression, with parenthesized expressions numbered\nin left-to-right order of their opening parentheses.\nIf a parenthesized expression does not participate in the match at all,\nits \\fIstartp\\fR and \\fIendp\\fR are NULL.\n.PP\n.I Regsub\ncopies \\fIsource\\fR to \\fIdest\\fR, making substitutions according to the\nmost recent \\fIregexec\\fR performed using \\fIprog\\fR.\nEach instance of `&' in \\fIsource\\fR is replaced by the substring\nindicated by \\fIstartp\\fR[\\fI0\\fR] and\n\\fIendp\\fR[\\fI0\\fR].\nEach instance of `\\e\\fIn\\fR', where \\fIn\\fR is a digit, is replaced by\nthe substring indicated by\n\\fIstartp\\fR[\\fIn\\fR] and\n\\fIendp\\fR[\\fIn\\fR].\nTo get a literal `&' or `\\e\\fIn\\fR' into \\fIdest\\fR, prefix it with `\\e';\nto get a literal `\\e' preceding `&' or `\\e\\fIn\\fR', prefix it with\nanother `\\e'.\n.PP\n.I Regerror\nis called whenever an error is detected in \\fIregcomp\\fR, \\fIregexec\\fR,\nor \\fIregsub\\fR.\nThe default \\fIregerror\\fR writes the string \\fImsg\\fR,\nwith a suitable indicator of origin,\non the standard\nerror output\nand invokes \\fIexit\\fR(2).\n.I Regerror\ncan be replaced by the user if other actions are desirable.\n.SH \"REGULAR EXPRESSION SYNTAX\"\nA regular expression is zero or more \\fIbranches\\fR, separated by `|'.\nIt matches anything that matches one of the branches.\n.PP\nA branch is zero or more \\fIpieces\\fR, concatenated.\nIt matches a match for the first, followed by a match for the second, etc.\n.PP\nA piece is an \\fIatom\\fR possibly followed by `*', `+', or `?'.\nAn atom followed by `*' matches a sequence of 0 or more matches of the atom.\nAn atom followed by `+' matches a sequence of 1 or more matches of the atom.\nAn atom followed by `?' matches a match of the atom, or the null string.\n.PP\nAn atom is a regular expression in parentheses (matching a match for the\nregular expression), a \\fIrange\\fR (see below), `.'\n(matching any single character), `^' (matching the null string at the\nbeginning of the input string), `$' (matching the null string at the\nend of the input string), a `\\e' followed by a single character (matching\nthat character), or a single character with no other significance\n(matching that character).\n.PP\nA \\fIrange\\fR is a sequence of characters enclosed in `[]'.\nIt normally matches any single character from the sequence.\nIf the sequence begins with `^',\nit matches any single character \\fInot\\fR from the rest of the sequence.\nIf two characters in the sequence are separated by `\\-', this is shorthand\nfor the full list of ASCII characters between them\n(e.g. `[0-9]' matches any decimal digit).\nTo include a literal `]' in the sequence, make it the first character\n(following a possible `^').\nTo include a literal `\\-', make it the first or last character.\n.SH AMBIGUITY\nIf a regular expression could match two different parts of the input string,\nit will match the one which begins earliest.\nIf both begin in the same place but match different lengths, or match\nthe same length in different ways, life gets messier, as follows.\n.PP\nIn general, the possibilities in a list of branches are considered in\nleft-to-right order, the possibilities for `*', `+', and `?' are\nconsidered longest-first, nested constructs are considered from the\noutermost in, and concatenated constructs are considered leftmost-first.\nThe match that will be chosen is the one that uses the earliest\npossibility in the first choice that has to be made.\nIf there is more than one choice, the next will be made in the same manner\n(earliest possibility) subject to the decision on the first choice.\nAnd so forth.\n.PP\nFor example, `(ab|a)b*c' could match `abc' in one of two ways.\nThe first choice is between `ab' and `a'; since `ab' is earlier, and does\nlead to a successful overall match, it is chosen.\nSince the `b' is already spoken for,\nthe `b*' must match its last possibility\\(emthe empty string\\(emsince\nit must respect the earlier choice.\n.PP\nIn the particular case where the regular expression does not use `|'\nand does not apply `*', `+', or `?' to parenthesized subexpressions,\nthe net effect is that the longest possible\nmatch will be chosen.\nSo `ab*', presented with `xabbbby', will match `abbbb'.\nNote that if `ab*' is tried against `xabyabbbz', it\nwill match `ab' just after `x', due to the begins-earliest rule.\n(In effect, the decision on where to start the match is the first choice\nto be made, hence subsequent choices must respect it even if this leads them\nto less-preferred alternatives.)\n.SH SEE ALSO\negrep(1), expr(1)\n.SH DIAGNOSTICS\n\\fIRegcomp\\fR returns NULL for a failure\n(\\fIregerror\\fR permitting),\nwhere failures are syntax errors, exceeding implementation limits,\nor applying `+' or `*' to a possibly-null operand.\n.SH HISTORY\nThis is a revised version.\nBoth code and manual page were\noriginally written by Henry Spencer at University of Toronto.\nThey are intended to be compatible with the Bell V8 \\fIregexp\\fR(3),\nbut are not derived from Bell code.\n.SH BUGS\nEmpty branches and empty regular expressions are not portable\nto other, otherwise-similar, implementations.\n.PP\nThe ban on\napplying `*' or `+' to a possibly-null operand is an artifact of the\nsimplistic implementation.\n.PP\nThe match-choice rules are complex.\nA simple ``longest match'' rule would be preferable,\nbut is harder to implement.\n.PP\nAlthough there is a general similarity to POSIX.2 ``extended'' regular\nexpressions, neither the regular-expression syntax nor the programming\ninterface is an exact match.\n.PP\nDue to emphasis on\ncompactness and simplicity,\nit's not strikingly fast.\nIt does give some attention to handling simple cases quickly.\n"
  },
  {
    "path": "regexp.c",
    "content": "/*\n * regcomp and regexec -- regsub and regerror are elsewhere\n */\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n#include <regexp.h>\n#include \"regmagic.h\"\n\n/*\n * The \"internal use only\" fields in regexp.h are present to pass info from\n * compile to execute that permits the execute phase to run lots faster on\n * simple cases.  They are:\n *\n * regstart\tchar that must begin a match; '\\0' if none obvious\n * reganch\tis the match anchored (at beginning-of-line only)?\n * regmust\tstring (pointer into program) that match must include, or NULL\n * regmlen\tlength of regmust string\n *\n * Regstart and reganch permit very fast decisions on suitable starting points\n * for a match, cutting down the work a lot.  Regmust permits fast rejection\n * of lines that cannot possibly match.  The regmust tests are costly enough\n * that regcomp() supplies a regmust only if the r.e. contains something\n * potentially expensive (at present, the only such thing detected is * or +\n * at the start of the r.e., which can involve a lot of backup).  Regmlen is\n * supplied because the test in regexec() needs it and regcomp() is computing\n * it anyway.\n */\n\n/*\n * Structure for regexp \"program\".  This is essentially a linear encoding\n * of a nondeterministic finite-state machine (aka syntax charts or\n * \"railroad normal form\" in parsing technology).  Each node is an opcode\n * plus a \"next\" pointer, possibly plus an operand.  \"Next\" pointers of\n * all nodes except BRANCH implement concatenation; a \"next\" pointer with\n * a BRANCH on both ends of it is connecting two alternatives.  (Here we\n * have one of the subtle syntax dependencies:  an individual BRANCH (as\n * opposed to a collection of them) is never concatenated with anything\n * because of operator precedence.)  The operand of some types of node is\n * a literal string; for others, it is a node leading into a sub-FSM.  In\n * particular, the operand of a BRANCH node is the first node of the branch.\n * (NB this is *not* a tree structure:  the tail of the branch connects\n * to the thing following the set of BRANCHes.)  The opcodes are:\n */\n\n/* definition\tnumber\topnd?\tmeaning */\n#define\tEND\t0\t/* no\tEnd of program. */\n#define\tBOL\t1\t/* no\tMatch beginning of line. */\n#define\tEOL\t2\t/* no\tMatch end of line. */\n#define\tANY\t3\t/* no\tMatch any character. */\n#define\tANYOF\t4\t/* str\tMatch any of these. */\n#define\tANYBUT\t5\t/* str\tMatch any but one of these. */\n#define\tBRANCH\t6\t/* node\tMatch this, or the next..\\&. */\n#define\tBACK\t7\t/* no\t\"next\" ptr points backward. */\n#define\tEXACTLY\t8\t/* str\tMatch this string. */\n#define\tNOTHING\t9\t/* no\tMatch empty string. */\n#define\tSTAR\t10\t/* node\tMatch this 0 or more times. */\n#define\tPLUS\t11\t/* node\tMatch this 1 or more times. */\n#define\tOPEN\t20\t/* no\tSub-RE starts here. */\n\t\t\t/*\tOPEN+1 is number 1, etc. */\n#define\tCLOSE\t30\t/* no\tAnalogous to OPEN. */\n\n/*\n * Opcode notes:\n *\n * BRANCH\tThe set of branches constituting a single choice are hooked\n *\t\ttogether with their \"next\" pointers, since precedence prevents\n *\t\tanything being concatenated to any individual branch.  The\n *\t\t\"next\" pointer of the last BRANCH in a choice points to the\n *\t\tthing following the whole choice.  This is also where the\n *\t\tfinal \"next\" pointer of each individual branch points; each\n *\t\tbranch starts with the operand node of a BRANCH node.\n *\n * BACK\t\tNormal \"next\" pointers all implicitly point forward; BACK\n *\t\texists to make loop structures possible.\n *\n * STAR,PLUS\t'?', and complex '*' and '+', are implemented as circular\n *\t\tBRANCH structures using BACK.  Simple cases (one character\n *\t\tper match) are implemented with STAR and PLUS for speed\n *\t\tand to minimize recursive plunges.\n *\n * OPEN,CLOSE\t...are numbered at compile time.\n */\n\n/*\n * A node is one char of opcode followed by two chars of \"next\" pointer.\n * \"Next\" pointers are stored as two 8-bit pieces, high order first.  The\n * value is a positive offset from the opcode of the node containing it.\n * An operand, if any, simply follows the node.  (Note that much of the\n * code generation knows about this implicit relationship.)\n *\n * Using two bytes for the \"next\" pointer is vast overkill for most things,\n * but allows patterns to get big without disasters.\n */\n#define\tOP(p)\t\t(*(p))\n#define\tNEXT(p)\t\t(((*((p)+1)&0177)<<8) + (*((p)+2)&0377))\n#define\tOPERAND(p)\t((p) + 3)\n\n/*\n * See regmagic.h for one further detail of program structure.\n */\n\n\n/*\n * Utility definitions.\n */\n#define\tFAIL(m)\t\t{ regerror(m); return(NULL); }\n#define\tISREPN(c)\t((c) == '*' || (c) == '+' || (c) == '?')\n#define\tMETA\t\t\"^$.[()|?+*\\\\\"\n\n/*\n * Flags to be passed up and down.\n */\n#define\tHASWIDTH\t01\t/* Known never to match null string. */\n#define\tSIMPLE\t\t02\t/* Simple enough to be STAR/PLUS operand. */\n#define\tSPSTART\t\t04\t/* Starts with * or +. */\n#define\tWORST\t\t0\t/* Worst case. */\n\n/*\n * Work-variable struct for regcomp().\n */\nstruct comp {\n\tchar *regparse;\t\t/* Input-scan pointer. */\n\tint regnpar;\t\t/* () count. */\n\tchar *regcode;\t\t/* Code-emit pointer; &regdummy = don't. */\n\tchar regdummy[3];\t/* NOTHING, 0 next ptr */\n\tlong regsize;\t\t/* Code size. */\n};\n#define\tEMITTING(cp)\t((cp)->regcode != (cp)->regdummy)\n\n/*\n * Forward declarations for regcomp()'s friends.\n */\nstatic char *reg(struct comp *cp, int paren, int *flagp);\nstatic char *regbranch(struct comp *cp, int *flagp);\nstatic char *regpiece(struct comp *cp, int *flagp);\nstatic char *regatom(struct comp *cp, int *flagp);\nstatic char *regnode(struct comp *cp, int op);\nstatic char *regnext(char *node);\nstatic void regc(struct comp *cp, int c);\nstatic void reginsert(struct comp *cp, int op, char *opnd);\nstatic void regtail(struct comp *cp, char *p, char *val);\nstatic void regoptail(struct comp *cp, char *p, char *val);\n\n/*\n - regcomp - compile a regular expression into internal code\n *\n * We can't allocate space until we know how big the compiled form will be,\n * but we can't compile it (and thus know how big it is) until we've got a\n * place to put the code.  So we cheat:  we compile it twice, once with code\n * generation turned off and size counting turned on, and once \"for real\".\n * This also means that we don't allocate space until we are sure that the\n * thing really will compile successfully, and we never have to move the\n * code and thus invalidate pointers into it.  (Note that it has to be in\n * one piece because free() must be able to free it all.)\n *\n * Beware that the optimization-preparation code in here knows about some\n * of the structure of the compiled regexp.\n */\nregexp *\nregcomp(exp)\nconst char *exp;\n{\n\tregister regexp *r;\n\tregister char *scan;\n\tint flags;\n\tstruct comp co;\n\n\tif (exp == NULL)\n\t\tFAIL(\"NULL argument to regcomp\");\n\n\t/* First pass: determine size, legality. */\n\tco.regparse = (char *)exp;\n\tco.regnpar = 1;\n\tco.regsize = 0L;\n\tco.regdummy[0] = NOTHING;\n\tco.regdummy[1] = co.regdummy[2] = 0;\n\tco.regcode = co.regdummy;\n\tregc(&co, MAGIC);\n\tif (reg(&co, 0, &flags) == NULL)\n\t\treturn(NULL);\n\n\t/* Small enough for pointer-storage convention? */\n\tif (co.regsize >= 0x7fffL)\t/* Probably could be 0xffffL. */\n\t\tFAIL(\"regexp too big\");\n\n\t/* Allocate space. */\n\tr = (regexp *)malloc(sizeof(regexp) + (size_t)co.regsize);\n\tif (r == NULL)\n\t\tFAIL(\"out of space\");\n\n\t/* Second pass: emit code. */\n\tco.regparse = (char *)exp;\n\tco.regnpar = 1;\n\tco.regcode = r->program;\n\tregc(&co, MAGIC);\n\tif (reg(&co, 0, &flags) == NULL)\n\t\treturn(NULL);\n\n\t/* Dig out information for optimizations. */\n\tr->regstart = '\\0';\t\t/* Worst-case defaults. */\n\tr->reganch = 0;\n\tr->regmust = NULL;\n\tr->regmlen = 0;\n\tscan = r->program+1;\t\t/* First BRANCH. */\n\tif (OP(regnext(scan)) == END) {\t/* Only one top-level choice. */\n\t\tscan = OPERAND(scan);\n\n\t\t/* Starting-point info. */\n\t\tif (OP(scan) == EXACTLY)\n\t\t\tr->regstart = *OPERAND(scan);\n\t\telse if (OP(scan) == BOL)\n\t\t\tr->reganch = 1;\n\n\t\t/*\n\t\t * If there's something expensive in the r.e., find the\n\t\t * longest literal string that must appear and make it the\n\t\t * regmust.  Resolve ties in favor of later strings, since\n\t\t * the regstart check works with the beginning of the r.e.\n\t\t * and avoiding duplication strengthens checking.  Not a\n\t\t * strong reason, but sufficient in the absence of others.\n\t\t */\n\t\tif (flags&SPSTART) {\n\t\t\tregister char *longest = NULL;\n\t\t\tregister size_t len = 0;\n\n\t\t\tfor (; scan != NULL; scan = regnext(scan))\n\t\t\t\tif (OP(scan) == EXACTLY && strlen(OPERAND(scan)) >= len) {\n\t\t\t\t\tlongest = OPERAND(scan);\n\t\t\t\t\tlen = strlen(OPERAND(scan));\n\t\t\t\t}\n\t\t\tr->regmust = longest;\n\t\t\tr->regmlen = (int)len;\n\t\t}\n\t}\n\n\treturn(r);\n}\n\n/*\n - reg - regular expression, i.e. main body or parenthesized thing\n *\n * Caller must absorb opening parenthesis.\n *\n * Combining parenthesis handling with the base level of regular expression\n * is a trifle forced, but the need to tie the tails of the branches to what\n * follows makes it hard to avoid.\n */\nstatic char *\nreg(cp, paren, flagp)\nregister struct comp *cp;\nint paren;\t\t\t/* Parenthesized? */\nint *flagp;\n{\n\tregister char *ret;\n\tregister char *br;\n\tregister char *ender;\n\tregister int parno;\n\tint flags;\n\n\t*flagp = HASWIDTH;\t/* Tentatively. */\n\n\tif (paren) {\n\t\t/* Make an OPEN node. */\n\t\tif (cp->regnpar >= NSUBEXP)\n\t\t\tFAIL(\"too many ()\");\n\t\tparno = cp->regnpar;\n\t\tcp->regnpar++;\n\t\tret = regnode(cp, OPEN+parno);\n\t}\n\n\t/* Pick up the branches, linking them together. */\n\tbr = regbranch(cp, &flags);\n\tif (br == NULL)\n\t\treturn(NULL);\n\tif (paren)\n\t\tregtail(cp, ret, br);\t/* OPEN -> first. */\n\telse\n\t\tret = br;\n\t*flagp &= ~(~flags&HASWIDTH);\t/* Clear bit if bit 0. */\n\t*flagp |= flags&SPSTART;\n\twhile (*cp->regparse == '|') {\n\t\tcp->regparse++;\n\t\tbr = regbranch(cp, &flags);\n\t\tif (br == NULL)\n\t\t\treturn(NULL);\n\t\tregtail(cp, ret, br);\t/* BRANCH -> BRANCH. */\n\t\t*flagp &= ~(~flags&HASWIDTH);\n\t\t*flagp |= flags&SPSTART;\n\t}\n\n\t/* Make a closing node, and hook it on the end. */\n\tender = regnode(cp, (paren) ? CLOSE+parno : END);\n\tregtail(cp, ret, ender);\n\n\t/* Hook the tails of the branches to the closing node. */\n\tfor (br = ret; br != NULL; br = regnext(br))\n\t\tregoptail(cp, br, ender);\n\n\t/* Check for proper termination. */\n\tif (paren && *cp->regparse++ != ')') {\n\t\tFAIL(\"unterminated ()\");\n\t} else if (!paren && *cp->regparse != '\\0') {\n\t\tif (*cp->regparse == ')') {\n\t\t\tFAIL(\"unmatched ()\");\n\t\t} else\n\t\t\tFAIL(\"internal error: junk on end\");\n\t\t/* NOTREACHED */\n\t}\n\n\treturn(ret);\n}\n\n/*\n - regbranch - one alternative of an | operator\n *\n * Implements the concatenation operator.\n */\nstatic char *\nregbranch(cp, flagp)\nregister struct comp *cp;\nint *flagp;\n{\n\tregister char *ret;\n\tregister char *chain;\n\tregister char *latest;\n\tint flags;\n\tregister int c;\n\n\t*flagp = WORST;\t\t\t\t/* Tentatively. */\n\n\tret = regnode(cp, BRANCH);\n\tchain = NULL;\n\twhile ((c = *cp->regparse) != '\\0' && c != '|' && c != ')') {\n\t\tlatest = regpiece(cp, &flags);\n\t\tif (latest == NULL)\n\t\t\treturn(NULL);\n\t\t*flagp |= flags&HASWIDTH;\n\t\tif (chain == NULL)\t\t/* First piece. */\n\t\t\t*flagp |= flags&SPSTART;\n\t\telse\n\t\t\tregtail(cp, chain, latest);\n\t\tchain = latest;\n\t}\n\tif (chain == NULL)\t\t\t/* Loop ran zero times. */\n\t\t(void) regnode(cp, NOTHING);\n\n\treturn(ret);\n}\n\n/*\n - regpiece - something followed by possible [*+?]\n *\n * Note that the branching code sequences used for ? and the general cases\n * of * and + are somewhat optimized:  they use the same NOTHING node as\n * both the endmarker for their branch list and the body of the last branch.\n * It might seem that this node could be dispensed with entirely, but the\n * endmarker role is not redundant.\n */\nstatic char *\nregpiece(cp, flagp)\nregister struct comp *cp;\nint *flagp;\n{\n\tregister char *ret;\n\tregister char op;\n\tregister char *next;\n\tint flags;\n\n\tret = regatom(cp, &flags);\n\tif (ret == NULL)\n\t\treturn(NULL);\n\n\top = *cp->regparse;\n\tif (!ISREPN(op)) {\n\t\t*flagp = flags;\n\t\treturn(ret);\n\t}\n\n\tif (!(flags&HASWIDTH) && op != '?')\n\t\tFAIL(\"*+ operand could be empty\");\n\tswitch (op) {\n\tcase '*':\t*flagp = WORST|SPSTART;\t\t\tbreak;\n\tcase '+':\t*flagp = WORST|SPSTART|HASWIDTH;\tbreak;\n\tcase '?':\t*flagp = WORST;\t\t\t\tbreak;\n\t}\n\n\tif (op == '*' && (flags&SIMPLE))\n\t\treginsert(cp, STAR, ret);\n\telse if (op == '*') {\n\t\t/* Emit x* as (x&|), where & means \"self\". */\n\t\treginsert(cp, BRANCH, ret);\t\t/* Either x */\n\t\tregoptail(cp, ret, regnode(cp, BACK));\t/* and loop */\n\t\tregoptail(cp, ret, ret);\t\t/* back */\n\t\tregtail(cp, ret, regnode(cp, BRANCH));\t/* or */\n\t\tregtail(cp, ret, regnode(cp, NOTHING));\t/* null. */\n\t} else if (op == '+' && (flags&SIMPLE))\n\t\treginsert(cp, PLUS, ret);\n\telse if (op == '+') {\n\t\t/* Emit x+ as x(&|), where & means \"self\". */\n\t\tnext = regnode(cp, BRANCH);\t\t/* Either */\n\t\tregtail(cp, ret, next);\n\t\tregtail(cp, regnode(cp, BACK), ret);\t/* loop back */\n\t\tregtail(cp, next, regnode(cp, BRANCH));\t/* or */\n\t\tregtail(cp, ret, regnode(cp, NOTHING));\t/* null. */\n\t} else if (op == '?') {\n\t\t/* Emit x? as (x|) */\n\t\treginsert(cp, BRANCH, ret);\t\t/* Either x */\n\t\tregtail(cp, ret, regnode(cp, BRANCH));\t/* or */\n\t\tnext = regnode(cp, NOTHING);\t\t/* null. */\n\t\tregtail(cp, ret, next);\n\t\tregoptail(cp, ret, next);\n\t}\n\tcp->regparse++;\n\tif (ISREPN(*cp->regparse))\n\t\tFAIL(\"nested *?+\");\n\n\treturn(ret);\n}\n\n/*\n - regatom - the lowest level\n *\n * Optimization:  gobbles an entire sequence of ordinary characters so that\n * it can turn them into a single node, which is smaller to store and\n * faster to run.  Backslashed characters are exceptions, each becoming a\n * separate node; the code is simpler that way and it's not worth fixing.\n */\nstatic char *\nregatom(cp, flagp)\nregister struct comp *cp;\nint *flagp;\n{\n\tregister char *ret;\n\tint flags;\n\n\t*flagp = WORST;\t\t/* Tentatively. */\n\n\tswitch (*cp->regparse++) {\n\tcase '^':\n\t\tret = regnode(cp, BOL);\n\t\tbreak;\n\tcase '$':\n\t\tret = regnode(cp, EOL);\n\t\tbreak;\n\tcase '.':\n\t\tret = regnode(cp, ANY);\n\t\t*flagp |= HASWIDTH|SIMPLE;\n\t\tbreak;\n\tcase '[': {\n\t\tregister int range;\n\t\tregister int rangeend;\n\t\tregister int c;\n\n\t\tif (*cp->regparse == '^') {\t/* Complement of range. */\n\t\t\tret = regnode(cp, ANYBUT);\n\t\t\tcp->regparse++;\n\t\t} else\n\t\t\tret = regnode(cp, ANYOF);\n\t\tif ((c = *cp->regparse) == ']' || c == '-') {\n\t\t\tregc(cp, c);\n\t\t\tcp->regparse++;\n\t\t}\n\t\twhile ((c = *cp->regparse++) != '\\0' && c != ']') {\n\t\t\tif (c != '-')\n\t\t\t\tregc(cp, c);\n\t\t\telse if ((c = *cp->regparse) == ']' || c == '\\0')\n\t\t\t\tregc(cp, '-');\n\t\t\telse {\n\t\t\t\trange = (unsigned char)*(cp->regparse-2);\n\t\t\t\trangeend = (unsigned char)c;\n\t\t\t\tif (range > rangeend)\n\t\t\t\t\tFAIL(\"invalid [] range\");\n\t\t\t\tfor (range++; range <= rangeend; range++)\n\t\t\t\t\tregc(cp, range);\n\t\t\t\tcp->regparse++;\n\t\t\t}\n\t\t}\n\t\tregc(cp, '\\0');\n\t\tif (c != ']')\n\t\t\tFAIL(\"unmatched []\");\n\t\t*flagp |= HASWIDTH|SIMPLE;\n\t\tbreak;\n\t\t}\n\tcase '(':\n\t\tret = reg(cp, 1, &flags);\n\t\tif (ret == NULL)\n\t\t\treturn(NULL);\n\t\t*flagp |= flags&(HASWIDTH|SPSTART);\n\t\tbreak;\n\tcase '\\0':\n\tcase '|':\n\tcase ')':\n\t\t/* supposed to be caught earlier */\n\t\tFAIL(\"internal error: \\\\0|) unexpected\");\n\t\tbreak;\n\tcase '?':\n\tcase '+':\n\tcase '*':\n\t\tFAIL(\"?+* follows nothing\");\n\t\tbreak;\n\tcase '\\\\':\n\t\tif (*cp->regparse == '\\0')\n\t\t\tFAIL(\"trailing \\\\\");\n\t\tret = regnode(cp, EXACTLY);\n\t\tregc(cp, *cp->regparse++);\n\t\tregc(cp, '\\0');\n\t\t*flagp |= HASWIDTH|SIMPLE;\n\t\tbreak;\n\tdefault: {\n\t\tregister size_t len;\n\t\tregister char ender;\n\n\t\tcp->regparse--;\n\t\tlen = strcspn(cp->regparse, META);\n\t\tif (len == 0)\n\t\t\tFAIL(\"internal error: strcspn 0\");\n\t\tender = *(cp->regparse+len);\n\t\tif (len > 1 && ISREPN(ender))\n\t\t\tlen--;\t\t/* Back off clear of ?+* operand. */\n\t\t*flagp |= HASWIDTH;\n\t\tif (len == 1)\n\t\t\t*flagp |= SIMPLE;\n\t\tret = regnode(cp, EXACTLY);\n\t\tfor (; len > 0; len--)\n\t\t\tregc(cp, *cp->regparse++);\n\t\tregc(cp, '\\0');\n\t\tbreak;\n\t\t}\n\t}\n\n\treturn(ret);\n}\n\n/*\n - regnode - emit a node\n */\nstatic char *\t\t\t/* Location. */\nregnode(cp, op)\nregister struct comp *cp;\nchar op;\n{\n\tregister char *const ret = cp->regcode;\n\tregister char *ptr;\n\n\tif (!EMITTING(cp)) {\n\t\tcp->regsize += 3;\n\t\treturn(ret);\n\t}\n\n\tptr = ret;\n\t*ptr++ = op;\n\t*ptr++ = '\\0';\t\t/* Null next pointer. */\n\t*ptr++ = '\\0';\n\tcp->regcode = ptr;\n\n\treturn(ret);\n}\n\n/*\n - regc - emit (if appropriate) a byte of code\n */\nstatic void\nregc(cp, b)\nregister struct comp *cp;\nchar b;\n{\n\tif (EMITTING(cp))\n\t\t*cp->regcode++ = b;\n\telse\n\t\tcp->regsize++;\n}\n\n/*\n - reginsert - insert an operator in front of already-emitted operand\n *\n * Means relocating the operand.\n */\nstatic void\nreginsert(cp, op, opnd)\nregister struct comp *cp;\nchar op;\nchar *opnd;\n{\n\tregister char *place;\n\n\tif (!EMITTING(cp)) {\n\t\tcp->regsize += 3;\n\t\treturn;\n\t}\n\n\t(void) memmove(opnd+3, opnd, (size_t)(cp->regcode - opnd));\n\tcp->regcode += 3;\n\n\tplace = opnd;\t\t/* Op node, where operand used to be. */\n\t*place++ = op;\n\t*place++ = '\\0';\n\t*place++ = '\\0';\n}\n\n/*\n - regtail - set the next-pointer at the end of a node chain\n */\nstatic void\nregtail(cp, p, val)\nregister struct comp *cp;\nchar *p;\nchar *val;\n{\n\tregister char *scan;\n\tregister char *temp;\n\tregister int offset;\n\n\tif (!EMITTING(cp))\n\t\treturn;\n\n\t/* Find last node. */\n\tfor (scan = p; (temp = regnext(scan)) != NULL; scan = temp)\n\t\tcontinue;\n\n\toffset = (OP(scan) == BACK) ? scan - val : val - scan;\n\t*(scan+1) = (offset>>8)&0177;\n\t*(scan+2) = offset&0377;\n}\n\n/*\n - regoptail - regtail on operand of first argument; nop if operandless\n */\nstatic void\nregoptail(cp, p, val)\nregister struct comp *cp;\nchar *p;\nchar *val;\n{\n\t/* \"Operandless\" and \"op != BRANCH\" are synonymous in practice. */\n\tif (!EMITTING(cp) || OP(p) != BRANCH)\n\t\treturn;\n\tregtail(cp, OPERAND(p), val);\n}\n\n/*\n * regexec and friends\n */\n\n/*\n * Work-variable struct for regexec().\n */\nstruct exec {\n\tchar *reginput;\t\t/* String-input pointer. */\n\tchar *regbol;\t\t/* Beginning of input, for ^ check. */\n\tchar **regstartp;\t/* Pointer to startp array. */\n\tchar **regendp;\t\t/* Ditto for endp. */\n};\n\n/*\n * Forwards.\n */\nstatic int regtry(struct exec *ep, regexp *rp, char *string);\nstatic int regmatch(struct exec *ep, char *prog);\nstatic size_t regrepeat(struct exec *ep, char *node);\n\n#ifdef DEBUG\nint regnarrate = 0;\nvoid regdump();\nstatic char *regprop();\n#endif\n\n/*\n - regexec - match a regexp against a string\n */\nint\nregexec(prog, str)\nregister regexp *prog;\nconst char *str;\n{\n\tregister char *string = (char *)str;\t/* avert const poisoning */\n\tregister char *s;\n\tstruct exec ex;\n\n\t/* Be paranoid. */\n\tif (prog == NULL || string == NULL) {\n\t\tregerror(\"NULL argument to regexec\");\n\t\treturn(0);\n\t}\n\n\t/* Check validity of program. */\n\tif ((unsigned char)*prog->program != MAGIC) {\n\t\tregerror(\"corrupted regexp\");\n\t\treturn(0);\n\t}\n\n\t/* If there is a \"must appear\" string, look for it. */\n\tif (prog->regmust != NULL && strstr(string, prog->regmust) == NULL)\n\t\treturn(0);\n\n\t/* Mark beginning of line for ^ . */\n\tex.regbol = string;\n\tex.regstartp = prog->startp;\n\tex.regendp = prog->endp;\n\n\t/* Simplest case:  anchored match need be tried only once. */\n\tif (prog->reganch)\n\t\treturn(regtry(&ex, prog, string));\n\n\t/* Messy cases:  unanchored match. */\n\tif (prog->regstart != '\\0') {\n\t\t/* We know what char it must start with. */\n\t\tfor (s = string; s != NULL; s = strchr(s+1, prog->regstart))\n\t\t\tif (regtry(&ex, prog, s))\n\t\t\t\treturn(1);\n\t\treturn(0);\n\t} else {\n\t\t/* We don't -- general case. */\n\t\tfor (s = string; !regtry(&ex, prog, s); s++)\n\t\t\tif (*s == '\\0')\n\t\t\t\treturn(0);\n\t\treturn(1);\n\t}\n\t/* NOTREACHED */\n}\n\n/*\n - regtry - try match at specific point\n */\nstatic int\t\t\t/* 0 failure, 1 success */\nregtry(ep, prog, string)\nregister struct exec *ep;\nregexp *prog;\nchar *string;\n{\n\tregister int i;\n\tregister char **stp;\n\tregister char **enp;\n\n\tep->reginput = string;\n\n\tstp = prog->startp;\n\tenp = prog->endp;\n\tfor (i = NSUBEXP; i > 0; i--) {\n\t\t*stp++ = NULL;\n\t\t*enp++ = NULL;\n\t}\n\tif (regmatch(ep, prog->program + 1)) {\n\t\tprog->startp[0] = string;\n\t\tprog->endp[0] = ep->reginput;\n\t\treturn(1);\n\t} else\n\t\treturn(0);\n}\n\n/*\n - regmatch - main matching routine\n *\n * Conceptually the strategy is simple:  check to see whether the current\n * node matches, call self recursively to see whether the rest matches,\n * and then act accordingly.  In practice we make some effort to avoid\n * recursion, in particular by going through \"ordinary\" nodes (that don't\n * need to know whether the rest of the match failed) by a loop instead of\n * by recursion.\n */\nstatic int\t\t\t/* 0 failure, 1 success */\nregmatch(ep, prog)\nregister struct exec *ep;\nchar *prog;\n{\n\tregister char *scan;\t/* Current node. */\n\tchar *next;\t\t/* Next node. */\n\n#ifdef DEBUG\n\tif (prog != NULL && regnarrate)\n\t\tfprintf(stderr, \"%s(\\n\", regprop(prog));\n#endif\n\tfor (scan = prog; scan != NULL; scan = next) {\n#ifdef DEBUG\n\t\tif (regnarrate)\n\t\t\tfprintf(stderr, \"%s...\\n\", regprop(scan));\n#endif\n\t\tnext = regnext(scan);\n\n\t\tswitch (OP(scan)) {\n\t\tcase BOL:\n\t\t\tif (ep->reginput != ep->regbol)\n\t\t\t\treturn(0);\n\t\t\tbreak;\n\t\tcase EOL:\n\t\t\tif (*ep->reginput != '\\0')\n\t\t\t\treturn(0);\n\t\t\tbreak;\n\t\tcase ANY:\n\t\t\tif (*ep->reginput == '\\0')\n\t\t\t\treturn(0);\n\t\t\tep->reginput++;\n\t\t\tbreak;\n\t\tcase EXACTLY: {\n\t\t\tregister size_t len;\n\t\t\tregister char *const opnd = OPERAND(scan);\n\n\t\t\t/* Inline the first character, for speed. */\n\t\t\tif (*opnd != *ep->reginput)\n\t\t\t\treturn(0);\n\t\t\tlen = strlen(opnd);\n\t\t\tif (len > 1 && strncmp(opnd, ep->reginput, len) != 0)\n\t\t\t\treturn(0);\n\t\t\tep->reginput += len;\n\t\t\tbreak;\n\t\t\t}\n\t\tcase ANYOF:\n\t\t\tif (*ep->reginput == '\\0' ||\n\t\t\t\t\tstrchr(OPERAND(scan), *ep->reginput) == NULL)\n\t\t\t\treturn(0);\n\t\t\tep->reginput++;\n\t\t\tbreak;\n\t\tcase ANYBUT:\n\t\t\tif (*ep->reginput == '\\0' ||\n\t\t\t\t\tstrchr(OPERAND(scan), *ep->reginput) != NULL)\n\t\t\t\treturn(0);\n\t\t\tep->reginput++;\n\t\t\tbreak;\n\t\tcase NOTHING:\n\t\t\tbreak;\n\t\tcase BACK:\n\t\t\tbreak;\n\t\tcase OPEN+1: case OPEN+2: case OPEN+3:\n\t\tcase OPEN+4: case OPEN+5: case OPEN+6:\n\t\tcase OPEN+7: case OPEN+8: case OPEN+9: {\n\t\t\tregister const int no = OP(scan) - OPEN;\n\t\t\tregister char *const input = ep->reginput;\n\n\t\t\tif (regmatch(ep, next)) {\n\t\t\t\t/*\n\t\t\t\t * Don't set startp if some later\n\t\t\t\t * invocation of the same parentheses\n\t\t\t\t * already has.\n\t\t\t\t */\n\t\t\t\tif (ep->regstartp[no] == NULL)\n\t\t\t\t\tep->regstartp[no] = input;\n\t\t\t\treturn(1);\n\t\t\t} else\n\t\t\t\treturn(0);\n\t\t\tbreak;\n\t\t\t}\n\t\tcase CLOSE+1: case CLOSE+2: case CLOSE+3:\n\t\tcase CLOSE+4: case CLOSE+5: case CLOSE+6:\n\t\tcase CLOSE+7: case CLOSE+8: case CLOSE+9: {\n\t\t\tregister const int no = OP(scan) - CLOSE;\n\t\t\tregister char *const input = ep->reginput;\n\n\t\t\tif (regmatch(ep, next)) {\n\t\t\t\t/*\n\t\t\t\t * Don't set endp if some later\n\t\t\t\t * invocation of the same parentheses\n\t\t\t\t * already has.\n\t\t\t\t */\n\t\t\t\tif (ep->regendp[no] == NULL)\n\t\t\t\t\tep->regendp[no] = input;\n\t\t\t\treturn(1);\n\t\t\t} else\n\t\t\t\treturn(0);\n\t\t\tbreak;\n\t\t\t}\n\t\tcase BRANCH: {\n\t\t\tregister char *const save = ep->reginput;\n\n\t\t\tif (OP(next) != BRANCH)\t\t/* No choice. */\n\t\t\t\tnext = OPERAND(scan);\t/* Avoid recursion. */\n\t\t\telse {\n\t\t\t\twhile (OP(scan) == BRANCH) {\n\t\t\t\t\tif (regmatch(ep, OPERAND(scan)))\n\t\t\t\t\t\treturn(1);\n\t\t\t\t\tep->reginput = save;\n\t\t\t\t\tscan = regnext(scan);\n\t\t\t\t}\n\t\t\t\treturn(0);\n\t\t\t\t/* NOTREACHED */\n\t\t\t}\n\t\t\tbreak;\n\t\t\t}\n\t\tcase STAR: case PLUS: {\n\t\t\tregister const char nextch =\n\t\t\t\t(OP(next) == EXACTLY) ? *OPERAND(next) : '\\0';\n\t\t\tregister size_t no;\n\t\t\tregister char *const save = ep->reginput;\n\t\t\tregister const size_t min = (OP(scan) == STAR) ? 0 : 1;\n\n\t\t\tfor (no = regrepeat(ep, OPERAND(scan)) + 1; no > min; no--) {\n\t\t\t\tep->reginput = save + no - 1;\n\t\t\t\t/* If it could work, try it. */\n\t\t\t\tif (nextch == '\\0' || *ep->reginput == nextch)\n\t\t\t\t\tif (regmatch(ep, next))\n\t\t\t\t\t\treturn(1);\n\t\t\t}\n\t\t\treturn(0);\n\t\t\tbreak;\n\t\t\t}\n\t\tcase END:\n\t\t\treturn(1);\t/* Success! */\n\t\t\tbreak;\n\t\tdefault:\n\t\t\tregerror(\"regexp corruption\");\n\t\t\treturn(0);\n\t\t\tbreak;\n\t\t}\n\t}\n\n\t/*\n\t * We get here only if there's trouble -- normally \"case END\" is\n\t * the terminating point.\n\t */\n\tregerror(\"corrupted pointers\");\n\treturn(0);\n}\n\n/*\n - regrepeat - report how many times something simple would match\n */\nstatic size_t\nregrepeat(ep, node)\nregister struct exec *ep;\nchar *node;\n{\n\tregister size_t count;\n\tregister char *scan;\n\tregister char ch;\n\n\tswitch (OP(node)) {\n\tcase ANY:\n\t\treturn(strlen(ep->reginput));\n\t\tbreak;\n\tcase EXACTLY:\n\t\tch = *OPERAND(node);\n\t\tcount = 0;\n\t\tfor (scan = ep->reginput; *scan == ch; scan++)\n\t\t\tcount++;\n\t\treturn(count);\n\t\tbreak;\n\tcase ANYOF:\n\t\treturn(strspn(ep->reginput, OPERAND(node)));\n\t\tbreak;\n\tcase ANYBUT:\n\t\treturn(strcspn(ep->reginput, OPERAND(node)));\n\t\tbreak;\n\tdefault:\t\t/* Oh dear.  Called inappropriately. */\n\t\tregerror(\"internal error: bad call of regrepeat\");\n\t\treturn(0);\t/* Best compromise. */\n\t\tbreak;\n\t}\n\t/* NOTREACHED */\n}\n\n/*\n - regnext - dig the \"next\" pointer out of a node\n */\nstatic char *\nregnext(p)\nregister char *p;\n{\n\tregister const int offset = NEXT(p);\n\n\tif (offset == 0)\n\t\treturn(NULL);\n\n\treturn((OP(p) == BACK) ? p-offset : p+offset);\n}\n\n#ifdef DEBUG\n\nstatic char *regprop();\n\n/*\n - regdump - dump a regexp onto stdout in vaguely comprehensible form\n */\nvoid\nregdump(r)\nregexp *r;\n{\n\tregister char *s;\n\tregister char op = EXACTLY;\t/* Arbitrary non-END op. */\n\tregister char *next;\n\n\n\ts = r->program + 1;\n\twhile (op != END) {\t/* While that wasn't END last time... */\n\t\top = OP(s);\n\t\tprintf(\"%2d%s\", s-r->program, regprop(s));\t/* Where, what. */\n\t\tnext = regnext(s);\n\t\tif (next == NULL)\t\t/* Next ptr. */\n\t\t\tprintf(\"(0)\");\n\t\telse \n\t\t\tprintf(\"(%d)\", (s-r->program)+(next-s));\n\t\ts += 3;\n\t\tif (op == ANYOF || op == ANYBUT || op == EXACTLY) {\n\t\t\t/* Literal string, where present. */\n\t\t\twhile (*s != '\\0') {\n\t\t\t\tputchar(*s);\n\t\t\t\ts++;\n\t\t\t}\n\t\t\ts++;\n\t\t}\n\t\tputchar('\\n');\n\t}\n\n\t/* Header fields of interest. */\n\tif (r->regstart != '\\0')\n\t\tprintf(\"start `%c' \", r->regstart);\n\tif (r->reganch)\n\t\tprintf(\"anchored \");\n\tif (r->regmust != NULL)\n\t\tprintf(\"must have \\\"%s\\\"\", r->regmust);\n\tprintf(\"\\n\");\n}\n\n/*\n - regprop - printable representation of opcode\n */\nstatic char *\nregprop(op)\nchar *op;\n{\n\tregister char *p;\n\tstatic char buf[50];\n\n\t(void) strcpy(buf, \":\");\n\n\tswitch (OP(op)) {\n\tcase BOL:\n\t\tp = \"BOL\";\n\t\tbreak;\n\tcase EOL:\n\t\tp = \"EOL\";\n\t\tbreak;\n\tcase ANY:\n\t\tp = \"ANY\";\n\t\tbreak;\n\tcase ANYOF:\n\t\tp = \"ANYOF\";\n\t\tbreak;\n\tcase ANYBUT:\n\t\tp = \"ANYBUT\";\n\t\tbreak;\n\tcase BRANCH:\n\t\tp = \"BRANCH\";\n\t\tbreak;\n\tcase EXACTLY:\n\t\tp = \"EXACTLY\";\n\t\tbreak;\n\tcase NOTHING:\n\t\tp = \"NOTHING\";\n\t\tbreak;\n\tcase BACK:\n\t\tp = \"BACK\";\n\t\tbreak;\n\tcase END:\n\t\tp = \"END\";\n\t\tbreak;\n\tcase OPEN+1:\n\tcase OPEN+2:\n\tcase OPEN+3:\n\tcase OPEN+4:\n\tcase OPEN+5:\n\tcase OPEN+6:\n\tcase OPEN+7:\n\tcase OPEN+8:\n\tcase OPEN+9:\n\t\tsprintf(buf+strlen(buf), \"OPEN%d\", OP(op)-OPEN);\n\t\tp = NULL;\n\t\tbreak;\n\tcase CLOSE+1:\n\tcase CLOSE+2:\n\tcase CLOSE+3:\n\tcase CLOSE+4:\n\tcase CLOSE+5:\n\tcase CLOSE+6:\n\tcase CLOSE+7:\n\tcase CLOSE+8:\n\tcase CLOSE+9:\n\t\tsprintf(buf+strlen(buf), \"CLOSE%d\", OP(op)-CLOSE);\n\t\tp = NULL;\n\t\tbreak;\n\tcase STAR:\n\t\tp = \"STAR\";\n\t\tbreak;\n\tcase PLUS:\n\t\tp = \"PLUS\";\n\t\tbreak;\n\tdefault:\n\t\tregerror(\"corrupted opcode\");\n\t\tbreak;\n\t}\n\tif (p != NULL)\n\t\t(void) strcat(buf, p);\n\treturn(buf);\n}\n#endif\n"
  },
  {
    "path": "regexp.h",
    "content": "/*\n * Definitions etc. for regexp(3) routines.\n *\n * Caveat:  this is V8 regexp(3) [actually, a reimplementation thereof],\n * not the System V one.\n */\n#define NSUBEXP  10\ntypedef struct regexp {\n\tchar *startp[NSUBEXP];\n\tchar *endp[NSUBEXP];\n\tchar regstart;\t\t/* Internal use only. */\n\tchar reganch;\t\t/* Internal use only. */\n\tchar *regmust;\t\t/* Internal use only. */\n\tint regmlen;\t\t/* Internal use only. */\n\tchar program[1];\t/* Unwarranted chumminess with compiler. */\n} regexp;\n\nextern regexp *regcomp(const char *re);\nextern int regexec(regexp *rp, const char *s);\nextern void regsub(const regexp *rp, const char *src, char *dst);\nextern void regerror(char *message);\n"
  },
  {
    "path": "regmagic.h",
    "content": "/*\n * The first byte of the regexp internal \"program\" is actually this magic\n * number; the start node begins in the second byte.\n */\n#define\tMAGIC\t0234\n"
  },
  {
    "path": "regsub.c",
    "content": "/*\n * regsub\n */\n#include <stdio.h>\n#include <stdlib.h>\n#include <string.h>\n#include <ctype.h>\n#include <regexp.h>\n#include \"regmagic.h\"\n\n/*\n - regsub - perform substitutions after a regexp match\n */\nvoid\nregsub(rp, source, dest)\nconst regexp *rp;\nconst char *source;\nchar *dest;\n{\n\tregister regexp * const prog = (regexp *)rp;\n\tregister char *src = (char *)source;\n\tregister char *dst = dest;\n\tregister char c;\n\tregister int no;\n\tregister size_t len;\n\n\tif (prog == NULL || source == NULL || dest == NULL) {\n\t\tregerror(\"NULL parameter to regsub\");\n\t\treturn;\n\t}\n\tif ((unsigned char)*(prog->program) != MAGIC) {\n\t\tregerror(\"damaged regexp\");\n\t\treturn;\n\t}\n\n\twhile ((c = *src++) != '\\0') {\n\t\tif (c == '&')\n\t\t\tno = 0;\n\t\telse if (c == '\\\\' && isdigit(*src))\n\t\t\tno = *src++ - '0';\n\t\telse\n\t\t\tno = -1;\n\n\t\tif (no < 0) {\t/* Ordinary character. */\n\t\t\tif (c == '\\\\' && (*src == '\\\\' || *src == '&'))\n\t\t\t\tc = *src++;\n\t\t\t*dst++ = c;\n\t\t} else if (prog->startp[no] != NULL && prog->endp[no] != NULL &&\n\t\t\t\t\tprog->endp[no] > prog->startp[no]) {\n\t\t\tlen = prog->endp[no] - prog->startp[no];\n\t\t\t(void) strncpy(dst, prog->startp[no], len);\n\t\t\tdst += len;\n\t\t\tif (*(dst-1) == '\\0') {\t/* strncpy hit NUL. */\n\t\t\t\tregerror(\"damaged match string\");\n\t\t\t\treturn;\n\t\t\t}\n\t\t}\n\t}\n\t*dst++ = '\\0';\n}\n"
  },
  {
    "path": "tests",
    "content": "abc\tabc\ty\t&\tabc\nabc\txbc\tn\t-\t-\nabc\taxc\tn\t-\t-\nabc\tabx\tn\t-\t-\nabc\txabcy\ty\t&\tabc\nabc\tababc\ty\t&\tabc\nab*c\tabc\ty\t&\tabc\nab*bc\tabc\ty\t&\tabc\nab*bc\tabbc\ty\t&\tabbc\nab*bc\tabbbbc\ty\t&\tabbbbc\nab+bc\tabbc\ty\t&\tabbc\nab+bc\tabc\tn\t-\t-\nab+bc\tabq\tn\t-\t-\nab+bc\tabbbbc\ty\t&\tabbbbc\nab?bc\tabbc\ty\t&\tabbc\nab?bc\tabc\ty\t&\tabc\nab?bc\tabbbbc\tn\t-\t-\nab?c\tabc\ty\t&\tabc\n^abc$\tabc\ty\t&\tabc\n^abc$\tabcc\tn\t-\t-\n^abc\tabcc\ty\t&\tabc\n^abc$\taabc\tn\t-\t-\nabc$\taabc\ty\t&\tabc\n^\tabc\ty\t&\t\n$\tabc\ty\t&\t\na.c\tabc\ty\t&\tabc\na.c\taxc\ty\t&\taxc\na.*c\taxyzc\ty\t&\taxyzc\na.*c\taxyzd\tn\t-\t-\na[bc]d\tabc\tn\t-\t-\na[bc]d\tabd\ty\t&\tabd\na[b-d]e\tabd\tn\t-\t-\na[b-d]e\tace\ty\t&\tace\na[b-d]\taac\ty\t&\tac\na[-b]\ta-\ty\t&\ta-\na[b-]\ta-\ty\t&\ta-\n[k]\tab\tn\t-\t-\na[b-a]\t-\tc\t-\t-\na[]b\t-\tc\t-\t-\na[\t-\tc\t-\t-\na]\ta]\ty\t&\ta]\na[]]b\ta]b\ty\t&\ta]b\na[^bc]d\taed\ty\t&\taed\na[^bc]d\tabd\tn\t-\t-\na[^-b]c\tadc\ty\t&\tadc\na[^-b]c\ta-c\tn\t-\t-\na[^]b]c\ta]c\tn\t-\t-\na[^]b]c\tadc\ty\t&\tadc\nab|cd\tabc\ty\t&\tab\nab|cd\tabcd\ty\t&\tab\n()ef\tdef\ty\t&-\\1\tef-\n()*\t-\tc\t-\t-\n*a\t-\tc\t-\t-\n^*\t-\tc\t-\t-\n$*\t-\tc\t-\t-\n(*)b\t-\tc\t-\t-\n$b\tb\tn\t-\t-\na\\\t-\tc\t-\t-\na\\(b\ta(b\ty\t&-\\1\ta(b-\na\\(*b\tab\ty\t&\tab\na\\(*b\ta((b\ty\t&\ta((b\na\\\\b\ta\\b\ty\t&\ta\\b\nabc)\t-\tc\t-\t-\n(abc\t-\tc\t-\t-\n((a))\tabc\ty\t&-\\1-\\2\ta-a-a\n(a)b(c)\tabc\ty\t&-\\1-\\2\tabc-a-c\na+b+c\taabbabc\ty\t&\tabc\na**\t-\tc\t-\t-\na*?\t-\tc\t-\t-\n(a*)*\t-\tc\t-\t-\n(a*)+\t-\tc\t-\t-\n(a|)*\t-\tc\t-\t-\n(a*|b)*\t-\tc\t-\t-\n(a+|b)*\tab\ty\t&-\\1\tab-b\n(a+|b)+\tab\ty\t&-\\1\tab-b\n(a+|b)?\tab\ty\t&-\\1\ta-a\n[^ab]*\tcde\ty\t&\tcde\n(^)*\t-\tc\t-\t-\n(ab|)*\t-\tc\t-\t-\n)(\t-\tc\t-\t-\n\tabc\ty\t&\t\nabc\t\tn\t-\t-\na*\t\ty\t&\t\nabcd\tabcd\ty\t&-\\&-\\\\&\tabcd-&-\\abcd\na(bc)d\tabcd\ty\t\\1-\\\\1-\\\\\\1\tbc-\\1-\\bc\n([abc])*d\tabbbcd\ty\t&-\\1\tabbbcd-c\n([abc])*bcd\tabcd\ty\t&-\\1\tabcd-a\na|b|c|d|e\te\ty\t&\te\n(a|b|c|d|e)f\tef\ty\t&-\\1\tef-e\n((a*|b))*\t-\tc\t-\t-\nabcd*efg\tabcdefg\ty\t&\tabcdefg\nab*\txabyabbbz\ty\t&\tab\nab*\txayabbbz\ty\t&\ta\n(ab|cd)e\tabcde\ty\t&-\\1\tcde-cd\n[abhgefdc]ij\thij\ty\t&\thij\n^(ab|cd)e\tabcde\tn\tx\\1y\txy\n(abc|)ef\tabcdef\ty\t&-\\1\tef-\n(a|b)c*d\tabcd\ty\t&-\\1\tbcd-b\n(ab|ab*)bc\tabc\ty\t&-\\1\tabc-a\na([bc]*)c*\tabc\ty\t&-\\1\tabc-bc\na([bc]*)(c*d)\tabcd\ty\t&-\\1-\\2\tabcd-bc-d\na([bc]+)(c*d)\tabcd\ty\t&-\\1-\\2\tabcd-bc-d\na([bc]*)(c+d)\tabcd\ty\t&-\\1-\\2\tabcd-b-cd\na[bcd]*dcdcde\tadcdcde\ty\t&\tadcdcde\na[bcd]+dcdcde\tadcdcde\tn\t-\t-\n(ab|a)b*c\tabc\ty\t&-\\1\tabc-ab\n((a)(b)c)(d)\tabcd\ty\t\\1-\\2-\\3-\\4\tabc-a-b-d\n[ -~]*\tabc\ty\t&\tabc\n[ -~ -~]*\tabc\ty\t&\tabc\n[ -~ -~ -~]*\tabc\ty\t&\tabc\n[ -~ -~ -~ -~]*\tabc\ty\t&\tabc\n[ -~ -~ -~ -~ -~]*\tabc\ty\t&\tabc\n[ -~ -~ -~ -~ -~ -~]*\tabc\ty\t&\tabc\n[ -~ -~ -~ -~ -~ -~ -~]*\tabc\ty\t&\tabc\n[a-zA-Z_][a-zA-Z0-9_]*\talpha\ty\t&\talpha\n^a(bc+|b[eh])g|.h$\tabh\ty\t&-\\1\tbh-\n(bc+d$|ef*g.|h?i(j|k))\teffgz\ty\t&-\\1-\\2\teffgz-effgz-\n(bc+d$|ef*g.|h?i(j|k))\tij\ty\t&-\\1-\\2\tij-ij-j\n(bc+d$|ef*g.|h?i(j|k))\teffg\tn\t-\t-\n(bc+d$|ef*g.|h?i(j|k))\tbcdd\tn\t-\t-\n(bc+d$|ef*g.|h?i(j|k))\treffgz\ty\t&-\\1-\\2\teffgz-effgz-\n((((((((((a))))))))))\t-\tc\t-\t-\n(((((((((a)))))))))\ta\ty\t&\ta\nmultiple words of text\tuh-uh\tn\t-\t-\nmultiple words\tmultiple words, yeah\ty\t&\tmultiple words\n(.*)c(.*)\tabcde\ty\t&-\\1-\\2\tabcde-ab-de\n\\((.*), (.*)\\)\t(a, b)\ty\t(\\2, \\1)\t(b, a)\n"
  },
  {
    "path": "timer.c",
    "content": "/*\n * Simple timing program for regcomp().\n * Usage: timer ncomp nexec nsub\n *\tor\n *\ttimer ncomp nexec nsub regexp string [ answer [ sub ] ]\n *\n * The second form is for timing repetitions of a single test case.\n * The first form's test data is a compiled-in copy of the \"tests\" file.\n * Ncomp, nexec, nsub are how many times to do each regcomp, regexec,\n * and regsub.  The way to time an operation individually is to do something\n * like \"timer 1 50 1\".\n */\n#include <stdio.h>\n\nstruct try {\n\tchar *re, *str, *ans, *src, *dst;\n} tests[] = {\n#include \"timer.t.h\"\n{ NULL, NULL, NULL, NULL, NULL }\n};\n\n#include <regexp.h>\n\nint errreport = 0;\t\t/* Report errors via errseen? */\nchar *errseen = NULL;\t\t/* Error message. */\n\nchar *progname;\n\n/* ARGSUSED */\nmain(argc, argv)\nint argc;\nchar *argv[];\n{\n\tint ncomp, nexec, nsub;\n\tstruct try one;\n\tchar dummy[512];\n\n\tif (argc < 4) {\n\t\tncomp = 1;\n\t\tnexec = 1;\n\t\tnsub = 1;\n\t} else {\n\t\tncomp = atoi(argv[1]);\n\t\tnexec = atoi(argv[2]);\n\t\tnsub = atoi(argv[3]);\n\t}\n\t\n\tprogname = argv[0];\n\tif (argc > 5) {\n\t\tone.re = argv[4];\n\t\tone.str = argv[5];\n\t\tif (argc > 6)\n\t\t\tone.ans = argv[6];\n\t\telse\n\t\t\tone.ans = \"y\";\n\t\tif (argc > 7) {\t\n\t\t\tone.src = argv[7];\n\t\t\tone.dst = \"xxx\";\n\t\t} else {\n\t\t\tone.src = \"x\";\n\t\t\tone.dst = \"x\";\n\t\t}\n\t\terrreport = 1;\n\t\ttry(one, ncomp, nexec, nsub);\n\t} else\n\t\tmultiple(ncomp, nexec, nsub);\n\texit(0);\n}\n\nvoid\nregerror(s)\nchar *s;\n{\n\tif (errreport)\n\t\terrseen = s;\n\telse\n\t\terror(s, \"\");\n}\n\n#ifndef ERRAVAIL\nerror(s1, s2)\nchar *s1;\nchar *s2;\n{\n\tfprintf(stderr, \"regexp: \");\n\tfprintf(stderr, s1, s2);\n\tfprintf(stderr, \"\\n\");\n\texit(1);\n}\n#endif\n\nint lineno = 0;\n\nmultiple(ncomp, nexec, nsub)\nint ncomp, nexec, nsub;\n{\n\tregister int i;\n\textern char *strchr();\n\n\terrreport = 1;\n\tfor (i = 0; tests[i].re != NULL; i++) {\n\t\tlineno++;\n\t\ttry(tests[i], ncomp, nexec, nsub);\n\t}\n}\n\ntry(fields, ncomp, nexec, nsub)\nstruct try fields;\nint ncomp, nexec, nsub;\n{\n\tregexp *r;\n\tchar dbuf[BUFSIZ];\n\tregister int i;\n\n\terrseen = NULL;\n\tr = regcomp(fields.re);\n\tif (r == NULL) {\n\t\tif (*fields.ans != 'c')\n\t\t\tcomplain(\"regcomp failure in `%s'\", fields.re);\n\t\treturn;\n\t}\n\tif (*fields.ans == 'c') {\n\t\tcomplain(\"unexpected regcomp success in `%s'\", fields.re);\n\t\tfree((char *)r);\n\t\treturn;\n\t}\n\tfor (i = ncomp-1; i > 0; i--) {\n\t\tfree((char *)r);\n\t\tr = regcomp(fields.re);\n\t}\n\tif (!regexec(r, fields.str)) {\n\t\tif (*fields.ans != 'n')\n\t\t\tcomplain(\"regexec failure in `%s'\", \"\");\n\t\tfree((char *)r);\n\t\treturn;\n\t}\n\tif (*fields.ans == 'n') {\n\t\tcomplain(\"unexpected regexec success\", \"\");\n\t\tfree((char *)r);\n\t\treturn;\n\t}\n\tfor (i = nexec-1; i > 0; i--)\n\t\t(void) regexec(r, fields.str);\n\terrseen = NULL;\n\tfor (i = nsub; i > 0; i--)\n\t\tregsub(r, fields.src, dbuf);\n\tif (errseen != NULL) {\t\n\t\tcomplain(\"regsub complaint\", \"\");\n\t\tfree((char *)r);\n\t\treturn;\n\t}\n\tif (strcmp(dbuf, fields.dst) != 0)\n\t\tcomplain(\"regsub result `%s' wrong\", dbuf);\n\tfree((char *)r);\n}\n\ncomplain(s1, s2)\nchar *s1;\nchar *s2;\n{\n\tfprintf(stderr, \"try: %d: \", lineno);\n\tfprintf(stderr, s1, s2);\n\tfprintf(stderr, \" (%s)\\n\", (errseen != NULL) ? errseen : \"\");\n}\n"
  },
  {
    "path": "try.c",
    "content": "/*\n * Simple test program for regexp(3) stuff.  Knows about debugging hooks.\n * Usage: try re [string [output [-]]]\n * The re is compiled and dumped, regexeced against the string, the result\n * is applied to output using regsub().  The - triggers a running narrative\n * from regexec().  Dumping and narrative don't happen unless DEBUG.\n *\n * If there are no arguments, stdin is assumed to be a stream of lines with\n * five fields:  a r.e., a string to match it against, a result code, a\n * source string for regsub, and the proper result.  Result codes are 'c'\n * for compile failure, 'y' for match success, 'n' for match failure.\n * Field separator is tab.\n */\n#include <stdio.h>\n#include <regexp.h>\n\n#ifdef ERRAVAIL\nchar *progname;\nextern char *mkprogname();\n#endif\n\n#ifdef DEBUG\nextern int regnarrate;\n#endif\n\nchar buf[BUFSIZ];\n\nint errreport = 0;\t\t/* Report errors via errseen? */\nchar *errseen = NULL;\t\t/* Error message. */\nint status = 0;\t\t\t/* Exit status. */\n\n/* ARGSUSED */\nmain(argc, argv)\nint argc;\nchar *argv[];\n{\n\tregexp *r;\n\tint i;\n\n#ifdef ERRAVAIL\n\tprogname = mkprogname(argv[0]);\n#endif\n\n\tif (argc == 1) {\n\t\tmultiple();\n\t\texit(status);\n\t}\n\n\tr = regcomp(argv[1]);\n\tif (r == NULL)\n\t\terror(\"regcomp failure\", \"\");\n#ifdef DEBUG\n\tregdump(r);\n\tif (argc > 4)\n\t\tregnarrate++;\n#endif\n\tif (argc > 2) {\n\t\ti = regexec(r, argv[2]);\n\t\tprintf(\"%d\", i);\n\t\tfor (i = 1; i < NSUBEXP; i++)\n\t\t\tif (r->startp[i] != NULL && r->endp[i] != NULL)\n\t\t\t\tprintf(\" \\\\%d\", i);\n\t\tprintf(\"\\n\");\n\t}\n\tif (argc > 3) {\n\t\tregsub(r, argv[3], buf);\n\t\tprintf(\"%s\\n\", buf);\n\t}\n\texit(status);\n}\n\nvoid\nregerror(s)\nchar *s;\n{\n\tif (errreport)\n\t\terrseen = s;\n\telse\n\t\terror(s, \"\");\n}\n\n#ifndef ERRAVAIL\nerror(s1, s2)\nchar *s1;\nchar *s2;\n{\n\tfprintf(stderr, \"regexp: \");\n\tfprintf(stderr, s1, s2);\n\tfprintf(stderr, \"\\n\");\n\texit(1);\n}\n#endif\n\nint lineno;\n\nregexp badregexp;\t\t/* Implicit init to 0. */\n\nmultiple()\n{\n\tchar rbuf[BUFSIZ];\n\tchar *field[5];\n\tchar *scan;\n\tint i;\n\tregexp *r;\n\textern char *strchr();\n\n\terrreport = 1;\n\tlineno = 0;\n\twhile (fgets(rbuf, sizeof(rbuf), stdin) != NULL) {\n\t\trbuf[strlen(rbuf)-1] = '\\0';\t/* Dispense with \\n. */\n\t\tlineno++;\n\t\tscan = rbuf;\n\t\tfor (i = 0; i < 5; i++) {\n\t\t\tfield[i] = scan;\n\t\t\tif (field[i] == NULL) {\n\t\t\t\tcomplain(\"bad testfile format\", \"\");\n\t\t\t\texit(1);\n\t\t\t}\n\t\t\tscan = strchr(scan, '\\t');\n\t\t\tif (scan != NULL)\n\t\t\t\t*scan++ = '\\0';\n\t\t}\n\t\ttry(field);\n\t}\n\n\t/* And finish up with some internal testing... */\n\tlineno = 9990;\n\terrseen = NULL;\n\tif (regcomp((char *)NULL) != NULL || errseen == NULL)\n\t\tcomplain(\"regcomp(NULL) doesn't complain\", \"\");\n\tlineno = 9991;\n\terrseen = NULL;\n\tif (regexec((regexp *)NULL, \"foo\") || errseen == NULL)\n\t\tcomplain(\"regexec(NULL, ...) doesn't complain\", \"\");\n\tlineno = 9992;\n\tr = regcomp(\"foo\");\n\tif (r == NULL) {\n\t\tcomplain(\"regcomp(\\\"foo\\\") fails\", \"\");\n\t\treturn;\n\t}\n\tlineno = 9993;\n\terrseen = NULL;\n\tif (regexec(r, (char *)NULL) || errseen == NULL)\n\t\tcomplain(\"regexec(..., NULL) doesn't complain\", \"\");\n\tlineno = 9994;\n\terrseen = NULL;\n\tregsub((regexp *)NULL, \"foo\", rbuf);\n\tif (errseen == NULL)\n\t\tcomplain(\"regsub(NULL, ..., ...) doesn't complain\", \"\");\n\tlineno = 9995;\n\terrseen = NULL;\n\tregsub(r, (char *)NULL, rbuf);\n\tif (errseen == NULL)\n\t\tcomplain(\"regsub(..., NULL, ...) doesn't complain\", \"\");\n\tlineno = 9996;\n\terrseen = NULL;\n\tregsub(r, \"foo\", (char *)NULL);\n\tif (errseen == NULL)\n\t\tcomplain(\"regsub(..., ..., NULL) doesn't complain\", \"\");\n\tlineno = 9997;\n\terrseen = NULL;\n\tif (regexec(&badregexp, \"foo\") || errseen == NULL)\n\t\tcomplain(\"regexec(nonsense, ...) doesn't complain\", \"\");\n\tlineno = 9998;\n\terrseen = NULL;\n\tregsub(&badregexp, \"foo\", rbuf);\n\tif (errseen == NULL)\n\t\tcomplain(\"regsub(nonsense, ..., ...) doesn't complain\", \"\");\n}\n\ntry(fields)\nchar **fields;\n{\n\tregexp *r;\n\tchar dbuf[BUFSIZ];\n\n\terrseen = NULL;\n\tr = regcomp(fields[0]);\n\tif (r == NULL) {\n\t\tif (*fields[2] != 'c')\n\t\t\tcomplain(\"regcomp failure in `%s'\", fields[0]);\n\t\treturn;\n\t}\n\tif (*fields[2] == 'c') {\n\t\tcomplain(\"unexpected regcomp success in `%s'\", fields[0]);\n\t\tfree((char *)r);\n\t\treturn;\n\t}\n\tif (!regexec(r, fields[1])) {\n\t\tif (*fields[2] != 'n')\n\t\t\tcomplain(\"regexec failure in `%s'\", fields[0]);\n\t\tfree((char *)r);\n\t\treturn;\n\t}\n\tif (*fields[2] == 'n') {\n\t\tcomplain(\"unexpected regexec success\", \"\");\n\t\tfree((char *)r);\n\t\treturn;\n\t}\n\terrseen = NULL;\n\tregsub(r, fields[3], dbuf);\n\tif (errseen != NULL) {\n\t\tcomplain(\"regsub complaint\", \"\");\n\t\tfree((char *)r);\n\t\treturn;\n\t}\n\tif (strcmp(dbuf, fields[4]) != 0)\n\t\tcomplain(\"regsub result `%s' wrong\", dbuf);\n\tfree((char *)r);\n}\n\ncomplain(s1, s2)\nchar *s1;\nchar *s2;\n{\n\tfprintf(stderr, \"try: %d: \", lineno);\n\tfprintf(stderr, s1, s2);\n\tfprintf(stderr, \" (%s)\\n\", (errseen != NULL) ? errseen : \"\");\n\tstatus = 1;\n}\n"
  }
]