[
  {
    "path": "CHANGES",
    "content": "Version  1.5.7\n\t-Added support for MP4 files\nVersion  1.5.6\n\t-Added support for Office 2007 file as well as bug fixes\nVersion  1.5.5\n\t-Added patch submitted by John K. Antonishek as well as cleaning \n\tup compiler warnings and man file installation.\nVersion  1.5.4\n\t-Added patch submitted by Milan Broz & Eamonn Saunders that \n\t fixes jpeg extraction bug.\nwarnings\n\tand an 64 bit bug. \nVersion  1.5.3\n\t-Added patches submitted by Toshio Kuratomi that fix compiler warnings\n\tand an 64 bit bug. \nVersion  1.5.2\n\t-Fixed problem with gap code thanks to Jeffry Turnmire\nVersion  1.5.1 \n\t-Fixed jpeg extraction bug thanks to Jeffry Turnmire\n\t-Fixed bug in OLE extraction thanks to Filip Van Raemdonck\nVersion  1.5\n\t-Fixed Endian errors on OSX\n\t-Fixed several bugs reported by John K. Antonishek\nVersion  1.4\n\t-Fixed realpath problems when compiling with cygwin\n\t-Fixed flaw in Zip extraction\n\t-Made indirect block detection a little more stable\nVersion  1.3\n\t- Fixed flaw in ZIP algorithm that didn't take into acct zeroized local file headers\n\t  that contain valid compressed/uncompressed info in the data descriptors\nVersion  1.2\n\t- Fixed conf file typos\nVersion  1.1\n\t- Improved Speed of extraction functions\n\t- Added NEXT option to config file\n\t- Fixed some integer overflow problems\n\t- Updated config file\n\t- Added ASCII option for the config file\nVersion  1.0 \n\t- Changed display functionq\n\t- Enhanced RaR and PE extraction\n\t- Minor bug fixes thanks to Eamon Walsh for the bug report and patch\n\t- Added support for Windows PE executables\n\t- Added support for multiple files\n\t- Thanks to Toshio Kuratomi for fixing some compiler warnings under gcc 4\n\t- Fixed bugs with respect to unique file names, and quick mode\nVersion  0.9.4\n\t- Improved speed and reliability of zip and mpeg extraction algorithms.\nVersion  0.9.3\n\t-  Added subdirectories for each output type as opposed to 1 directory\n\t\tcontaining 90,000 files.\t\nVersion  0.9.2\n\t- Greatly improved OLE extraction capabilites.\nVersion  0.9.1\n\t- Re-wrote code to run on LINUX,OSX,BSD,and SOLARIS\n\t- Added builtin extraction functions\n\t- Changed default behavior to look for the conf file in /usr/local/etc as \n\t  well as the the current dir.  Also the conf file is not required\n\t  for the program to run if the -t option is enabled.\n\t- Added a -i switch to specify an input file as opposed to using stdin\n\t- Added -k to allow the user to change the default chunk size as well \n\t  as -b to change the default block size\n\t- Changed the output dir to a time stamp of when the program was run.\n\t- Added -d for indirect block detection\nVersion 0.69 (Our thanks to Zach Kanner for these improvements...)\n\t- Corrected a bug that prevented the \"reverse footer search\" option\n\t  from working correctly.\n\t- Added a new \"NEXT\" option, specify NEXT after the footer on any\n          search specification line and foremost will search for \n\t  the last occurence (forward only currently) of that footer in the \n          window of size length but not including that footer in the resulting \n          output file created.  This feature lets you search for files that \n          don't have good ending footers but are separated by multiple starting           footers or other identifiable data which you know should not be \n          included in the output.  This works really well for MS Word documents           where you don't know where the end is.  The start of another document\n          becomes the end.  With this feature as you can specific the \"NEXT\" \n\t  or something after the end of the data we are looking for.\n\t- Updated the default foremost.conf file to use the feature for .doc\n\t  files.  Also added tags for ScanSoft PaperPort files (.max), and\n          a Windows program called PINs (.pins), which stores encrypted \n\t  passwords.\nVersion 0.68\nVersion 0.67\n\t- Added \"reverse footer search\" option, specify REVERSE after the \n\t  footer on any search specification line and foremost will search for\n\t  the last occurence of that footer in the window of size length.\n\nVersion 0.66\n\t- Changed normal search to Boyer-Moore algorithm. Much faster!\n \t- Added progress meter\n\t- Added ability to suppress extensions from a single file type or \n\t   from all file types.\n\t- Added \"chop\" field to show when files have been trimmed\n           based on their definitions in the configuration file\n\t- Added \"interior\" field to show when files have been found \n           somewhere other than a sector boundary\n\t- Added OpenBSD support\n\t- Added Win32 support via native compilation (Mingw)\n        - Added Win32 support via Cygwin, to include:\n                 -using %lld instead of %Ld\n                 -ignoring the fnctl line for O_LARGEFILE in Win32\n                 -redeclare strsignal as const char strsignal\n                 -write function basename for Win32 using '\\\\' as delimiter\n                 -updated Makefile\n\t- Removed unneccessary header files from foremost.h\n             \n(Version 0.65 was not published)\n\nVersion 0.64 - Audit file now records full paths of input and output files\n               Foremost now requires that the output directory is empty\n                 before running. If necessary, foremost will create the\n\t         output directory (ie. if it doesn't exist)\n               Added structure to internal code of foremost.c and created \n                 dig.c file\n               Fixed bug that generated wrong line number in configuration\n                 file error messages\n\t       Fixed bug on empty wildcard definitions\n\t       Added limit for number of file types in configuration file\n\nVersion 0.63 - Increased speed by using files already loaded in memory\t\n\t         instead of going back to the disk every time.\n\t       Minor speed increase to helper functions\n               Added footers for several file formats including ZIP\n\nVersion 0.62 - Added man page and make install functionality\n               Added \"internal\" indicator to show when a file is found\n                 off the start of the sector. \n               Fixed discrepancy between audit file and screen output\n                regarding file numbers and offset locations (off by one)\n               Added more graceful error handling\n\nVersion 0.61 - Added check for \"^M\" line feeds added by MSDOS editors\n                 while reading configuration files.\n\nVersion 0.6 - Renamed project to \"foremost\"\n\t      Added support for wildcards\n              Added -q for quick mode\n              More code clean up\n              Removed BSD porting code (oops) and added support\n               for large (>2GB) files.\n\nVersion 0.5 - Added -v for verbose mode\n              Added more intelligble output regarding file locations\n\t      Added error handling procedures\n\t      Added support for loading specification files from the disk\n\nVersion 0.4 - More code cleanup\n\t      (not actually released, used as test during investigation)\n\nVersion 0.3 - Code cleanup continues, moved all variables into the \n              state variable. The program still needs a LOT of work.\n\nVersion 0.2 - Code cleanup by Jesse Kornblum. Removed linux specific\n              code and ported to OpenBSD. Added support for handling\n              multiple images from the command line and created the\n              state variable. 07 March 2001\n\nVersion 0.1 - Proof of concept code written by Kris Kendall,\n              originally called \"snarfit\" 05 March 2001\n"
  },
  {
    "path": "Makefile",
    "content": "\nRAW_CC = gcc\nRAW_FLAGS = -Wall -O2\nLINK_OPT = \nVERSION = 1.5.7\n# Try to determine the host system\nSYS := $(shell uname -s | tr -d \"[0-9]\" | tr -d \"-\" | tr \"[A-Z]\" \"[a-z]\")\n\n\n# You can cross compile this program for Win32 using Linux and the \n# MinGW compiler. See the README for details. If you have already\n# installed MinGW, put the location ($PREFIX) here:\nCR_BASE = /usr/local/cross-tools/i386-mingw32msvc/bin\n\n# You shouldn't need to change anything below this line\n#---------------------------------------------------------------------\n\n# This should be commented out when debugging is done\n#RAW_FLAGS += -D__DEBUG -ggdb\n\nNAME = foremost\nMAN_PAGES = $(NAME).8.gz\n\nRAW_FLAGS += -DVERSION=\\\"$(VERSION)\\\"\n\n# Where we get installed\nBIN = /usr/local/bin\nMAN = /usr/share/man/man8\nCONF= /usr/local/etc\n# Setup for compiling and cross-compiling for Windows\n# The CR_ prefix refers to cross compiling from OSX to Windows\nCR_CC = $(CR_BASE)/gcc\nCR_OPT = $(RAW_FLAGS) -D__WIN32\nCR_LINK = -liberty\nCR_STRIP = $(CR_BASE)/strip\nCR_GOAL = $(NAME).exe\nWINCC = $(RAW_CC) $(RAW_FLAGS) -D__WIN32\n\n# Generic \"how to compile C files\"\nCC = $(RAW_CC) $(RAW_FLAGS) -D__UNIX\n.c.o:   \n\t$(CC) -c $<\n\n\n# Definitions we'll need later (and that should rarely change)\nHEADER_FILES = main.h ole.h extract.h\nSRC =  main.c state.c helpers.c config.c cli.c engine.c dir.c extract.c api.c\nOBJ =  main.o state.o helpers.o config.o cli.o engine.o dir.o extract.o api.o\nDOCS = Makefile README CHANGES $(MAN_PAGES) foremost.conf\nWINDOC = README.txt CHANGES.txt\n\n\n#---------------------------------------------------------------------\n# OPERATING SYSTEM DIRECTIVES\n#---------------------------------------------------------------------\n\nall: $(SYS) goals\n\ngoals: $(NAME)\n\nlinux: CC += -D__LINUX -DLARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64\nlinux: goals\n\nsunos: solaris\nsolaris: CC += -D__SOLARIS -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64\nsolaris: goals\n\ndarwin: CC += -D__MACOSX\ndarwin: goals\n\nmac: CC += -D__MACOSX\nmac: goals\n\nnetbsd:  unix\nopenbsd: unix\nfreebsd: unix\nunix: goals\n\n#Fore some reasons BSD variants get confused on how to build engine.o\n#so lets make it real clear\n\nengine.o:       engine.c\n\t$(CC) -c engine.c\n\n\n# Common commands for compiling versions for Windows. \n# See cross and windows directives below.\nwin_general: LINK_OPT = $(CR_LINK)\nwin_general: GOAL = $(CR_GOAL)\nwin_general: goals\n\t$(STRIP) $(CR_GOAL)\n\n# Cross compiling from Linux to Windows. See README for more info\ncross: CC = $(CR_CC) $(CR_OPT)\ncross: STRIP = $(CR_STRIP)\ncross: win_general\n\n# See the README for information on Windows compilation\nwindows: CC = $(WINCC)\nwindows: STRIP = strip\nwindows: win_general \n\ncygwin_nt.: unix\ncygwin: unix\n\n\n#---------------------------------------------------------------------\n# COMPILE THE PROGRAMS\n#   This section must be updated each time you add an algorithm\n#---------------------------------------------------------------------\n\nforemost: $(OBJ)\n\t$(CC) $(OBJ) -o $(NAME) $(LINK_OPT)\n\n\n#---------------------------------------------------------------------\n# INSTALLATION AND REMOVAL \n#---------------------------------------------------------------------\n\ninstall: goals\n\tinstall -m 755 $(NAME) $(BIN)\n\tinstall -m 444 $(MAN_PAGES) $(MAN)\n\tinstall -m 444 foremost.conf $(CONF)\nmacinstall: BIN = /usr/local/bin/\nmacinstall: MAN = /usr/share/man/man1/\nmacinstall: CONF = /usr/local/etc/\nmacinstall: mac install\n\n\nuninstall:\n\trm -f -- $(BIN)/{$(RM_GOALS)}\n\trm -f -- $(MAN)/{$(RM_DOCS)}\n\nmacuninstall: BIN = /usr/bin\nmacuninstall: MAN = /usr/share/man/man1\nmacuninstall: uninstall\n\n#---------------------------------------------------------------------\n# CLEAN UP\n#---------------------------------------------------------------------\n\n# This is used for debugging\npreflight:\n\tgrep -n RBF *.1 *.h *.c README CHANGES\n\nnice:\n\trm -f -- *~\n\nclean: nice\n\trm -f -- *.o\n\trm -f -- $(CR_GOAL) $(NAME) $(WIN_DOC)\n\trm -f -- $(TAR_FILE).gz $(DEST_DIR).zip $(DEST_DIR).zip.gpg\n\n#-------------------------------------------------------------------------\n# MAKING PACKAGES\n#-------------------------------------------------------------------------\n\nEXTRA_FILES = \nDEST_DIR = $(NAME)-$(VERSION)\nTAR_FILE = $(DEST_DIR).tar\nPKG_FILES = $(SRC) $(HEADER_FILES) $(DOCS) $(EXTRA_FILES)\n\n# This packages me up to send to somebody else\npackage: clean\n\trm -f $(TAR_FILE) $(TAR_FILE).gz\n\tmkdir $(DEST_DIR)\n\tcp $(PKG_FILES) $(DEST_DIR)\n\ttar cvf $(TAR_FILE) $(DEST_DIR)\n\trm -rf $(DEST_DIR)\n\tgzip $(TAR_FILE)\n\n\n# This Makefile is designed for Mac OSX to package the file. \n# To do this on a linux box, The big line below starting with \"/usr/bin/tbl\"\n# should be replaced with:\n#\n#\tman ./$(MD5GOAL).1 | col -bx > README.txt\n#\n# and the \"flip -d\" command should be replaced with dos2unix\n#\n# The flip command can be found at:\n# http://ccrma-www.stanford.edu/~craig/utility/flip/#\nwin-doc:\n\t/usr/bin/tbl ./$(MD5GOAL).1 | /usr/bin/groff -S -Wall -mtty-char -mandoc -Tascii | /usr/bin/col > README.txt\n\tcp CHANGES CHANGES.txt\n\tflip -d $(WINDOC)\n\ncross-pkg: clean cross win-doc\n\trm -f $(DEST_DIR).zip\n\tzip $(DEST_DIR).zip $(CR_MD5GOAL) $(CR_SHA1GOAL) $(CR_SHA256GOAL) $(WINDOC)\n\trm -f $(WINDOC)\n\nworld: package cross-pkg\n"
  },
  {
    "path": "README",
    "content": "\nFOREMOST \n----------------------------------------------------------------------\n\nForemost is a Linux program to recover files based on their headers and\nfooters. Foremost can work on image files, such as those generated by dd,\nSafeback, Encase, etc, or directly on a drive. The headers and footers are\nspecified by a configuration file, so you can pick and choose which\nheaders you want to look for.\n\n\n\n--------------------------------------------\nINSTALL FOREMOST\n--------------------------------------------\n\nTo run foremost, you must:\n\n- uncompress the archive\n- compile\n- install\n\nHere's how to do it:\n\nLINUX:\n$ tar zxvf foremost-xx.tar.gz\n$ cd foremost-xx\n$ make\n$ make install\n\nBSD:\n$ tar zxvf foremost-xx.tar.gz\n$ cd foremost-xx\n$ make unix\n$ make install\n\nSOLARIS:\n$ tar zxvf foremost-xx.tar.gz\n$ cd foremost-xx\n$ make solaris\n$ make install\n\nOSX:\n$ tar zxvf foremost-xx.tar.gz\n$ cd foremost-xx\n$ make mac\n$ make macinstall\n\nOn systems with older versions of glibc (earlier than 2.2.0), you will get \nsome harmless warnings about ftello and fseeko not being defined. You can \nignore these.\n\n\nIf you ever need to remove foremost from your system, you can do this:\n\n$ make uninstall\n\n\n\n--------------------------------------------\nUSING FOREMOST\n--------------------------------------------\n\nA description of the command line arguments can be found in the man page. \nTo view it:\n\n$ man foremost\n\n\n\n--------------------------------------------\nCONFIGURATION FILE FORMAT\n--------------------------------------------\n\nThe configuration file is used to control what types of files foremost\nsearches for. A sample configuration file, foremost.conf, is included with\nthis distribution. For each file type, the configuration file describes\nthe file's extension, whether the header and footer are case sensitive,\nthe maximum file size, and the header and footer for the file. The footer\nfield is optional, but header, size, case sensitivity, and extension are\nnot!\n\nAny line that begins with a '#' is considered a comment and ignored. Thus,\nto skip a file type just put a '#' at the beginning of that line\n\nHeaders and footers are decoded before use. To specify a value in\nhexadecimal use \\x[0-f][0-f], and for octal use \\[0-7][0-7][0-7].  Spaces\ncan be represented by \\s. Example: \"\\x4F\\123\\I\\sCCI\" decodes to \"OSI CCI\".\n\nTo match any single character (aka a wildcard) use a '?'. If you need to\nsearch for the '?' character, you will need to change the 'wildcard' line\n*and* every occurrence of the old wildcard character in the configuration\nfile. Don't forget those hex and octal values! '?' is equal to 0x3f and\n\\063.\n\nHere's a sample set of headers and footers:\n\n# extension  case-sens  max-size   header\t\t\tfooter\t\t(option)\n#\n# GIF and JPG files (very common)\n\tgif\ty\t155000\t\\x47\\x49\\x46\\x38\\x37\\x61\t\\x00\\x3b\n  \tgif\ty \t155000\t\\x47\\x49\\x46\\x38\\x39\\x61\t\\x00\\x00\\x3b\n  \tjpg\ty\t200000\t\\xff\\xd8\\xff\t\t\t\\xff\\xd9\n\nNote: the option is a method of specifying additional options.  Current the following options exist:\n\nFORWARD: Specify to search from the header to the footer (optional) up to the max-size.\nREVERSE: Specify to search from the footer to the header up to the max-size.\nNEXT: Specify to search from the header to the data just past the footer.  This allows you to  specify data that you know is 'NOT' in the data you are looking for and should terminated the search, up to the max-size.\n\n--------------------------------------------\nBUG REPORTING\n--------------------------------------------\n\nPlease report ALL bugs to nick dot mikus AT gmail d0t com. Please include a \ndescription of the bug, how you found it, and your contact information.\n\n\n\n\n--------------------------------------------\nCREDITS AND THANKS\n--------------------------------------------\n\nForemost was written by Special Agent Kris Kendall and Special Agent Jesse\nKornblum of the United States Air Force Office of Special Investigations\nstarting in March 2001. This program would not be what it is today without\nhelp from (in no particular order): Rob Meekins, Dan Kalil, and Chet\nMaciag. This project was inspired by CarvThis, written by the Defense\nComputer Forensic Lab in 1999.\n\n\n--------------------------------------------\nLEGAL NOTICE\n--------------------------------------------\n\ndd, Safeback, and Encase are copyrighted works and any questions regarding \nthese tools should be directed to the copyright holders. The United States \nGovernment does not endorse the use of these or any other imaging tools. \n"
  },
  {
    "path": "api.c",
    "content": "/*\n\tModified API from http://chicago.sourceforge.net/devel/docs/ole/\n\tBasically the same API, added error checking and the ability\n\tto check buffers for docs, not just files.\n*/\n#include \"main.h\"\n#include \"ole.h\"\n\n/*Some ugly globals\n* This API should be re-written\n* in a modular fashion*/\nunsigned char\tbuffer[OUR_BLK_SIZE];\nchar\t\t\t*extract_name;\nint\t\t\t\textract = 0;\nint\t\t\t\tdir_count = 0;\nint\t\t\t\t*FAT;\nint\t\t\t\tverbose = TRUE;\nint\t\t\t\tFATblk;\nint\t\t\t\tcurrFATblk;\nint\t\t\t\thighblk = 0;\nint\t\t\t\tblock_list[OUR_BLK_SIZE / sizeof(int)];\nextern int\t\terrno;\n\n/*Inititialize those globals used by extract_ole*/\nvoid init_ole()\n{\n\tint i = 0;\n\textract = 0;\n\tdir_count = 0;\n\tFAT = NULL;\n\thighblk = 0;\n\tFATblk = 0;\n\tcurrFATblk = -1;\n\tdirlist = NULL;\n\tdl = NULL;\n\tfor (i = 0; i < OUR_BLK_SIZE / sizeof(int); i++)\n\t\t{\n\t\tblock_list[i] = 0;\n\t\t}\n\n\tfor (i = 0; i < OUR_BLK_SIZE; i++)\n\t\t{\n\t\tbuffer[i] = 0;\n\t\t}\n}\n\nvoid *Malloc(size_t bytes)\n{\n\tvoid\t*x;\n\n\tx = malloc(bytes);\n\tif (x)\n\t\treturn x;\n\tdie(\"Can't malloc %d bytes.\\n\", (char *)bytes);\n\treturn 0;\n}\n\nvoid die(char *fmt, void *arg)\n{\n\tfprintf(stderr, fmt, arg);\n\texit(1);\n}\n\nint get_dir_block(unsigned char *fd, int blknum, int buffersize)\n{\n\tint\t\t\t\ti;\n\tstruct OLE_DIR\t*dir;\n\tunsigned char\t*dest = NULL;\n\n\tdest = get_ole_block(fd, blknum, buffersize);\n\tif (dest == NULL)\n\t\t{\n\t\treturn FALSE;\n\t\t}\n\n\tfor (i = 0; i < DIRS_PER_BLK; i++)\n\t\t{\n\t\tdir = (struct OLE_DIR *) &dest[sizeof(struct OLE_DIR) * i];\n\t\tif (dir->type == NO_ENTRY)\n\t\t\tbreak;\n\t\t}\n\n\tif (i == DIRS_PER_BLK)\n\t\t{\n\t\treturn TRUE;\n\t\t}\n\telse\n\t\t{\n\t\treturn SHORT_BLOCK;\n\t\t}\n}\n\nint get_dir_info(unsigned char *src)\n{\n\tint\t\t\t\ti, j;\n\tchar\t\t\t*p, *q;\n\tstruct OLE_DIR\t*dir;\n\tint\t\t\t\tpunctCount = 0;\n\tshort\t\t\tname_size = 0;\n\n\tfor (i = 0; i < DIRS_PER_BLK; i++)\n\t\t{\n\t\tdir = (struct OLE_DIR *) &src[sizeof(struct OLE_DIR) * i];\n\t\tpunctCount = 0;\n\n\t\t//if(dir->reserved!=0) return FALSE;\n\t\tif (dir->type < 0)\t//Should we check if values are > 5 ?????\n\t\t{\n#ifdef DEBUG\n\t\t\tprintf(\"\\n\tInvalid directory type\\n\");\n\t\t\tprintf(\"type:=%c size:=%lu \\n\", dir->type, dir->size);\n#endif\n\t\t\treturn FALSE;\n\t\t}\n\n\t\tif (dir->type == NO_ENTRY)\n\t\t\tbreak;\n\n#ifdef DEBUG\n\n\t\t//dump_dirent (i);\n#endif\n\t\tdl = &dirlist[dir_count++];\n\t\tif (dl == NULL)\n\t\t{\n#ifdef DEBUG\n\t\t\tprintf(\"dl==NULL!!! bailing out\\n\");\n#endif\n\t\t\treturn FALSE;\n\t\t}\n\n\t\tif (dir_count > 500)\n\t\t\treturn FALSE;\t/*SANITY CHECKING*/\n\t\tq = dl->name;\n\t\tp = dir->name;\n\n\t\tname_size = htos((unsigned char *) &dir->namsiz, FOREMOST_LITTLE_ENDIAN);\n\n#ifdef DEBUG\n\t\tprintf(\" dir->namsiz:=%d\\n\", name_size);\n#endif\n\t\tif (name_size > 64 || name_size <= 0)\n\t\t\treturn FALSE;\n\n\t\tif (*p < ' ')\n\t\t\tp += 2;\t\t\t/* skip leading short */\n\t\tfor (j = 0; j < name_size; j++, p++)\n\t\t\t{\n\n\t\t\tif (p == NULL || q == NULL)\n\t\t\t\treturn FALSE;\n\t\t\tif (*p && isprint(*p))\n\t\t\t\t{\n\n\t\t\t\tif (ispunct(*p))\n\t\t\t\t\tpunctCount++;\n\t\t\t\t*q++ = *p;\n\n\t\t\t\t}\n\t\t\t}\n\n\t\tif (punctCount > 3)\n\t\t{\n#ifdef DEBUG\n\t\t\tprintf(\"dl->name:=%s\\n\", dl->name);\n\t\t\tprintf(\"pcount > 3!!! bailing out\\n\");\n#endif\n\t\t\treturn FALSE;\n\t\t}\n\n\t\tif (dl->name == NULL)\n\t\t{\n#ifdef DEBUG\n\t\t\tprintf(\"\t***NULL dir name. bailing out \\n\");\n#endif\n\t\t\treturn FALSE;\n\t\t}\n\n\t\t/*Ignore Catalogs*/\n\t\tif (strstr(dl->name, \"Catalog\"))\n\t\t\treturn FALSE;\n\t\t*q = 0;\n\t\tdl->type = dir->type;\n\t\tdl->size = htoi((unsigned char *) &dir->size, FOREMOST_LITTLE_ENDIAN);\n\n\t\tdl->start_block = htoi((unsigned char *) &dir->start_block, FOREMOST_LITTLE_ENDIAN);\n\t\tdl->next = htoi((unsigned char *) &dir->next_dirent, FOREMOST_LITTLE_ENDIAN);\n\t\tdl->prev = htoi((unsigned char *) &dir->prev_dirent, FOREMOST_LITTLE_ENDIAN);\n\t\tdl->dir = htoi((unsigned char *) &dir->dir_dirent, FOREMOST_LITTLE_ENDIAN);\n\t\tif (dir->type != STREAM)\n\t\t\t{\n\t\t\tdl->s1 = dir->secs1;\n\t\t\tdl->s2 = dir->secs2;\n\t\t\tdl->d1 = dir->days1;\n\t\t\tdl->d2 = dir->days2;\n\t\t\t}\n\t\t}\n\n\treturn TRUE;\n}\n\nstatic int\t*lnlv;\t\t\t/* last next link visited ! */\nint reorder_dirlist(struct DIRECTORY *dir, int level)\n{\n\n\t//printf(\"\tReordering the dirlist\\n\");\n\tdir->level = level;\n\tif (dir->dir != -1 || dir->dir > dir_count)\n\t\t{\n\t\treturn 0;\n\t\t}\n\telse if (!reorder_dirlist(&dirlist[dir->dir], level + 1))\n\t\treturn 0;\n\n\t/* reorder next-link subtree, saving the most next link visited */\n\tif (dir->next != -1)\n\t\t{\n\t\tif (dir->next > dir_count)\n\t\t\treturn 0;\n\t\telse if (!reorder_dirlist(&dirlist[dir->next], level))\n\t\t\treturn 0;\n\t\t}\n\telse\n\t\tlnlv = &dir->next;\n\n\t/* move the prev child to the next link and reorder it, if any exist\n */\n\tif (dir->prev != -1)\n\t\t{\n\t\tif (dir->prev > dir_count)\n\t\t\treturn 0;\n\t\telse\n\t\t\t{\n\t\t\t*lnlv = dir->prev;\n\t\t\tdir->prev = -1;\n\t\t\tif (!reorder_dirlist(&dirlist[*lnlv], level))\n\t\t\t\treturn 0;\n\t\t\t}\n\t\t}\n\n\treturn 1;\n}\n\nint get_block(unsigned char *fd, int blknum, unsigned char *dest, long long int buffersize)\n{\n\tunsigned char\t\t*temp = fd;\n\tint\t\t\t\t\ti = 0;\n\tunsigned long long\tjump = (unsigned long long)OUR_BLK_SIZE * (unsigned long long)(blknum + 1);\n\tif (blknum < -1 || jump < 0 || blknum > buffersize || buffersize < jump)\n\t{\n#ifdef DEBUG\n\t\tprintf(\"\tBad blk read1 blknum:=%d  jump:=%lld buffersize=%lld\\n\", blknum, jump, buffersize);\n#endif\n\t\treturn FALSE;\n\t}\n\n\ttemp = fd + jump;\n#ifdef DEBUG\n\tprintf(\"\tJumping to %lld blknum=%d buffersize=%lld\\n\", jump, blknum, buffersize);\n#endif\n\tfor (i = 0; i < OUR_BLK_SIZE; i++)\n\t\t{\n\t\tdest[i] = temp[i];\n\t\t}\n\n\tif ((blknum + 1) > highblk)\n\t\thighblk = blknum + 1;\n\treturn TRUE;\n}\n\nunsigned char *get_ole_block(unsigned char *fd, int blknum, unsigned long long buffersize)\n{\n\tunsigned long long\tjump = (unsigned long long)OUR_BLK_SIZE * (unsigned long long)(blknum + 1);\n\tif (blknum < -1 || jump < 0 || blknum > buffersize || buffersize < jump)\n\t{\n#ifdef DEBUG\n\t\tprintf(\"\tBad blk read1 blknum:=%d  jump:=%lld buffersize=%lld\\n\", blknum, jump, buffersize);\n#endif\n\t\treturn FALSE;\n\t}\n\n#ifdef DEBUG\n\tprintf(\"\tJumping to %lld blknum=%d buffersize=%lld\\n\", jump, blknum, buffersize);\n#endif\n\treturn (fd + jump);\n}\n\nint get_FAT_block(unsigned char *fd, int blknum, int *dest, int buffersize)\n{\n\tstatic int\tFATblk;\n\n\t//   static int currFATblk = -1;\n\tFATblk = htoi((unsigned char *) &FAT[blknum / (OUR_BLK_SIZE / sizeof(int))],\n\t\t\t\t  FOREMOST_LITTLE_ENDIAN);\n#ifdef DEBUG\n\tprintf(\"****blknum:=%d FATblk:=%d currFATblk:=%d\\n\", blknum, FATblk, currFATblk);\n#endif\n\tif (currFATblk != FATblk)\n\t{\n#ifdef DEBUG\n\t\tprintf(\"*****blknum:=%d FATblk:=%d\\n\", blknum, FATblk);\n#endif\n\t\tif (!get_block(fd, FATblk, (unsigned char *)dest, buffersize))\n\t\t\t{\n\t\t\treturn FALSE;\n\t\t\t}\n\n\t\tcurrFATblk = FATblk;\n\t}\n\n\treturn TRUE;\n}\n\nvoid dump_header(struct OLE_HDR *h)\n{\n\tint i, *x;\n\n\t//struct OLE_HDR *h = (struct OLE_HDR *) buffer;\n\t// fprintf (stderr, \"clsid  = \");\n\t//printx(h->clsid,0,16);\n\tfprintf(stderr,\n\t\t\t\"\\nuMinorVersion  = %u\\t\",\n\t\t\thtos((unsigned char *) &h->uMinorVersion, FOREMOST_LITTLE_ENDIAN));\n\tfprintf(stderr,\n\t\t\t\"uDllVersion  = %u\\t\",\n\t\t\thtos((unsigned char *) &h->uDllVersion, FOREMOST_LITTLE_ENDIAN));\n\tfprintf(stderr,\n\t\t\t\"uByteOrder  = %u\\n\",\n\t\t\thtos((unsigned char *) &h->uByteOrder, FOREMOST_LITTLE_ENDIAN));\n\tfprintf(stderr,\n\t\t\t\"uSectorShift  = %u\\t\",\n\t\t\thtos((unsigned char *) &h->uSectorShift, FOREMOST_LITTLE_ENDIAN));\n\tfprintf(stderr,\n\t\t\t\"uMiniSectorShift  = %u\\t\",\n\t\t\thtos((unsigned char *) &h->uMiniSectorShift, FOREMOST_LITTLE_ENDIAN));\n\tfprintf(stderr,\n\t\t\t\"reserved  = %u\\n\",\n\t\t\thtos((unsigned char *) &h->reserved, FOREMOST_LITTLE_ENDIAN));\n\tfprintf(stderr,\n\t\t\t\"reserved1  = %u\\t\",\n\t\t\thtoi((unsigned char *) &h->reserved1, FOREMOST_LITTLE_ENDIAN));\n\tfprintf(stderr,\n\t\t\t\"reserved2  = %u\\t\",\n\t\t\thtoi((unsigned char *) &h->reserved2, FOREMOST_LITTLE_ENDIAN));\n\tfprintf(stderr,\n\t\t\t\"csectMiniFat = %u\\t\",\n\t\t\thtoi((unsigned char *) &h->csectMiniFat, FOREMOST_LITTLE_ENDIAN));\n\tfprintf(stderr,\n\t\t\t\"miniSectorCutoff = %u\\n\",\n\t\t\thtoi((unsigned char *) &h->miniSectorCutoff, FOREMOST_LITTLE_ENDIAN));\n\tfprintf(stderr,\n\t\t\t\"root_start_block  = %u\\n\",\n\t\t\thtoi((unsigned char *) &h->root_start_block, FOREMOST_LITTLE_ENDIAN));\n\tfprintf(stderr,\n\t\t\t\"dir flag = %u\\n\",\n\t\t\thtoi((unsigned char *) &h->dir_flag, FOREMOST_LITTLE_ENDIAN));\n\tfprintf(stderr,\n\t\t\t\"# FAT blocks = %u\\n\",\n\t\t\thtoi((unsigned char *) &h->num_FAT_blocks, FOREMOST_LITTLE_ENDIAN));\n\tfprintf(stderr,\n\t\t\t\"FAT_next_block = %u\\n\",\n\t\t\thtoi((unsigned char *) &h->FAT_next_block, FOREMOST_LITTLE_ENDIAN));\n\tfprintf(stderr,\n\t\t\t\"# extra FAT blocks = %u\\n\",\n\t\t\thtoi((unsigned char *) &h->num_extra_FAT_blocks, FOREMOST_LITTLE_ENDIAN));\n\tx = (int *) &h[1];\n\tfprintf(stderr, \"bbd list:\");\n\tfor (i = 0; i < 109; i++, x++)\n\t\t{\n\t\tif ((i % 10) == 0)\n\t\t\tfprintf(stderr, \"\\n\");\n\t\tif (*x == '\\xff')\n\t\t\tbreak;\n\t\tfprintf(stderr, \"%x \", *x);\n\t\t}\n\n\tfprintf(stderr, \"\\n\t**************End of header***********\\n\");\n}\n\nstruct OLE_HDR *reverseBlock(struct OLE_HDR *dest, struct OLE_HDR *h)\n{\n\tint i, *x, *y;\n\tdest->uMinorVersion = htos((unsigned char *) &h->uMinorVersion, FOREMOST_LITTLE_ENDIAN);\n\tdest->uDllVersion = htos((unsigned char *) &h->uDllVersion, FOREMOST_LITTLE_ENDIAN);\n\tdest->uByteOrder = htos((unsigned char *) &h->uByteOrder, FOREMOST_LITTLE_ENDIAN);\t\t\t\t/*28*/\n\tdest->uSectorShift = htos((unsigned char *) &h->uSectorShift, FOREMOST_LITTLE_ENDIAN);\n\tdest->uMiniSectorShift = htos((unsigned char *) &h->uMiniSectorShift, FOREMOST_LITTLE_ENDIAN);\t/*32*/\n\tdest->reserved = htos((unsigned char *) &h->reserved, FOREMOST_LITTLE_ENDIAN);\t\t\t\t\t/*34*/\n\tdest->reserved1 = htoi((unsigned char *) &h->reserved1, FOREMOST_LITTLE_ENDIAN);\t\t\t\t/*36*/\n\tdest->reserved2 = htoi((unsigned char *) &h->reserved2, FOREMOST_LITTLE_ENDIAN);\t\t\t\t/*40*/\n\tdest->num_FAT_blocks = htoi((unsigned char *) &h->num_FAT_blocks, FOREMOST_LITTLE_ENDIAN);\t\t/*44*/\n\tdest->root_start_block = htoi((unsigned char *) &h->root_start_block, FOREMOST_LITTLE_ENDIAN);\t/*48*/\n\tdest->dfsignature = htoi((unsigned char *) &h->dfsignature, FOREMOST_LITTLE_ENDIAN);\t\t\t/*52*/\n\tdest->miniSectorCutoff = htoi((unsigned char *) &h->miniSectorCutoff, FOREMOST_LITTLE_ENDIAN);\t/*56*/\n\tdest->dir_flag = htoi((unsigned char *) &h->dir_flag, FOREMOST_LITTLE_ENDIAN);\t\t\t\t\t/*60 first sec in the mini fat chain*/\n\tdest->csectMiniFat = htoi((unsigned char *) &h->csectMiniFat, FOREMOST_LITTLE_ENDIAN);\t\t\t/*64 number of sectors in the minifat */\n\tdest->FAT_next_block = htoi((unsigned char *) &h->FAT_next_block, FOREMOST_LITTLE_ENDIAN);\t\t/*68*/\n\tdest->num_extra_FAT_blocks = htoi((unsigned char *) &h->num_extra_FAT_blocks,\n\t\t\t\t\t\t\t\t\t  FOREMOST_LITTLE_ENDIAN);\n\n\tx = (int *) &h[1];\n\ty = (int *) &dest[1];\n\tfor (i = 0; i < 109; i++, x++)\n\t\t{\n\t\t*y = htoi((unsigned char *)x, FOREMOST_LITTLE_ENDIAN);\n\t\ty++;\n\t\t}\n\n\treturn dest;\n}\n\nvoid dump_ole_header(struct OLE_HDR *h)\n{\n\tint i, *x;\n\n\t//fprintf (stderr, \"clsid  = \");\n\t//printx(h->clsid,0,16);\n\tfprintf(stderr,\n\t\t\t\"\\nuMinorVersion  = %u\\t\",\n\t\t\thtos((unsigned char *) &h->uMinorVersion, FOREMOST_LITTLE_ENDIAN));\n\tfprintf(stderr,\n\t\t\t\"uDllVersion  = %u\\t\",\n\t\t\thtos((unsigned char *) &h->uDllVersion, FOREMOST_LITTLE_ENDIAN));\n\tfprintf(stderr,\n\t\t\t\"uByteOrder  = %u\\n\",\n\t\t\thtos((unsigned char *) &h->uByteOrder, FOREMOST_LITTLE_ENDIAN));\n\tfprintf(stderr,\n\t\t\t\"uSectorShift  = %u\\t\",\n\t\t\thtos((unsigned char *) &h->uSectorShift, FOREMOST_LITTLE_ENDIAN));\n\tfprintf(stderr,\n\t\t\t\"uMiniSectorShift  = %u\\t\",\n\t\t\thtos((unsigned char *) &h->uMiniSectorShift, FOREMOST_LITTLE_ENDIAN));\n\tfprintf(stderr,\n\t\t\t\"reserved  = %u\\n\",\n\t\t\thtos((unsigned char *) &h->reserved, FOREMOST_LITTLE_ENDIAN));\n\tfprintf(stderr,\n\t\t\t\"reserved1  = %u\\t\",\n\t\t\thtoi((unsigned char *) &h->reserved1, FOREMOST_LITTLE_ENDIAN));\n\tfprintf(stderr,\n\t\t\t\"reserved2  = %u\\t\",\n\t\t\thtoi((unsigned char *) &h->reserved2, FOREMOST_LITTLE_ENDIAN));\n\tfprintf(stderr,\n\t\t\t\"csectMiniFat = %u\\t\",\n\t\t\thtoi((unsigned char *) &h->csectMiniFat, FOREMOST_LITTLE_ENDIAN));\n\tfprintf(stderr,\n\t\t\t\"miniSectorCutoff = %u\\n\",\n\t\t\thtoi((unsigned char *) &h->miniSectorCutoff, FOREMOST_LITTLE_ENDIAN));\n\tfprintf(stderr,\n\t\t\t\"root_start_block  = %u\\n\",\n\t\t\thtoi((unsigned char *) &h->root_start_block, FOREMOST_LITTLE_ENDIAN));\n\tfprintf(stderr,\n\t\t\t\"dir flag = %u\\n\",\n\t\t\thtoi((unsigned char *) &h->dir_flag, FOREMOST_LITTLE_ENDIAN));\n\tfprintf(stderr,\n\t\t\t\"# FAT blocks = %u\\n\",\n\t\t\thtoi((unsigned char *) &h->num_FAT_blocks, FOREMOST_LITTLE_ENDIAN));\n\tfprintf(stderr,\n\t\t\t\"FAT_next_block = %u\\n\",\n\t\t\thtoi((unsigned char *) &h->FAT_next_block, FOREMOST_LITTLE_ENDIAN));\n\tfprintf(stderr,\n\t\t\t\"# extra FAT blocks = %u\\n\",\n\t\t\thtoi((unsigned char *) &h->num_extra_FAT_blocks, FOREMOST_LITTLE_ENDIAN));\n\tx = (int *) &h[1];\n\tfprintf(stderr, \"bbd list:\");\n\tfor (i = 0; i < 109; i++, x++)\n\t\t{\n\t\tif ((i % 10) == 0)\n\t\t\tfprintf(stderr, \"\\n\");\n\t\tif (*x == '\\xff')\n\t\t\tbreak;\n\t\tfprintf(stderr, \"%x \", htoi((unsigned char *)x, FOREMOST_LITTLE_ENDIAN));\n\t\t}\n\n\tfprintf(stderr, \"\\n\t**************End of header***********\\n\");\n}\n\nint dump_dirent(int which_one)\n{\n\tint\t\t\t\ti;\n\tchar\t\t\t*p;\n\tshort\t\t\tunknown;\n\tstruct OLE_DIR\t*dir;\n\n\tdir = (struct OLE_DIR *) &buffer[which_one * sizeof(struct OLE_DIR)];\n\tif (dir->type == NO_ENTRY)\n\t\treturn TRUE;\n\tfprintf(stderr, \"DIRENT_%d :\\t\", dir_count);\n\tfprintf(stderr,\n\t\t\t\"%s\\t\",\n\t\t\t(dir->type == ROOT) ? \"root directory\" : (dir->type == STORAGE) ? \"directory\" : \"file\");\n\n\t/* get UNICODE name */\n\tp = dir->name;\n\tif (*p < ' ')\n\t\t{\n\t\tunknown = *((short *)p);\n\n\t\t//fprintf (stderr, \"%04x\\t\", unknown);\n\t\tp += 2; /* step over unknown short */\n\t\t}\n\n\tfor (i = 0; i < dir->namsiz; i++, p++)\n\t\t{\n\t\tif (*p && (*p > 0x1f))\n\t\t\t{\n\t\t\tif (isprint(*p))\n\t\t\t\t{\n\t\t\t\tfprintf(stderr, \"%c\", *p);\n\t\t\t\t}\n\t\t\telse\n\t\t\t\t{\n\t\t\t\tprintf(\"***\tInvalid char %x ***\\n\", *p);\n\t\t\t\treturn FALSE;\n\t\t\t\t}\n\t\t\t}\n\t\t}\n\n\tfprintf(stderr, \"\\n\");\n\n\t//fprintf (stderr, \"prev_dirent = %lu\\t\", dir->prev_dirent);\n\t//fprintf (stderr, \"next_dirent = %lu\\t\", dir->next_dirent);\n\t//fprintf (stderr, \"dir_dirent  = %lu\\n\", dir->dir_dirent);\n\t//fprintf (stderr, \"name  = %s\\t\", dir->name);\n\tfprintf(stderr, \"namsiz  = %u\\t\", dir->namsiz);\n\tfprintf(stderr, \"type  = %d\\t\", dir->type);\n\tfprintf(stderr, \"reserved  = %u\\n\", dir->reserved);\n\n\tfprintf(stderr, \"start block  = %lu\\n\", dir->start_block);\n\tfprintf(stderr, \"size  = %lu\\n\", dir->size);\n\tfprintf(stderr, \"\\n\t**************End of dirent***********\\n\");\n\treturn TRUE;\n}\n"
  },
  {
    "path": "cli.c",
    "content": "\n\n#include \"main.h\"\n\nvoid fatal_error (f_state * s, char *msg)\n\t{\n\tfprintf(stderr, \"%s: %s%s\", __progname, msg, NEWLINE);\n\tif (get_audit_file_open(s))\n\t\t{\n\t\taudit_msg(s, msg);\n\t\tclose_audit_file(s);\n\t\t}\n\texit(EXIT_FAILURE);\n\t}\n\nvoid print_error(f_state *s, char *fn, char *msg)\n{\n\tif (!(get_mode(s, mode_quiet)))\n\t\tfprintf(stderr, \"%s: %s: %s%s\", __progname, fn, msg, NEWLINE);\n}\n\nvoid print_message(f_state *s, char *format, va_list argp)\n{\n\tvfprintf(stdout, format, argp);\n\tfprintf(stdout, \"%s\", NEWLINE);\n}\n"
  },
  {
    "path": "config.c",
    "content": "\n\n#include \"main.h\"\n\nint translate (char *str)\n\t{\n\tchar\tnext;\n\tchar\t*rd = str, *wr = str, *bad;\n\tchar\ttemp[1 + 3 + 1];\n\tchar\tch;\n\n\tif (!*rd)\t\t\t\t\t//If it's a null string just return\n\t\t{\n\t\treturn 0;\n\t\t}\n\n\twhile (*rd)\n\t\t{\n\n\t\t/* Is it an escaped character ? */\n\t\tif (*rd == '\\\\')\n\t\t\t{\n\t\t\trd++;\n\t\t\tswitch (*rd)\n\t\t\t\t{\n\t\t\t\tcase '\\\\':\n\t\t\t\t\t*rd++;\n\t\t\t\t\t*wr++ = '\\\\';\n\t\t\t\t\tbreak;\n\n\t\t\t\tcase 'a':\n\t\t\t\t\t*rd++;\n\t\t\t\t\t*wr++ = '\\a';\n\t\t\t\t\tbreak;\n\n\t\t\t\tcase 's':\n\t\t\t\t\t*rd++;\n\t\t\t\t\t*wr++ = ' ';\n\t\t\t\t\tbreak;\n\n\t\t\t\tcase 'n':\n\t\t\t\t\t*rd++;\n\t\t\t\t\t*wr++ = '\\n';\n\t\t\t\t\tbreak;\n\n\t\t\t\tcase 'r':\n\t\t\t\t\t*rd++;\n\t\t\t\t\t*wr++ = '\\r';\n\t\t\t\t\tbreak;\n\n\t\t\t\tcase 't':\n\t\t\t\t\t*rd++;\n\t\t\t\t\t*wr++ = '\\t';\n\t\t\t\t\tbreak;\n\n\t\t\t\tcase 'v':\n\t\t\t\t\t*rd++;\n\t\t\t\t\t*wr++ = '\\v';\n\t\t\t\t\tbreak;\n\n\t\t\t\t/* Hexadecimal/Octal values are treated in one place using strtoul() */\n\t\t\t\tcase 'x':\n\t\t\t\tcase '0':\n\t\t\t\tcase '1':\n\t\t\t\tcase '2':\n\t\t\t\tcase '3':\n\t\t\t\t\tnext = *(rd + 1);\n\t\t\t\t\tif (next < 48 || (57 < next && next < 65) || (70 < next && next < 97) || next > 102)\n\t\t\t\t\t\tbreak;\t//break if not a digit or a-f, A-F\n\t\t\t\t\tnext = *(rd + 2);\n\t\t\t\t\tif (next < 48 || (57 < next && next < 65) || (70 < next && next < 97) || next > 102)\n\t\t\t\t\t\tbreak;\t//break if not a digit or a-f, A-F\n\t\t\t\t\ttemp[0] = '0';\n\t\t\t\t\tbad = temp;\n\t\t\t\t\tstrncpy(temp + 1, rd, 3);\n\t\t\t\t\ttemp[4] = '\\0';\n\t\t\t\t\tch = strtoul(temp, &bad, 0);\n\t\t\t\t\tif (*bad == '\\0')\n\t\t\t\t\t\t{\n\t\t\t\t\t\t*wr++ = ch;\n\t\t\t\t\t\trd += 3;\n\t\t\t\t\t\t}\t\t/* else INVALID CHARACTER IN INPUT ('\\\\' followed by *rd) */\n\t\t\t\t\tbreak;\n\n\t\t\t\tdefault:\t\t/* INVALID CHARACTER IN INPUT (*rd)*/\n\t\t\t\t\t*wr++ = '\\\\';\n\t\t\t\t\tbreak;\n\t\t\t\t}\n\t\t\t}\n\n\t\t/* Unescaped characters go directly to the output */\n\t\telse\n\t\t\t*wr++ = *rd++;\n\t\t}\n\t*wr = '\\0';\t\t\t\t\t//Null terminate the string that we just created...\n\treturn wr - str;\n\t}\n\nchar *skipWhiteSpace(char *str)\n{\n\twhile (isspace(str[0]))\n\t\tstr++;\n\treturn str;\n}\n\nint extractSearchSpecData(f_state *state, char **tokenarray)\n{\n\n\t/* Process a normal line with 3-4 tokens on it\n   token[0] = suffix\n   token[1] = case sensitive\n   token[2] = size to snarf\n   token[3] = begintag\n   token[4] = endtag (optional)\n   token[5] = search for footer from back of buffer flag and other options (whew!)\n*/\n\n\t/* Allocate the memory for these lines.... */\n\ts_spec\t*s = &search_spec[state->num_builtin];\n\n\ts->suffix = malloc(MAX_SUFFIX_LENGTH * sizeof(char));\n\ts->header = malloc(MAX_STRING_LENGTH * sizeof(char));\n\ts->footer = malloc(MAX_STRING_LENGTH * sizeof(char));\n\ts->type = CONF;\n\tif (!strncasecmp(tokenarray[0], FOREMOST_NOEXTENSION_SUFFIX, strlen(FOREMOST_NOEXTENSION_SUFFIX)\n\t\t))\n\t\t{\n\t\ts->suffix[0] = ' ';\n\t\ts->suffix[1] = 0;\n\t\t}\n\telse\n\t\t{\n\n\t\t/* Assign the current line to the SearchSpec object */\n\t\tmemcpy(s->suffix, tokenarray[0], MAX_SUFFIX_LENGTH);\n\t\t}\n\n\t/* Check for case sensitivity */\n\ts->case_sen = (!strncasecmp(tokenarray[1], \"y\", 1) || !strncasecmp(tokenarray[1], \"yes\", 3));\n\n\ts->max_len = atoi(tokenarray[2]);\n\n\t/* Determine which search type we want to use for this needle */\n\ts->searchtype = SEARCHTYPE_FORWARD;\n\tif (!strncasecmp(tokenarray[5], \"REVERSE\", strlen(\"REVERSE\")))\n\t\t{\n\n\t\ts->searchtype = SEARCHTYPE_REVERSE;\n\t\t}\n\telse if (!strncasecmp(tokenarray[5], \"NEXT\", strlen(\"NEXT\")))\n\t\t{\n\t\ts->searchtype = SEARCHTYPE_FORWARD_NEXT;\n\t\t}\n\n\t// this is the default, but just if someone wants to provide this value just to be sure\n\telse if (!strncasecmp(tokenarray[5], \"FORWARD\", strlen(\"FORWARD\")))\n\t\t{\n\t\ts->searchtype = SEARCHTYPE_FORWARD;\n\t\t}\n\telse if (!strncasecmp(tokenarray[5], \"ASCII\", strlen(\"ASCII\")))\n\t\t{\n\t\t\t//fprintf(stderr,\"Setting ASCII TYPE\\n\");\n\t\ts->searchtype = SEARCHTYPE_ASCII;\n\t\t}\n\n\t/* Done determining searchtype */\n\n\t/* We copy the tokens and translate them from the file format.\n   The translate() function does the translation and returns\n   the length of the argument being translated */\n\ts->header_len = translate(tokenarray[3]);\n\tmemcpy(s->header, tokenarray[3], s->header_len);\n\ts->footer_len = translate(tokenarray[4]);\n\tmemcpy(s->footer, tokenarray[4], s->footer_len);\n\n\tinit_bm_table(s->header, s->header_bm_table, s->header_len, s->case_sen, s->searchtype);\n\tinit_bm_table(s->footer, s->footer_bm_table, s->footer_len, s->case_sen, s->searchtype);\n\n\treturn TRUE;\n}\n\nint process_line(f_state *s, char *buffer, int line_number)\n{\n\n\tchar\t*buf = buffer;\n\tchar\t*token;\n\tchar\t**tokenarray = (char **)malloc(6 * sizeof(char[MAX_STRING_LENGTH]));\n\tint\t\ti = 0, len = strlen(buffer);\n\n\t/* Any line that ends with a CTRL-M (0x0d) has been processed\n   by a DOS editor. We will chop the CTRL-M to ignore it */\n\tif (buffer[len - 2] == 0x0d && buffer[len - 1] == 0x0a)\n\t\t{\n\t\tbuffer[len - 2] = buffer[len - 1];\n\t\tbuffer[len - 1] = buffer[len];\n\t\t}\n\n\tbuf = (char *)skipWhiteSpace(buf);\n\ttoken = strtok(buf, \" \\t\\n\");\n\n\t/* Any line that starts with a '#' is a comment and can be skipped */\n\tif (token == NULL || token[0] == '#')\n\t\t{\n\t\treturn TRUE;\n\t\t}\n\n\t/* Check for the wildcard */\n\tif (!strncasecmp(token, \"wildcard\", 9))\n\t\t{\n\t\tif ((token = strtok(NULL, \" \\t\\n\")) != NULL)\n\t\t\t{\n\t\t\ttranslate(token);\n\t\t\t}\n\t\telse\n\t\t\t{\n\t\t\treturn TRUE;\n\t\t\t}\n\n\t\tif (strlen(token) > 1)\n\t\t\t{\n\t\t\tfprintf(stderr,\n\t\t\t\t\t\"Warning: Wildcard can only be one character,\"\n\t\t\t\t\t\" but you specified %zu characters.\\n\"\n\t\t\t\t\"         Using the first character, \\\"%c\\\", as the wildcard.\\n\",\n\t\t\tstrlen(token),\n\t\t\t\t\ttoken[0]);\n\t\t\t}\n\n\t\twildcard = token[0];\n\t\treturn TRUE;\n\t\t}\n\n\twhile (token && (i < NUM_SEARCH_SPEC_ELEMENTS))\n\t\t{\n\t\ttokenarray[i] = token;\n\t\ti++;\n\t\ttoken = strtok(NULL, \" \\t\\n\");\n\t\t}\n\n\tswitch (NUM_SEARCH_SPEC_ELEMENTS - i)\n\t\t{\n\t\tcase 2:\n\t\t\ttokenarray[NUM_SEARCH_SPEC_ELEMENTS - 1] = \"\";\n\t\t\ttokenarray[NUM_SEARCH_SPEC_ELEMENTS - 2] = \"\";\n\t\t\tbreak;\n\n\t\tcase 1:\n\t\t\ttokenarray[NUM_SEARCH_SPEC_ELEMENTS - 1] = \"\";\n\t\t\tbreak;\n\n\t\tcase 0:\n\t\t\tbreak;\n\n\t\tdefault:\n\t\t\tfprintf(stderr, \"\\nERROR: In line %d of the configuration file.\\n\", line_number);\n\t\t\treturn FALSE;\n\t\t\treturn TRUE;\n\n\t\t}\n\n\tif (!extractSearchSpecData(s, tokenarray))\n\t\t{\n\t\tfprintf(stderr,\n\t\t\t\t\"\\nERROR: Unknown error on line %d of the configuration file.\\n\",\n\t\t\t\tline_number);\n\t\t}\n\n\ts->num_builtin++;\n\n\treturn TRUE;\n}\n\nint load_config_file(f_state *s)\n{\n\tFILE\t*f;\n\tchar\t*buffer = (char *)malloc(MAX_STRING_LENGTH * sizeof(char));\n\toff_t\tline_number = 0;\n\n#ifdef __DEBUG\n\tprintf(\"About to open config file %s%s\", get_config_file(s), NEWLINE);\n#endif\n\n\tif ((f = fopen(get_config_file(s), \"r\")) == NULL)\n\t{\n\n\t\t/*Can't find  a conf in the current directory\n    * So lets try the /usr/local/etc*/\n#ifdef __WIN32\n\t\tset_config_file(s, \"/Program Files/foremost/foremost.conf\");\n#else\n\t\tset_config_file(s, \"/usr/local/etc/foremost.conf\");\n#endif\n\t\tif ((f = fopen(get_config_file(s), \"r\")) == NULL)\n\t\t\t{\n\t\t\tprint_error(s, get_config_file(s), strerror(errno));\n\t\t\tfree(buffer);\n\t\t\treturn TRUE;\n\t\t\t}\n\n\t}\n\n\twhile (fgets(buffer, MAX_STRING_LENGTH, f))\n\t\t{\n\t\t++line_number;\n\t\tif (!process_line(s, buffer, line_number))\n\t\t\t{\n\t\t\tfree(buffer);\n\t\t\tfclose(f);\n\t\t\treturn TRUE;\n\n\t\t\t}\n\t\t}\n\n\tfclose(f);\n\tfree(buffer);\n\treturn FALSE;\n}\n"
  },
  {
    "path": "dir.c",
    "content": "\n\n#include \"main.h\"\n\nint is_empty_directory (DIR * temp)\n\t{\n\n\t/* Empty directories contain two entries for . and .. \n     A directory with three entries, therefore, is not empty */\n\tif (readdir(temp) && readdir(temp) && readdir(temp))\n\t\treturn FALSE;\n\n\treturn TRUE;\n\t}\n\n/*Try to cleanup the ouput directory if nothing to a sub-dir*/\nvoid cleanup_output(f_state *s)\n{\n\tchar\t\t\tdir_name[MAX_STRING_LENGTH];\n\n\tDIR\t\t\t\t*temp;\n\tDIR\t\t\t\t*outputDir;\n\tstruct dirent\t*entry;\n\n\tif ((outputDir = opendir(get_output_directory(s))) == NULL)\n\t\t{\n\n\t\t/*Error?*/\n\t\t}\n\n\twhile ((entry = readdir(outputDir)))\n\t\t{\n\t\tmemset(dir_name, 0, MAX_STRING_LENGTH - 1);\n\t\tstrcpy(dir_name, get_output_directory(s));\n\t\tstrcat(dir_name, \"/\");\n\t\tstrcat(dir_name, entry->d_name);\n\t\ttemp = opendir(dir_name);\n\t\tif (temp != NULL)\n\t\t\t{\n\t\t\tif (is_empty_directory(temp))\n\t\t\t\t{\n\t\t\t\trmdir(dir_name);\n\t\t\t\t}\n\t\t\t}\n\n\t\t}\n\n}\n\nint make_new_directory(f_state *s, char *fn)\n{\n\n#ifdef __WIN32\n\n\t#ifndef __CYGWIN\nfprintf(stderr,\"Calling mkdir with\\n\");\t\n\tif (mkdir(fn))\n\t#endif\n\n#else\n\t\tmode_t\tnew_mode =\n\t\t\t(\n\t\t\t\tS_IRUSR |\n\t\t\t\tS_IWUSR |\n\t\t\t\tS_IXUSR |\n\t\t\t\tS_IRGRP |\n\t\t\t\tS_IWGRP |\n\t\t\t\tS_IXGRP |\n\t\t\t\tS_IROTH |\n\t\t\t\tS_IWOTH\n\t\t\t);\n\tif (mkdir(fn, new_mode))\n#endif\n\t\t{\n\t\tif (errno != EEXIST)\n\t\t\t{\n\t\t\tprint_error(s, fn, strerror(errno));\n\t\t\treturn TRUE;\n\t\t\t}\n\t\t}\n\n\treturn FALSE;\n}\n\n/*Clean the timestamped dir name to make it a little more file system friendly*/\nchar *clean_time_string(char *time)\n{\n\tint len = strlen(time);\n\tint i = 0;\n\n\tfor (i = 0; i < len; i++)\n\t{\n#ifdef __WIN32\n\t\tif (time[i] == ':' && time[i + 1] != '\\\\')\n\t\t\t{\n\t\t\ttime[i] = '_';\n\t\t\t}\n\n#else\n\t\tif (time[i] == ' ' || time[i] == ':')\n\t\t\t{\n\t\t\ttime[i] = '_';\n\t\t\t}\n#endif\n\t}\n\n\treturn time;\n}\n\nint create_output_directory(f_state *s)\n{\n\tDIR\t\t*d;\n\tchar\tdir_name[MAX_STRING_LENGTH];\n  \n\tmemset(dir_name, 0, MAX_STRING_LENGTH - 1);\n\tif (s->time_stamp)\n\t\t{\n\t\tstrcpy(dir_name, get_output_directory(s));\n\t\tstrcat(dir_name, \"_\");\n\t\tstrcat(dir_name, get_start_time(s));\n\t\tclean_time_string(dir_name);\n\t\tset_output_directory(s, dir_name);\n\t\t}\n#ifdef DEBUG\n\tprintf(\"Checking output directory %s\\n\", get_output_directory(s));\n#endif\n\n\tif ((d = opendir(get_output_directory(s))) != NULL)\n\t\t{\n\n\t\t/* The directory exists already. It MUST be empty for us to continue */\n\t\tif (!is_empty_directory(d))\n\t\t\t{\n\t\t\tprintf(\"ERROR: %s is not empty\\n \\tPlease specify another directory or run with -T.\\n\",\n\t\t\t\t   get_output_directory(s));\n\n\t\t\texit(EXIT_FAILURE);\n\t\t\t}\n\n\t\t/* The directory exists and is empty. We're done! */\n\t\tclosedir(d);\n\t\treturn FALSE;\n\t\t}\n\n\t/* The error value ENOENT means that either the directory doesn't exist,\n     which is fine, or that the filename is zero-length, which is bad.\n     All other errors are, of course, bad. \n*/\n\tif (errno != ENOENT)\n\t\t{\n\t\tprint_error(s, get_output_directory(s), strerror(errno));\n\t\treturn TRUE;\n\t\t}\n\n\tif (strlen(get_output_directory(s)) == 0)\n\t\t{\n\n\t\t/* Careful! Calling print_error will try to display a filename\n       that is zero characters! In theory this should never happen \n       as our call to realpath should avoid this. But we'll play it safe. */\n\t\tprint_error(s, \"(output_directory)\", \"Output directory name unknown\");\n\t\treturn TRUE;\n\t\t}\n\n\treturn (make_new_directory(s, get_output_directory(s)));\n}\n\n/*Create file type sub dirs, can get tricky when multiple types use one \n extraction algorithm (OLE)*/\nint create_sub_dirs(f_state *s)\n{\n\tint\t\ti = 0;\n\tint\t\tj = 0;\n\tchar\tdir_name[MAX_STRING_LENGTH];\n\tchar\tole_types[7][4] = { \"ppt\", \"doc\", \"xls\", \"sdw\", \"mbd\", \"vis\", \"ole\" };\n\tchar\triff_types[2][4] = { \"avi\", \"wav\" };\n\tchar\tzip_types[8][5] = { \"sxc\", \"sxw\", \"sxi\", \"sx\", \"jar\",\"docx\",\"pptx\",\"xlsx\" };\n\n\tfor (i = 0; i < s->num_builtin; i++)\n\t\t{\n\t\tmemset(dir_name, 0, MAX_STRING_LENGTH - 1);\n\t\tstrcpy(dir_name, get_output_directory(s));\n\t\tstrcat(dir_name, \"/\");\n\t\tstrcat(dir_name, search_spec[i].suffix);\n\t\tmake_new_directory(s, dir_name);\n\n\t\tif (search_spec[i].type == OLE)\n\t\t\t{\n\t\t\tfor (j = 0; j < 7; j++)\n\t\t\t\t{\n\t\t\t\tif (strstr(ole_types[j], search_spec[i].suffix))\n\t\t\t\t\tcontinue;\n\n\t\t\t\tmemset(dir_name, 0, MAX_STRING_LENGTH - 1);\n\t\t\t\tstrcpy(dir_name, get_output_directory(s));\n\t\t\t\tstrcat(dir_name, \"/\");\n\t\t\t\tstrcat(dir_name, ole_types[j]);\n\t\t\t\tmake_new_directory(s, dir_name);\n\t\t\t\t}\n\t\t\t}\n\t\telse if (get_mode(s, mode_write_all))\n\t\t\t{\n\t\t\tfor (j = 0; j < 7; j++)\n\t\t\t\t{\n\t\t\t\tif (strstr(search_spec[i].suffix, ole_types[j]))\n\t\t\t\t\t{\n\t\t\t\t\tfor (j = 0; j < 7; j++)\n\t\t\t\t\t\t{\n\t\t\t\t\t\tif (strstr(ole_types[j], search_spec[i].suffix))\n\t\t\t\t\t\t\tcontinue;\n\n\t\t\t\t\t\tmemset(dir_name, 0, MAX_STRING_LENGTH - 1);\n\t\t\t\t\t\tstrcpy(dir_name, get_output_directory(s));\n\t\t\t\t\t\tstrcat(dir_name, \"/\");\n\t\t\t\t\t\tstrcat(dir_name, ole_types[j]);\n\t\t\t\t\t\tmake_new_directory(s, dir_name);\n\t\t\t\t\t\t}\n\t\t\t\t\tbreak;\n\t\t\t\t\t}\n\n\t\t\t\t}\n\t\t\t}\n\n\t\tif (search_spec[i].type == EXE)\n\t\t\t{\n\t\t\tmemset(dir_name, 0, MAX_STRING_LENGTH - 1);\n\t\t\tstrcpy(dir_name, get_output_directory(s));\n\t\t\tstrcat(dir_name, \"/\");\n\t\t\tstrcat(dir_name, \"dll\");\n\t\t\tmake_new_directory(s, dir_name);\n\t\t\t}\n\n\t\tif (search_spec[i].type == RIFF)\n\t\t\t{\n\t\t\tfor (j = 0; j < 2; j++)\n\t\t\t\t{\n\t\t\t\tif (strstr(ole_types[j], search_spec[i].suffix))\n\t\t\t\t\tcontinue;\n\t\t\t\tmemset(dir_name, 0, MAX_STRING_LENGTH - 1);\n\t\t\t\tstrcpy(dir_name, get_output_directory(s));\n\t\t\t\tstrcat(dir_name, \"/\");\n\t\t\t\tstrcat(dir_name, riff_types[j]);\n\t\t\t\tmake_new_directory(s, dir_name);\n\t\t\t\t}\n\t\t\t}\n\t\telse if (get_mode(s, mode_write_all))\n\t\t\t{\n\t\t\tfor (j = 0; j < 2; j++)\n\t\t\t\t{\n\t\t\t\tif (strstr(search_spec[i].suffix, riff_types[j]))\n\t\t\t\t\t{\n\t\t\t\t\tfor (j = 0; j < 2; j++)\n\t\t\t\t\t\t{\n\t\t\t\t\t\tif (strstr(ole_types[j], search_spec[i].suffix))\n\t\t\t\t\t\t\tcontinue;\n\n\t\t\t\t\t\tmemset(dir_name, 0, MAX_STRING_LENGTH - 1);\n\t\t\t\t\t\tstrcpy(dir_name, get_output_directory(s));\n\t\t\t\t\t\tstrcat(dir_name, \"/\");\n\t\t\t\t\t\tstrcat(dir_name, riff_types[j]);\n\t\t\t\t\t\tmake_new_directory(s, dir_name);\n\t\t\t\t\t\t}\n\t\t\t\t\tbreak;\n\t\t\t\t\t}\n\n\t\t\t\t}\n\t\t\t}\n\n\t\tif (search_spec[i].type == ZIP)\n\t\t\t{\n\t\t\tfor (j = 0; j < 8; j++)\n\t\t\t\t{\n\t\t\t\tif (strstr(ole_types[j], search_spec[i].suffix))\n\t\t\t\t\tcontinue;\n\n\t\t\t\tmemset(dir_name, 0, MAX_STRING_LENGTH - 1);\n\t\t\t\tstrcpy(dir_name, get_output_directory(s));\n\t\t\t\tstrcat(dir_name, \"/\");\n\t\t\t\tstrcat(dir_name, zip_types[j]);\n\t\t\t\tmake_new_directory(s, dir_name);\n\t\t\t\t}\n\t\t\t}\n\t\telse if (get_mode(s, mode_write_all))\n\t\t\t{\n\t\t\tfor (j = 0; j < 8; j++)\n\t\t\t\t{\n\t\t\t\tif (strstr(search_spec[i].suffix, zip_types[j]))\n\t\t\t\t\t{\n\t\t\t\t\tfor (j = 0; j < 5; j++)\n\t\t\t\t\t\t{\n\t\t\t\t\t\tif (strstr(ole_types[j], search_spec[i].suffix))\n\t\t\t\t\t\t\tcontinue;\n\n\t\t\t\t\t\tmemset(dir_name, 0, MAX_STRING_LENGTH - 1);\n\t\t\t\t\t\tstrcpy(dir_name, get_output_directory(s));\n\t\t\t\t\t\tstrcat(dir_name, \"/\");\n\t\t\t\t\t\tstrcat(dir_name, zip_types[j]);\n\t\t\t\t\t\tmake_new_directory(s, dir_name);\n\t\t\t\t\t\t}\n\t\t\t\t\tbreak;\n\t\t\t\t\t}\n\t\t\t\t}\n\t\t\t}\n\n\t\t}\n\n\treturn TRUE;\n}\n\n/*We have found a file so write to disk*/\nint write_to_disk(f_state *s, s_spec *needle, u_int64_t len, unsigned char *buf, u_int64_t t_offset)\n{\n\n\tchar\t\tfn[MAX_STRING_LENGTH];\n\tFILE\t\t*f;\n\tFILE\t\t*test;\n\tlong\t\tbyteswritten = 0;\n\tchar\t\ttemp[32];\n\tu_int64_t\tblock = ((t_offset) / s->block_size);\n\tint\t\t\ti = 1;\n\n\t//Name files based on their block offset\n\tneedle->written = TRUE;\n\n\tif (get_mode(s, mode_write_audit))\n\t\t{\n\t\tif (needle->comment == NULL)\n\t\t\tstrcpy(needle->comment, \" \");\n\n\t\taudit_msg(s,\n\t\t\t\t  \"%d:\\t%10ld.%s \\t %10s \\t %10llu \\t %s\",\n\t\t\t\t  s->fileswritten,\n\t\t\t\t  block,\n\t\t\t\t  needle->suffix,\n\t\t\t\t  human_readable(len, temp),\n\t\t\t\t  t_offset,\n\t\t\t\t  needle->comment);\n\t\ts->fileswritten++;\n\t\tneedle->found++;\n\t\treturn TRUE;\n\t\t}\n\n\tsnprintf(fn,\n\t\t\t MAX_STRING_LENGTH,\n\t\t\t \"%s/%s/%0*llu.%s\",\n\t\t\t s->output_directory,\n\t\t\t needle->suffix,\n\t\t\t 8,\n\t\t\t block,\n\t\t\t needle->suffix);\n\n\ttest = fopen(fn, \"rb\");\n\twhile (test)\t/*Test the files to make sure we have unique file names, some headers could be within the same block*/\n\t\t{\n\t\tmemset(fn, 0, MAX_STRING_LENGTH - 1);\n\t\tsnprintf(fn,\n\t\t\t\t MAX_STRING_LENGTH - 1,\n\t\t\t\t \"%s/%s/%0*llu_%d.%s\",\n\t\t\t\t s->output_directory,\n\t\t\t\t needle->suffix,\n\t\t\t\t 8,\n\t\t\t\t block,\n\t\t\t\t i,\n\t\t\t\t needle->suffix);\n\t\ti++;\n\t\tfclose(test);\n\t\ttest = fopen(fn, \"rb\");\n\t\t}\n\n\tif (!(f = fopen(fn, \"wb\")))\n\t\t{\n\t\tprintf(\"fn = %s  failed\\n\", fn);\n\t\tfatal_error(s, \"Can't open file for writing \\n\");\n\t\t}\n\n\tif ((byteswritten = fwrite(buf, sizeof(char), len, f)) != len)\n\t\t{\n\t\tfprintf(stderr, \"fn=%s bytes=%lu\\n\", fn, byteswritten);\n\t\tfatal_error(s, \"Error writing file\\n\");\n\t\t}\n\n\tif (fclose(f))\n\t\t{\n\t\tfatal_error(s, \"Error closing file\\n\");\n\t\t}\n\n\tif (needle->comment == NULL)\n\t\tstrcpy(needle->comment, \" \");\n\t\n\tif (i == 1) {\n      audit_msg(s,\"%d:\\t%08llu.%s \\t %10s \\t %10llu \\t %s\",\n         s->fileswritten,\n         block,\n         needle->suffix,\n         human_readable(len, temp),\n         t_offset,\n         needle->comment);\n         } else {\n      audit_msg(s,\"%d:\\t%08llu_%d.%s \\t %10s \\t %10llu \\t %s\",\n         s->fileswritten,\n         block,\n         i - 1,\n         needle->suffix, \n         human_readable(len, temp),\n         t_offset,\n         needle->comment);\n         }\n\n/*\n\taudit_msg(s,\"%d:\\t%10llu.%s \\t %10s \\t %10llu \\t %s\",\n\t\t\t  s->fileswritten,\n\t\t\t  block,\n\t\t\t  needle->suffix,\n\t\t\t  human_readable(len, temp),\n\t\t\t  t_offset,\n\t\t\t  needle->comment);\n\n*/\n\ts->fileswritten++;\n\tneedle->found++;\n\treturn TRUE;\n}\n"
  },
  {
    "path": "engine.c",
    "content": "\n\t /* FOREMOST\n *\n * By Jesse Kornblum, Kris Kendall, & Nick Mikus\n *\n * This is a work of the US Government. In accordance with 17 USC 105,\n * copyright protection is not available for any work of the US Government.\n *\n * This program is distributed in the hope that it will be useful, but\n * WITHOUT ANY WARRANTY; without even the implied warranty of\n * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.\n *\n */\n\n#include \"main.h\"\n\nint user_interrupt (f_state * s, f_info * i)\n\t{\n\taudit_msg(s, \"Interrupt received at %s\", current_time());\n\n\t/* RBF - Write user_interrupt */\n\tfclose(i->handle);\n\tfree(s);\n\tfree(i);\n\tcleanup_output(s);\n\texit(-1);\n\treturn FALSE;\n\t}\n\nunsigned char *read_from_disk(u_int64_t offset, f_info *i, u_int64_t length)\n{\n\n\tu_int64_t\t\tbytesread = 0;\n\tunsigned char\t*newbuf = (unsigned char *)malloc(length * sizeof(char));\n\tif (!newbuf) {\n           fprintf(stderr, \"Ran out of memory in read_from_disk()\\n\");\n           exit(1);\n         }\n\n\tfseeko(i->handle, offset, SEEK_SET);\n\tbytesread = fread(newbuf, 1, length, i->handle);\n\tif (bytesread != length)\n\t{\n\t\tfree(newbuf);\n\t\treturn NULL;\n\t}\n\telse\n\t{\n\t\treturn newbuf;\n\t}\n}\n\n/*\n   Perform a modified boyer-moore string search (w/ support for wildcards and case-insensitive searches)\n   and allows the starting position in the buffer to be manually set, which allows data to be skipped\n*/\nunsigned char *bm_search_skipn(unsigned char *needle, size_t needle_len, unsigned char *haystack,\n\t\t\t\t\t\t\t   size_t haystack_len, size_t table[UCHAR_MAX + 1], int casesensitive,\n\t\t\t\t\t\t\t   int searchtype, int start_pos)\n{\n\tregister size_t shift = 0;\n\tregister size_t pos = start_pos;\n\tunsigned char\t*here;\n\n\tif (needle_len == 0)\n\t\treturn haystack;\n\n\tif (searchtype == SEARCHTYPE_FORWARD || searchtype == SEARCHTYPE_FORWARD_NEXT)\n\t\t{\n\t\twhile (pos < haystack_len)\n\t\t\t{\n\t\t\twhile (pos < haystack_len && (shift = table[(unsigned char)haystack[pos]]) > 0)\n\t\t\t\t{\n\t\t\t\tpos += shift;\n\t\t\t\t}\n\n\t\t\tif (0 == shift)\n\t\t\t\t{\n\t\t\t\there = (unsigned char *) &haystack[pos - needle_len + 1];\n\t\t\t\tif (0 == memwildcardcmp(needle, here, needle_len, casesensitive))\n\t\t\t\t\t{\n\t\t\t\t\treturn (here);\n\t\t\t\t\t}\n\t\t\t\telse\n\t\t\t\t\tpos++;\n\t\t\t\t}\n\t\t\t}\n\n\t\treturn NULL;\n\t\t}\n\telse if (searchtype == SEARCHTYPE_REVERSE)\t//Run our search backwards\n\t\t{\n\t\twhile (pos < haystack_len)\n\t\t\t{\n\t\t\twhile\n\t\t\t(\n\t\t\t\tpos < haystack_len &&\n\t\t\t\t(shift = table[(unsigned char)haystack[haystack_len - pos - 1]]) > 0\n\t\t\t)\n\t\t\t\t{\n\t\t\t\tpos += shift;\n\t\t\t\t}\n\n\t\t\tif (0 == shift)\n\t\t\t\t{\n\t\t\t\tif (0 == memwildcardcmp(needle, here = (unsigned char *) &haystack[haystack_len - pos - 1],\n\t\t\t\t\tneedle_len, casesensitive))\n\t\t\t\t\t{\n\t\t\t\t\treturn (here);\n\t\t\t\t\t}\n\t\t\t\telse\n\t\t\t\t\tpos++;\n\t\t\t\t}\n\t\t\t}\n\n\t\treturn NULL;\n\t\t}\n\n\treturn NULL;\n}\n\n/*\n   Perform a modified boyer-moore string search (w/ support for wildcards and case-insensitive searches)\n   and allows the starting position in the buffer to be manually set, which allows data to be skipped\n*/\nunsigned char *bm_search(unsigned char *needle, size_t needle_len, unsigned char *haystack,\n\t\t\t\t\t\t size_t haystack_len, size_t table[UCHAR_MAX + 1], int case_sen,\n\t\t\t\t\t\t int searchtype)\n{\n\n\t//printf(\"The needle2 is:\\t\");\n\t//printx(needle,0,needle_len);\n\treturn bm_search_skipn(needle,\n\t\t\t\t\t\t   needle_len,\n\t\t\t\t\t\t   haystack,\n\t\t\t\t\t\t   haystack_len,\n\t\t\t\t\t\t   table,\n\t\t\t\t\t\t   case_sen,\n\t\t\t\t\t\t   searchtype,\n\t\t\t\t\t\t   needle_len - 1);\n\n}\n\nvoid setup_stream(f_state *s, f_info *i)\n{\n\tchar\tbuffer[MAX_STRING_LENGTH];\n\tu_int64_t\tskip = (((u_int64_t) s->skip) * ((u_int64_t) s->block_size));\n#ifdef DEBUG\n\tprintf(\"s->skip=%d s->block_size=%d total=%llu\\n\",\n\t\t   s->skip,\n\t\t   s->block_size,\n\t\t   (((u_int64_t) s->skip) * ((u_int64_t) s->block_size)));\n#endif\n\ti->bytes_read = 0;\n\ti->total_megs = i->total_bytes / ONE_MEGABYTE;\n\n\tif (i->total_bytes != 0)\n\t\t{\n\t\taudit_msg(s,\n\t\t\t\t  \"Length: %s (%llu bytes)\",\n\t\t\t\t  human_readable(i->total_bytes, buffer),\n\t\t\t\t  i->total_bytes);\n\t\t}\n\telse\n\t\taudit_msg(s, \"Length: Unknown\");\n\n\tif (s->skip != 0)\n\t\t{\n\t\taudit_msg(s, \"Skipping: %s (%llu bytes)\", human_readable(skip, buffer), skip);\n\t\tfseeko(i->handle, skip, SEEK_SET);\n\t\tif (i->total_bytes != 0)\n\t\t\ti->total_bytes -= skip;\n\t\t}\n\n\taudit_msg(s, \" \");\n\n#ifdef __WIN32\n\ti->last_read = 0;\n\ti->overflow_count = 0;\n#endif\n\n}\n\nvoid audit_layout(f_state *s)\n{\n\taudit_msg(s,\n\t\t\t  \"Num\\t %s (bs=%d)\\t %10s\\t %s\\t %s \\n\",\n\t\t\t  \"Name\",\n\t\t\t  s->block_size,\n\t\t\t  \"Size\",\n\t\t\t  \"File Offset\",\n\t\t\t  \"Comment\");\n\n}\n\nvoid dumpInd(unsigned char *ind, int bs)\n{\n\tint i = 0;\n\tprintf(\"\\n/*******************************/\\n\");\n\n\twhile (bs > 0)\n\t\t{\n\t\tif (i % 10 == 0)\n\t\t\tprintf(\"\\n\");\n\n\t\t//printx(ind,0,10);\n\t\tprintf(\"%4u \", htoi(ind, FOREMOST_LITTLE_ENDIAN));\n\n\t\tbs -= 4;\n\t\tind += 4;\n\t\ti++;\n\t\t}\n\n\tprintf(\"\\n/*******************************/\\n\");\n}\n\n/********************************************************************************\n *Function: ind_block\n *Description: check if the block foundat is pointing to looks like an indirect \n *\tblock\n *Return: TRUE/FALSE\n **********************************************************************************/\nint ind_block(unsigned char *foundat, u_int64_t buflen, int bs)\n{\n\n\tunsigned char\t*temp = foundat;\n\tint\t\t\t\tjump = 12 * bs;\n\tunsigned int\tblock = 0;\n\tunsigned int\tblock2 = 0;\n\tunsigned int\tdif = 0;\n\tint\t\t\t\ti = 0;\n\tunsigned int\tone = 1;\n\tunsigned int\tnumbers = (bs / 4) - 1;\n\n\t//int reconstruct=FALSE;\n\n\t/*Make sure we don't jump past the end of the buffer*/\n\tif (buflen < jump + 16)\n\t\treturn FALSE;\n\n\twhile (i < numbers)\n\t\t{\n\t\tblock = htoi(&temp[jump + (i * 4)], FOREMOST_LITTLE_ENDIAN);\n\n\t\tif (block < 0)\n\t\t\treturn FALSE;\n\n\t\tif (block == 0)\n\t\t\t{\n\t\t\tbreak;\n\t\t\t}\n\n\t\ti++;\n\t\tblock2 = htoi(&temp[jump + (i * 4)], FOREMOST_LITTLE_ENDIAN);\n\t\tif (block2 < 0)\n\t\t\treturn FALSE;\n\n\t\tif (block2 == 0)\n\t\t\t{\n\t\t\tbreak;\n\t\t\t}\n\n\t\tdif = block2 - block;\n\n\t\tif (dif == one)\n\t\t{\n\n#ifdef DEBUG\n\t\t\tprintf(\"block1:=%u, block2:=%u dif=%u\\n\", block, block2, dif);\n#endif\n\t\t}\n\t\telse\n\t\t{\n\n#ifdef DEBUG\n\t\t\tprintf(\"Failure, dif!=1\\n\");\n\t\t\tprintf(\"\\tblock1:=%u, block2:=%u dif=%u\\n\", block, block2, dif);\n#endif\n\n\t\t\treturn FALSE;\n\t\t}\n\n#ifdef DEBUG\n\t\tprintf(\"block1:=%u, block2:=%u dif=%u\\n\", block, block2, dif);\n#endif\n\t\t}\n\n\tif (i == 0)\n\t\treturn FALSE;\n\n\t/*Check if the rest of the bytes are zero'd out */\n\tfor (i = i + 1; i < numbers; i++)\n\t\t{\n\t\tblock = htoi(&temp[jump + (i * 4)], FOREMOST_LITTLE_ENDIAN);\n\t\tif (block != 0)\n\t\t\t{\n\n\t\t\t//printf(\"Failure, 0 test\\n\");\n\t\t\treturn FALSE;\n\t\t\t}\n\t\t}\n\n\treturn TRUE;\n}\n\n/********************************************************************************\n *Function: search_chunk\n *Description: Analyze the given chunk by running each defined search spec on it\n *Return: TRUE/FALSE\n **********************************************************************************/\nint search_chunk(f_state *s, unsigned char *buf, f_info *i, u_int64_t chunk_size, u_int64_t f_offset)\n{\n\n\tu_int64_t\t\tc_offset = 0;\n\t//u_int64_t               foundat_off = 0;\n\t//u_int64_t               buf_off = 0;\n\n\tunsigned char\t*foundat = buf;\n\tunsigned char\t*current_pos = NULL;\n\tunsigned char\t*header_pos = NULL;\n\tunsigned char\t*newbuf = NULL;\n\tunsigned char\t*ind_ptr = NULL;\n\tu_int64_t\t\tcurrent_buflen = chunk_size;\n\tint\t\t\t\ttryBS[3] = { 4096, 1024, 512 };\n\tunsigned char\t*extractbuf = NULL;\n\tu_int64_t\t\tfile_size = 0;\n\ts_spec\t\t\t*needle = NULL;\n\tint\t\t\t\tj = 0;\n\tint\t\t\t\tbs = 0;\n\tint\t\t\t\trem = 0;\n\tint\t\t\t\tx = 0;\n\tint\t\t\t\tfound_ind = FALSE;\n\t off_t saveme;\n\t//char comment[32];\n\tfor (j = 0; j < s->num_builtin; j++)\n\t\t{\n\t\tneedle = &search_spec[j];\n\t\tfoundat = buf;\t\t\t\t\t\t\t\t\t\t/*reset the buffer for the next search spec*/\n#ifdef DEBUG\n\t\tprintf(\"\tSEARCHING FOR %s's\\n\", needle->suffix);\n#endif\n\t\tbs = 0;\n\t\tcurrent_buflen = chunk_size;\n\t\twhile (foundat)\n\t\t\t{\n\t\t\tneedle->written = FALSE;\n\t\t\tfound_ind = FALSE;\n\t\t\tmemset(needle->comment, 0, COMMENT_LENGTH - 1);\n                        if (chunk_size <= (foundat - buf)) {\n#ifdef DEBUG\n\t\t\t\tprintf(\"avoided seg fault in search_chunk()\\n\");\n#endif\n\t\t\t\tfoundat = NULL;\n\t\t\t\tbreak;\n\t\t\t}\n\t\t\tcurrent_buflen = chunk_size - (foundat - buf);\n\n\t\t\t//if((foundat-buf)< 1 ) break;\t\n#ifdef DEBUG\n\t\t\t//foundat_off=foundat;\n\t\t\t//buf_off=buf;\n\t\t\t//printf(\"current buf:=%llu (foundat-buf)=%llu \\n\", current_buflen, (u_int64_t) (foundat_off - buf_off));\n#endif\n\t\t\tif (signal_caught == SIGTERM || signal_caught == SIGINT)\n\t\t\t\t{\n\t\t\t\tuser_interrupt(s, i);\n\t\t\t\tprintf(\"Cleaning up.\\n\");\n\t\t\t\tsignal_caught = 0;\n\t\t\t\t}\n\n\t\t\tif (get_mode(s, mode_quick))\t\t\t\t\t/*RUN QUICK SEARCH*/\n\t\t\t{\n#ifdef DEBUG\n\n\t\t\t\t//printf(\"quick mode is on\\n\");\n#endif\n\n\t\t\t\t/*Check if we are not on a block head, adjust if so*/\n\t\t\t\trem = (foundat - buf) % s->block_size;\n\t\t\t\tif (rem != 0)\n\t\t\t\t\t{\n\t\t\t\t\tfoundat += (s->block_size - rem);\n\t\t\t\t\t}\n\n\t\t\t\tif (memwildcardcmp(needle->header, foundat, needle->header_len, needle->case_sen\n\t\t\t\t\t) != 0)\n\t\t\t\t\t{\n\n\t\t\t\t\t/*No match, jump to the next block*/\n\t\t\t\t\tif (current_buflen > s->block_size)\n\t\t\t\t\t\t{\n\t\t\t\t\t\tfoundat += s->block_size;\n\t\t\t\t\t\tcontinue;\n\t\t\t\t\t\t}\n\t\t\t\t\telse\t\t\t\t\t\t\t\t\t/*We are out of buffer lets go to the next search spec*/\n\t\t\t\t\t\t{\n\t\t\t\t\t\tfoundat = NULL;\n\t\t\t\t\t\tbreak;\n\t\t\t\t\t\t}\n\t\t\t\t\t}\n\n\t\t\t\theader_pos = foundat;\n\t\t\t}\n\t\t\telse\t\t\t\t\t\t\t\t\t\t\t/**********RUN STANDARD SEARCH********************/\n\t\t\t\t{\n\t\t\t\tfoundat = bm_search(needle->header,\n\t\t\t\t\t\t\t\t\tneedle->header_len,\n\t\t\t\t\t\t\t\t\tfoundat,\n\t\t\t\t\t\t\t\t\tcurrent_buflen,\t\t\t//How much to search through\n\t\t\t\t\t\t\t\t\tneedle->header_bm_table,\n\t\t\t\t\t\t\t\t\tneedle->case_sen,\t\t//casesensative\n\t\t\t\t\t\t\t\t\tSEARCHTYPE_FORWARD);\n\n\t\t\t\theader_pos = foundat;\n\t\t\t\t}\n\n\t\t\tif (foundat != NULL && foundat >= 0)\t\t\t/*We got something, run the appropriate heuristic to find the EOF*/\n\t\t\t\t{\n\t\t\t\tcurrent_buflen = chunk_size - (foundat - buf);\n\n\t\t\t\tif (get_mode(s, mode_ind_blk))\n\t\t\t\t{\n#ifdef DEBUG\n\t\t\t\t\tprintf(\"ind blk detection on\\n\");\n#endif\n\n\t\t\t\t\t//dumpInd(foundat+12*1024,1024);\n\t\t\t\t\tfor (x = 0; x < 3; x++)\n\t\t\t\t\t\t{\n\t\t\t\t\t\tbs = tryBS[x];\n\n\t\t\t\t\t\tif (ind_block(foundat, current_buflen, bs))\n\t\t\t\t\t\t\t{\n\t\t\t\t\t\t\tif (get_mode(s, mode_verbose))\n\t\t\t\t\t\t\t\t{\n\t\t\t\t\t\t\t\tsprintf(needle->comment, \" (IND BLK bs:=%d)\", bs);\n\t\t\t\t\t\t\t\t}\n\n\t\t\t\t\t\t\t//dumpInd(foundat+12*bs,bs);\n#ifdef DEBUG\n\t\t\t\t\t\t\tprintf(\"performing mem move\\n\");\n#endif\n\t\t\t\t\t\t\tif(current_buflen >  13 * bs)//Make sure we have enough buffer\n\t\t\t\t\t\t\t\t{\n\t\t\t\t\t\t\t\tif (!memmove(foundat + 12 * bs, foundat + 13 * bs, current_buflen - 13 * bs))\n\t\t\t\t\t\t\t\tbreak;\n\n\t\t\t\t\t\t\t\tfound_ind = TRUE;\n#ifdef DEBUG\n\t\t\t\t\t\t\t\tprintf(\"performing mem move complete\\n\");\n#endif\n\t\t\t\t\t\t\t\tind_ptr = foundat + 12 * bs;\n\t\t\t\t\t\t\t\tcurrent_buflen -= bs;\n\t\t\t\t\t\t\t\tchunk_size -= bs;\n\t\t\t\t\t\t\t\tbreak;\n\t\t\t\t\t\t\t\t}\n\t\t\t\t\t\t\t}\n\n\t\t\t\t\t\t}\n\n\t\t\t\t}\n\n\t\t\t\tc_offset = (foundat - buf);\n\t\t\t\tcurrent_pos = foundat;\n\n\t\t\t\t/*Now lets analyze the file and see if we can determine its size*/\n\n\t\t\t\t// printf(\"c_offset=%llu %x %x %llx\\n\", c_offset,foundat,buf,c_offset);\n\t\t\t\tfoundat = extract_file(s, c_offset, foundat, current_buflen, needle, f_offset);\n#ifdef DEBUG\n\t\t\t\tif (foundat == NULL)\n\t\t\t\t\t{\n\t\t\t\t\tprintf(\"Foundat == NULL!!!\\n\");\n\t\t\t\t\t}\n#endif\n\t\t\t\tif (get_mode(s, mode_write_all))\n\t\t\t\t\t{\n\t\t\t\t\tif (needle->written == FALSE)\n\t\t\t\t\t\t{\n\n\t\t\t\t\t\t/*write every header we find*/\n\t\t\t\t\t\tif (current_buflen >= needle->max_len)\n\t\t\t\t\t\t\t{\n\t\t\t\t\t\t\tfile_size = needle->max_len;\n\t\t\t\t\t\t\t}\n\t\t\t\t\t\telse\n\t\t\t\t\t\t\t{\n\t\t\t\t\t\t\tfile_size = current_buflen;\n\t\t\t\t\t\t\t}\n\n\t\t\t\t\t\tsprintf(needle->comment, \" (Header dump)\");\n\t\t\t\t\t\textractbuf = (unsigned char *)malloc(file_size * sizeof(char));\n\t\t\t\t\t\tmemcpy(extractbuf, header_pos, file_size);\n\t\t\t\t\t\twrite_to_disk(s, needle, file_size, extractbuf, c_offset + f_offset);\n\t\t\t\t\t\tfree(extractbuf);\n\t\t\t\t\t\t}\n\t\t\t\t\t}\n\t\t\t\telse if (!foundat)\t\t\t\t\t\t\t/*Should we search further?*/\n\t\t\t\t\t{\n\n\t\t\t\t\t/*We couldn't determine where the file ends, now lets check to see\n\t\t\t* if we should try again\n\t\t\t*/\n\t\t\t\t\tif (current_buflen < needle->max_len)\t/*We need to bridge the gap*/\n\t\t\t\t\t{\n#ifdef DEBUG\n\t\t\t\t\t\tprintf(\"\tBridge the gap\\n\");\n#endif\n\t\t\t\t\t\tsaveme = ftello(i->handle);\n\t\t\t\t\t\t/*grow the buffer and try to extract again*/\n\t\t\t\t\t\tnewbuf = read_from_disk(c_offset + f_offset, i, needle->max_len);\n\t\t\t\t\t\tif (newbuf == NULL)\n\t\t\t\t\t\t\tbreak;\n\t\t\t\t\t\tcurrent_pos = extract_file(s,\n\t\t\t\t\t\t\t\t\t\t\t\t   c_offset,\n\t\t\t\t\t\t\t\t\t\t\t\t   newbuf,\n\t\t\t\t\t\t\t\t\t\t\t\t   needle->max_len,\n\t\t\t\t\t\t\t\t\t\t\t\t   needle,\n\t\t\t\t\t\t\t\t\t\t\t\t   f_offset);\n\t\t\t\t\t\t\n\t\t\t\t\t\t/*Lets put the fp back*/\n\t\t\t\t\t\tfseeko(i->handle, saveme, SEEK_SET);\n\t\t\t\t\t\t\n\n\t\t\t\t\t\tfree(newbuf);\n\t\t\t\t\t}\n\t\t\t\t\telse\n\t\t\t\t\t\t{\n\t\t\t\t\t\tfoundat = header_pos;\t\t\t\t/*reset the foundat pointer to the location of the last header*/\n\t\t\t\t\t\tfoundat += needle->header_len + 1;\t/*jump past the header*/\n\t\t\t\t\t\t}\n\t\t\t\t\t}\n\n\n\t\t\t\t}\n\n\t\t\tif (found_ind)\n\t\t\t\t{\n\n\t\t\t\t/*Put the ind blk back in, re-arrange the buffer so that the future blks names come out correct*/\n#ifdef DEBUG\n\t\t\t\t\t\tprintf(\"Replacing the ind block\\n\");\n#endif\n\t\t\t\t/*This is slow, should we do this??????*/\n\t\t\t\tif (!memmove(ind_ptr + 1 * bs, ind_ptr, current_buflen - 13 * bs))\n\t\t\t\t\tbreak;\n\t\t\t\tmemset(ind_ptr, 0, bs - 1);\n\t\t\t\tchunk_size += bs;\n\t\t\t\tmemset(needle->comment, 0, COMMENT_LENGTH - 1);\n\t\t\t\t}\n\t\t\t}\t//end while\n\t\t}\n\n\treturn TRUE;\n}\n\n/********************************************************************************\n *Function: search_stream\n *Description: Analyze the file by reading 1 chunk (default: 100MB) at a time and \n *passing it to\tsearch_chunk\n *Return: TRUE/FALSE\n **********************************************************************************/\nint search_stream(f_state *s, f_info *i)\n{\n\tu_int64_t\t\tbytesread = 0;\n\tu_int64_t\t\tf_offset = 0;\n\tu_int64_t\t\tchunk_size = ((u_int64_t) s->chunk_size) * MEGABYTE;\n\tunsigned char\t*buf = (unsigned char *)malloc(sizeof(char) * chunk_size);\n\n\tsetup_stream(s, i);\n\n\taudit_layout(s);\n#ifdef DEBUG\n\tprintf(\"\\n\\t READING THE FILE INTO MEMORY\\n\");\n#endif\n\n\twhile ((bytesread = fread(buf, 1, chunk_size, i->handle)) > 0)\n\t\t{\n\t\tif (signal_caught == SIGTERM || signal_caught == SIGINT)\n\t\t\t{\n\t\t\tuser_interrupt(s, i);\n\t\t\tprintf(\"Cleaning up.\\n\");\n\t\t\tsignal_caught = 0;\n\t\t\t}\n\n#ifdef DEBUG\n\t\tprintf(\"\\n\\tbytes_read:=%llu\\n\", bytesread);\n#endif\n\t\tsearch_chunk(s, buf, i, bytesread, f_offset);\n\t\tf_offset += bytesread;\n\t\tif (!get_mode(s, mode_quiet))\n\t\t\t{\n\t\t\tfprintf(stderr, \"*\");\n\n\t\t\t//displayPosition(s,i,f_offset);\n\t\t\t}\n\n\t\t/*FIX ME***\n\t* We should jump back and make sure we didn't miss any headers that are \n\t* bridged between chunks.  What is the best way to do this?\\\n  \t*/\n\t\t}\n\n\tif (!get_mode(s, mode_quiet))\n\t\t{\n\t\tfprintf(stderr, \"|\\n\");\n\t\t}\n\n#ifdef DEBUG\n\tprintf(\"\\n\\tDONE READING bytes_read:=%llu\\n\", bytesread);\n#endif\n\tif (signal_caught == SIGTERM || signal_caught == SIGINT)\n\t\t{\n\t\tuser_interrupt(s, i);\n\t\tprintf(\"Cleaning up.\\n\");\n\t\tsignal_caught = 0;\n\t\t}\n\n\tfree(buf);\n\treturn FALSE;\n}\n\nvoid audit_start(f_state *s, f_info *i)\n{\n\tif (!get_mode(s, mode_quiet))\n\t\t{\n\t\tfprintf(stderr, \"Processing: %s\\n|\", i->file_name);\n\t\t}\n\n\taudit_msg(s, FOREMOST_DIVIDER);\n\taudit_msg(s, \"File: %s\", i->file_name);\n\taudit_msg(s, \"Start: %s\", current_time());\n}\n\nvoid audit_finish(f_state *s, f_info *i)\n{\n\taudit_msg(s, \"Finish: %s\", current_time());\n}\n\nint process_file(f_state *s)\n{\n\n\t//printf(\"processing file\\n\");\n\tf_info\t*i = (f_info *)malloc(sizeof(f_info));\n\tchar\ttemp[PATH_MAX];\n\n\tif ((realpath(s->input_file, temp)) == NULL)\n\t\t{\n\t\tprint_error(s, s->input_file, strerror(errno));\n\t\treturn TRUE;\n\t\t}\n\n\ti->file_name = strdup(s->input_file);\n\ti->is_stdin = FALSE;\n\taudit_start(s, i);\n\n\t//  printf(\"opening file %s\\n\",i->file_name);\n#if defined(__LINUX)\n\t#ifdef DEBUG\n\tprintf(\"Using 64 bit fopen\\n\");\n\t#endif\n\ti->handle = fopen64(i->file_name, \"rb\");\n#elif defined(__WIN32)\n\n\t/*I would like to be able to read from\n\t* physical devices in Windows, have played\n\t* with different options to fopen and the\n\t* dd src says you need write access on WinXP\n\t* but nothing seems to work*/\n\ti->handle = fopen(i->file_name, \"rb\");\n#else\n\ti->handle = fopen(i->file_name, \"rb\");\n#endif\n\tif (i->handle == NULL)\n\t\t{\n\t\tprint_error(s, s->input_file, strerror(errno));\n\t\taudit_msg(s, \"Error: %s\", strerror(errno));\n\t\treturn TRUE;\n\t\t}\n\n\ti->total_bytes = find_file_size(i->handle);\n\tsearch_stream(s, i);\n\taudit_finish(s, i);\n\n\tfclose(i->handle);\n\tfree(i);\n\treturn FALSE;\n}\n\nint process_stdin(f_state *s)\n{\n\tf_info\t*i = (f_info *)malloc(sizeof(f_info));\n\n\ti->file_name = strdup(\"stdin\");\n\ts->input_file = \"stdin\";\n\ti->handle = stdin;\n\ti->is_stdin = TRUE;\n\n\t/* We can't compute the size of this stream, we just ignore it*/\n\ti->total_bytes = 0;\n\taudit_start(s, i);\n\n\tsearch_stream(s, i);\n\n\tfree(i->file_name);\n\tfree(i);\n\treturn FALSE;\n}\n"
  },
  {
    "path": "extract.c",
    "content": "\n\t /* extract.c\n * Copyright (c) 2005, Nick Mikus\n * This file contains the file specific functions used to extract\n * data from an image.\n *\n * Each has a similar structure\n * f_state *s:  state of the program.\n * c_offset:\toffset that the header was recorded within the current chunk\n * foundat:\tThe location the header was \"foundat\"\n * buflen:\tHow much buffer is left until the end of the current chunk\n * needle:\tSearch specification\n * f_offset:\tOffset that the current chunk is located within the file\n */\n\n#include \"main.h\"\n#include \"extract.h\"\n#include \"ole.h\"\nextern unsigned char buffer[OUR_BLK_SIZE];\nextern int\tverbose;\nextern int\tdir_count;\nextern int\tblock_list[OUR_BLK_SIZE / sizeof(int)];\nextern int\t*FAT;\nextern char *extract_name;\nextern int\textract;\nextern int\tFATblk;\nextern int\thighblk;\n\n/********************************************************************************\n *Function: extract_zip\n *Description: Given that we have a ZIP header jump through the file headers\n    until we reach the EOF.\n *Return: A pointer to where the EOF of the ZIP is in the current buffer\n**********************************************************************************/\nunsigned char *extract_zip(f_state *s, u_int64_t c_offset, unsigned char *foundat, u_int64_t buflen,\n\t\t\t\t\t\t   s_spec *needle, u_int64_t f_offset, char *type)\n{\n\tunsigned char\t\t\t\t*currentpos = NULL;\n\tunsigned char\t\t\t\t*buf = foundat;\n\tunsigned short\t\t\t\tcomment_length = 0;\n\tunsigned char\t\t\t\t*extractbuf = NULL;\n\tstruct zipLocalFileHeader\tlocalFH;\n\tu_int64_t\t\t\t\t\tbytes_to_search = 50 * KILOBYTE;\n\tu_int64_t\t\t\t\t\tfile_size = 0;\n\tint\t\t\t\t\t\t\toOffice = FALSE;\n\tint\t\t\t\t\t\t\toffice2007 = FALSE;\n\n\tchar\t\t\t\t\t\tcomment[32];\n\tlocalFH.genFlag=0;\n\tlocalFH.compressed=0;\n\tlocalFH.uncompressed =0;\n\tif (buflen < 100)\n\t\treturn NULL;\n\n\tif (strncmp((char *) &foundat[30], \"mimetypeapplication/vnd.sun.xml.\", 32) == 0)\n\t\t{\n\t\toOffice = TRUE;\n\t\tif (strncmp((char *) &foundat[62], \"calc\", 4) == 0)\n\t\t\t{\n\t\t\tneedle->suffix = \"sxc\";\n\t\t\t}\n\t\telse if (strncmp((char *) &foundat[62], \"impress\", 7) == 0)\n\t\t\t{\n\t\t\tneedle->suffix = \"sxi\";\n\t\t\t}\n\t\telse if (strncmp((char *) &foundat[62], \"writer\", 6) == 0)\n\t\t\t{\n\t\t\tneedle->suffix = \"sxw\";\n\t\t\t}\n\t\telse\n\t\t\t{\n\t\t\tsprintf(comment, \" (OpenOffice Doc?)\");\n\t\t\tstrcat(needle->comment, comment);\n\t\t\tneedle->suffix = \"sx\";\n\t\t\t}\n\t\t}\n\telse\n\t\t{\n\t\tneedle->suffix = \"zip\";\n\t\t}\n\n\t\n\twhile (1)\t//Jump through each local file header until the central directory structure is reached, much faster than searching \n\t\t{\n\t\t\n\t\tif (foundat[2] == '\\x03' && foundat[3] == '\\x04')\t//Verfiy we are looking at a local file header//\n\t\t\t{\n\t\t\t\n\t\t\tlocalFH.compression=htos(&foundat[8], FOREMOST_LITTLE_ENDIAN);\n\t\t\tlocalFH.compressed = htoi(&foundat[18], FOREMOST_LITTLE_ENDIAN);\n\t\t\tlocalFH.uncompressed = htoi(&foundat[22], FOREMOST_LITTLE_ENDIAN);\n\t\t\tlocalFH.filename_length = htos(&foundat[26], FOREMOST_LITTLE_ENDIAN);\n\t\t\tlocalFH.extra_length = htos(&foundat[28], FOREMOST_LITTLE_ENDIAN);;\n\t\t\tlocalFH.genFlag = htos(&foundat[6], FOREMOST_LITTLE_ENDIAN);\t\n\n\t\t\t// Sanity checking\n\t\t\tif (localFH.compressed > needle->max_len)\n\t\t\t\treturn foundat + needle->header_len;\n\n\t\t\tif (localFH.filename_length > 100)\n\t\t\t\treturn foundat + needle->header_len;\n\n\t\t\t//Check if we should grab more from the disk\n\t\t\tif (localFH.compressed + 30 > buflen - (foundat - buf))\n\t\t\t\t{\n\t\t\t\treturn NULL;\t\t\t\t\t\t\t\t\n\t\t\t\t}\n\t\t\t\t\n\t\t\t//Size of the local file header data structure\n\t\t\tfoundat += 30;\t\t\t\t\t\t\t\t\t\n\n\t\t\tif (strcmp(needle->suffix,\"zip\")==0)\n\t\t\t\t{\n\t\t\t\tif (strncmp((char *)foundat, \"content.xml\", 11) == 0 && strcmp(needle->suffix,\"zip\")==0)\n\t\t\t\t\t{\n\t\t\t\t\toOffice = TRUE;\n\t\t\t\t\tsprintf(comment, \" (OpenOffice Doc?)\");\n\t\t\t\t\tstrcat(needle->comment, comment);\n\t\t\t\t\tneedle->suffix = \"sx\";\n\t\t\t\t\t}\n\t\t\t\telse if (strstr((char *)foundat, \".class\") || strstr((char *)foundat, \".jar\") ||\n\t\t\t\t\t\t strstr((char *)foundat, \".java\"))\n\t\t\t\t\t{\n\t\t\t\t\tneedle->suffix = \"jar\";\n\t\t\t\t\t}\n\t\t\t\telse if(strncmp((char *)foundat, \"[Content_Types].xml\",19)==0)\n\t\t\t\t\t{\n\t\t\t\t\t\toffice2007=TRUE;\n\t\t\t\t\t}\n\t\t\t\telse if(strncmp((char *)foundat, \"ppt/slides\",10)==0 && office2007==TRUE)\n\t\t\t\t\t{\n\t\t\t\t\t\tneedle->suffix = \"pptx\";\n\t\t\t\t\t}\n\t\t\t\telse if(strncmp((char *)foundat, \"word/document.xml\",17)==0 && office2007==TRUE)\n\t\t\t\t\t{\t\n\t\t\t\t\t\tneedle->suffix = \"docx\";\n\t\t\t\t\t}\n\t\t\t\telse if(strncmp((char *)foundat, \"xl/workbook.xml\",15)==0 && office2007==TRUE)\n\t\t\t\t\t{\t\n\t\t\t\t\t\tneedle->suffix = \"xlsx\";\n\t\t\t\t\t}\n\t\t\t\t\t\n\t\t\t\t\t\n\t\t\t\telse\n\t\t\t\t\t{\n\t\t\t\t\t\tprintf(\"foundat=%s\\n\",foundat);\n\t\t\t\t\t}\t\n\t\t\t\t}\n\n\t\t\tfoundat += localFH.compressed;\n\t\t\tfoundat += localFH.filename_length;\n\t\t\tfoundat += localFH.extra_length;\n\t\t\t\n\t\t\tif (localFH.genFlag == 8)\n\t\t\t\t{\n#ifdef DEBUG\t\n\t\t\t\t\tfprintf(stderr,\"We have extra stuff!!!\");\n#endif\n\t\t\t\t}\n\t\t\t\n\t\t\t\n\t\t\tif(localFH.genFlag & 1<<3 && localFH.uncompressed==0 &&  localFH.compressed==0 )\n\t\t\t\t{\n#ifdef DEBUG\n\t\t\t\tfprintf(stderr,\"No data to jmp Just search for the next file Footer (localFH.genFlag:=%d)\\n\",localFH.genFlag);\n#endif\n\t\t\t\tbreak;\n\t\t\t\t}\n\n\t#ifdef DEBUG\n\t\t\t\tprintf(\"localFH.compressed:=%d  localFH.uncompressed:=%d\\n\\t jumping %d bytes filename=%d bytes\",\n\t\t\t\t\t   localFH.compressed,\n\t\t\t\t\t   localFH.uncompressed,localFH.filename_length+localFH.compressed+localFH.extra_length,localFH.filename_length);\n\t\t\t\tprintx(foundat, 0, 16);\n\t#endif\n\n\t\t\t}\t\n\t\telse if (oOffice && localFH.genFlag == 8)\n\t\t\t{\n\t\t\tbreak;\n\t\t\t}\n\t\telse\n\t\t\t{\n\t\t\tbreak;\n\t\t\t}\n\t\t\t\n\t\t\n\t}//end while loop\n\t\n\tif (oOffice)\n\t\t{\n\n\t\t//We have an OO doc how long should we search for?\n\t\tbytes_to_search = 1 * MEGABYTE;\n\t\t}\n\telse if (localFH.genFlag & 1<<3 && localFH.uncompressed==0 &&  localFH.compressed==0 )\n\t\t{\n\t\tbytes_to_search = needle->max_len;\n\t\t}\n\telse\n\t\t{\n\t\tbytes_to_search = (buflen < (foundat - buf) ? buflen : buflen - (foundat - buf));\n\t\t}\n\n\t//Make sure we are not searching more than what he have\n        if (buflen <= (foundat - buf)) {\n#ifdef DEBUG\n\t\tprintf(\"avoided bug in extract_zip!\\n\");\n#endif\n\t\tbytes_to_search = 0;\n\t} else {\n\t\tif (buflen - (foundat - buf) < bytes_to_search)\n\t\t{\n\t\tbytes_to_search = buflen - (foundat - buf);\n\t\t}\n\t}\n\n\n\tcurrentpos = foundat;\n#ifdef DEBUG\n\tprintf(\"Search for the footer bytes_to_search:=%lld buflen:=%lld\\n\", bytes_to_search, buflen);\n#endif\n\n\tfoundat = bm_search(needle->footer,\n\t\t\t\t\t\tneedle->footer_len,\n\t\t\t\t\t\tfoundat,\n\t\t\t\t\t\tbytes_to_search,\n\t\t\t\t\t\tneedle->footer_bm_table,\n\t\t\t\t\t\tneedle->case_sen,\n\t\t\t\t\t\tSEARCHTYPE_FORWARD);\n#ifdef DEBUG\n\tprintf(\"Search complete \\n\");\n#endif\n\n\tif (foundat)\t\t\t\t\t\t\t\t\t\t\t/*Found the end of the central directory structure, determine the exact length and extract*/\n\t{\n\n\t\t/*Jump to the comment length field*/\n#ifdef DEBUG\n\t\tprintf(\"distance searched:=%lu\\n\", foundat - currentpos);\n#endif\n\t\tif (buflen - (foundat - buf) > 20)\n\t\t\t{\n\t\t\tfoundat += 20;\n\t\t\t}\n\t\telse\n\t\t\t{\n\t\t\treturn NULL;\n\t\t\t}\n\n\t\tcomment_length = htos(foundat, FOREMOST_LITTLE_ENDIAN);\n\t\tfoundat += comment_length + 2;\n\t\tfile_size = (foundat - buf);\n#ifdef DEBUG\n\t\tprintf(\"File size %lld\\n\", file_size);\n\t\tprintf(\"Found a %s type:=%s\\n\", needle->suffix, type);\n#endif\n\t\textractbuf = buf;\n\t\tif (strcmp(type,\"all\")==0 || strcmp(type,needle->suffix)==0)\n\t\t{\n#ifdef DEBUG\n\t\t\tprintf(\"Writing a %s to disk\\n\", needle->suffix);\n#endif\n\t\t\twrite_to_disk(s, needle, file_size, extractbuf, c_offset + f_offset);\n\t\t}\n\n#ifdef DEBUG\n\t\tprintf(\"Found a %s\\n\", needle->suffix);\n#endif\n\t\treturn foundat-2;\n\t}\n\n\tif (bytes_to_search > buflen - (currentpos - buf))\n\t\treturn NULL;\n\n#ifdef DEBUG\n\tprintf(\"I give up \\n\");\n#endif\n\treturn currentpos;\n}\n\n/********************************************************************************\n *Function: extract_pdf\n *Description: Given that we have a PDF header check if it is Linearized, if so\n    grab the file size and we are done, else search for the %%EOF\n*Return: A pointer to where the EOF of the PDF is in the current buffer\n**********************************************************************************/\nunsigned char *extract_pdf(f_state *s, u_int64_t c_offset, unsigned char *foundat, u_int64_t buflen,\n\t\t\t\t\t\t   s_spec *needle, u_int64_t f_offset)\n{\n\tunsigned char\t\t*currentpos = NULL;\n\tunsigned char\t\t*buf = foundat;\n\tunsigned char\t\t*extractbuf = NULL;\n\tunsigned char\t\t*tempsize;\n\tunsigned long int\tsize = 0;\n\tint\t\t\t\t\tfile_size = 0;\n\tunsigned char\t\t*header = foundat;\n\tint\t\t\t\t\tbytes_to_search = 0;\n\tchar\t\t\t\tcomment[32];\n\n\tfoundat += needle->header_len;\t/* Jump Past the %PDF HEADER */\n\tcurrentpos = foundat;\n\n#ifdef DEBUG\n\tprintf(\"PDF SEARCH\\n\");\n#endif\n\n\t/*Determine when we have searched enough*/\n\tif (buflen >= needle->max_len)\n\t\t{\n\t\tbytes_to_search = needle->max_len;\n\t\t}\n\telse\n\t\t{\n\t\tbytes_to_search = buflen;\n\t\t}\n\n\t/*Check if the buffer is less than 100 bytes, if so search what we have*/\n\tif (buflen < 512)\n\t\treturn NULL;\n\telse\n\t\t{\n\t\tcurrentpos = foundat;\n\n\t\t/*Check for .obj in the first 100 bytes*/\n\t\tfoundat = bm_search(needle->markerlist[1].value,\n\t\t\t\t\t\t\tneedle->markerlist[1].len,\n\t\t\t\t\t\t\tfoundat,\n\t\t\t\t\t\t\t100,\n\t\t\t\t\t\t\tneedle->markerlist[1].marker_bm_table,\n\t\t\t\t\t\t\tneedle->case_sen,\n\t\t\t\t\t\t\tSEARCHTYPE_FORWARD);\n\n\t\tif (!foundat)\n\t\t{\n#ifdef DEBUG\n\t\t\tprintf(\"no obj found\\n\");\n#endif\n\t\t\treturn currentpos + 100;\n\t\t}\n\n\t\tfoundat = currentpos;\n\n\t\t/*Search for \"./L \" to see if the file is linearized*/\n\t\tfoundat = bm_search(needle->markerlist[2].value,\n\t\t\t\t\t\t\tneedle->markerlist[2].len,\n\t\t\t\t\t\t\tfoundat,\n\t\t\t\t\t\t\t512,\n\t\t\t\t\t\t\tneedle->markerlist[2].marker_bm_table,\n\t\t\t\t\t\t\tneedle->case_sen,\n\t\t\t\t\t\t\tSEARCHTYPE_FORWARD);\n\n\t\tif (foundat)\n\t\t\t{\n\t\t\tfoundat = bm_search(needle->markerlist[0].value,\n\t\t\t\t\t\t\t\tneedle->markerlist[0].len,\n\t\t\t\t\t\t\t\tfoundat,\n\t\t\t\t\t\t\t\t512,\n\t\t\t\t\t\t\t\tneedle->markerlist[0].marker_bm_table,\n\t\t\t\t\t\t\t\tneedle->case_sen,\n\t\t\t\t\t\t\t\tSEARCHTYPE_FORWARD);\n\t\t\t}\n\t\telse\n\t\t{\n#ifdef DEBUG\n\t\t\tprintf(\"not linearized\\n\");\n#endif\n\t\t}\n\t\t}\n\n\tif (foundat)\t\t\t\t\t/*The PDF is linearized extract the size and we are done*/\n\t\t{\n\t\tsprintf(comment, \" (PDF is Linearized)\");\n\t\tstrcat(needle->comment, comment);\n\n\t\tfoundat += needle->markerlist[0].len;\n\t\ttempsize = (unsigned char *)malloc(8 * sizeof(char));\n\t\ttempsize = memcpy(tempsize, foundat, 8);\n\t\tsize = atoi((char *)tempsize);\n\n\t\tfree(tempsize);\n\t\tif (size <= 0)\n\t\t\treturn foundat;\n\t\tif (size > buflen)\n\t\t\t{\n\t\t\tif (size > needle->max_len)\n\t\t\t\treturn foundat;\n\t\t\telse\n\t\t\t\treturn NULL;\n\t\t\t}\n\n\t\theader += size;\n\t\tfoundat = header;\n\t\tfoundat -= needle->footer_len;\n\n\t\t/*Jump back 10 bytes and see if we actually have and EOF there*/\n\t\tfoundat -= 10;\n\t\tcurrentpos = foundat;\n\t\tfoundat = bm_search(needle->footer,\n\t\t\t\t\t\t\tneedle->footer_len,\n\t\t\t\t\t\t\tfoundat,\n\t\t\t\t\t\t\tneedle->footer_len + 9,\n\t\t\t\t\t\t\tneedle->footer_bm_table,\n\t\t\t\t\t\t\tneedle->case_sen,\n\t\t\t\t\t\t\tSEARCHTYPE_FORWARD);\n\t\tif (foundat)\t\t\t\t/*There is an valid EOF at the end, Write to disk*/\n\t\t\t{\n\t\t\tfoundat += needle->footer_len + 1;\n\t\t\tfile_size = (foundat - buf);\n\n\t\t\textractbuf = buf;\n\t\t\twrite_to_disk(s, needle, file_size, extractbuf, c_offset + f_offset);\n\n\t\t\treturn foundat;\n\t\t\t}\n\n\t\treturn NULL;\n\n\t\t}\n\telse\t\t\t\t\t\t\t/*Search for Linearized PDF failed, just look for %%EOF */\n\t{\n#ifdef DEBUG\n\t\tprintf(\"\tLinearized search failed, searching %d bytes, buflen:=%lld\\n\",\n\t\t\t   bytes_to_search,\n\t\t\t   buflen - (header - buf));\n#endif\n\t\tfoundat = currentpos;\n\t\tfoundat = bm_search(needle->footer,\n\t\t\t\t\t\t\tneedle->footer_len,\n\t\t\t\t\t\t\tfoundat,\n\t\t\t\t\t\t\tbytes_to_search,\n\t\t\t\t\t\t\tneedle->footer_bm_table,\n\t\t\t\t\t\t\tneedle->case_sen,\n\t\t\t\t\t\t\tSEARCHTYPE_FORWARD);\n\n\t\tif (foundat)\t\t\t\t/*Write the non-linearized PDF to disk*/\n\t\t\t{\n\t\t\tfoundat += needle->footer_len + 1;\n\t\t\tfile_size = (foundat - buf);\n\t\t\textractbuf = buf;\n\n\t\t\twrite_to_disk(s, needle, file_size, extractbuf, c_offset + f_offset);\n\n\t\t\treturn foundat;\n\n\t\t\t}\n\n\t\treturn NULL;\n\t}\n\n}\n\n/********************************************************************************\n *Function: extract_cpp\n *Description: Use keywords to attempt to find C/C++ source code\n*Return: A pointer to where the EOF of the CPP file is in the current buffer\n**********************************************************************************/\nunsigned char *extract_cpp(f_state *s, u_int64_t c_offset, unsigned char *foundat, u_int64_t buflen,\n\t\t\t\t\t\t   s_spec *needle, u_int64_t f_offset)\n{\n\n\tunsigned char\t*header = foundat;\n\tunsigned char\t*buf = foundat;\n\tunsigned char\t*extractbuf = NULL;\n\tint\t\t\t\tend = 0;\n\tint\t\t\t\tstart = 0;\n\tint\t\t\t\ti = 0;\n\tint\t\t\t\tmarker_score = 0;\n\tint\t\t\t\tok = FALSE;\n\tint\t\t\t\tfile_size = 0;\n\tunsigned char\t*footer = NULL;\n\n\t/*Search for a \" or a < within 20 bytes of a #include statement*/\n\tfor (i = 0; i < 20; i++)\n\t\t{\n\t\tif (foundat[i] == '\\x22' || foundat[i] == '\\x3C')\n\t\t\t{\n\t\t\tok = TRUE;\n\t\t\t}\n\t\t}\n\n\tif (!ok)\n\t\treturn foundat + needle->header_len;\n\n\t/*Keep running through the buffer until an non printable character is reached*/\n\twhile (isprint(foundat[end]) || foundat[end] == '\\x0a' || foundat[end] == '\\x09')\n\t\t{\n\t\tend++;\n\t\t}\n\n\tfoundat += end - 1;\n\tfooter = foundat;\n\n\tif (end < 50)\n\t\treturn foundat;\n\n\t/*Now lets go the other way and grab all those comments at the begining of the file*/\n\twhile (isprint(buf[start]) || buf[start] == '\\x0a' || buf[start] == '\\x09')\n\t\t{\n\t\tstart--;\n\t\t}\n\n\theader = &buf[start + 1];\n\tfile_size = (footer - header);\n\n\tfoundat = header;\n\n\t/*Now we have an ascii file to look for keywords in*/\n\tfoundat = bm_search(needle->footer,\n\t\t\t\t\t\tneedle->footer_len,\n\t\t\t\t\t\theader,\n\t\t\t\t\t\tfile_size,\n\t\t\t\t\t\tneedle->footer_bm_table,\n\t\t\t\t\t\tFALSE,\n\t\t\t\t\t\tSEARCHTYPE_FORWARD);\n\tif (foundat)\n\t\tmarker_score += 1;\n\n\tfoundat = header;\n\tfoundat = bm_search(needle->markerlist[0].value,\n\t\t\t\t\t\tneedle->markerlist[0].len,\n\t\t\t\t\t\theader,\n\t\t\t\t\t\tfile_size,\n\t\t\t\t\t\tneedle->markerlist[0].marker_bm_table,\n\t\t\t\t\t\t1,\n\t\t\t\t\t\tSEARCHTYPE_FORWARD);\n\tif (foundat)\n\t\tmarker_score += 1;\n\n\tif (marker_score == 0)\n\t\treturn foundat;\n\n\tif (foundat)\n\t\t{\n\t\textractbuf = buf;\n\t\twrite_to_disk(s, needle, file_size, extractbuf, c_offset + f_offset + start + 1);\n\t\t\n\t\treturn footer;\n\n\t\t}\n\n\treturn NULL;\n}\n\n/********************************************************************************\n *Function: extract_htm\n *Description: Given that we have a HTM header\n    search for the file EOF and check that the bytes areound the header are ascii\n*Return: A pointer to where the EOF of the HTM is in the current buffer\n**********************************************************************************/\nunsigned char *extract_htm(f_state *s, u_int64_t c_offset, unsigned char *foundat, u_int64_t buflen,\n\t\t\t\t\t\t   s_spec *needle, u_int64_t f_offset)\n{\n\tunsigned char\t*buf = foundat;\n\tunsigned char\t*extractbuf = NULL;\n\tunsigned char\t*currentpos = NULL;\n\n\tint\t\t\t\tbytes_to_search = 0;\n\tint\t\t\t\ti = 0;\n\tint\t\t\t\tfile_size = 0;\n\n\t/*Jump past the <HTML tag*/\n\tfoundat += needle->header_len;\n\n\t/*Check the first 16 bytes to see if they are ASCII*/\n\tfor (i = 0; i < 16; i++)\n\t\t{\n\t\tif (!isprint(foundat[i]) && foundat[i] != '\\x0a' && foundat[i] != '\\x09')\n\t\t\t{\n\t\t\treturn foundat + 16;\n\t\t\t}\n\t\t}\n\n\t/*Determine if the buffer is large enough to encompass a reasonable search*/\n\tif (buflen < needle->max_len)\n\t\t{\n\t\tbytes_to_search = buflen - (foundat - buf);\n\t\t}\n\telse\n\t\t{\n\t\tbytes_to_search = needle->max_len;\n\t\t}\n\n\t/*Store the current position and search for the HTML> tag*/\n\tcurrentpos = foundat;\n\tfoundat = bm_search(needle->footer,\n\t\t\t\t\t\tneedle->footer_len,\n\t\t\t\t\t\tfoundat,\n\t\t\t\t\t\tbytes_to_search,\n\t\t\t\t\t\tneedle->footer_bm_table,\n\t\t\t\t\t\tneedle->case_sen,\n\t\t\t\t\t\tSEARCHTYPE_FORWARD);\n\tif (foundat)\t//Found the footer, write to disk\n\t\t{\n\t\tfile_size = (foundat - buf) + needle->footer_len;\n\t\textractbuf = buf;\n\t\twrite_to_disk(s, needle, file_size, extractbuf, c_offset + f_offset);\n\t\tfoundat += needle->footer_len;\n\t\treturn foundat;\n\n\t\t}\n\telse\n\t\t{\n\t\treturn NULL;\n\t\t}\n\n}\n\n/********************************************************************************\n *Function: validOLEheader\n *Description: run various tests aginst an OLE-HEADER to determine whether or not\n \tit is valid.\n*Return: TRUE/FALSE\n**********************************************************************************/\nint valid_ole_header(struct OLE_HDR *h)\n{\n\n\tif (htos((unsigned char *) &h->reserved, FOREMOST_LITTLE_ENDIAN) != 0 ||\n\t\thtoi((unsigned char *) &h->reserved1, FOREMOST_LITTLE_ENDIAN) != 0 ||\n\t\thtoi((unsigned char *) &h->reserved2, FOREMOST_LITTLE_ENDIAN) != 0)\n\t\t{\n\t\treturn FALSE;\n\t\t}\n\n\t/*The minimum sector shift is usually 2^6(64) and the uSectorShift is 2^9(512))*/\n\tif (htos((unsigned char *) &h->uMiniSectorShift, FOREMOST_LITTLE_ENDIAN) != 6 ||\n\t\thtos((unsigned char *) &h->uSectorShift, FOREMOST_LITTLE_ENDIAN) != 9 ||\n\t\thtoi((unsigned char *) &h->dir_flag, FOREMOST_LITTLE_ENDIAN) < 0)\n\t\t{\n\t\treturn FALSE;\n\t\t}\n\n\t/*Sanity Checking*/\n\tif (htoi((unsigned char *) &h->num_FAT_blocks, FOREMOST_LITTLE_ENDIAN) <= 0 ||\n\t\thtoi((unsigned char *) &h->num_FAT_blocks, FOREMOST_LITTLE_ENDIAN) > 100)\n\t\t{\n\t\treturn FALSE;\n\t\t}\n\n\tif (htoi((unsigned char *) &h->num_extra_FAT_blocks, FOREMOST_LITTLE_ENDIAN) < 0 ||\n\t\thtoi((unsigned char *) &h->num_extra_FAT_blocks, FOREMOST_LITTLE_ENDIAN) > 100)\n\t\t{\n\t\treturn FALSE;\n\t\t}\n\n\treturn TRUE;\n\n}\n\n/********************************************************************************\n *Function:checkOleName\n *Description: Determine what type of file is stored in the OLE format based on the\n \tnames of DIRENT in the FAT table.\n*Return: A char* consisting of the suffix of the appropriate file.\n**********************************************************************************/\nchar *check_ole_name(char *name)\n{\n\tif (strstr(name, \"WordDocument\"))\n\t\t{\n\t\treturn \"doc\";\n\t\t}\n\telse if (strstr(name, \"Worksheet\") || strstr(name, \"Book\") || strstr(name, \"Workbook\"))\n\t\t{\n\t\treturn \"xls\";\n\t\t}\n\telse if (strstr(name, \"Power\"))\n\t\t{\n\t\treturn \"ppt\";\n\t\t}\n\telse if (strstr(name, \"Access\") || strstr(name, \"AccessObjSiteData\"))\n\t\t{\n\t\treturn \"mbd\";\n\t\t}\n\telse if (strstr(name, \"Visio\"))\n\t\t{\n\t\treturn \"vis\";\n\t\t}\n\telse if (strstr(name, \"Sfx\"))\n\t\t{\n\t\treturn \"sdw\";\n\t\t}\n\telse\n\t\t{\n\t\treturn NULL;\n\t\t}\n\n\treturn NULL;\n\n}\n\nint adjust_bs(int size, int bs)\n{\n\tint rem = (size % bs);\n\n\tif (rem == 0)\n\t\t{\n\n\t\treturn size;\n\t\t}\n\n#ifdef DEBUG\n\tprintf(\"\\tnew size:=%d\\n\", size + (bs - rem));\n#endif\n\treturn (size + (bs - rem));\n\n}\n\n/********************************************************************************\n *Function: extract_ole\n *Description: Given that we have a OLE header, jump through the OLE structure and\n    determine what type of file it is.\n*Return: A pointer to where the EOF of the OLE is in the current buffer\n**********************************************************************************/\nunsigned char *extract_ole(f_state *s, u_int64_t c_offset, unsigned char *foundat, u_int64_t buflen,\n\t\t\t\t\t\t   s_spec *needle, u_int64_t f_offset, char *type)\n{\n\tunsigned char\t*buf = foundat;\n\tunsigned char\t*extractbuf = NULL;\n\tchar\t\t\t*temp = NULL;\n\tchar\t\t\t*suffix = \"ole\";\n\tint\t\t\t\ttotalsize = 0;\n\tint\t\t\t\textrasize = 0;\n\tint\t\t\t\toldblk = 0;\n\tint\t\t\t\ti, j;\n\tint\t\t\t\tsize = 0;\n\tint\t\t\t\tblknum = 0;\n\tint\t\t\t\tvalidblk = 512;\n\tint\t\t\t\tfile_size = 0;\n\tint\t\t\t\tnum_extra_FAT_blocks = 0;\n\tunsigned char\t*htoi_c = NULL;\n\tint\t\t\t\textra_dir_blocks = 0;\n\tint\t\t\t\tnum_FAT_blocks = 0;\n\tint\t\t\t\tnext_FAT_block = 0;\n\tunsigned char\t*p;\n\tint\t\t\t\tfib = 1024;\n\tstruct OLE_HDR\t*h = NULL;\n\n\tint\t\t\t\tresult = 0;\n\tint\t\t\t\thighblock = 0;\n\tunsigned long\tminiSectorCutoff = 0;\n\tunsigned long\tcsectMiniFat = 0;\n\n\t/*Deal with globals defined in the OLE API, ugly*/\n\tif (dirlist != NULL)\n\t\tfree(dirlist);\n\tif (FAT != NULL)\n\t\tfree(FAT);\n\tinit_ole();\n\n\tif (buflen < validblk)\n\t\tvalidblk = buflen;\n\th = (struct OLE_HDR *)foundat;\t/*cast the header block to point at foundat*/\n#ifdef DEBUG\n\tdump_header(h);\n#endif\n\tnum_FAT_blocks = htoi((unsigned char *) &h->num_FAT_blocks, FOREMOST_LITTLE_ENDIAN);\n\n\tif (!valid_ole_header(h))\n\t\treturn (buf + validblk);\n\n\tminiSectorCutoff = htoi((unsigned char *) &h->miniSectorCutoff, FOREMOST_LITTLE_ENDIAN);\n\tcsectMiniFat = htoi((unsigned char *) &h->csectMiniFat, FOREMOST_LITTLE_ENDIAN);\n\tnext_FAT_block = htoi((unsigned char *) &h->FAT_next_block, FOREMOST_LITTLE_ENDIAN);\n\tnum_extra_FAT_blocks = htoi((unsigned char *) &h->num_extra_FAT_blocks, FOREMOST_LITTLE_ENDIAN);\n\n\tFAT = (int *)Malloc(OUR_BLK_SIZE * (num_FAT_blocks + 1));\n\tp = (unsigned char *)FAT;\n\tmemcpy(p, &h[1], OUR_BLK_SIZE - FAT_START);\n\tif (next_FAT_block > 0)\n\t\t{\n\t\tp += (OUR_BLK_SIZE - FAT_START);\n\t\tblknum = next_FAT_block;\n\t\tfor (i = 0; i < num_extra_FAT_blocks; i++)\n\t\t\t{\n\t\t\tif (!get_block(buf, blknum, p, buflen))\n\t\t\t\treturn buf + validblk;\n\t\t\tvalidblk = (blknum + 1) * OUR_BLK_SIZE;\n\t\t\tp += OUR_BLK_SIZE - sizeof(int);\n\t\t\tblknum = htoi(p, FOREMOST_LITTLE_ENDIAN);\n\t\t\t}\n\t\t}\n\n\tblknum = htoi((unsigned char *) &h->root_start_block, FOREMOST_LITTLE_ENDIAN);\n\n\tif(blknum < 0)\n\t{\n\t\treturn buf + 10;\n\t}\n\n\thighblock = htoi((unsigned char *) &h->dir_flag, FOREMOST_LITTLE_ENDIAN);\n#ifdef DEBUG\n\tprintf(\"getting dir block\\n\");\n#endif\n\n\t//if(!get_dir_block (buf, blknum, buflen)) return buf+validblk;\n\tif (!get_block(buf, blknum, buffer, buflen))\n\t\treturn buf + validblk;\t\t/*GET DIR BLOCK*/\n#ifdef DEBUG\n\tprintf(\"done getting dir block\\n\");\n#endif\n\tvalidblk = (blknum + 1) * OUR_BLK_SIZE;\t\n\twhile (blknum != END_OF_CHAIN)\n\t{\n#ifdef DEBUG\n\t\tprintf(\"finding dir info extra_dir_blks:=%d\\n\", extra_dir_blocks);\n#endif\n\t\tif (extra_dir_blocks > 300)\n\t\t\treturn buf + validblk;\n\n\t\t/**PROBLEMA**/\n#ifdef DEBUG\n\t\tprintf(\"***blknum:=%d FATblk:=%d ourblksize=%d\\n\", blknum, FATblk,OUR_BLK_SIZE);\n#endif\n\t\toldblk = blknum;\n\t\thtoi_c = (unsigned char *) &FAT[blknum / (OUR_BLK_SIZE / sizeof(int))];\n\n\t\tFATblk = htoi(htoi_c, FOREMOST_LITTLE_ENDIAN);\n#ifdef DEBUG\n\t\tprintf(\"***blknum:=%d FATblk:=%d\\n\", blknum, FATblk);\n#endif\n\n\t\tif (!get_FAT_block(buf, blknum, block_list, buflen))\n\t\t\treturn buf + validblk;\n\t\tblknum = htoi((unsigned char *) &block_list[blknum % 128], FOREMOST_LITTLE_ENDIAN);\n#ifdef DEBUG\n\t\tprintf(\"**blknum:=%d FATblk:=%d\\n\", blknum, FATblk);\n#endif\n\t\tif (blknum == END_OF_CHAIN || oldblk == blknum)\n\t\t{\n#ifdef DEBUG\n\t\t\tprintf(\"EOC\\n\");\n#endif\n\t\t\tbreak;\n\t\t}\n\n\t\textra_dir_blocks++;\n\t\tresult = get_dir_block(buf, blknum, buflen);\n\t\tif (result == SHORT_BLOCK)\n\t\t{\n#ifdef DEBUG\n\t\t\tprintf(\"SHORT BLK\\n\");\n#endif\n\t\t\tbreak;\n\t\t}\n\t\telse if (!result)\n\t\t\treturn buf + validblk;\n\n\t}\n\n#ifdef DEBUG\n\tprintf(\"DONE WITH WHILE\\n\");\n#endif\n\tblknum = htoi((unsigned char *) &h->root_start_block, FOREMOST_LITTLE_ENDIAN);\n\tsize = OUR_BLK_SIZE * (extra_dir_blocks + 1);\n\tdirlist = (struct DIRECTORY *)Malloc(size);\n\tmemset(dirlist, 0, size);\n\n\tif (!get_block(buf, blknum, buffer, buflen))\n\t\treturn buf + validblk;\t\t/*GET DIR BLOCK*/\n\n\tif (!get_dir_info(buffer))\n\t\t{\n\t\treturn foundat + validblk;\n\t\t}\n\n\tfor (i = 0; i < extra_dir_blocks; i++)\n\t\t{\n\t\tif (!get_FAT_block(buf, blknum, block_list, buflen))\n\t\t\treturn buf + validblk;\n\t\tblknum = htoi((unsigned char *) &block_list[blknum % 128], FOREMOST_LITTLE_ENDIAN);\n\t\tif (blknum == END_OF_CHAIN)\n\t\t\tbreak;\n#ifdef DEBUG\n\t\tprintf(\"getting dir blk blknum=%d\\n\", blknum);\n#endif\n\t\tif (!get_block(buf, blknum, buffer, buflen))\n\t\t\treturn buf + validblk;\t/*GET DIR BLOCK*/\n\t\tif (!get_dir_info(buffer))\n\t\t\t{\n\t\t\treturn buf + validblk;\n\t\t\t}\n\t\t}\n\n#ifdef DEBUG\n\tprintf(\"dir count is %d\\n\", i);\n#endif\n\tfor (dl = dirlist, i = 0; i < dir_count; i++, dl++)\n\t\t{\n\t\tmemset(buffer, ' ', 75);\n\t\tj = htoi((unsigned char *) &dl->level, FOREMOST_LITTLE_ENDIAN) * 4;\n\t\tsprintf((char *) &buffer[j], \"%-s\", dl->name);\n\t\tj = strlen((char *)buffer);\n\n\t\tif (dl->name[0] == '@')\n\t\t\treturn foundat + validblk;\n\t\tif (dl->type == STREAM)\n\t\t\t{\n\t\t\tbuffer[j] = ' ';\n\t\t\tsprintf((char *) &buffer[60], \"%8d\\n\", dl->size);\n\n\t\t\tif (temp == NULL)\t\t/*check if we have alread defined the type*/\n\t\t\t\t{\n\t\t\t\ttemp = check_ole_name(dl->name);\n\t\t\t\tif (temp)\n\t\t\t\t\tsuffix = temp;\n\t\t\t\t}\n\n\t\t\tif (dl->size > miniSectorCutoff)\n\t\t\t\t{\n\t\t\t\ttotalsize += adjust_bs(dl->size, 512);\n\t\t\t\t}\n\t\t\telse\n\t\t\t\t{\n\t\t\t\ttotalsize += adjust_bs(dl->size, 64);\n\t\t\t\t}\n\n#ifdef DEBUG\n\t\t\tfprintf(stdout, buffer);\n#endif\n\t\t\t}\n\t\telse\n\t\t\t{\n\t\t\tsprintf((char *) &buffer[j], \"\\n\");\n#ifdef DEBUG\n\t\t\tprintf(\"\\tnot stream data \\n\");\n\t\t\tfprintf(stdout, buffer);\n#endif\n\n\t\t\textrasize += adjust_bs(dl->size, 512);\n\n\t\t\t}\n\t\t}\n\n\ttotalsize += fib;\n#ifdef DEBUG\n\tprintf(\"DIR SIZE:=%d, numFATblks:=%d MiniFat:=%d\\n\",\n\t\t   adjust_bs(((dir_count) * 128), 512),\n\t\t   (num_FAT_blocks * 512),\n\t\t   adjust_bs((64 * csectMiniFat), 512));\n#endif\n\ttotalsize += adjust_bs(((dir_count) * 128), 512);\n\ttotalsize += (num_FAT_blocks * 512);\n\ttotalsize += adjust_bs((64 * csectMiniFat), 512);\n\tif ((highblk + 5) > highblock && highblk > 0)\n\t\t{\n\t\thighblock = highblk + 5;\n\t\t}\n\n\thighblock = highblock * 512;\n\n#ifdef DEBUG\n\tprintf(\"\\t highblock:=%d\\n\", highblock);\n#endif\n\tif (highblock > totalsize)\n\t{\n#ifdef DEBUG\n\t\tprintf(\"\tTotal size:=%d a difference of %lld\\n\", totalsize, buflen - totalsize);\n\t\tprintf(\"\tExtra size:=%d \\n\", extrasize);\n\t\tprintf(\"\tHighblock is greater than totalsize\\n\");\n#endif\n\t\ttotalsize = highblock;\n\t}\n\n\ttotalsize = adjust_bs(totalsize, 512);\n#ifdef DEBUG\n\tprintf(\"\tTotal size:=%d a difference of %lld\\n\", totalsize, buflen - totalsize);\n\tprintf(\"\tExtra size:=%d \\n\", extrasize);\n#endif\n\n\tif (buflen < totalsize)\n\t{\n#ifdef DEBUG\n\t\tprintf(\"\t***Error not enough left in the buffer left:=%lld needed=%d***\\n\",\n\t\t\t   buflen,\n\t\t\t   totalsize);\n#endif\n\t\ttotalsize = buflen;\n\t}\n\n\tfoundat = buf;\n\thighblock -= 5 * 512;\n\tif (highblock > 0 && highblock < buflen)\n\t\t{\n\t\tfoundat += highblock;\n\t\t}\n\telse\n\t\t{\n\t\tfoundat += totalsize;\n\t\t}\n\n\t/*Return to the highest blknum read in the file, that way we don't miss files that are close*/\n\tfile_size = totalsize;\n\textractbuf = buf;\n\n\tif (suffix)\n\t\tneedle->suffix = suffix;\n\n\tif (!strstr(needle->suffix, type) && strcmp(type,\"all\")!=0)\n\t\t{\n\t\treturn foundat;\n\t\t}\n\n\twrite_to_disk(s, needle, file_size, extractbuf, c_offset + f_offset);\n\treturn foundat;\n\n}\n\n//********************************************************************************/\nint check_mov(unsigned char *atom)\n{\n#ifdef DEBUG\n\tprintf(\"Atom:= %c%c%c%c\\n\", atom[0], atom[1], atom[2], atom[3]);\n#endif\n\tif (strncmp((char *)atom, \"free\", 4) == 0 || strncmp((char *)atom, \"mdat\", 4) == 0 ||\n\t\tstrncmp((char *)atom, \"free\", 4) == 0 || strncmp((char *)atom, \"wide\", 4) == 0 ||\n\t\tstrncmp((char *)atom, \"PICT\", 4) == 0)\n\t\t{\n\t\treturn TRUE;\n\t\t}\n\n\tif (strncmp((char *)atom, \"trak\", 4) == 0 || strncmp((char *)atom, \"mdat\", 4) == 0 ||\n\t\tstrncmp((char *)atom, \"mp3\", 3) == 0 || strncmp((char *)atom, \"wide\", 4) == 0 ||\n\t\tstrncmp((char *)atom, \"moov\", 4) == 0)\n\t\t{\n\t\treturn TRUE;\n\t\t}\n\n\treturn FALSE;\n}\n\n/********************************************************************************\n *Function: extract_mov\n *Description: Given that we have a MOV header JUMP through the mov data structures\n    until we reach EOF\n*Return: A pointer to where the EOF of the MOV is in the current buffer\n**********************************************************************************/\nunsigned char *extract_mov(f_state *s, u_int64_t c_offset, unsigned char *foundat, u_int64_t buflen,\n\t\t\t\t\t\t   s_spec *needle, u_int64_t f_offset)\n{\n\tunsigned char\t*buf = foundat - 4;\n\tunsigned char\t*extractbuf = NULL;\n\tunsigned int\tatomsize = 0;\n\tunsigned int\tfilesize = 0;\n\tint\t\t\t\tmdat = FALSE;\n\tfoundat -= 4;\n\tbuflen += 4;\n\twhile (1)\t\t\t\t\t\t/*Loop through all the atoms until the EOF is reached*/\n\t\t{\n\t\tatomsize = htoi(foundat, FOREMOST_BIG_ENDIAN);\n#ifdef DEBUG\n\t\tprintf(\"Atomsize:=%d\\n\", atomsize);\n#endif\n\t\tif (atomsize <= 0 || atomsize > needle->max_len)\n\t\t\t{\n\t\t\treturn foundat + needle->header_len + 4;\n\t\t\t}\n\n\t\tfilesize += atomsize;\t\t/*Add the atomsize to the total file size*/\n\t\tif (filesize > buflen)\n\t\t{\n#ifdef DEBUG\n\t\t\tprintf(\"file size > buflen fs:=%d bf:=%lld\\n\", filesize, buflen);\n#endif\n\t\t\tif (buflen >= needle->max_len)\n\t\t\t\treturn foundat + needle->header_len + 4;\n\t\t\telse\n\t\t\t\t{\n\t\t\t\treturn NULL;\n\t\t\t\t}\n\t\t}\n\n\t\tfoundat += atomsize;\n\t\tif (buflen - (foundat - buf) < 5)\n\t\t\t{\n\t\t\tif (mdat)\n\t\t\t\t{\n\t\t\t\tbreak;\n\t\t\t\t}\n\t\t\telse\n\t\t\t{\n#ifdef DEBUG\n\t\t\t\tprintf(\"No mdat found\");\n#endif\n\t\t\t\treturn foundat;\n\t\t\t}\n\t\t\t}\n\n\t\t/*Check if we have an mdat atom, these are required thus can be used to\n\t* Weed out corrupted file*/\n\t\tif (strncmp((char *)foundat + 4, \"mdat\", 4) == 0)\n\t\t\t{\n\t\t\tmdat = TRUE;\n\t\t\t}\n\n\t\tif (check_mov(foundat + 4)) /*Check to see if we are at a valid header*/\n\t\t{\n#ifdef DEBUG\n\t\t\tprintf(\"Checkmov succeeded\\n\");\n#endif\n\t\t}\n\t\telse\n\t\t{\n#ifdef DEBUG\n\t\t\tprintf(\"Checkmov failed\\n\");\n#endif\n\t\t\tif (mdat)\n\t\t\t\t{\n\t\t\t\tbreak;\n\t\t\t\t}\n\t\t\telse\n\t\t\t{\n#ifdef DEBUG\n\t\t\t\tprintf(\"No mdat found\");\n#endif\n\t\t\t\treturn foundat;\n\n\t\t\t}\n\t\t}\n\t\t}\t\t\t\t\t\t\t//End loop\n\n\tif (foundat)\n\t\t{\n\n\t\tfilesize = (foundat - buf);\n#ifdef DEBUG\n\t\tprintf(\"file size:=%d\\n\", filesize);\n#endif\n\t\textractbuf = buf;\n\t\twrite_to_disk(s, needle, filesize, extractbuf, c_offset + f_offset - 4);\n\t\treturn foundat;\n\t\t}\n\n#ifdef DEBUG\n\tprintf(\"NULL Atomsize:=%d\\n\", atomsize);\n#endif\n\treturn NULL;\n\n}\n\n/********************************************************************************\n *Function: extract_wmv\n *Description: Given that we have a WMV header\n    search for the file header and grab the file size.\n*Return: A pointer to where the EOF of the WMV is in the current buffer\n**********************************************************************************/\nunsigned char *extract_wmv(f_state *s, u_int64_t c_offset, unsigned char *foundat, u_int64_t buflen,\n\t\t\t\t\t\t   s_spec *needle, u_int64_t f_offset)\n{\n\n\tunsigned char\t*currentpos = NULL;\n\tunsigned char\t*header = foundat;\n\tunsigned char\t*extractbuf = NULL;\n\tunsigned char\t*buf = foundat;\n\tunsigned int\t\tsize = 0;\n\tu_int64_t\t\tfile_size = 0;\n\tu_int64_t\t\t\theaderSize = 0;\n\tu_int64_t\t\t\tfileObjHeaderSize = 0;\n\tint\t\t\t\tnumberofHeaderObjects = 0;\n\tint\t\t\t\treserved[2];\n\tint\t\t\t\tbytes_to_search = 0;\n\n\t/*If we have less than a WMV header bail out*/\n\tif (buflen < 70)\n\t\treturn NULL;\n\n\tfoundat += 16;\t\t/*Jump to the header size*/\n\theaderSize = htoll(foundat, FOREMOST_LITTLE_ENDIAN);\n\t//printx(foundat,0,8);\n\tfoundat += 8;\n\tnumberofHeaderObjects = htoi(foundat, FOREMOST_LITTLE_ENDIAN);\n\tfoundat += 4;\t\t//Jump to the begin File properties obj\n\treserved[0] = foundat[0];\n\treserved[1] = foundat[1];\n\tfoundat += 2;\n\t//printf(\"found WMV\\n\");\n\t//end header obj\n\t//****************************************************/\n\t//Sanity Check\n\t//printf(\"WMV num_header_objs=%d headerSize=%llu\\n\",numberofHeaderObjects,headerSize);\n\n\tif (headerSize <= 0 || numberofHeaderObjects <= 0 || reserved[0] != 1)\n\t\t{\n\t\tprintf(\"WMV err num_header_objs=%d headerSize=%llu\\n\",numberofHeaderObjects,headerSize);\n\t\treturn foundat;\n\t\t}\n\n\tcurrentpos = foundat;\n\tif (buflen - (foundat - buf) >= needle->max_len)\n\t\tbytes_to_search = needle->max_len;\n\telse\n\t\tbytes_to_search = buflen - (foundat - buf);\n\n\t/*Note we are not searching for the footer here, just the file header ID so we can get the file size*/\n\tfoundat = bm_search(needle->footer,\n\t\t\t\t\t\tneedle->footer_len,\n\t\t\t\t\t\tfoundat,\n\t\t\t\t\t\tbytes_to_search,\n\t\t\t\t\t\tneedle->footer_bm_table,\n\t\t\t\t\t\tneedle->case_sen,\n\t\t\t\t\t\tSEARCHTYPE_FORWARD);\n\tif (foundat)\n\t\t{\n\t\tfoundat += 16;\t/*jump to the headersize*/\n\t\tfileObjHeaderSize = htoll(foundat, FOREMOST_LITTLE_ENDIAN);\n\t\t//printx(foundat,0,8);\n\t\tfoundat += 24;\t//Jump to the file size obj\n\t\tsize = htoi(foundat, FOREMOST_LITTLE_ENDIAN);\n\t\t//printx(foundat,0,8);\n\t\t\n#ifdef DEBUG\n\t\tprintf(\"SIZE:=%u fileObjHeaderSize=%llu\\n\", size,fileObjHeaderSize);\n#endif\n\t\t}\n\telse\n\t\t{\n\t\treturn NULL;\n\t\t}\n\n\t/*Sanity check data*/\n\tif (size > 0 && size <= needle->max_len && size <= buflen)\n\t\t{\n\t\theader += size;\n#ifdef DEBUG\n\t\tprintf(\"\tFound a WMV at:=%lld,File size:=%lld\\n\", c_offset, size);\n\t\tprintf(\"\tHeadersize:=%d, numberofHeaderObjects:= %d ,reserved:=%d,%d\\n\",\n\t\t\t   headerSize,\n\t\t\t   numberofHeaderObjects,\n\t\t\t   reserved[0],\n\t\t\t   reserved[1]);\n#endif\n\n\t\t/*Everything seem ok, write to disk*/\n\t\tfile_size = (header - buf);\n\t\textractbuf = buf;\n\t\twrite_to_disk(s, needle, file_size, extractbuf, c_offset + f_offset);\n\t\tfoundat += file_size;\n\t\treturn header;\n\t\t}\n\n\treturn NULL;\n\n}\n\n/********************************************************************************\n *Function: extract_riff\n *Description: Given that we have a RIFF header parse header and grab the file size.\n *Return: A pointer to where the EOF of the RIFF is in the current buffer\n **********************************************************************************/\nunsigned char *extract_riff(f_state *s, u_int64_t c_offset, unsigned char *foundat, u_int64_t buflen,\n\t\t\t\t\t\t\ts_spec *needle, u_int64_t f_offset, char *type)\n{\n\tunsigned char\t*buf = foundat;\n\tunsigned char\t*extractbuf = NULL;\n\tint\t\t\t\tsize = 0;\n\tu_int64_t\t\tfile_size = 0;\n\n\tsize = htoi(&foundat[4], FOREMOST_LITTLE_ENDIAN);\t\t/* Grab the total file size in little endian from offset 4*/\n\tif (strncmp((char *) &foundat[8], \"AVI\", 3) == 0)\t\t/*Sanity Check*/\n\t\t{\n\t\tif (strncmp((char *) &foundat[12], \"LIST\", 4) == 0) /*Sanity Check*/\n\t\t\t{\n\t\t\tif (size > 0 && size <= needle->max_len && size <= buflen)\n\t\t\t{\n#ifdef DEBUG\n\t\t\t\tprintf(\"\\n\tFound an AVI at:=%lld,File size:=%d\\n\", c_offset, size);\n#endif\n\t\t\t\tfile_size = size;\n\t\t\t\textractbuf = buf;\n\t\t\t\tneedle->suffix = \"avi\";\n\t\t\t\tif (!strstr(needle->suffix, type) && strcmp(type,\"all\")!=0)\n\t\t\t\t\treturn foundat + size;\n\t\t\t\twrite_to_disk(s, needle, file_size, extractbuf, c_offset + f_offset);\n\t\t\t\tfoundat += size;\n\t\t\t\treturn foundat;\n\t\t\t}\n\n\t\t\treturn buf + needle->header_len;\n\n\t\t\t}\n\t\telse\n\t\t\t{\n\t\t\treturn buf + needle->header_len;\n\t\t\t}\n\t\t}\n\telse if (strncmp((char *) &foundat[8], \"WAVE\", 4) == 0) /*Sanity Check*/\n\t\t{\n\t\tif (size > 0 && size <= needle->max_len && size <= buflen)\n\t\t{\n#ifdef DEBUG\n\t\t\tprintf(\"\\n\tFound a WAVE at:=%lld,File size:=%d\\n\", c_offset, size);\n#endif\n\n\t\t\tfile_size = size;\n\t\t\textractbuf = buf;\n\t\t\tneedle->suffix = \"wav\";\n\t\t\tif (!strstr(needle->suffix, type) && strcmp(type,\"all\")!=0)\n\t\t\t\treturn foundat + size;\n\n\t\t\twrite_to_disk(s, needle, file_size, extractbuf, c_offset + f_offset);\n\t\t\tfoundat += file_size;\n\t\t\treturn foundat;\n\t\t}\n\n\t\treturn buf + needle->header_len;\n\n\t\t}\n\telse\n\t\t{\n\t\treturn buf + needle->header_len;\n\t\t}\n\n\treturn NULL;\n\n}\n\n/********************************************************************************\n *Function: extract_bmp\n *Description: Given that we have a BMP header parse header and grab the file size.\n *Return: A pointer to where the EOF of the BMP is in the current buffer\n **********************************************************************************/\nunsigned char *extract_bmp(f_state *s, u_int64_t c_offset, unsigned char *foundat, u_int64_t buflen,\n\t\t\t\t\t\t   s_spec *needle, u_int64_t f_offset)\n{\n\tunsigned char\t*buf = foundat;\n\tint\t\t\t\tsize = 0;\n\tint\t\t\t\theaderlength = 0;\n\tint\t\t\t\tv_size = 0;\n\tint\t\t\t\th_size = 0;\n\tunsigned char\t*extractbuf = NULL;\n\tu_int64_t\t\tfile_size = 0;\n\tchar\t\t\tcomment[32];\n\tint\t\t\t\tdataOffset = 0;\n\tint\t\t\t\tdataSize = 0;\n\n\tif (buflen < 100)\n\t\treturn buf + needle->header_len;\n\n\t/*JUMP the first to bytes of the header (BM)*/\n\tsize = htoi(&foundat[2], FOREMOST_LITTLE_ENDIAN);\t/*Grab the total file size in little_endian*/\n\n\t/*Sanity Check*/\n\tif (size <= 100 || size > needle->max_len)\n\t\treturn buf + needle->header_len;\n\n\tdataOffset = htoi(&foundat[10], FOREMOST_LITTLE_ENDIAN);\n\tdataSize = htoi(&foundat[34], FOREMOST_LITTLE_ENDIAN);\n\n\theaderlength = htoi(&foundat[14], FOREMOST_LITTLE_ENDIAN);\n\n\tif (dataSize + dataOffset != size)\n\t\t{\n\n\t\t//printf(\"newtest != dataSize:=%d dataOffset:=%d\\n\",dataSize,dataOffset);\n\t\t}\n\n\t//Header length\n\tif (headerlength > 1000 || headerlength <= 0)\n\t\treturn buf + needle->header_len;\n\n\t//foundat+=4;\n\tv_size = htoi(&foundat[22], FOREMOST_LITTLE_ENDIAN);\n\th_size = htoi(&foundat[18], FOREMOST_LITTLE_ENDIAN);\n\n\t//Vertical length\n\tif (v_size <= 0 || v_size > 2000 || h_size <= 0)\n\t\treturn buf + needle->header_len;\n\n#ifdef DEBUG\n\tprintf(\"\\n\tThe size of the BMP is %d, Header length:=%d , Vertical Size:= %d, dataSize:=%d dataOffset:=%d\\n\",\n\t   size,\n\t\t   headerlength,\n\t\t   v_size,\n\t\t   dataSize,\n\t\t   dataOffset);\n#endif\n\tif (size <= buflen)\n\t\t{\n\n\t\tsprintf(comment, \" (%d x %d)\", h_size, v_size);\n\t\tstrcat(needle->comment, comment);\n\n\t\tfile_size = size;\n\t\textractbuf = buf;\n\t\t\n\t\twrite_to_disk(s, needle, file_size, extractbuf, (c_offset + f_offset));\n\t\tfoundat += file_size;\n\t\treturn foundat;\n\n\t\t}\n\n\treturn NULL;\n}\n\n/********************************************************************************\n *Function: extract_gif\n *Description: Given that we have a GIF header parse the given buffer to determine\n *\twhere the file ends.\n *Return: A pointer to where the EOF of the GIF is in the current buffer\n **********************************************************************************/\nunsigned char *extract_gif(f_state *s, u_int64_t c_offset, unsigned char *foundat, u_int64_t buflen,\n\t\t\t\t\t\t   s_spec *needle, u_int64_t f_offset)\n{\n\tunsigned char\t*buf = foundat;\n\tunsigned char\t*currentpos = foundat;\n\tunsigned char\t*extractbuf = NULL;\n\tint\t\t\t\tbytes_to_search = 0;\n\tunsigned short\twidth = 0;\n\tunsigned short\theight = 0;\n\tu_int64_t\t\tfile_size = 0;\n\tchar\t\t\tcomment[32];\n\tfoundat += 4;\t\t/*Jump the first 4 bytes of the gif header (GIF8)*/\n\n\t/*Check if the GIF is type 89a or 87a*/\n\tif (strncmp((char *)foundat, \"9a\", 2) == 0 || strncmp((char *)foundat, \"7a\", 2) == 0)\n\t\t{\n\t\tfoundat += 2;\t/*Jump the length of the header*/\n\t\twidth = htos(foundat, FOREMOST_LITTLE_ENDIAN);\n\t\theight = htos(&foundat[2], FOREMOST_LITTLE_ENDIAN);\n\n\t\tsprintf(comment, \" (%d x %d)\", width, height);\n\t\tstrcat(needle->comment, comment);\n\n\t\tcurrentpos = foundat;\n\t\tif (buflen - (foundat - buf) >= needle->max_len)\n\t\t\tbytes_to_search = needle->max_len;\n\t\telse\n\t\t\tbytes_to_search = buflen - (foundat - buf);\n\t\tfoundat = bm_search(needle->footer,\n\t\t\t\t\t\t\tneedle->footer_len,\n\t\t\t\t\t\t\tfoundat,\n\t\t\t\t\t\t\tbytes_to_search,\n\t\t\t\t\t\t\tneedle->footer_bm_table,\n\t\t\t\t\t\t\tneedle->case_sen,\n\t\t\t\t\t\t\tSEARCHTYPE_FORWARD);\n\t\tif (foundat)\n\t\t{\n\n\t\t\t/*We found the EOF, write the file to disk and return*/\n#ifdef DEBUG\n\t\t\tprintx(foundat, 0, 16);\n#endif\n\t\t\tfile_size = (foundat - buf) + needle->footer_len;\n#ifdef DEBUG\n\t\t\tprintf(\"The GIF file size is  %llu  c_offset:=%llu\\n\", file_size, c_offset);\n#endif\n\t\t\textractbuf = buf;\n\t\t\twrite_to_disk(s, needle, file_size, extractbuf, c_offset + f_offset);\n\t\t\tfoundat += needle->footer_len;\n\t\t\treturn foundat;\n\t\t}\n\n\t\treturn NULL;\n\n\t\t}\n\telse\t\t\t\t/*Invalid GIF header return the current pointer*/\n\t\t{\n\t\treturn foundat;\n\t\t}\n\n}\n\n/********************************************************************************\n *Function: extract_mpg\n * Not done yet\n **********************************************************************************/\nunsigned char *extract_mpg(f_state *s, u_int64_t c_offset, unsigned char *foundat, u_int64_t buflen,\n\t\t\t\t\t\t   s_spec *needle, u_int64_t f_offset)\n{\n\tunsigned char\t*buf = foundat;\n\tunsigned char\t*currentpos = NULL;\n\n\tunsigned char\t*extractbuf = NULL;\n\tint\t\t\t\tbytes_to_search = 0;\n\tunsigned short\tsize = 0;\n\tu_int64_t\t\tfile_size = 0;\n\n\t/*\n    size=htos(&foundat[4],FOREMOST_BIG_ENDIAN);\n    printf(\"size:=%d\\n\",size);\n\n    printx(foundat,0,16);\n    foundat+=4;\n    */\n\tint\t\t\t\tj = 0;\n\tif (foundat[15] == (unsigned char)'\\xBB')\n\t\t{\n\t\t}\n\telse\n\t\t{\n\n\t\treturn buf + needle->header_len;\n\t\t}\n\n\tif (buflen <= 2 * KILOBYTE)\n\t\t{\n\t\tbytes_to_search = buflen;\n\t\t}\n\telse\n\t\t{\n\t\tbytes_to_search = 2 * KILOBYTE;\n\t\t}\n\n\twhile (1)\n\t\t{\n\t\tj = 0;\n\t\tcurrentpos = foundat;\n#ifdef DEBUG\n\t\tprintf(\"Searching for marker\\n\");\n#endif\n\t\tfoundat = bm_search(needle->markerlist[0].value,\n\t\t\t\t\t\t\tneedle->markerlist[0].len,\n\t\t\t\t\t\t\tfoundat,\n\t\t\t\t\t\t\tbytes_to_search,\n\t\t\t\t\t\t\tneedle->markerlist[0].marker_bm_table,\n\t\t\t\t\t\t\tneedle->case_sen,\n\t\t\t\t\t\t\tSEARCHTYPE_FORWARD);\n\n\t\tif (foundat)\n\t\t{\n#ifdef DEBUG\n\t\t\tprintf(\"Found after searching %d\\n\", foundat - currentpos);\n#endif\n\t\t\twhile (1)\n\t\t\t\t{\n\n\t\t\t\tif (foundat[3] >= (unsigned char)'\\xBB' && foundat[3] <= (unsigned char)'\\xEF')\n\t\t\t\t{\n#ifdef DEBUG\n\t\t\t\t\tprintf(\"jumping %d:\\n\", j);\n#endif\n\t\t\t\t\tsize = htos(&foundat[4], FOREMOST_BIG_ENDIAN);\n#ifdef DEBUG\n\t\t\t\t\tprintf(\"\\t hit: \");\n\t\t\t\t\tprintx(foundat, 0, 16);\n\t\t\t\t\tprintf(\"size:=%d\\n\\tjump: \", size);\n#endif\n\t\t\t\t\tfile_size += (foundat - buf) + size;\n\t\t\t\t\tif (size <= 0 || size > buflen - (foundat - buf))\n\t\t\t\t\t{\n#ifdef DEBUG\n\t\t\t\t\t\tprintf(\"Not enough room in the buffer \");\n#endif\n\t\t\t\t\t\tif (size <= 50 * KILOBYTE && size > 0)\n\t\t\t\t\t\t\t{\n\n\t\t\t\t\t\t\t/*We should probably search more*/\n\t\t\t\t\t\t\tif (file_size < needle->max_len)\n\t\t\t\t\t\t\t\t{\n\t\t\t\t\t\t\t\treturn NULL;\n\t\t\t\t\t\t\t\t}\n\t\t\t\t\t\t\telse\n\t\t\t\t\t\t\t\t{\n\t\t\t\t\t\t\t\tbreak;\n\t\t\t\t\t\t\t\t}\n\t\t\t\t\t\t\t}\n\t\t\t\t\t\telse\n\t\t\t\t\t\t\t{\n\t\t\t\t\t\t\treturn currentpos + needle->header_len;\n\t\t\t\t\t\t\t}\n\t\t\t\t\t}\n\n\t\t\t\t\tfoundat += size + 6;\n#ifdef DEBUG\n\t\t\t\t\tprintx(foundat, 0, 16);\n#endif\n\t\t\t\t\tj++;\n\t\t\t\t}\n\t\t\t\telse\n\t\t\t\t\t{\n\n\t\t\t\t\tbreak;\n\t\t\t\t\t}\n\t\t\t\t}\n\n\t\t\tif (foundat[3] == (unsigned char)'\\xB9')\n\t\t\t\t{\n\t\t\t\tbreak;\n\t\t\t\t}\n\t\t\telse if (foundat[3] != (unsigned char)'\\xBA' && foundat[3] != (unsigned char)'\\x00')\n\t\t\t\t{\n\n\t\t\t\t/*This is the error state where this doesn't seem to be an mpg anymore*/\n\t\t\t\tsize = htos(&foundat[4], FOREMOST_BIG_ENDIAN);\n#ifdef DEBUG\n\t\t\t\tprintf(\"\\t ***TEST: %x\\n\", foundat[3]);\n\t\t\t\tprintx(foundat, 0, 16);\n\n\t\t\t\tprintf(\"size:=%d\\n\", size);\n#endif\n\t\t\t\tif ((currentpos - buf) >= 1 * MEGABYTE)\n\t\t\t\t\t{\n\t\t\t\t\tfoundat = currentpos;\n\t\t\t\t\tbreak;\n\t\t\t\t\t}\n\n\t\t\t\treturn currentpos + needle->header_len;\n\n\t\t\t\t}\n\t\t\telse if (foundat[3] == (unsigned char)'\\xB3')\n\t\t\t\t{\n\t\t\t\tfoundat += 3;\n\t\t\t\t}\n\t\t\telse\n\t\t\t\t{\n\t\t\t\tfoundat += 3;\n\t\t\t\t}\n\t\t}\n\t\telse\n\t\t\t{\n\t\t\tif ((currentpos - buf) >= 1 * MEGABYTE)\n\t\t\t\t{\n\t\t\t\tfoundat = currentpos;\n\t\t\t\tbreak;\n\t\t\t\t}\n\t\t\telse\n\t\t\t{\n#ifdef DEBUG\n\t\t\t\tprintf(\"RETURNING BUF\\n\");\n#endif\n\t\t\t\treturn buf + needle->header_len;\n\t\t\t}\n\t\t\t}\n\t\t}\n\n\tif (foundat)\n\t\t{\n\t\tfile_size = (foundat - buf) + needle->footer_len;\n\t\tif (file_size < 1 * KILOBYTE)\n\t\t\treturn buf + needle->header_len;\n\t\t}\n\telse\n\t\t{\n\t\treturn buf + needle->header_len;\n\t\t}\n\n\tif (file_size > buflen)\n\t\tfile_size = buflen;\n\tfoundat = buf;\n#ifdef DEBUG\n\tprintf(\"The file size is  %llu  c_offset:=%llu\\n\", file_size, c_offset);\n#endif\n\n\textractbuf = buf;\n\twrite_to_disk(s, needle, file_size, extractbuf, c_offset + f_offset);\n\tfoundat += file_size;\n\treturn foundat;\n}\n\n\n/********************************************************************************\n *Function: extract_mp4\n * Not done yet\n **********************************************************************************/\nunsigned char *extract_mp4(f_state *s, u_int64_t c_offset, unsigned char *foundat, u_int64_t buflen,\n\t\t\t\t\t\t   s_spec *needle, u_int64_t f_offset)\n{\n\tunsigned char\t*buf = foundat;\n\n\tunsigned char\t*extractbuf = NULL;\n\tunsigned int\tsize = 0;\n\tu_int64_t\t\tfile_size = 0;\n\n   \n\twhile(1)\n\t{\n\t \tsize=htoi(&foundat[28],FOREMOST_BIG_ENDIAN);\n\t\tif(size ==0)\n\t\t{\n\t\t\t//printf(\"size ==0\\n\");\n\t\t\tfoundat+=28;\n\t\t\tbreak;\n\t\t}\n    \t\t//printf(\"size:=%d\\n\",size);\n\t\tif(size > 0 && size < buflen)\n\t\t{\n\t\t\tif(!isprint(foundat[32]) ||  !isprint(foundat[33]))\n\t\t\t{\n\t\t\t\t//printf(\"print err\\n\");\n\t\t\t\tbreak;\n\t\t\t\t//return foundat+8;\n\t\t\t}\n\t\t\tfoundat+=size;\n\t\t\t\n\t\t}\n\t\telse\n\t\t{\n\t\t\tif (size < needle->max_len)\n\t\t\t{\n\t\t\t\t//printf(\"Searching More\\n\");\n\t\t\t\treturn NULL;\n\t\t\t}\n\t\t\telse\n\t\t\t{\n\t\t\t\t//printf(\"ERR\\n\");\n\t\t\t\t//return foundat+8;\n\t\t\t\tbreak;\n\t\t\t}\n\t\t}\t\n\t\n\t\t//printx(foundat,0,32);\n\n\t}\n\tif (foundat)\n\t{\n\t\tfile_size = (foundat - buf) + needle->footer_len;\n\t\tif (file_size < 1 * KILOBYTE)\n\t\t\treturn buf + needle->header_len;\n\t}\n\t\n\n\tif (file_size > buflen)\n\t\tfile_size = buflen;\n\tfoundat = buf;\n\n\n\textractbuf = buf;\t\n\twrite_to_disk(s, needle, file_size, extractbuf, c_offset + f_offset);\n\tfoundat += file_size;\n\treturn foundat;\n}\n\n\n/********************************************************************************\n *Function: extract_png\n *Description: Given that we have a PNG header parse the given buffer to determine\n *\twhere the file ends.\n *Return: A pointer to where the EOF of the PNG is in the current buffer\n **********************************************************************************/\nunsigned char *extract_png(f_state *s, u_int64_t c_offset, unsigned char *foundat, u_int64_t buflen,\n\t\t\t\t\t\t   s_spec *needle, u_int64_t f_offset)\n{\n\tunsigned char\t*buf = foundat;\n\tunsigned char\t*currentpos = NULL;\n\n\tunsigned char\t*extractbuf = NULL;\n\tint\t\t\t\tsize = 0;\n\tint\t\t\t\theight = 0;\n\tint\t\t\t\twidth = 0;\n\tu_int64_t\t\tfile_size = 0;\n\tchar\t\t\tcomment[32];\n\n\tif (buflen < 100)\n\t\treturn NULL;\n\tfoundat += 8;\n\twidth = htoi(&foundat[8], FOREMOST_BIG_ENDIAN);\n\theight = htoi(&foundat[12], FOREMOST_BIG_ENDIAN);\n\n\tif (width < 1 || height < 1)\n\t\treturn foundat;\n\n\tif (width > 3000 || height > 3000)\n\t\treturn foundat;\n\n\tsprintf(comment, \" (%d x %d)\", width, height);\n\tstrcat(needle->comment, comment);\n\n\twhile (1)\t/* Jump through the headers until we reach the \"data\" part of the file*/\n\t\t{\n\t\tsize = htoi(foundat, FOREMOST_BIG_ENDIAN);\n#ifdef DEBUG\n\t\tprintx(foundat, 0, 16);\n\t\tprintf(\"Size:=%d\\n\", size);\n#endif\n\n\t\tcurrentpos = foundat;\n\t\tif (size <= 0 || size > buflen - (foundat - buf))\n\t\t{\n#ifdef DEBUG\n\t\t\tprintf(\"buflen - (foundat-buf)=%lu\\n\", buflen - (foundat - buf));\n#endif\n\t\t\treturn currentpos;\n\t\t}\n\n\t\t/*12 is the length of the size, TYPE, and CRC field*/\n\t\tfoundat += size + 12;\n\n\t\tif (isprint(foundat[4]))\n\t\t\t{\n\t\t\tif (strncmp((char *) &foundat[4], \"IEND\", 4) == 0)\n\t\t\t\t{\n\t\t\t\tbreak;\n\t\t\t\t}\n\t\t\t}\n\t\telse\n\t\t{\n#ifdef DEBUG\n\t\t\tprintx(foundat, 0, 16);\n\t\t\tprintf(\"Not ascii returning\\n\");\n#endif\n\t\t\treturn currentpos;\n\t\t}\n\n\t\t}\n\n\tif (foundat)\n\t\t{\n\t\tfile_size = (foundat - buf) + htoi(foundat, FOREMOST_BIG_ENDIAN) + 12;\n\n\t\tif (file_size > buflen)\n\t\t\tfile_size = buflen;\n\t\tfoundat = buf;\n#ifdef DEBUG\n\t\tprintf(\"The file size is  %llu  c_offset:=%llu\\n\", file_size, c_offset);\n#endif\n\t\textractbuf = buf;\n\t\twrite_to_disk(s, needle, file_size, extractbuf, c_offset + f_offset);\n\t\tfoundat += file_size;\n\t\treturn foundat;\n\t\t}\n\n\treturn NULL;\n}\n\n/********************************************************************************\n *Function: extract_jpeg\n *Description: Given that we have a JPEG header parse the given buffer to determine\n *\twhere the file ends.\n *Return: A pointer to where the EOF of the JPEG is in the current buffer\n **********************************************************************************/\nunsigned char *extract_jpeg(f_state *s, u_int64_t c_offset, unsigned char *foundat, u_int64_t buflen,\n\t\t\t\t\t\t\ts_spec *needle, u_int64_t f_offset)\n{\n\tunsigned char\t*buf = foundat;\n\tunsigned char\t*currentpos = NULL;\n\n\tunsigned char\t*extractbuf = NULL;\n\tunsigned short\theadersize;\n\tint\t\t\t\tbytes_to_search = 0;\n\tint\t\t\t\thasTable = FALSE;\n\tint\t\t\t\thasHuffman = FALSE;\n\tu_int64_t\t\tfile_size = 0;\n\n\t// char comment[32];\n\n\t/*Check if we have a valid header*/\n\tif (buflen < 128)\n\t\t{\n\t\treturn NULL;\n\t\t}\n\n\tif (foundat[3] == (unsigned char)'\\xe0')\n\t\t{\n\n\t\t//JFIF header\n\t\t//sprintf(comment,\" (JFIF)\");\n\t\t//strcat(needle->comment,comment);\n\t\t}\n\telse if (foundat[3] == (unsigned char)'\\xe1')\n\t\t{\n\n\t\t//sprintf(comment,\" (EXIF)\");\n\t\t//strcat(needle->comment,comment);\n\t\t}\n\telse\n\t\treturn foundat + needle->header_len;\t//Invalid keep searching\n\twhile (1)\t\t\t\t\t\t\t\t\t/* Jump through the headers until we reach the \"data\" part of the file*/\n\t{\n#ifdef DEBUG\n\t\tprintx(foundat, 0, 16);\n#endif\n\t\tfoundat += 2;\n\t\theadersize = htos(&foundat[2], FOREMOST_BIG_ENDIAN);\n#ifdef DEBUG\n\t\tprintf(\"Headersize:=%d buflen:=%lld\\n\", headersize, buflen);\n#endif\n\n\t\t\n\t\tif (((foundat + headersize) - buf) > buflen){ return NULL; }\t\n\n\t\tfoundat += headersize;\n\t\t\n\t\tif (foundat[2] != (unsigned char)'\\xff')\n\t\t\t{\n\t\t\tbreak;\n\t\t\t}\n\n\t\t/*Ignore 2 \"0xff\" side by side*/\n\t\tif (foundat[2] == (unsigned char)'\\xff' && foundat[3] == (unsigned char)'\\xff')\n\t\t\t{\n\t\t\tfoundat++;\n\t\t\t}\n\n\t\tif (foundat[3] == (unsigned char)'\\xdb' || foundat[4] == (unsigned char)'\\xdb')\n\t\t\t{\n\t\t\thasTable = TRUE;\n\t\t\t}\n\t\telse if (foundat[3] == (unsigned char)'\\xc4')\n\t\t\t{\n\t\t\thasHuffman = TRUE;\n\t\t\t}\n\t}\n\n\t/*All jpegs must contain a Huffman marker as well as a quantization table*/\n\tif (!hasTable || !hasHuffman)\n\t{\n#ifdef DEBUG\n\t\tprintf(\"No Table or Huffman \\n\");\n#endif\n\t\treturn buf + needle->header_len;\n\t}\n\n\tcurrentpos = foundat;\n\n\t//sprintf(\"Searching for footer\\n\");\n\tif (buflen < (foundat - buf)) {\n#ifdef DEBUG\n\t\tprintf(\"avoided bug in extract_jpeg!\\n\");\n#endif\n\t\tbytes_to_search = 0;\n\t} else {\n\t\tif (buflen - (foundat - buf) >= needle->max_len)\n\t\t\tbytes_to_search = needle->max_len;\n\t\telse\n\t\t\tbytes_to_search = buflen - (foundat - buf);\n\t}\n\n\tfoundat = bm_search(needle->footer,\n\t\t\t\t\t\tneedle->footer_len,\n\t\t\t\t\t\tfoundat,\n\t\t\t\t\t\tbytes_to_search,\n\t\t\t\t\t\tneedle->footer_bm_table,\n\t\t\t\t\t\tneedle->case_sen,\n\t\t\t\t\t\tSEARCHTYPE_FORWARD);\n\n\tif (foundat)\t\t\t\t\t\t\t\t/*Found found a valid JPEG*/\n\t\t{\n\n\t\t/*We found the EOF, write the file to disk and return*/\n\t\tfile_size = (foundat - buf) + needle->footer_len;\n#ifdef DEBUG\n\t\tprintf(\"The jpeg file size is  %llu  c_offset:=%llu\\n\", file_size, c_offset);\n#endif\n\n\t\t//extractbuf=(unsigned char*) malloc(file_size*sizeof(char));\n\t\t//memcpy(extractbuf,buf,file_size);\n\t\textractbuf = buf;\n\t\twrite_to_disk(s, needle, file_size, extractbuf, c_offset + f_offset);\n\t\tfoundat += needle->footer_len;\n\n\t\t////free(extractbuf);\n\t\treturn foundat;\n\t\t}\n\telse\n\t\t{\n\t\treturn NULL;\n\t\t}\n\n}\t//End extract_jpeg\n\n/********************************************************************************\n *Function: extract_generic\n *Description:\n *Return: A pointer to where the EOF of the\n **********************************************************************************/\nunsigned char *extract_generic(f_state *s, u_int64_t c_offset, unsigned char *foundat,\n\t\t\t\t\t\t\t   u_int64_t buflen, s_spec *needle, u_int64_t f_offset)\n{\n\tunsigned char\t*buf = foundat;\n\tunsigned char\t*endptr = foundat;\n\tunsigned char\t*beginptr = foundat;\n\tunsigned char\t*extractbuf = NULL;\n\tint\t\tbytes_to_search = 0;\n\tu_int64_t\tfile_size = 0;\n\tint begin=0;\n\tint end=0;\n\t\n\n\tif (buflen - (foundat - buf) >= needle->max_len)\n\t\tbytes_to_search = needle->max_len;\n\telse\n\t\tbytes_to_search = buflen - (foundat - buf);\n\n  \tif(needle->searchtype ==SEARCHTYPE_FORWARD_NEXT)\n\t{\n\t\t\tfoundat+=needle->header_len;\n\t\t\tfoundat = bm_search(needle->header,\n\t\t\t\t\t\t\tneedle->header_len,\n\t\t\t\t\t\t\tfoundat,\n\t\t\t\t\t\t\tbytes_to_search,\n\t\t\t\t\t\t\tneedle->footer_bm_table,\n\t\t\t\t\t\t\tneedle->case_sen,\n\t\t\t\t\t\t\tSEARCHTYPE_FORWARD);\n\t}\n\telse if(needle->searchtype ==SEARCHTYPE_ASCII)\n\t{\n\t\t\t\n\t\n\t\t\twhile (isprint(foundat[end]) || foundat[end] == '\\x0a' || foundat[end] == '\\x0d' || foundat[end] == '\\x09')\n\t\t\t{\n\t\t\t\tend++;\n\t\t\t}\n\t\t\t\n\t\t\tfoundat+=end;\n\t\t\tendptr=foundat;\n\t\t\tfoundat=buf;\n\t\t\t\n\t\t\twhile (isprint(foundat[begin-1]) || foundat[begin-1] == '\\x0a' || foundat[begin-1] == '\\x0d' || foundat[begin-1] == '\\x09')\n\t\t\t{\n\t\t\t\tbegin--;\n\t\t\t}\n\t\t\t\n\t\t\tfoundat+=begin;\n\t\t\tbeginptr=foundat;\n\t\t\t\n\t\t\tbuf=beginptr;\n\t\t\tfoundat=endptr;\n\t\t\t//printx(buf,0,4);\t\n\t\t\t\n\t\t\tfile_size=end-begin;\t\n\t\t\t//fprintf(stderr,\"file_size=%llu end=%d begin=%d ptrsize=%d ptrsize2=%d\\n\",file_size,end,begin,endptr-beginptr,foundat-buf);\n\t\t\tif(buf==foundat) \n\t\t\t{\n\t\t\t\t\tfprintf(stderr,\"Returning Foundat\\n\");\n\t\t\t\t\treturn foundat+needle->header_len;\n\t\t\t}\t\t\t\n\t}\n  \telse if (needle->footer == NULL || strlen((char *)needle->footer) < 1)\n\t{\n#ifdef DEBUG\n\t\tprintf(\"footer is NULL\\n\");\n#endif\n\t\tfoundat = NULL;\n\t}\n\telse\n\t{\n#ifdef DEBUG\n\t\tprintf(\"footer is not NULL %p\\n\", needle->footer);\n#endif\n\t\tfoundat = bm_search(needle->footer,\n\t\t\t\t\t\t\tneedle->footer_len,\n\t\t\t\t\t\t\tfoundat,\n\t\t\t\t\t\t\tbytes_to_search,\n\t\t\t\t\t\t\tneedle->footer_bm_table,\n\t\t\t\t\t\t\tneedle->case_sen,\n\t\t\t\t\t\t\tSEARCHTYPE_FORWARD);\n\t}\n\n\tif (foundat)\n\t{\n#ifdef DEBUG\n\t\tprintf(\"found %s!!!\\n\", needle->footer);\n#endif\n\t\tif(needle->searchtype ==SEARCHTYPE_FORWARD_NEXT || needle->searchtype ==SEARCHTYPE_ASCII)\n\t\t{\n\t\t\t\tfile_size = (foundat - buf);\n\t\t}\n\t\telse\n\t\t{\n\t\t\t\tfile_size = (foundat - buf) + needle->footer_len;\n\t\t}\t\n\t}\n\telse\n\t{\n\t\tfile_size = needle->max_len;\n\t}\n\n\tif (file_size == 0)\n\t{\n\t\tfile_size = needle->max_len;\n\t}\n\n\tif (file_size > (buflen-begin))\n\t{\n\t\tfile_size = buflen;\n\t}\n\t\n#ifdef DEBUG\n\tprintf(\"The file size is  %llu  c_offset:=%llu\\n\", file_size, c_offset);\n#endif\n\n\textractbuf = buf;\n\twrite_to_disk(s, needle, file_size, extractbuf, c_offset + f_offset);\n\t\n\tif(needle->searchtype !=SEARCHTYPE_ASCII)\n\t{\n\t\tfoundat=buf;\n\t\tfoundat += needle->header_len;\n\t}\n\treturn foundat;\t\t\n\t\n\t\n\t\n}\n\n/********************************************************************************\n *Function: extract_exe\n *Description:\n *Return: A pointer to where the EOF of the\n **********************************************************************************/\nunsigned char *extract_exe(f_state *s, u_int64_t c_offset, unsigned char *foundat, u_int64_t buflen,\n\t\t\t\t\t\t   s_spec *needle, u_int64_t f_offset)\n{\n\tunsigned char\t*buf = foundat;\n\tunsigned char\t*extractbuf = NULL;\n\tu_int64_t\t\tfile_size = 0;\n\tunsigned short\tpe_offset = 0;\n\tunsigned int\tSizeOfCode = 0;\n\tunsigned int\tSizeOfInitializedData = 0;\n\tunsigned int\tSizeOfUninitializedData = 0;\n\tunsigned int\trva = 0;\n\tunsigned int\toffset = 0;\n\tunsigned short\tsections = 0;\n\tunsigned int\tsizeofimage = 0;\n\tunsigned int\traw_section_size = 0;\n\tunsigned int\tsize_of_headers = 0;\n\tunsigned short\tdll = 0;\n\tunsigned int\tsum = 0;\n\tunsigned short\texe_char = 0;\n\tunsigned int\talign = 0;\n\tint\t\t\t\ti = 0;\n\ttime_t\t\t\tcompile_time = 0;\n\tstruct tm\t\t*ret_time;\n\tchar\t\t\tcomment[32];\n\tchar\t\t\tascii_time[32];\n\n\tif (buflen < 100)\n\t\treturn foundat + 2;\n\tpe_offset = htos(&foundat[60], FOREMOST_LITTLE_ENDIAN);\n\tif (pe_offset < 1 || pe_offset > 1000 || pe_offset > buflen)\n\t\t{\n\t\treturn foundat + 60;\n\t\t}\n\n\tfoundat += pe_offset;\n\tif (foundat[0] != (unsigned char)'\\x50' || foundat[1] != (unsigned char)'\\x45')\n\t\t{\n\t\treturn foundat;\n\t\t}\n\n\tsections = htos(&foundat[6], FOREMOST_LITTLE_ENDIAN);\n\tif (buflen < (40 * sections + 224))\n\t\t{\n\t\treturn foundat;\n\t\t}\n\n\tcompile_time = (time_t) htoi(&foundat[8], FOREMOST_LITTLE_ENDIAN);\n\tret_time = gmtime(&compile_time);\n\tsprintf(ascii_time,\n\t\t\t\"%02d/%02d/%04d %02d:%02d:%02d\",\n\t\t\tret_time->tm_mon + 1,\n\t\t\tret_time->tm_mday,\n\t\t\tret_time->tm_year + 1900,\n\t\t\tret_time->tm_hour,\n\t\t\tret_time->tm_min,\n\t\t\tret_time->tm_sec);\n\tchop(ascii_time);\n\n\tsprintf(comment, ascii_time);\n\tstrcat(needle->comment, comment);\n\texe_char = htos(&foundat[22], FOREMOST_LITTLE_ENDIAN);\n\tif (exe_char & 0x2000)\n\t\t{\n\t\tdll = 1;\n\t\t}\n\telse if (exe_char & 0x1000)\n\t\t{\n\n\t\t//printf(\"System File!!!\\n\");\n\t\t}\n\telse if (exe_char & 0x0002)\n\t\t{\n\n\t\t//printf(\"EXE !!!\\n\");\n\t\t}\n\telse\n\t\t{\n\t\treturn foundat;\n\t\t}\n\n\tfoundat += 0x18;\t/*Jump to opt header should be 0x0b 0x01*/\n\n\tSizeOfCode = htoi(&foundat[4], FOREMOST_LITTLE_ENDIAN);\n\tSizeOfInitializedData = htoi(&foundat[8], FOREMOST_LITTLE_ENDIAN);\n\tSizeOfUninitializedData = htoi(&foundat[12], FOREMOST_LITTLE_ENDIAN);\n\trva = htoi(&foundat[16], FOREMOST_LITTLE_ENDIAN);\n\talign = htoi(&foundat[36], FOREMOST_LITTLE_ENDIAN);\n\n\tsizeofimage = htoi(&foundat[56], FOREMOST_LITTLE_ENDIAN);\n\tsize_of_headers = htoi(&foundat[60], FOREMOST_LITTLE_ENDIAN);\n\tfoundat += 224;\n\n\t/*Start of sections*/\n\tfor (i = 0; i < sections; i++)\n\t\t{\n\n\t\t//strncpy(name,foundat,8);\n\t\toffset = htoi(&foundat[20], FOREMOST_LITTLE_ENDIAN);\n\t\traw_section_size = htoi(&foundat[16], FOREMOST_LITTLE_ENDIAN);\n\n\t\t//printf(\"\\t%s size=%d offset=%d\\n\",name,raw_section_size,offset);\n\t\tfoundat += 40;\n\n\t\t//rem+=(raw_section_size%align);\n\t\t//sum+=raw_section_size;\n\t\tsum = offset + raw_section_size;\n\t\t}\n\n\t/*\n    printf(\"rva is %d sum= %d\\n\",rva,sum);\n    printf(\"soi is %d,soh is %d \\n\",sizeofimage,size_of_headers);\n    printf(\"we are off by %d\\n\",sum-buflen);\n    printf(\"soc=%d ,soidr=%d, souid=%d\\n\",SizeOfCode,SizeOfInitializedData,SizeOfUninitializedData);\n    printf(\"fs=%d ,extr=%d\\n\",SizeOfCode+SizeOfInitializedData,SizeOfUninitializedData);\n\t\t*/\n\tfile_size = sum;\n\tif (file_size < 512 || file_size > 4 * MEGABYTE)\n\t\t{\n\t\treturn foundat + 60;\n\t\t}\n\n\tif (file_size > buflen)\n\t\tfile_size = buflen;\n\tfoundat = buf;\n#ifdef DEBUG\n\tprintf(\"The file size is  %llu  c_offset:=%llu\\n\", file_size, c_offset);\n#endif\n\n\textractbuf = buf;\n\tif (dll == 1)\n\t\t{\n\t\tstrcpy(needle->suffix, \"dll\");\n\t\twrite_to_disk(s, needle, file_size, extractbuf, c_offset + f_offset);\n\t\tstrcpy(needle->suffix, \"exe\");\n\t\t}\n\telse\n\t\t{\n\t\twrite_to_disk(s, needle, file_size, extractbuf, c_offset + f_offset);\n\t\t}\n\n\tfoundat += needle->header_len;\n\treturn (buf + file_size);\n}\n\n\n/********************************************************************************\n *Function: extract_reg\n *Description:\n *Return: A pointer to where the EOF of the\n **********************************************************************************/\nunsigned char *extract_reg(f_state *s, u_int64_t c_offset, unsigned char *foundat, u_int64_t buflen,\n\t\t\t\t\t\t   s_spec *needle, u_int64_t f_offset)\n{\n\tunsigned char\t*buf = foundat;\n\tunsigned char\t*extractbuf = NULL;\n\tint sizeofreg = htoi(&foundat[0x28], FOREMOST_LITTLE_ENDIAN);\n\tint file_size=0;\n\tif(sizeofreg < 0 || sizeofreg > needle->max_len)\t\n\t{\n\t\treturn (foundat+4);\n\t}\t\n\tfoundat+=sizeofreg;\n\tfile_size = (foundat - buf);\n\n\textractbuf = buf;\n\n\n\twrite_to_disk(s, needle, file_size , extractbuf, c_offset + f_offset);\n\n\t\t\t\n\treturn NULL;\n}\n/********************************************************************************\n *Function: extract_rar\n *Description:\n *Return: A pointer to where the EOF of the\n **********************************************************************************/\nunsigned char *extract_rar(f_state *s, u_int64_t c_offset, unsigned char *foundat, u_int64_t buflen,\n\t\t\t\t\t\t   s_spec *needle, u_int64_t f_offset)\n{\n\tunsigned char\t*buf = foundat;\n\tunsigned char\t*extractbuf = NULL;\n\tu_int64_t\t\tfile_size = 0;\n\tunsigned short\theadersize = 0;\n\tunsigned short\tflags = 0;\n\tunsigned int\tfilesize = 0;\n\tunsigned int\ttot_file_size = 0;\n\tunsigned int\tufilesize = 0;\n\tint\t\t\t\ti = 0;\n\tint\t\t\t\tscan = 0;\n\tint\t\t\t\tflag = 0;\n\tint\t\t\t\tpasswd = 0;\n\tu_int64_t\t\tbytes_to_search = 50 * KILOBYTE;\n\tchar\t\t\tcomment[32];\n\n\t/*Marker Block*/\n\theadersize = htos(&foundat[5], FOREMOST_LITTLE_ENDIAN);\n\tfoundat += headersize;\n\n\t/*Archive Block*/\n\theadersize = htos(&foundat[5], FOREMOST_LITTLE_ENDIAN);\n\tfilesize = htoi(&foundat[7], FOREMOST_LITTLE_ENDIAN);\n\n\tif (foundat[2] != '\\x73')\n\t\t{\n\t\treturn foundat; /*Error*/\n\t\t}\n\n\tflags = htos(&foundat[3], FOREMOST_LITTLE_ENDIAN);\n\tif ((flags & 0x01) != 0)\n\t\t{\n\t\tsprintf(comment, \" Multi-volume:\");\n\t\tstrcat(needle->comment, comment);\n\t\t}\n\n\tif (flags & 0x02)\n\t\t{\n\t\tsprintf(comment, \" an archive comment is present:\");\n\t\tstrcat(needle->comment, comment);\n\t\t}\n\n\tfoundat += headersize;\n\n\tif (foundat[2] != '\\x74')\n\t\t{\n\t\tfor (i = 0; i < 500; i++)\n\t\t\t{\n\t\t\tif (foundat[i] == '\\x74')\n\t\t\t\t{\n\t\t\t\tfoundat += i - 2;\n\t\t\t\tscan = 1;\n\t\t\t\tbreak;\n\t\t\t\t}\n\t\t\t}\n\t\t}\n\n\tif (headersize == 13 && foundat[2] != '\\x74')\n\t\t{\n\n\t\tif (scan == 0)\n\t\t\t{\n\t\t\tsprintf(comment, \"Encrypted Headers!\");\n\t\t\tstrcat(needle->comment, comment);\n\t\t\t}\n\n\t\tif (buflen - (foundat - buf) >= needle->max_len)\n\t\t\tbytes_to_search = needle->max_len;\n\t\telse\n\t\t\tbytes_to_search = buflen - (foundat - buf);\n\n\t\t//printf(\"bytes_to_search:=%d needle->footer_len:=%d needle->header_len:=%d\\n\",bytes_to_search,needle->footer_len,needle->header_len);\n\t\tfoundat = bm_search(needle->footer,\n\t\t\t\t\t\t\tneedle->footer_len,\n\t\t\t\t\t\t\tfoundat,\n\t\t\t\t\t\t\tbytes_to_search,\n\t\t\t\t\t\t\tneedle->footer_bm_table,\n\t\t\t\t\t\t\tneedle->case_sen,\n\t\t\t\t\t\t\tSEARCHTYPE_FORWARD);\n\t\tif (foundat == NULL)\n\t\t\t{\n\t\t\ttot_file_size = bytes_to_search;\n\t\t\tfoundat = buf + tot_file_size;\n\t\t\t}\n\t\t}\n\telse\n\t\t{\n\n\t\t/*Loop through files*/\n\t\twhile (foundat[2] == '\\x74')\n\t\t\t{\n\n\t\t\theadersize = htos(&foundat[5], FOREMOST_LITTLE_ENDIAN);\n\t\t\tfilesize = htoi(&foundat[7], FOREMOST_LITTLE_ENDIAN);\n\t\t\tufilesize = htoi(&foundat[11], FOREMOST_LITTLE_ENDIAN);\n\n\t\t\tif (headersize < 1 || headersize > buflen)\n\t\t\t\tflag = 1;\n\t\t\tif (filesize < 0 || filesize > buflen)\n\t\t\t\tflag = 1;\n\t\t\tif ((headersize + filesize) > buflen)\n\t\t\t\tflag = 1;\n\t\t\tif (ufilesize < 0)\n\t\t\t\tflag = 1;\n\n\t\t\tflags = htos(&foundat[3], FOREMOST_LITTLE_ENDIAN);\n\t\t\tif ((flags & 0x04) != 0)\n\t\t\t\t{\n\t\t\t\tpasswd = 1;\n\t\t\t\t}\n\n\t\t\ttot_file_size = (foundat - buf);\n\t\t\tif ((tot_file_size + headersize + filesize) > buflen)\n\t\t\t\t{\n\t\t\t\tbreak;\n\t\t\t\t}\n\n\t\t\tfoundat += headersize + filesize;\n\t\t\t}\n\n\t\tif (passwd == 1)\n\t\t\t{\n\t\t\tsprintf(comment, \"Password Protected:\");\n\t\t\tstrcat(needle->comment, comment);\n\t\t\t}\n\n\t\tif (flag == 1)\n\t\t\t{\n\t\t\tsprintf(comment, \"Encrypted Headers!\");\n\t\t\tstrcat(needle->comment, comment);\n\t\t\tfoundat = bm_search(needle->footer,\n\t\t\t\t\t\t\t\tneedle->footer_len,\n\t\t\t\t\t\t\t\tfoundat,\n\t\t\t\t\t\t\t\tbytes_to_search,\n\t\t\t\t\t\t\t\tneedle->footer_bm_table,\n\t\t\t\t\t\t\t\tneedle->case_sen,\n\t\t\t\t\t\t\t\tSEARCHTYPE_FORWARD);\n\t\t\tif (foundat == NULL)\n\t\t\t\t{\n\t\t\t\ttot_file_size = bytes_to_search;\n\t\t\t\tfoundat = buf + tot_file_size;\n\t\t\t\t}\n\t\t\t}\n\n\t\tif (foundat[2] != '\\x7B' && tot_file_size == 0)\n\t\t\t{\n\n\t\t\t//printf(\"Error 7B!!!! %x\\n\",foundat[2]);\n\t\t\treturn foundat;\n\t\t\t}\n\n\t\tfoundat += 7;\n\n\t\t}\n\n\tif (foundat)\n\t\t{\n\n\t\t/*We found the EOF, write the file to disk and return*/\n\t\ttot_file_size = (foundat - buf);\n\t\tif (tot_file_size > buflen)\n\t\t\tfile_size = buflen;\n\n\t\textractbuf = buf;\n\t\twrite_to_disk(s, needle, tot_file_size, extractbuf, c_offset + f_offset);\n\t\treturn foundat;\n\t\t}\n\telse\n\t\t{\n\t\treturn NULL;\n\t\t}\n\n\treturn NULL;\n}\n\nunsigned char *extract_file(f_state *s, u_int64_t c_offset, unsigned char *foundat, u_int64_t buflen,\n\t\t\t\t\t\t\ts_spec *needle, u_int64_t f_offset)\n{\n\tif (needle->type == JPEG)\n\t\t{\n\t\treturn extract_jpeg(s, c_offset, foundat, buflen, needle, f_offset);\n\t\t}\n\telse if (needle->type == GIF)\n\t\t{\n\t\treturn extract_gif(s, c_offset, foundat, buflen, needle, f_offset);\n\t\t}\n\telse if (needle->type == PNG)\n\t\t{\n\t\treturn extract_png(s, c_offset, foundat, buflen, needle, f_offset);\n\t\t}\n\telse if (needle->type == BMP)\n\t\t{\n\t\treturn extract_bmp(s, c_offset, foundat, buflen, needle, f_offset);\n\t\t}\n\telse if (needle->type == RIFF)\n\t\t{\n\t\tneedle->suffix = \"rif\";\n\t\treturn extract_riff(s, c_offset, foundat, buflen, needle, f_offset, \"all\");\n\t\t}\n\telse if (needle->type == AVI)\n\t\t{\n\t\treturn extract_riff(s, c_offset, foundat, buflen, needle, f_offset, \"avi\");\n\t\t}\n\telse if (needle->type == WAV)\n\t\t{\n\t\tneedle->suffix = \"rif\";\n\t\treturn extract_riff(s, c_offset, foundat, buflen, needle, f_offset, \"wav\");\n\t\t}\n\telse if (needle->type == WMV)\n\t\t{\n\t\treturn extract_wmv(s, c_offset, foundat, buflen, needle, f_offset);\n\t\t}\n\telse if (needle->type == OLE)\n\t\t{\n\t\tneedle->suffix = \"ole\";\n\t\treturn extract_ole(s, c_offset, foundat, buflen, needle, f_offset, \"all\");\n\t\t}\n\telse if (needle->type == DOC)\n\t\t{\n\t\treturn extract_ole(s, c_offset, foundat, buflen, needle, f_offset, \"doc\");\n\t\t}\n\telse if (needle->type == PPT)\n\t\t{\n\t\treturn extract_ole(s, c_offset, foundat, buflen, needle, f_offset, \"ppt\");\n\t\t}\n\telse if (needle->type == XLS)\n\t\t{\n\t\tneedle->suffix = \"ole\";\n\t\treturn extract_ole(s, c_offset, foundat, buflen, needle, f_offset, \"xls\");\n\t\t}\n\telse if (needle->type == PDF)\n\t\t{\n\t\treturn extract_pdf(s, c_offset, foundat, buflen, needle, f_offset);\n\t\t}\n\telse if (needle->type == CPP)\n\t\t{\n\t\treturn extract_cpp(s, c_offset, foundat, buflen, needle, f_offset);\n\t\t}\n\telse if (needle->type == HTM)\n\t\t{\n\t\treturn extract_htm(s, c_offset, foundat, buflen, needle, f_offset);\n\t\t}\n\telse if (needle->type == MPG)\n\t\t{\n\t\treturn extract_mpg(s, c_offset, foundat, buflen, needle, f_offset);\n\t\t}\n\telse if (needle->type == MP4)\n\t\t{\n\t\treturn extract_mp4(s, c_offset, foundat, buflen, needle, f_offset);\n\t\t}\n\telse if (needle->type == ZIP)\n\t\t{\n\t\treturn extract_zip(s, c_offset, foundat, buflen, needle, f_offset, \"all\");\n\t\t}\n\telse if (needle->type == RAR)\n\t\t{\n\t\treturn extract_rar(s, c_offset, foundat, buflen, needle, f_offset);\n\t\t}\n\telse if (needle->type == SXW)\n\t\t{\n\t\treturn extract_zip(s, c_offset, foundat, buflen, needle, f_offset, \"sxw\");\n\t\t}\n\telse if (needle->type == SXC)\n\t\t{\n\t\treturn extract_zip(s, c_offset, foundat, buflen, needle, f_offset, \"sxc\");\n\t\t}\n\telse if (needle->type == SXI)\n\t\t{\n\t\treturn extract_zip(s, c_offset, foundat, buflen, needle, f_offset, \"sxi\");\n\t\t}\n\telse if (needle->type == EXE)\n\t\t{\n\t\treturn extract_exe(s, c_offset, foundat, buflen, needle, f_offset);\n\t\t}\n\telse if (needle->type == MOV || needle->type == VJPEG)\n\t\t{\n\t\treturn extract_mov(s, c_offset, foundat, buflen, needle, f_offset);\n\t\t}\n\telse if (needle->type == CONF)\n\t\t{\n\t\treturn extract_generic(s, c_offset, foundat, buflen, needle, f_offset);\n\t\t}\n\telse\n\t\t{\n\t\treturn NULL;\n\t\t}\n\treturn NULL;\t\n}\n"
  },
  {
    "path": "extract.h",
    "content": "/*\n\tlocal file header signature     4 bytes  (0x04034b50)\n        version needed to extract       2 bytes\n        general purpose bit flag        2 bytes\n        compression method              2 bytes\n        last mod file time              2 bytes\n        last mod file date              2 bytes\n        crc-32                          4 bytes\n        compressed size                 4 bytes\n        uncompressed size               4 bytes\n        filename length                 2 bytes\n        extra field length              2 bytes\n*/\n\n/*\n \tcentral file header signature   4 bytes  (0x02014b50)\n        version made by                 2 bytes\n        version needed to extract       2 bytes\n        general purpose bit flag        2 bytes\n        compression method              2 bytes\n        last mod file time              2 bytes\n        last mod file date              2 bytes\n        crc-32                          4 bytes\n        compressed size                 4 bytes\n        uncompressed size               4 bytes\n        filename length                 2 bytes\n        extra field length              2 bytes\n        file comment length             2 bytes\n        disk number start               2 bytes\n        internal file attributes        2 bytes\n        external file attributes        4 bytes\n        relative offset of local header 4 bytes\n*/\n\n/* end of central dir signature    4 bytes  (0x06054b50)\n        number of this disk             2 bytes\n        number of the disk with the\n        start of the central directory  2 bytes\n        total number of entries in\n        the central dir on this disk    2 bytes\n        total number of entries in\n        the central dir                 2 bytes\n        size of the central directory   4 bytes\n        offset of start of central\n        directory with respect to\n        the starting disk number        4 bytes\n        zipfile comment length          2 bytes\n        zipfile comment (variable size)\n\t*/\nstruct zipLocalFileHeader\n{\n\tunsigned int\tsignature;\t\t\t\t\t//0\n\tunsigned short\tversion;\t\t\t\t\t//4\n\tunsigned short\tgenFlag;\t\t\t\t\t//6\n\tsigned short\tcompression;\t\t\t\t//8\n\tunsigned short\tlast_mod_time;\t\t\t\t//10\n\tunsigned short\tlast_mod_date;\t\t\t\t//12\n\tunsigned int\tcrc;\t\t\t\t\t\t//14\n\tunsigned int\tcompressed;\t\t\t\t\t//18\n\tunsigned int\tuncompressed;\t\t\t\t//22\n\tunsigned short\tfilename_length;\t\t\t//26\n\tunsigned short\textra_length;\t\t\t\t//28\n};\nstruct zipCentralFileHeader\n{\n\tunsigned int\tsignature;\t\t\t\t\t//0\n\tunsigned char\tversion_extract[2];\t\t\t//4\n\tunsigned char\tversion_madeby[2];\t\t\t//6\n\tunsigned short\tgenFlag;\t\t\t\t\t//8\n\tunsigned short\tcompression;\t\t\t\t//10\n\tunsigned short\tlast_mod_time;\t\t\t\t//12\n\tunsigned short\tlast_mod_date;\t\t\t\t//14\n\tunsigned int\tcrc;\t\t\t\t\t\t//16\n\tunsigned int\tcompressed;\t\t\t\t\t//20\n\tunsigned int\tuncompressed;\t\t\t\t//24\n\tunsigned short\tfilename_length;\t\t\t//28\n\tunsigned short\textra_length;\t\t\t\t//30\n\tunsigned short\tfilecomment_length;\t\t\t//32\n\tunsigned short\tdisk_number_start;\t\t\t//34\n};\nstruct zipEndCentralFileHeader\n{\n\tunsigned int\tsignature;\t\t\t\t\t//0\n\tunsigned short\tnumOfdisk;\t\t\t\t\t//4\n\tunsigned short\tcompression;\t\t\t\t//6\n\tunsigned short\tstart_of_central_dir;\t\t//8\n\tunsigned short\tnum_entries_in_central_dir; //10\n\tunsigned int\tsize_of_central_dir;\t\t//12\n\tunsigned int\toffset;\t\t\t\t\t\t//16\n\tunsigned short\tcomment_length;\t\t\t\t//20\n};\n\nvoid print_zip(struct zipLocalFileHeader *fileHeader, struct zipCentralFileHeader *centralHeader)\n{\n\tprintf(\"\\n\tLocal Header Data\\n\");\n\tprintf(\"GenFlag:=%d,compressed:=%d,uncompressed:=%d\\n\",\n\t\t   fileHeader->genFlag,\n\t\t   fileHeader->compressed,\n\t\t   fileHeader->uncompressed);\n\tprintf(\"Compression:=%d, filename_len:=%d,extralen:=%d\\n\",\n\t\t   fileHeader->compression,\n\t\t   fileHeader->filename_length,\n\t\t   fileHeader->extra_length);\n\n\tprintf(\"\tCentral Header Data\\n\");\n\tprintf(\"GenFlag:=%d,compressed:=%d,uncompressed:=%d\\n\",\n\t\t   centralHeader->genFlag,\n\t\t   centralHeader->compressed,\n\t\t   centralHeader->uncompressed);\n\tprintf(\"Compression:=%d, Version Madeby:=%x%x\\n\",\n\t\t   centralHeader->compression,\n\t\t   centralHeader->version_madeby[0],\n\t\t   centralHeader->version_madeby[1]);\n}\n"
  },
  {
    "path": "foremost.8",
    "content": ".TH FOREMOST \"8\" \"v1.5 - May 2009\"\n\n.SH NAME\nforemost \\- Recover files using their headers, footers, and data structures\n\n.SH SYNOPSIS\n.B foremost[\\fB-h\\fR][\\fB-V\\fR][\\fB-d\\fR][\\fB-vqwQT\\fR][\\fB-b\\fR<blocksize>][\\fB-o\\fR<dir>]\n[\\fB-t\\fR<type>][\\fB-s\\fR<num>][\\fB-i\\fR<file>] \n\n.SH BUILTIN FORMATS\n.PP\nRecover files from a disk image based on file types specified by the\nuser using the -t switch.\n\n.TP\n.B jpg\nSupport for the JFIF and Exif formats including implementations used \nin modern digital cameras.\n\n\n.TP\n.B gif\n.TP\n.B png\n.TP\n.B bmp\nSupport for windows bmp format.\n.TP\n.B avi\n.TP\n.B exe \nSupport for Windows PE binaries, will extract DLL and EXE files along\nwith their compile times.\n.TP\n.B mpg \nSupport for most MPEG files (must begin with 0x000001BA) \n.TP\n.B mp4\n.TP\n.B wav\n.TP\n.B riff \nThis will extract AVI and RIFF since they use the same file \nformat (RIFF). note faster than running each separately. \n.TP\n.B wmv\nNote may also extract -wma files as they have similar format.\n.TP\n.B mov\n.TP\n.B pdf\n.TP\n.B ole\nThis will grab any file using the OLE file structure.  This includes\nPowerPoint, Word, Excel, Access, and StarWriter\n.TP\n.B doc\nNote it is more efficient to run OLE as you get more bang for your buck.  \nIf you wish to ignore all other ole files then use this.\n.TP\n.B zip\nNote is will extract .jar files as well because they use a similar format.\nOpen Office docs are just zip'd XML files so they are extracted as well.  \nThese include SXW, SXC, SXI, and SX? for undetermined OpenOffice files.\nOffice 2007 files are also XML based (PPTX,DOCX,XLSX)\n.TP\n.B rar\n.TP\n.B htm\n.TP\n.B cpp\nC source code detection, note this is primitive and may \ngenerate documents other than C code.\n.TP\n.B all\nRun all pre-defined extraction methods. [Default if no -t is specified]\n\n.SH DESCRIPTION\n.PP\nRecover files from a disk image based on headers and footers specified by the\nuser.\n\n.TP\n\\fB\\-h\\fR\nShow a help screen and exit.\n\n.TP\n\n\\fB\\-V\\fR\nShow copyright information and exit.\n.TP\n\n\\fB\\-d\\fR\nTurn on indirect block detection, this works well for Unix file systems.\n.TP\n\\fB\\-T\\fR\nTime stamp the output directory so you don't have to delete the output\ndir when running multiple times.\n\n.TP\n\\fB\\-v\\fR\nEnables verbose mode. This causes more information regarding the current\nstate of the program to be displayed on the screen, and is highly recommended.\n\n\n.TP\n\\fB\\-q\\fR\nEnables quick mode. In quick mode, only the start of each sector is \nsearched for matching headers. That is, the header is searched only up to \nthe length of the longest header. The rest of the sector, usually about 500 \nbytes, is ignored. This mode makes foremost run considerably faster, but it \nmay cause you to miss files that are embedded in other files. For example, \nusing quick mode you will not be able to find JPEG images embedded in \nMicrosoft Word documents. \n\nQuick mode should not be used when examining NTFS file systems. Because \nNTFS will store small files inside the Master File Table, these files will \nbe missed during quick mode.\n.br\n\n.TP\n\\fB\\-Q\\fR\nEnables Quiet mode. Most error messages will be suppressed.\n.br\n\n.TP\n\\fB\\-w\\fR\nEnables write audit only mode.  No files will be extracted. \n.br\n\n.TP\n\\fB\\-a\\fR\nEnables write all headers, perform no error detection in terms of corrupted files.\n.br\n\n.TP\n\\fB\\-b\\fR \\fInumber\\fR\nAllows you to specify the block size used in foremost.  This is relevant for \nfile naming and quick searches.  The default is 512.\n\tie.\tforemost -b 1024 image.dd\n.br\n.TP\n\\fB\\-k\\fR \\fInumber\\fR\nAllows you to specify the chunk size used in foremost.  This can improve \nspeed if you have enough RAM to fit the image in.  It reduces the checking \nthat occurs between chunks of the buffer.  For example if you had > 500MB of RAM.\n\tie.\tforemost -k 500 image.dd\n.br\n\n.TP\n\\fB\\-i\\fR \\fIfile\\fR\nThe \\fIfile\\fR is used as the input file.  If no input file is specified\nor the input file cannot be read then stdin is used.\n\n.TP\n\\fB-o\\fR \\fIdirectory\\fR\nRecovered files are written to the directory\n\\fIdirectory\\fR. \n\n.TP\n\\fB-c\\fR \\fIfile\\fR\nSets the configuration file to use. If none is specified, the file \n\"foremost.conf\" from the current directory is used, if that doesn't\nexist then \"/etc/foremost.conf\" is used. The format for\nthe configuration file is described in the default configuration\nfile included with this program. See the \\fICONFIGURATION FILE\\fR\nsection below for more information.\n\n.TP\n\n\\fB-s\\fR \\fInumber\\fR\nSkips \\fInumber\\fR blocks in the input file before beginning the search\nfor headers.    \n\tie.  foremost -s 512 -t jpeg -i /dev/hda1\n.TP\n\n\n.PP\n\n.SH CONFIGURATION FILE\nThe configuration file is used to control what types of files foremost\nsearches for. A sample configuration file, foremost.conf, is included with\nthis distribution. For each file type, the configuration file describes\nthe file's extension, whether the header and footer are case sensitive,\nthe maximum file size, and the header and footer for the file. The footer\nfield is optional, but header, size, case sensitivity, and extension are\nnot!\n\nAny line that begins with a pound sign \nis considered a comment and ignored. Thus,\nto skip a file type just put a pound sign at the beginning of that line\n\nHeaders and footers are decoded before use. To specify a value in\nhexadecimal use \\\\x[0-f][0-f], and for octal use \\\\[0-7][0-7][0-7].  Spaces\ncan be represented by \\\\s. Example: \"\\\\x4F\\\\123\\\\I\\\\sCCI\" decodes to \"OSI CCI\".\n\nTo match any single character (aka a wildcard) use a ?. If you need to\nsearch for the ? character, you will need to change the wildcard line\n*and* every occurrence of the old wildcard character in the configuration\nfile. Do not forget those hex and octal values! ? is equal to \\\\x3f and\n\\\\063.\n\nThere is a sample set of headers in the README file.\n\n.SH EXAMPLES\n.TP\n.SH Search for jpeg format skipping the first 100 blocks\nforemost -s 100 -t jpg -i image.dd \n.TP\n.SH Only generate an audit file, and print to the screen (verbose mode)\nforemost -av image.dd \n.TP\n.SH Search all defined types\nforemost -t all -i image.dd\n.TP\n.SH Search for gif and pdf's \nforemost -t gif,pdf -i image.dd\n.TP\n.SH Search for office documents and jpeg files in a Unix file system in verbose mode.  \nforemost -vd -t ole,jpeg -i image.dd\n.TP\n.SH Run the default case\nforemost image.dd\n.PP\n\n.SH AUTHORS\nOriginal Code written by Special Agent Kris Kendall and Special Agent Jesse Kornblum of \nthe United States Air Force Office of Special Investigations.\n\nModification by Nick Mikus a Research Associate at the Naval Postgraduate \nSchool Center for Information Systems Security Studies and Research.  The modification\nof Foremost was part of a masters thesis at NPS.\n\n.SH BUGS\nWhen compiling foremost on systems with versions of glibc 2.1.x or older,\nyou will get some (harmless) compiler warnings regarding the implicit \ndeclaration of fseeko and ftello. You can safely ignore these warnings.\n.PP\n\n.SH \"REPORTING BUGS\"\nBecause Foremost could be used to obtain evidence for criminal \nprosecutions, we\ntake all bug reports \\fIvery\\fR seriously. Any bug that jeopardizes the\nforensic integrity of this program could have serious consequenses. When submitting a bug report, please include a description\nof the problem, how you found it, and your contact information.\n.PP\nSend bug reports to:\n.br\nnamikus AT users d0t sf d0t net\n.PP\n.SH COPYRIGHT\nThis program is a work of the US Government. In accordance with 17 USC 105,\ncopyright protection is not available for any work of the US Government.\n.PP\nThis is free software; see the source for copying conditions.  There is NO\nwarranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.\n\n.SH \"SEE ALSO\"\nThere is more information in the README file. \n.PP\nForemost was originally designed to imitate the functionality of CarvThis, \na DOS program written by the Defense Computer Forensics Lab in in 1999.\n\n\n"
  },
  {
    "path": "foremost.conf",
    "content": "#\n# Foremost configuration file\n#-------------------------------------------------------------------------\n# Note the foremost configuration file is provided to support formats which\n# don't have built-in extraction functions.  If the format is built-in to foremost\n# simply run foremost with -t <suffix> and provide the format you wish to extract. \n#\n# The configuration file is used to control what types of files foremost\n# searches for. A sample configuration file, foremost.conf, is included with\n# this distribution. For each file type, the configuration file describes\n# the file's extension, whether the header and footer are case sensitive,\n# the maximum file size, and the header and footer for the file. The footer\n# field is optional, but header, size, case sensitivity, and extension are\n# not!\n#\n# Any line that begins with a '#' is considered a comment and ignored. Thus,\n# to skip a file type just put a '#' at the beginning of that line\n#\n\n# Headers and footers are decoded before use. To specify a value in\n# hexadecimal use \\x[0-f][0-f], and for octal use \\[0-3][0-7][0-7].  Spaces\n# can be represented by \\s. Example: \"\\x4F\\123\\I\\sCCI\" decodes to \"OSI CCI\".\n#\n# To match any single character (aka a wildcard) use a '?'. If you need to\n# search for the '?' character, you will need to change the 'wildcard' line\n# *and* every occurrence of the old wildcard character in the configuration\n# file. Don't forget those hex and octal values! '?' is equal to 0x3f and\n# \\063.\n#\n# If you would like to extract files without an extension enter the value\n# \"NONE\" in the extension column (note: you can change the value of this\n# \"no suffix\" flag by setting the variable FOREMOST_NOEXTENSION_SUFFIX\n# in foremost.h and recompiling).\n#\n# The ASCII option will extract all ASCII printable characters before and after \n# the keyword provided.\n#\n# The NEXT keyword after a footer instructs foremost to search forwards for data \n# that starts with the header provided and terminates or is followed by data in \n# the footer -- the footer data is not included in the output.  The data in the \n# footer, when used with the NEXT keyword effectively allows you to search for \n# data that you know for sure should not be in the output file.  This method for \n# example, lets you search for two 'starting' headers in a document that doesn't \n# have a good ending footer and you can't say exactly what the footer is, but \n# you know if you see another header, that should end the search and an output\n# file should be written.\n\n# To redefine the wildcard character, change the setting below and all\n# occurances in the formost.conf file.\n#\n#wildcard  ?\n#\n#\t\tcase\tsize\theader\t\t\tfooter\n#extension   sensitive\t\n#\n#---------------------------------------------------------------------\n# EXAMPLE WITH NO SUFFIX\n#---------------------------------------------------------------------\n#\n# Here is an example of how to use the no extension option. Any files \n# containing the string \"FOREMOST\" would be extracted to a file without \n# an extension (eg: 00000000,00000001)\n#      NONE     y      1000     FOREMOST\n#\n#---------------------------------------------------------------------\n# GRAPHICS FILES\n#---------------------------------------------------------------------\t\n#\n#\n# AOL ART files\n#\tart\ty\t150000\t\\x4a\\x47\\x04\\x0e\t\\xcf\\xc7\\xcb\n#  \tart\ty \t150000\t\\x4a\\x47\\x03\\x0e\t\\xd0\\xcb\\x00\\x00\n#\n# GIF and JPG files (very common)\n#\t(NOTE THESE FORMATS HAVE BUILTIN EXTRACTION FUNCTION)\n#\tgif\ty\t155000000\t\\x47\\x49\\x46\\x38\\x37\\x61\t\\x00\\x3b\n#  \tgif\ty \t155000000\t\\x47\\x49\\x46\\x38\\x39\\x61\t\\x00\\x00\\x3b\n#  \tjpg\ty\t20000000\t\\xff\\xd8\\xff\\xe0\\x00\\x10\t\\xff\\xd9\n#  \tjpg\ty\t20000000\t\\xff\\xd8\\xff\\xe1 \\xff\\xd9 \n#  \tjpg\ty\t20000000\t\\xff\\xd8\t\\xff\\xd9\n#\n# PNG   (used in web pages)\n#\t(NOTE THIS FORMAT HAS A BUILTIN EXTRACTION FUNCTION)\n#  \tpng\ty\t200000\t\\x50\\x4e\\x47?\t\\xff\\xfc\\xfd\\xfe\n#\n#\n# BMP \t\n#\t(NOTE THIS FORMAT HAS A BUILTIN EXTRACTION FUNCTION)\n#\tbmp\ty\t100000\tBM??\\x00\\x00\\x00\n#\n# TIF\n#  \ttif\ty\t200000000\t\\x49\\x49\\x2a\\x00\n#\n#---------------------------------------------------------------------\t\n# ANIMATION FILES\n#---------------------------------------------------------------------\t\n#\n# AVI (Windows animation and DiVX/MPEG-4 movies)\n#\t(NOTE THIS FORMAT HAS A BUILTIN EXTRACTION FUNCTION)\n#  \tavi\ty\t4000000 RIFF????AVI\n#\n# Apple Quicktime\n#\t(NOTE THIS FORMAT HAS A BUILTIN EXTRACTION FUNCTION)\n#\tmov\ty\t4000000\t????????\\x6d\\x6f\\x6f\\x76\n#\tmov\ty\t4000000\t????????\\x6d\\x64\\x61\\x74\n#\n# MPEG Video\n#\tmpg\ty\t4000000\tmpg\teof\n#\tmpg\ty\t20000000 \\x00\\x00\\x01\\xba      \\x00\\x00\\x01\\xb9\n#\tmpg     y \t20000000 \\x00\\x00\\x01\\xb3 \t\\x00\\x00\\x01\\xb7\n#\n# Macromedia Flash\n#\tfws\ty\t4000000\tFWS\n#\n#---------------------------------------------------------------------\t\n# MICROSOFT OFFICE \n#---------------------------------------------------------------------\t\n#\n# Word documents\n#\t(NOTE THIS FORMAT HAS A BUILTIN EXTRACTION FUNCTION)\n#\tdoc\ty\t12500000  \\xd0\\xcf\\x11\\xe0\\xa1\\xb1\n#\n# Outlook files\n#\tpst\ty\t400000000 \\x21\\x42\\x4e\\xa5\\x6f\\xb5\\xa6\n#\tost\ty\t400000000 \\x21\\x42\\x44\\x4e\n#\n# Outlook Express\n#\tdbx\ty\t4000000\t\\xcf\\xad\\x12\\xfe\\xc5\\xfd\\x74\\x6f\n#\tidx\ty\t4000000\t\\x4a\\x4d\\x46\\x39\n#\tmbx\ty\t4000000\t\\x4a\\x4d\\x46\\x36\n#\n#---------------------------------------------------------------------\t\n# WORDPERFECT\n#---------------------------------------------------------------------\n#\n#\twpc\ty\t100000\t?WPC\n#\n#---------------------------------------------------------------------\t\n# HTML\t\t(NOTE THIS FORMAT HAS A BUILTIN EXTRACTION FUNCTION)\n#---------------------------------------------------------------------\t\n#\n#\thtm\tn\t50000   <html\t\t\t</html>\n#\n#---------------------------------------------------------------------\t\n# ADOBE PDF\t(NOTE THIS FORMAT HAS A BUILTIN EXTRACTION FUNCTION)\n#---------------------------------------------------------------------\t\n#\n#\tpdf\ty\t5000000\t%PDF-  %EOF \n#\n#\n#---------------------------------------------------------------------\t\n# AOL (AMERICA ONLINE)\n#---------------------------------------------------------------------\t\n#\n# AOL Mailbox\n#\tmail\ty\t500000\t \\x41\\x4f\\x4c\\x56\\x4d\n#\n#\n#\t\n#---------------------------------------------------------------------\t\n# PGP (PRETTY GOOD PRIVACY)\n#---------------------------------------------------------------------\t\n#\n# PGP Disk Files\n#\tpgd\ty\t500000\t\\x50\\x47\\x50\\x64\\x4d\\x41\\x49\\x4e\\x60\\x01\n#\n# Public Key Ring\n#\tpgp\ty\t100000\t\\x99\\x00\n# Security Ring\n#\tpgp\ty\t100000\t\\x95\\x01\n#\tpgp\ty\t100000\t\\x95\\x00\n# Encrypted Data or ASCII armored keys\n#\tpgp\ty\t100000\t\\xa6\\x00\n# (there should be a trailer for this...)\n#\ttxt\ty\t100000\t-----BEGIN\\040PGP\n#\n#\n#---------------------------------------------------------------------\t\n# RPM (Linux package format)\n#---------------------------------------------------------------------\t\n#\trpm\ty\t1000000\t\\xed\\xab\n#\n#\n#---------------------------------------------------------------------\t\n# SOUND FILES\n#---------------------------------------------------------------------\t\n#\t(NOTE THIS FORMAT HAS A BUILTIN EXTRACTION FUNCTION)\n#\twav     y\t200000\tRIFF????WAVE\n#\n# Real Audio Files\n#\tra\ty\t1000000\t\\x2e\\x72\\x61\\xfd\n#\tra\ty\t1000000\t.RMF\n#\n#\tasf     y       8000000\t \\x30\\x26\\xB2\\x75\\x8E\\x66\\xCF\\x11\\xA6\\xD9\\x00\\xAA\\x00\\x62\\xCE\\x6C\n#\n#\twmv     y       20000000 \\x30\\x26\\xB2\\x75\\x8E\\x66\\xCF\\x11\\xA6\\xD9\\x00\\xAA\\x00\\x62\\xCE\\x6C\n#\n#\twma     y       8000000  \\x30\\x26\\xB2\\x75    \\x00\\x00\\x00\\xFF\n#\n#\twma     y       8000000  \\x30\\x26\\xB2\\x75    \\x52\\x9A\\x12\\x46\n#\n#\tmp3     y    \t8000000 \\xFF\\xFB??\\x44\\x00\\x00\n#\tmp3     y    \t8000000 \\x57\\x41\\x56\\45            \\x00\\x00\\xFF\\\n#\tmp3     y    \t8000000 \\xFF\\xFB\\xD0\\            \\xD1\\x35\\x51\\xCC\\\n#\tmp3     y    \t8000000 \\x49\\x44\\x33\\\n#\tmp3     y    \t8000000 \\x4C\\x41\\x4D\\x45\\\n#---------------------------------------------------------------------\t\n# WINDOWS REGISTRY FILES\n#---------------------------------------------------------------------\t\n# \n# Windows NT registry\n#\tdat\ty\t4000000\tregf\n# Windows 95 registry\n#\tdat\ty\t4000000\tCREG\n#\n#    \tlnk     y    \t5000\t\\x4C\\x00\\x00\\x00\\x01\\x14\\x02\\x00\\x00\\x00\\x00\\x00\\xC0\\x00\\x00\n#    \tchm     y    \t100000\t\\x49\\x54\\x53\\x46\\x03\\x00\\x00\\x00\\x60\\x00\\x00\\x00\\x01\\x00\\x00\n#    \tcookie  n    \t4096    id=\n#    \trdp     y    \t4096\t\\xFF\\xFE\\x73\\x00\\x63\\x00\\x72\\x00\\x65\\x00\\x65\\x00\\x6E\\x00\\x20\\x00\\x6D\n#\n#---------------------------------------------------------------------\t\n# MISCELLANEOUS\n#---------------------------------------------------------------------\t\n#\t(NOTE THIS FORMAT HAS BUILTIN EXTRACTION FUNCTION)\n#\tzip\ty\t10000000\tPK\\x03\\x04\t\\x3c\\xac\n#\t(NOTE THIS FORMAT HAS BUILTIN EXTRACTION FUNCTION)\n#\trar\ty\t10000000\tRar!\n#\n#\tjava\ty\t1000000\t\\xca\\xfe\\xba\\xbe\n#\n#\tcpp\ty\t20000\t#include\t#include\tASCII\n#---------------------------------------------------------------------\t\n# ScanSoft PaperPort \"Max\" files\n#---------------------------------------------------------------------\t\n#      max   y     1000000    \\x56\\x69\\x47\\x46\\x6b\\x1a\\x00\\x00\\x00\\x00   \\x00\\x00\\x05\\x80\\x00\\x00 \n#---------------------------------------------------------------------\t\n# PINs Password Manager program\n#---------------------------------------------------------------------\t\n#      pins  y     8000     \\x50\\x49\\x4e\\x53\\x20\\x34\\x2e\\x32\\x30\\x0d\n"
  },
  {
    "path": "helpers.c",
    "content": "\n\t /* MD5DEEP - helpers.c\n *\n * By Jesse Kornblum\n *\n * This is a work of the US Government. In accordance with 17 USC 105,\n * copyright protection is not available for any work of the US Government.\n *\n * This program is distributed in the hope that it will be useful, but\n * WITHOUT ANY WARRANTY; without even the implied warranty of\n * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.\n *\n */\n\n#include \"main.h\"\n\n/* Removes any newlines at the end of the string buf.\n   Works for both *nix and Windows styles of newlines.\n   Returns the new length of the string. */\nunsigned int chop (char *buf)\n\t{\n\n\t/* Windows newlines are 0x0d 0x0a, *nix are 0x0a */\n\tunsigned int\tlen = strlen(buf);\n\tif (buf[len - 1] == 0x0a)\n\t\t{\n\t\tif (buf[len - 2] == 0x0d)\n\t\t\t{\n\t\t\tbuf[len - 2] = buf[len - 1];\n\t\t\t}\n\t\tbuf[len - 1] = buf[len];\n\t\t}\n\treturn strlen(buf);\n\t}\n\nchar *units(unsigned int c)\n{\n\tswitch (c)\n\t\t{\n\t\tcase 0:\t\treturn \"B\";\n\t\tcase 1:\t\treturn \"KB\";\n\t\tcase 2:\t\treturn \"MB\";\n\t\tcase 3:\t\treturn \"GB\";\n\t\tcase 4:\t\treturn \"TB\";\n\t\tcase 5:\t\treturn \"PB\";\n\t\tcase 6:\t\treturn \"EB\";\n\t\t/* Steinbach's Guideline for Systems Programming:\n       Never test for an error condition you don't know how to handle.\n\n       Granted, given that no existing system can handle anything \n       more than 18 exabytes, this shouldn't be an issue. But how do we\n       communicate that 'this shouldn't happen' to the user? */\n\t\tdefault:\treturn \"??\";\n\t\t}\n}\n\nchar *human_readable(off_t size, char *buffer)\n{\n\tunsigned int\tcount = 0;\n\twhile (size > 1024)\n\t\t{\n\t\tsize /= 1024;\n\t\t++count;\n\t\t}\n\n\t/* The size will be, at most, 1023, and the units will be\n     two characters no matter what. Thus, the maximum length of\n     this string is six characters. e.g. strlen(\"1023 EB\") = 6 */\n\tif (sizeof(off_t) == 4)\n\t\t{\n\t\tsnprintf(buffer, 8, \"%u %s\", (unsigned int)size, units(count));\n\t\t}\n\telse if (sizeof(off_t) == 8)\n\t\t{\n\t\tsnprintf(buffer, 8, \"%llu %s\", (u_int64_t) size, units(count));\n\t\t}\n\n\treturn buffer;\n}\n\nchar *current_time(void)\n{\n\ttime_t\tnow = time(NULL);\n\tchar\t*ascii_time = ctime(&now);\n\tchop(ascii_time);\n\treturn ascii_time;\n}\n\n/* Shift the contents of a string so that the values after 'new_start'\n   will now begin at location 'start' */\nvoid shift_string(char *fn, int start, int new_start)\n{\n\tif (start < 0 || start > strlen(fn) || new_start < 0 || new_start < start)\n\t\treturn;\n\n\twhile (new_start < strlen(fn))\n\t\t{\n\t\tfn[start] = fn[new_start];\n\t\tnew_start++;\n\t\tstart++;\n\t\t}\n\n\tfn[start] = 0;\n}\n\nvoid make_magic(void)\n{\n\tprintf(\"%s%s\",\n\t\t   \"\\x53\\x41\\x4E\\x20\\x44\\x49\\x4D\\x41\\x53\\x20\\x48\\x49\\x47\\x48\\x20\\x53\\x43\\x48\\x4F\\x4F\\x4C\\x20\\x46\\x4F\\x4F\\x54\\x42\\x41\\x4C\\x4C\\x20\\x52\\x55\\x4C\\x45\\x53\\x21\",\n\t   NEWLINE);\n}\n\n#if defined(__UNIX)\n\n/* Return the size, in bytes of an open file stream. On error, return 0 */\n\t#if defined(__LINUX)\n\noff_t find_file_size(FILE *f)\n{\n\toff_t\t\tnum_sectors = 0;\n\tint\t\t\tfd = fileno(f);\n\tstruct stat sb;\n\n\tif (fstat(fd, &sb))\n\t\t{\n\t\treturn 0;\n\t\t}\n\n\tif (S_ISREG(sb.st_mode) || S_ISDIR(sb.st_mode))\n\t\treturn sb.st_size;\n\telse if (S_ISCHR(sb.st_mode) || S_ISBLK(sb.st_mode))\n\t\t{\n\t\tif (ioctl(fd, BLKGETSIZE, &num_sectors))\n\t\t{\n\t\t#if defined(__DEBUG)\n\t\t\tfprintf(stderr, \"%s: ioctl call to BLKGETSIZE failed.%s\", __progname, NEWLINE);\n\t\t#endif\n\t\t}\n\t\telse\n\t\t\treturn (num_sectors * 512);\n\t\t}\n\n\treturn 0;\n}\n\n\t#elif defined(__MACOSX)\n\n\t\t#include <stdint.h>\n\t\t#include <sys/ioctl.h>\n\t\t#include <sys/disk.h>\n\noff_t find_file_size(FILE *f)\n{\n\t\t#ifdef DEBUG\n\tprintf(\"\tFIND MAC file size\\n\");\n\t\t#endif\n\treturn 0;\t/*FIX ME this function causes strange problems on MACOSX, so for now return 0*/\n\tstruct stat info;\n\toff_t\t\ttotal = 0;\n\toff_t\t\toriginal = ftello(f);\n\tint\t\t\tok = TRUE, fd = fileno(f);\n\n\t/* I'd prefer not to use fstat as it will follow symbolic links. We don't\n     follow symbolic links. That being said, all symbolic links *should*\n     have been caught before we got here. */\n\tfstat(fd, &info);\n\n\t/* Block devices, like /dev/hda, don't return a normal filesize.\n     If we are working with a block device, we have to ask the operating\n     system to tell us the true size of the device. \n     \n     The following only works on Linux as far as I know. If you know\n     how to port this code to another operating system, please contact\n     the current maintainer of this program! */\n\tif (S_ISBLK(info.st_mode))\n\t\t{\n\t\tdaddr_t blocksize = 0;\n\t\tdaddr_t blockcount = 0;\n\n\t\t/* Get the block size */\n\t\tif (ioctl(fd, DKIOCGETBLOCKSIZE, blocksize) < 0)\n\t\t\t{\n\t\t\tok = FALSE;\n\t\t#if defined(__DEBUG)\n\t\t\tperror(\"DKIOCGETBLOCKSIZE failed\");\n\t\t#endif\n\t\t\t}\n\n\t\t/* Get the number of blocks */\n\t\tif (ok)\n\t\t\t{\n\t\t\tif (ioctl(fd, DKIOCGETBLOCKCOUNT, blockcount) < 0)\n\t\t\t{\n\t\t#if defined(__DEBUG)\n\t\t\t\tperror(\"DKIOCGETBLOCKCOUNT failed\");\n\t\t#endif\n\t\t\t}\n\t\t\t}\n\n\t\ttotal = blocksize * blockcount;\n\n\t\t}\n\n\telse\n\t\t{\n\n\t\t/* I don't know why, but if you don't initialize this value you'll\n       get wildly innacurate results when you try to run this function */\n\t\tif ((fseeko(f, 0, SEEK_END)))\n\t\t\treturn 0;\n\t\ttotal = ftello(f);\n\t\tif ((fseeko(f, original, SEEK_SET)))\n\t\t\treturn 0;\n\t\t}\n\n\treturn (total - original);\n}\n\n\t#else\n\n/* This is code for general UNIX systems \n   (e.g. NetBSD, FreeBSD, OpenBSD, etc) */\nstatic off_t midpoint(off_t a, off_t b, long blksize)\n{\n\toff_t\taprime = a / blksize;\n\toff_t\tbprime = b / blksize;\n\toff_t\tc, cprime;\n\n\tcprime = (bprime - aprime) / 2 + aprime;\n\tc = cprime * blksize;\n\n\treturn c;\n}\n\noff_t find_dev_size(int fd, int blk_size)\n{\n\n\toff_t\tcurr = 0, amount = 0;\n\tvoid\t*buf;\n\n\tif (blk_size == 0)\n\t\treturn 0;\n\n\tbuf = malloc(blk_size);\n\n\tfor (;;)\n\t\t{\n\t\tssize_t nread;\n\n\t\tlseek(fd, curr, SEEK_SET);\n\t\tnread = read(fd, buf, blk_size);\n\t\tif (nread < blk_size)\n\t\t\t{\n\t\t\tif (nread <= 0)\n\t\t\t\t{\n\t\t\t\tif (curr == amount)\n\t\t\t\t\t{\n\t\t\t\t\tfree(buf);\n\t\t\t\t\tlseek(fd, 0, SEEK_SET);\n\t\t\t\t\treturn amount;\n\t\t\t\t\t}\n\n\t\t\t\tcurr = midpoint(amount, curr, blk_size);\n\t\t\t\t}\n\t\t\telse\n\t\t\t\t{\t/* 0 < nread < blk_size */\n\t\t\t\tfree(buf);\n\t\t\t\tlseek(fd, 0, SEEK_SET);\n\t\t\t\treturn amount + nread;\n\t\t\t\t}\n\t\t\t}\n\t\telse\n\t\t\t{\n\t\t\tamount = curr + blk_size;\n\t\t\tcurr = amount * 2;\n\t\t\t}\n\t\t}\n\n\tfree(buf);\n\tlseek(fd, 0, SEEK_SET);\n\treturn amount;\n}\n\noff_t find_file_size(FILE *f)\n{\n\tint\t\t\tfd = fileno(f);\n\tstruct stat sb;\n\treturn 0;\t\t/*FIX ME SOLARIS FILE SIZE CAUSES SEG FAULT, for now just return 0*/\n\n\tif (fstat(fd, &sb))\n\t\treturn 0;\n\n\tif (S_ISREG(sb.st_mode) || S_ISDIR(sb.st_mode))\n\t\treturn sb.st_size;\n\telse if (S_ISCHR(sb.st_mode) || S_ISBLK(sb.st_mode))\n\t\treturn find_dev_size(fd, sb.st_blksize);\n\n\treturn 0;\n}\n\n\t#endif /* UNIX Flavors */\n#endif /* ifdef __UNIX */\n\n#if defined(__WIN32)\noff_t find_file_size(FILE *f)\n{\n\toff_t\ttotal = 0, original = ftello(f);\n\n\tif ((fseeko(f, 0, SEEK_END)))\n\t\treturn 0;\n\n\ttotal = ftello(f);\n\tif ((fseeko(f, original, SEEK_SET)))\n\t\treturn 0;\n\n\treturn total;\n}\n\n#endif /* ifdef __WIN32 */\n\nvoid print_search_specs(f_state *s)\n{\n\tint i = 0;\n\tint j = 0;\n\tprintf(\"\\nDUMPING BUILTIN SEARCH INFO\\n\\t\");\n\tfor (i = 0; i < s->num_builtin; i++)\n\t\t{\n\n\t\tprintf(\"%s:\\n\\t footer_len:=%d, header_len:=%d, max_len:=%llu \",\n\t\t\t   search_spec[i].suffix,\n\t\t\t   search_spec[i].footer_len,\n\t\t\t   search_spec[i].header_len,\n\t\t\t   search_spec[i].max_len);\n\t\tprintf(\"\\n\\t header:\\t\");\n\t\tprintx(search_spec[i].header, 0, search_spec[i].header_len);\n\t\tprintf(\"\\t footer:\\t\");\n\t\tprintx(search_spec[i].footer, 0, search_spec[i].footer_len);\n\t\tfor (j = 0; j < search_spec[i].num_markers; j++)\n\t\t\t{\n\t\t\tprintf(\"\\tmarker: \\t\");\n\t\t\tprintx(search_spec[i].markerlist[j].value, 0, search_spec[i].markerlist[j].len);\n\t\t\t}\n\n\t\t}\n\n}\n\nvoid print_stats(f_state *s)\n{\n\tint i = 0;\n\taudit_msg(s, \"\\n%d FILES EXTRACTED\\n\\t\", s->fileswritten);\n\tfor (i = 0; i < s->num_builtin; i++)\n\t\t{\n\n\t\tif (search_spec[i].found != 0)\n\t\t\t{\n\t\t\tif (search_spec[i].type == OLE)\n\t\t\t\tsearch_spec[i].suffix = \"ole\";\n\t\t\telse if (search_spec[i].type == RIFF)\n\t\t\t\tsearch_spec[i].suffix = \"rif\";\n\t\t\telse if (search_spec[i].type == ZIP)\n\t\t\t\tsearch_spec[i].suffix = \"zip\";\n\t\t\taudit_msg(s, \"%s:= %d\", search_spec[i].suffix, search_spec[i].found);\n\t\t\t}\n\t\t}\n}\n\nint charactersMatch(char a, char b, int caseSensitive)\n{\n\n\t//if(a==b) return 1;\n\tif (a == wildcard || a == b)\n\t\treturn 1;\n\tif (caseSensitive || (a < 'A' || a > 'z' || b < 'A' || b > 'z'))\n\t\treturn 0;\n\n\t/* This line is equivalent to (abs(a-b)) == 'a' - 'A' */\n\treturn (abs(a - b) == 32);\n}\n\nint memwildcardcmp(const void *s1, const void *s2, size_t n, int caseSensitive)\n{\n\tif (n != 0)\n\t\t{\n\t\tregister const unsigned char\t*p1 = s1, *p2 = s2;\n\t\tdo\n\t\t\t{\n\t\t\tif (!charactersMatch(*p1++, *p2++, caseSensitive))\n\t\t\t\treturn (*--p1 -*--p2);\n\t\t\t}\n\t\twhile (--n != 0);\n\t\t}\n\n\treturn (0);\n}\n\nvoid printx(unsigned char *buf, int start, int end)\n{\n\tint i = 0;\n\tfor (i = start; i < end; i++)\n\t\t{\n\t\tprintf(\"%x \", buf[i]);\n\t\t}\n\n\tprintf(\"\\n\");\n}\n\nchar *reverse_string(char *to, char *from, int startLocation, int endLocation)\n{\n\tint i = endLocation;\n\tint j = 0;\n\tfor (j = startLocation; j < endLocation; j++)\n\t\t{\n\t\ti--;\n\t\tto[j] = from[i];\n\t\t}\n\n\treturn to;\n}\n\nunsigned short htos(unsigned char s[], int endian)\n{\n\n\tunsigned char\t*bytes = (unsigned char *)malloc(sizeof(unsigned short) * sizeof(char));\n\tunsigned short\tsize = 0;\n\tchar\t\t\ttemp = 'x';\n\tbytes = memcpy(bytes, s, sizeof(short));\n\n\tif (endian == FOREMOST_BIG_ENDIAN && BYTE_ORDER == LITTLE_ENDIAN)\n\t\t{\n\n\t\t//printf(\"switching the byte order\\n\");\n\t\ttemp = bytes[0];\n\t\tbytes[0] = bytes[1];\n\t\tbytes[1] = temp;\n\n\t\t}\n\telse if (endian == FOREMOST_LITTLE_ENDIAN && BYTE_ORDER == BIG_ENDIAN)\n\t\t{\n\t\ttemp = bytes[0];\n\t\tbytes[0] = bytes[1];\n\t\tbytes[1] = temp;\n\t\t}\n\n\tsize = *((unsigned short *)bytes);\n\tfree(bytes);\n\treturn size;\n}\n\nunsigned int htoi(unsigned char s[], int endian)\n{\n\n\tint\t\t\t\tlength = sizeof(int);\n\tunsigned char\t*bytes = (unsigned char *)malloc(length * sizeof(char));\n\tunsigned int\tsize = 0;\n\n\tbytes = memcpy(bytes, s, length);\n\n\tif (endian == FOREMOST_BIG_ENDIAN && BYTE_ORDER == LITTLE_ENDIAN)\n\t\t{\n\n\t\tbytes = (unsigned char *)reverse_string((char *)bytes, (char *)s, 0, length);\n\t\t}\n\telse if (endian == FOREMOST_LITTLE_ENDIAN && BYTE_ORDER == BIG_ENDIAN)\n\t\t{\n\n\t\tbytes = (unsigned char *)reverse_string((char *)bytes, (char *)s, 0, length);\n\t\t}\n\n\tsize = *((unsigned int *)bytes);\n\n\tfree(bytes);\n\treturn size;\n}\n\nu_int64_t htoll(unsigned char s[], int endian)\n{\n\tint\t\t\t\tlength = sizeof(u_int64_t);\n\tunsigned char\t*bytes = (unsigned char *)malloc(length * sizeof(char));\n\tu_int64_t\tsize = 0;\n\tbytes = memcpy(bytes, s, length);\n#ifdef DEBUG\n\tprintf(\"htoll len=%d endian=%d\\n\",length,endian);\n#endif\t\n\tif (endian == FOREMOST_BIG_ENDIAN && BYTE_ORDER == LITTLE_ENDIAN)\n\t\t{\n#ifdef DEBUG\n\t\tprintf(\"reverse0\\n\");\n#endif\n\t\tbytes = (unsigned char *)reverse_string((char *)bytes, (char *)s, 0, length);\n\t\t}\n\telse if (endian == FOREMOST_LITTLE_ENDIAN && BYTE_ORDER == BIG_ENDIAN)\n\t\t{\n#ifdef DEBUG\n\tprintf(\"reverse1\\n\");\n#endif\n\t\tbytes = (unsigned char *)reverse_string((char *)bytes, (char *)s, 0, length);\n\t\t}\n\n\tsize = *((u_int64_t *)bytes);\n#ifdef DEBUG\n\tprintf(\"htoll size=%llu\\n\",size);\n\tprintx(bytes,0,length);\n#endif\t\n\t\n\n\tfree(bytes);\n\treturn size;\n}\n\n/* display Position: Tell the user how far through the infile we are */\nint displayPosition(f_state *s, f_info *i, u_int64_t pos)\n{\n\n\tint\t\t\tpercentDone = 0;\n\tstatic int\tlast_val = 0;\n\tint\t\t\tcount;\n\tint\t\t\tflag = FALSE;\n\tint\t\t\tfactor = 4;\n\tint\t\t\tmultiplier = 25;\n\tint\t\t\tnumber_of_stars = 0;\n\tchar\t\tbuffer[256];\n\tlong double skip = s->skip * s->block_size;\n\n\tlong double tot_bytes = (long double)((i->total_bytes));\n\ttot_bytes -= skip;\n\tif (i->total_bytes > 0)\n\t\t{\n\t\tpercentDone = (((long double)pos) / ((long double)tot_bytes)) * 100;\n\t\tif (percentDone != last_val)\n\t\t\tflag = TRUE;\n\t\tlast_val = percentDone;\n\t\t}\n\telse\n\t\t{\n\t\tflag = TRUE;\n\t\tfactor = 4;\n\t\tmultiplier = 25;\n\t\t}\n\n\tif (flag)\n\t\t{\n\t\tnumber_of_stars = percentDone / factor;\n\n\t\tprintf(\"%s: |\", s->input_file);\n\t\tfor (count = 0; count < number_of_stars; count++)\n\t\t\t{\n\t\t\tprintf(\"*\");\n\t\t\t}\n\n\t\tfor (count = 0; count < (multiplier - number_of_stars); count++)\n\t\t\t{\n\t\t\tprintf(\" \");\n\t\t\t}\n\n\t\tif (i->total_bytes > 0)\n\t\t\t{\n\t\t\tprintf(\"|\\t %d%% done\\n\", percentDone);\n\t\t\t}\n\t\telse\n\t\t\t{\n\t\t\tprintf(\"|\\t %s done\\n\", human_readable(pos, buffer));\n\n\t\t\t}\n\t\t}\n\n\tif (percentDone == 100)\n\t\t{\n\t\tlast_val = 0;\n\t\t}\n\n\treturn TRUE;\n}\n"
  },
  {
    "path": "main.c",
    "content": "\n\n\n/* FOREMOST\n *\n * By Jesse Kornblum and Kris Kendall\n * \n * This is a work of the US Government. In accordance with 17 USC 105,\n * copyright protection is not available for any work of the US Government.\n *\n * This program is distributed in the hope that it will be useful, but\n * WITHOUT ANY WARRANTY; without even the implied warranty of\n * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.\n *\n *\n */\n#include \"main.h\"\n\n#ifdef __WIN32\n\n/* Allows us to open standard input in binary mode by default \n   See http://gnuwin32.sourceforge.net/compile.html for more */\nint _CRT_fmode = _O_BINARY;\n#endif\n\nvoid catch_alarm(int signum)\n{\n\tsignal_caught = signum;\n\tsignal(signum, catch_alarm);\n}\n\nvoid register_signal_handler(void)\n{\n\tsignal_caught = 0;\n\n\tif (signal(SIGINT, catch_alarm) == SIG_IGN)\n\t\tsignal(SIGINT, SIG_IGN);\n\tif (signal(SIGTERM, catch_alarm) == SIG_IGN)\n\t\tsignal(SIGTERM, SIG_IGN);\n\n#ifndef __WIN32\n\n\t/* Note: I haven't found a way to get notified of\n     console resize events in Win32.  Right now the statusbar\n     will be too long or too short if the user decides to resize\n     their console window while foremost runs.. */\n\n\t/* RBF - Handle TTY events  */\n\n\t// The function setttywidth is in the old helpers.c\n\t// signal(SIGWINCH, setttywidth);\n#endif\n}\n\nvoid try_msg(void)\n{\n\tfprintf(stderr, \"Try `%s -h` for more information.%s\", __progname, NEWLINE);\n}\n\n/* The usage function should, at most, display 22 lines of text to fit\n   on a single screen */\nvoid usage(void)\n{\n\tfprintf(stderr, \"%s version %s by %s.%s\", __progname, VERSION, AUTHOR, NEWLINE);\n\tfprintf(stderr,\n\t\t\t\"%s %s [-v|-V|-h|-T|-Q|-q|-a|-w-d] [-t <type>] [-s <blocks>] [-k <size>] \\n\\t[-b <size>] [-c <file>] [-o <dir>] [-i <file] %s%s\",\n\t\tCMD_PROMPT,\n\t\t\t__progname,\n\t\t\tNEWLINE,\n\t\t\tNEWLINE);\n\tfprintf(stderr, \"-V  - display copyright information and exit%s\", NEWLINE);\n\tfprintf(stderr, \"-t  - specify file type.  (-t jpeg,pdf ...) %s\", NEWLINE);\n\tfprintf(stderr, \"-d  - turn on indirect block detection (for UNIX file-systems) %s\", NEWLINE);\n\tfprintf(stderr, \"-i  - specify input file (default is stdin) %s\", NEWLINE);\n\tfprintf(stderr,\n\t\t\t\"-a  - Write all headers, perform no error detection (corrupted files) %s\",\n\t\t\tNEWLINE);\n\tfprintf(stderr,\n\t\t\t\"-w  - Only write the audit file, do not write any detected files to the disk %s\",\n\t\t\tNEWLINE);\n\tfprintf(stderr,\n\t\t\t\"-o  - set output directory (defaults to %s)%s\",\n\t\t\tDEFAULT_OUTPUT_DIRECTORY,\n\t\t\tNEWLINE);\n\tfprintf(stderr,\n\t\t\t\"-c  - set configuration file to use (defaults to %s)%s\",\n\t\t\tDEFAULT_CONFIG_FILE,\n\t\t\tNEWLINE);\n\tfprintf(stderr,\n\t\t\t\"-q  - enables quick mode. Search are performed on 512 byte boundaries.%s\",\n\t\t\tNEWLINE);\n\tfprintf(stderr, \"-Q  - enables quiet mode. Suppress output messages. %s\", NEWLINE);\n\n\t/* RBF - What should verbose mode be? */\n\tfprintf(stderr, \"-v  - verbose mode. Logs all messages to screen%s\", NEWLINE);\n}\n\nvoid process_command_line(int argc, char **argv, f_state *s)\n{\n\n\tint\t\ti;\n\tchar\t*ptr1, *ptr2;\n\n\twhile ((i = getopt(argc, argv, \"o:b:c:t:s:i:k:hqmQTadvVw\")) != -1)\n\t\t{\n\t\tswitch (i)\n\t\t\t{\n\n\t\t\tcase 'v':\n\t\t\t\tset_mode(s, mode_verbose);\n\t\t\t\tbreak;\n\n\t\t\tcase 'd':\n\t\t\t\tset_mode(s, mode_ind_blk);\n\t\t\t\tbreak;\n\n\t\t\tcase 'w':\n\t\t\t\tset_mode(s, mode_write_audit);\t/*Only write audit*/\n\t\t\t\tbreak;\n\n\t\t\tcase 'a':\n\t\t\t\tset_mode(s, mode_write_all);\t/*Write all headers*/\n\t\t\t\tbreak;\n\n\t\t\tcase 'b':\n\t\t\t\tset_block(s, atoi(optarg));\n\t\t\t\tbreak;\n\n\t\t\tcase 'o':\n\t\t\t\tset_output_directory(s, optarg);\n\t\t\t\tbreak;\n\n\t\t\tcase 'q':\n\t\t\t\tset_mode(s, mode_quick);\n\t\t\t\tbreak;\n\n\t\t\tcase 'Q':\n\t\t\t\tset_mode(s, mode_quiet);\n\t\t\t\tbreak;\n\n\t\t\tcase 'c':\n\t\t\t\tset_config_file(s, optarg);\n\t\t\t\tbreak;\n\n\t\t\tcase 'm':\n\t\t\t\tset_mode(s, mode_multi_file);\n\n\t\t\tcase 'k':\n\t\t\t\tset_chunk(s, atoi(optarg));\n\t\t\t\tbreak;\n\n\t\t\tcase 's':\n\t\t\t\tset_skip(s, atoi(optarg));\n\t\t\t\tbreak;\n\n\t\t\tcase 'i':\n\t\t\t\tset_input_file(s, optarg);\n\t\t\t\tbreak;\n\n\t\t\tcase 'T':\n\t\t\t\ts->time_stamp = TRUE;\n\t\t\t\tbreak;\n\n\t\t\tcase 't':\n\n\t\t\t\t/*See if we have multiple file types to define*/\n\t\t\t\tptr1 = ptr2 = optarg;\n\t\t\t\twhile (1)\n\t\t\t\t\t{\n\t\t\t\t\tif (!*ptr2)\n\t\t\t\t\t\t{\n\t\t\t\t\t\tif (!set_search_def(s, ptr1, 0))\n\t\t\t\t\t\t\t{\n\t\t\t\t\t\t\tusage();\n\t\t\t\t\t\t\texit(EXIT_SUCCESS);\n\t\t\t\t\t\t\t}\n\t\t\t\t\t\tbreak;\n\t\t\t\t\t\t}\n\n\t\t\t\t\tif (*ptr2 == ',')\n\t\t\t\t\t\t{\n\t\t\t\t\t\t*ptr2 = '\\0';\n\t\t\t\t\t\tif (!set_search_def(s, ptr1, 0))\n\t\t\t\t\t\t\t{\n\t\t\t\t\t\t\tusage();\n\t\t\t\t\t\t\texit(EXIT_SUCCESS);\n\t\t\t\t\t\t\t}\n\n\t\t\t\t\t\t*ptr2++ = ',';\n\t\t\t\t\t\tptr1 = ptr2;\n\t\t\t\t\t\t}\n\t\t\t\t\telse\n\t\t\t\t\t\t{\n\t\t\t\t\t\tptr2++;\n\t\t\t\t\t\t}\n\t\t\t\t\t}\n\t\t\t\tbreak;\n\n\t\t\tcase 'h':\n\t\t\t\tusage();\n\t\t\t\texit(EXIT_SUCCESS);\n\n\t\t\tcase 'V':\n\t\t\t\tprintf(\"%s%s\", VERSION, NEWLINE);\n\n\t\t\t\t/* We could just say printf(COPYRIGHT), but that's a good way\n\t to introduce a format string vulnerability. Better to always\n\t use good programming practice... */\n\t\t\t\tprintf(\"%s\", COPYRIGHT);\n\t\t\t\texit(EXIT_SUCCESS);\n\n\t\t\tdefault:\n\t\t\t\ttry_msg();\n\t\t\t\texit(EXIT_FAILURE);\n\n\t\t\t}\n\n\t\t}\n\n#ifdef __DEBUG\n\tdump_state(s);\n#endif\n\n}\n\nint main(int argc, char **argv)\n{\n\n\tFILE\t*testFile = NULL;\n\tf_state *s = (f_state *)malloc(sizeof(f_state));\n\tint\t\tinput_files = 0;\n\tchar\t**temp = argv;\n\tDIR* \tdir;\n\n#ifndef __GLIBC__\n\t__progname = basename(argv[0]);\n#endif\n\n\t/*Initialize the global state struct*/\n\tif (initialize_state(s, argc, argv))\n\t\tfatal_error(s, \"Unable to initialize state\");\n\n\tregister_signal_handler();\n\tprocess_command_line(argc, argv, s);\n\n\tload_config_file(s);\n\n\tif (s->num_builtin == 0)\n\t\t{\n\n\t\t/*Nothing specified via the command line or the conf\n\tfile so default to all builtin search types*/\n\t\tset_search_def(s, \"all\", 0);\n\t\t}\n\t\n\tif (create_output_directory(s))\n\t\tfatal_error(s, \"Unable to open output directory\");\t\n\n\tif (!get_mode(s, mode_write_audit))\n\t\t{\n\t\tcreate_sub_dirs(s);\n\t\t}\n\n\tif (open_audit_file(s))\n\t\tfatal_error(s, \"Can't open audit file\");\n\n\t/* Scan for valid files to open */\n\twhile (*argv != NULL)\n\t{\n\t\tif(strcmp(*argv,\"-c\")==0)\n\t\t{\n\t\t\t/*jump past the conf file so we don't process it.*/\n\t\t\targv+=2;\n\t\t}\n\t\ttestFile = fopen(*argv, \"rb\");\n\t\tif (testFile)\n\t\t{\n\t\t\tfclose(testFile);\n\t\t\tdir = opendir(*argv);\n\t\t\t\n\t\t\tif(!strstr(s->config_file,*argv)!=0 && !dir)\n\t\t\t{\n\t\t\t\tinput_files++;\n\t\t\t}\n\t\t\t\n\t\t\tif(dir) closedir(dir);\t\t\n\t\t}\n\n\t\t++argv;\n\t}\n\n\targv = temp;\n\tif (input_files > 1)\n\t\t{\n\t\tset_mode(s, mode_multi_file);\n\t\t}\n\n\t++argv;\n\twhile (*argv != NULL)\n\t\t{\n\t\ttestFile = fopen(*argv, \"rb\");\n\n\t\tif (testFile)\n\t\t\t{\n\t\t\t\tfclose(testFile);\n\t\t\t\tdir = opendir(*argv);\n\t\t\t\tif(!strstr(s->config_file,*argv)!=0 && !dir)\n\t\t\t\t{\n\t\t\t\t\tset_input_file(s, *argv);\n\t\t\t\t\tprocess_file(s);\n\t\t\t\t}\n\t\t\t\tif(dir) closedir(dir);\t\n\t\t\t}\n\n\t\t++argv;\n\t\t}\n\n\tif (input_files == 0)\n\t\t{\n\n\t\t//printf(\"using stdin\\n\");\n\t\tprocess_stdin(s);\n\t\t}\n\n\tprint_stats(s);\n\n\t/*Lets try to clean up some of the extra sub_dirs*/\n\tcleanup_output(s);\n\n\tif (close_audit_file(s))\n\t\t{\n\n\t\t/* Hells bells. This is bad, but really, what can we do about it? \n       Let's just report the error and try to get out of here! */\n\t\tprint_error(s, AUDIT_FILE_NAME, \"Error closing audit file\");\n\t\t}\n\n\tfree_state(s);\n\tfree(s);\n\treturn EXIT_SUCCESS;\n}\n"
  },
  {
    "path": "main.h",
    "content": "\r\n/* FOREMOST\r\n *\r\n * By Jesse Kornblum\r\n *\r\n * This is a work of the US Government. In accordance with 17 USC 105,\r\n * copyright protection is not available for any work of the US Government.\r\n *\r\n * This program is distributed in the hope that it will be useful, but\r\n * WITHOUT ANY WARRANTY; without even the implied warranty of\r\n * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.\r\n *\r\n */\r\n \r\n//#define DEBUG 1\r\n   \r\n#ifndef __FOREMOST_H\r\n#define __FOREMOST_H\r\n\r\n/* Version information is defined in the Makefile */\r\n\r\n#define AUTHOR      \"Jesse Kornblum, Kris Kendall, and Nick Mikus\"\r\n\r\n/* We use \\r\\n for newlines as this has to work on Win32. It's redundant for\r\n   everybody else, but shouldn't cause any harm. */\r\n#define COPYRIGHT   \"This program is a work of the US Government. \"\\\r\n\"In accordance with 17 USC 105,\\r\\n\"\\\r\n\"copyright protection is not available for any work of the US Government.\\r\\n\"\\\r\n\"This is free software; see the source for copying conditions. There is NO\\r\\n\"\\\r\n\"warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.\\r\\n\"\r\n\r\n#define _GNU_SOURCE\r\n#include <stdio.h>\r\n#include <stdlib.h>\r\n#include <limits.h>\r\n#include <dirent.h>\r\n#include <errno.h>\r\n#include <string.h>\r\n#include <unistd.h>\r\n#include <time.h>\r\n#include <math.h>\r\n#include <ctype.h>\r\n#include <sys/stat.h>\r\n#include <sys/types.h>\r\n#include <signal.h>\r\n\r\n/* For va_arg */\r\n#include <stdarg.h>\r\n\r\n#ifdef __LINUX\r\n#include <sys/ioctl.h>\r\n#include <sys/mount.h>\r\n#define   u_int64_t   unsigned long long\r\n#endif \r\n\r\n\r\n#ifdef __LINUX\r\n\r\n#ifndef __USE_BSD\r\n#define __USE_BSD\r\n#endif\r\n#include <endian.h>\r\n\r\n#elif defined (__SOLARIS)\r\n\r\n#define BIG_ENDIAN    4321\r\n#define LITTLE_ENDIAN 1234\r\n\r\n#include <sys/isa_defs.h>\r\n#ifdef _BIG_ENDIAN       \r\n#define BYTE_ORDER BIG_ENDIAN\r\n#else\r\n#define BYTE_ORDER LITTLE_ENDIAN\r\n#endif\r\n\r\n#elif defined (__WIN32)\r\n#include <sys/param.h>\r\n\r\n#elif defined (__MACOSX)\r\n#include <machine/endian.h>\r\n#define __U16_TYPE unsigned short\r\n#endif\r\n\r\n\r\n#define TRUE   1\r\n#define FALSE  0\r\n#define ONE_MEGABYTE  1048576\r\n\r\n\r\n/* RBF - Do we need these type definitions? */ \r\n#ifdef __SOLARIS\r\n#define   u_int32_t   unsigned int\r\n#define   u_int64_t   unsigned long long\r\n#endif \r\n\r\n\r\n/* The only time we're *not* on a UNIX system is when we're on Windows */\r\n#ifndef __WIN32\r\n#ifndef __UNIX\r\n#define __UNIX\r\n#endif  /* ifndef __UNIX */\r\n#endif  /* ifndef __WIN32 */\r\n\r\n\r\n#ifdef __UNIX\r\n\r\n#ifndef __U16_TYPE\r\n#define __U16_TYPE unsigned short\r\n#endif\r\n\r\n#include <libgen.h>\r\n\r\n#ifndef BYTE_ORDER \r\n\r\n#define BIG_ENDIAN    4321\r\n#define LITTLE_ENDIAN 1234\r\n\r\n#define BYTE_ORDER LITTLE_ENDIAN\r\n\r\n#endif\r\n/* This avoids compiler warnings on older systems */\r\nint fseeko(FILE *stream, off_t offset, int whence);\r\noff_t ftello(FILE *stream);\r\n\r\n\r\n#define CMD_PROMPT \"$\"\r\n#define DIR_SEPARATOR   '/'\r\n#define NEWLINE \"\\n\"\r\n#define LINE_LENGTH 74\r\n#define BLANK_LINE \\\r\n\"                                                                          \"\r\n\r\n#endif /* #ifdef __UNIX */\r\n\r\n/* This allows us to open standard input in binary mode by default \r\n   See http://gnuwin32.sourceforge.net/compile.html for more */\r\n#include <fcntl.h>\r\n\r\n/* Code specific to Microsoft Windows */\r\n#ifdef __WIN32\r\n\r\n/* By default, Windows uses long for off_t. This won't do. We\r\n   need an unsigned number at minimum. Windows doesn't have 64 bit\r\n   numbers though. */\r\n#ifdef off_t\r\n#undef off_t\r\n#endif\r\n#define off_t unsigned long\r\n\r\n#define CMD_PROMPT \"c:\\\\>\"\r\n#define  DIR_SEPARATOR   '\\\\'\r\n#define NEWLINE \"\\r\\n\"\r\n#define LINE_LENGTH 72\r\n#define BLANK_LINE \\\r\n\"                                                                        \"\r\n\r\n\r\n/* It would be nice to use 64-bit file lengths in Windows */\r\n#define ftello   ftell\r\n#define fseeko   fseek\r\n\r\n#ifndef __CYGWIN\r\n#define  snprintf         _snprintf\r\n#endif\r\n\r\n#define  u_int32_t        unsigned long\r\n\r\n/* We create macros for the Windows equivalent UNIX functions.\r\n   No worries about lstat to stat; Windows doesn't have symbolic links */\r\n#define lstat(A,B)      stat(A,B)\r\n\r\n#define u_int64_t unsigned __int64\r\n\r\n#ifndef __CYGWIN\r\n\t#define realpath(A,B)   _fullpath(B,A,PATH_MAX) \r\n#endif\r\n/* Not used in md5deep anymore, but left in here in case I \r\n   ever need it again. Win32 documentation searches are evil.\r\n   int asprintf(char **strp, const char *fmt, ...);\r\n*/\r\n\r\nchar *basename(char *a);\r\nextern char *optarg;\r\nextern int optind;\r\nint getopt(int argc, char *const argv[], const char *optstring);\r\n\r\n#endif   /* ifdef _WIN32 */\r\n\r\n\r\n/* On non-glibc systems we have to manually set the __progname variable */\r\n#ifdef __GLIBC__\r\nextern char *__progname;\r\n#else\r\nchar *__progname;\r\n#endif /* ifdef __GLIBC__ */\r\n\r\n/* -----------------------------------------------------------------\r\n   Program Defaults\r\n   ----------------------------------------------------------------- */\r\n#define MAX_STRING_LENGTH   1024\r\n#define COMMENT_LENGTH   64\r\n\r\n/* Modes refer to options that can be set by the user. */\r\n\r\n#define mode_none                0\r\n#define mode_verbose          1<<1\r\n#define mode_quiet            1<<2\r\n#define mode_ind_blk          1<<3\r\n#define mode_quick            1<<4\r\n#define mode_write_all        1<<5\r\n#define mode_write_audit      1<<6\r\n#define mode_multi_file\t      1<<7\r\n\r\n#define MAX_NEEDLES                   254\r\n#define NUM_SEARCH_SPEC_ELEMENTS        6\r\n#define MAX_SUFFIX_LENGTH               8\r\n#define MAX_FILE_TYPES                100\r\n#define FOREMOST_NOEXTENSION_SUFFIX \"NONE\"\r\n/* Modes 3 to 31 are reserved for future use. We shouldn't use\r\n   modes higher than 31 as Win32 can't go that high. */\r\n\r\n#define DEFAULT_MODE              mode_none\r\n#define DEFAULT_CONFIG_FILE       \"foremost.conf\"\r\n#define DEFAULT_OUTPUT_DIRECTORY  \"output\"\r\n#define AUDIT_FILE_NAME           \"audit.txt\"\r\n#define FOREMOST_DIVIDER          \"------------------------------------------------------------------\"\r\n\r\n#define JPEG 0\r\n#define GIF 1\r\n#define BMP 2\r\n#define MPG 3\r\n#define PDF 4\r\n#define DOC 5\r\n#define AVI 6\r\n#define WMV 7\r\n#define HTM 8\r\n#define ZIP 9\r\n#define MOV 10\r\n#define XLS 11\r\n#define PPT 12\r\n#define WPD 13\r\n#define CPP 14\r\n#define OLE 15\r\n#define GZIP 16\r\n#define RIFF 17\r\n#define WAV 18\r\n#define VJPEG 19\r\n#define SXW 20\r\n#define SXC 21\r\n#define SXI 22\r\n#define CONF 23\r\n#define PNG 24\r\n#define RAR 25\r\n#define EXE 26\r\n#define ELF 27\r\n#define REG 28\r\n#define DOCX 29\r\n#define XLSX 30\r\n#define PPTX 31\n#define MP4 32\r\n\r\n\r\n#define KILOBYTE                  1024\r\n#define MEGABYTE                  1024 * KILOBYTE\r\n#define GIGABYTE                  1024 * MEGABYTE\r\n#define TERABYTE                  1024 * GIGABYTE\r\n#define PETABYTE                  1024 * TERABYTE\r\n#define EXABYTE                   1024 * PETABYTE\r\n\r\n#define UNITS_BYTES                     0\r\n#define UNITS_KILOB                     1\r\n#define UNITS_MEGAB                     2\r\n#define UNITS_GIGAB                     3\r\n#define UNITS_TERAB                     4\r\n#define UNITS_PETAB                     5\r\n#define UNITS_EXAB                      6\r\n\r\n#define SEARCHTYPE_FORWARD      0\r\n#define SEARCHTYPE_REVERSE      1\r\n#define SEARCHTYPE_FORWARD_NEXT 2\r\n#define SEARCHTYPE_ASCII        3\r\n\r\n#define FOREMOST_BIG_ENDIAN 0\r\n#define FOREMOST_LITTLE_ENDIAN 1\r\n/*DEFAULT CHUNK SIZE In MB*/\r\n#define CHUNK_SIZE 100 \r\n\r\n\r\n/* Wildcard is a global variable because it's used by very simple\r\n   functions that don't need the whole state passed to them */\r\n\r\n/* -----------------------------------------------------------------\r\n   State Variable and Global Variables\r\n   ----------------------------------------------------------------- */\r\nchar wildcard;\r\ntypedef struct f_state \r\n{\r\n  off_t mode;\r\n  char *config_file;\r\n  char *input_file;\r\n  char *output_directory;\r\n  char *start_time;\r\n  char *invocation;\r\n  char *audit_file_name;\r\n  FILE *audit_file;\r\n  int audit_file_open;\r\n  int num_builtin;\r\n  int chunk_size; /*IN MB*/\r\n  int fileswritten;\r\n  int block_size;\r\n  int skip;\r\n  \r\n  int time_stamp;\r\n} f_state;\r\n\r\ntypedef struct marker\r\n{\r\n    unsigned char* value;\r\n    int len;\r\n    size_t marker_bm_table[UCHAR_MAX+1];\r\n}marker;\r\n\r\ntypedef struct s_spec\r\n{\r\n    char* suffix;\r\n    int type;\r\n    u_int64_t max_len;\r\n    unsigned char* header;\r\n    unsigned int header_len;\r\n    size_t header_bm_table[UCHAR_MAX+1];\r\n\r\n    unsigned char* footer;\r\n    unsigned int footer_len;\r\n    size_t footer_bm_table[UCHAR_MAX+1];\r\n    marker markerlist[5];\r\n    int num_markers;\r\n    int searchtype;                               \r\n\r\n    int case_sen;\r\n    \r\n    int found;\r\n    \r\n    char comment[MAX_STRING_LENGTH];/*Used for audit*/\r\n    int written; /*used for -a mode*/\r\n}s_spec;\r\n\r\ns_spec search_spec[50];  /*ARRAY OF BUILTIN SEARCH TYPES*/\r\n\r\ntypedef struct f_info {\r\n  char *file_name;\r\n  off_t total_bytes;\r\n\r\n  /* We never use the total number of bytes in a file, \r\n     only the number of megabytes when we display a time estimate */\r\n  off_t total_megs;\r\n  off_t bytes_read;\r\n\r\n#ifdef __WIN32\r\n  /* Win32 is a 32-bit operating system and can't handle file sizes\r\n     larger than 4GB. We use this to keep track of overflows */\r\n  off_t last_read;\r\n  off_t overflow_count;\r\n#endif\r\n\r\n  FILE *handle;\r\n  int is_stdin;\r\n} f_info;\r\n\r\n/* Set if the user hits ctrl-c */\r\nint signal_caught;\r\n\r\n/* -----------------------------------------------------------------\r\n   Function definitions\r\n   ----------------------------------------------------------------- */\r\n\r\n/* State functions */\r\n\r\nint initialize_state(f_state *s, int argc, char **argv);\r\nvoid free_state(f_state *s);\r\n\r\nchar *get_invocation(f_state *s);\r\nchar *get_start_time(f_state *s);\r\n\r\nint set_config_file(f_state *s, char *fn);\r\nchar* get_config_file(f_state *s);\r\n\r\nint set_output_directory(f_state *s, char *fn);\r\nchar* get_output_directory(f_state *s);\r\n\r\nvoid set_audit_file_open(f_state *s);\r\nint get_audit_file_open(f_state *s);\r\n\r\nvoid set_mode(f_state *s, off_t new_mode);\r\nint get_mode(f_state *s, off_t check_mode);\r\n\r\nint set_search_def(f_state *s,char* ft,u_int64_t max_file_size);\r\nvoid get_search_def(f_state s);\r\n\r\nvoid set_input_file(f_state *s,char* filename);\r\nvoid get_input_file(f_state *s);\r\n\r\nvoid set_chunk(f_state *s, int size);\r\n\r\nvoid init_bm_table(unsigned char *needle, size_t table[UCHAR_MAX + 1], size_t len, int casesensitive,int searchtype);\r\n\r\nvoid set_skip(f_state *s, int size);\r\nvoid set_block(f_state *s, int size);\r\n\r\n\r\n#ifdef __DEBUG\r\nvoid dump_state(f_state *s);\r\n#endif\r\n\r\n/* The audit file */\r\nint open_audit_file(f_state *s);\r\nvoid audit_msg(f_state *s, char *format, ...);\r\nint close_audit_file(f_state *s);\r\n\r\n\r\n/* Set up our output directory */\r\nint create_output_directory(f_state *s);\r\nint write_to_disk(f_state *s,s_spec * needle,u_int64_t len,unsigned char* buf,  u_int64_t t_offset);\r\nint create_sub_dirs(f_state *s);\r\nvoid cleanup_output(f_state *s);\r\n\r\n/* Configuration Files */\r\nint load_config_file(f_state *s);\r\n\r\n\r\n/* Helper functions */\r\nchar *current_time(void);\r\noff_t find_file_size(FILE *f);\r\nchar *human_readable(off_t size, char *buffer);\r\nchar *units(unsigned int c);\r\nunsigned int chop(char *buf);\r\nvoid print_search_specs(f_state *s);\r\nint memwildcardcmp(const void *s1, const void *s2,size_t n,int caseSensitive);\r\nint charactersMatch(char a, char b, int caseSensitive);\r\nvoid printx(unsigned char* buf,int start, int end);\r\nunsigned short htos(unsigned char s[],int endian);\r\nunsigned int htoi(unsigned char s[],int endian);\r\nu_int64_t htoll(unsigned char s[],int endian);\r\nint displayPosition(f_state* s,f_info* i,u_int64_t pos);\r\n\r\n\r\n/* Interface functions \r\n   These functions stay the same regardless if we're using a\r\n   command line interface or a GUI */\r\nvoid fatal_error(f_state *s, char *msg);\r\nvoid print_error(f_state *s, char *fn, char *msg);\r\nvoid print_message(f_state *s, char *format, va_list argp);\r\nvoid print_stats(f_state *s);\r\n\r\n/* Engine */\r\nint process_file(f_state *s);\r\nint process_stdin(f_state *s);\r\nunsigned char *bm_search(unsigned char *needle, size_t needle_len,unsigned char *haystack, size_t haystack_len,\r\n\tsize_t table[UCHAR_MAX + 1], int case_sen,int searchtype);\r\nunsigned char *bm_search_skipn(unsigned char *needle, size_t needle_len,unsigned char *haystack, size_t haystack_len,\r\n\tsize_t table[UCHAR_MAX + 1], int casesensitive,int searchtype, int start_pos) ;\t\r\n#endif /* __FOREMOST_H */\r\n\r\n/* BUILTIN */\r\nunsigned char* extract_file(f_state *s,  u_int64_t c_offset,unsigned char *foundat,  u_int64_t buflen, s_spec * needle, u_int64_t f_offset);\r\n\r\n\r\n\r\n\r\n\r\n"
  },
  {
    "path": "ole.h",
    "content": "#define TRUE\t\t\t1\n#define FALSE\t\t\t0\n#define SPECIAL_BLOCK\t- 3\n#define END_OF_CHAIN\t- 2\n#define UNUSED\t\t\t- 1\n\n#define NO_ENTRY\t\t0\n#define STORAGE\t\t\t1\n#define STREAM\t\t\t2\n#define ROOT\t\t\t5\n#define SHORT_BLOCK\t\t3\n\n#define FAT_START\t\t0x4c\n#define OUR_BLK_SIZE\t512\n#define DIRS_PER_BLK\t4\n#ifndef __CYGWIN\n\t#define MIN(x, y)\t((x) < (y) ? (x) : (y))\n#endif\n\n#include <stdarg.h>\n#include <string.h>\n#include <stdio.h>\n#include <sys/types.h>\n#include <unistd.h>\n#include <stdlib.h>\n#include <fcntl.h>\n#include <sys/stat.h>\n#include <ctype.h>\n\nstruct OLE_HDR\n{\n\tchar\t\t\tmagic[8];\t\t\t\t/*0*/\n\tchar\t\t\tclsid[16];\t\t\t\t/*8*/\n       __U16_TYPE      uMinorVersion;                  /*24*/\n       __U16_TYPE      uDllVersion;                    /*26*/\n       __U16_TYPE      uByteOrder;                             /*28*/\n       __U16_TYPE      uSectorShift;                   /*30*/\n       __U16_TYPE      uMiniSectorShift;               /*32*/\n       __U16_TYPE      reserved;                               /*34*/\n       u_int32_t       reserved1;                              /*36*/\n       u_int32_t       reserved2;                              /*40*/\n       u_int32_t       num_FAT_blocks;                 /*44*/\n       u_int32_t       root_start_block;               /*48*/\n       u_int32_t       dfsignature;                    /*52*/\n       u_int32_t       miniSectorCutoff;               /*56*/\n       u_int32_t       dir_flag;                               /*60 first sec in the mini fat chain*/\n       u_int32_t       csectMiniFat;                   /*64 number of sectors in the minifat */\n       u_int32_t       FAT_next_block;                 /*68*/\n       u_int32_t       num_extra_FAT_blocks;   /*72*/\n\t/* FAT block list starts here !! first 109 entries  */\n};\n\nstruct OLE_DIR\n{\n\tchar\t\t\tname[64];\n\tunsigned short\tnamsiz;\n\tchar\t\t\ttype;\n\tchar\t\t\tbflags;\t\t\t\t\t//0 or 1\n\tunsigned long\tprev_dirent;\n\tunsigned long\tnext_dirent;\n\tunsigned long\tdir_dirent;\n\tchar\t\t\tclsid[16];\n\tunsigned long\tuserFlags;\n\tint\t\t\t\tsecs1;\n\tint\t\t\t\tdays1;\n\tint\t\t\t\tsecs2;\n\tint\t\t\t\tdays2;\n\tunsigned long\tstart_block;\t\t\t//starting SECT of stream\n\tunsigned long\tsize;\n\tshort\t\t\treserved;\t\t\t\t//must be 0\n};\n\nstruct DIRECTORY\n{\n\tchar\tname[64];\n\tint\t\ttype;\n\tint\t\tlevel;\n\tint\t\tstart_block;\n\tint\t\tsize;\n\tint\t\tnext;\n\tint\t\tprev;\n\tint\t\tdir;\n\tint\t\ts1;\n\tint\t\ts2;\n\tint\t\td1;\n\tint\t\td2;\n}\n*dirlist, *dl;\n\nint\t\t\t\tget_dir_block(unsigned char *fd, int blknum, int buffersize);\nint\t\t\t\tget_dir_info(unsigned char *src);\nvoid\t\t\textract_stream(char *fd, int blknum, int size);\nvoid\t\t\tdump_header(struct OLE_HDR *h);\nint\t\t\t\tdump_dirent(int which_one);\nint\t\t\t\tget_block(unsigned char *fd, int blknum, unsigned char *dest, long long int buffersize);\nint\t\t\t\tget_FAT_block(unsigned char *fd, int blknum, int *dest, int buffersize);\nint\t\t\t\treorder_dirlist(struct DIRECTORY *dir, int level);\n\nunsigned char\t*get_ole_block(unsigned char *fd, int blknum, unsigned long long buffersize);\nstruct OLE_HDR\t*reverseBlock(struct OLE_HDR *dest, struct OLE_HDR *h);\n\nvoid\t\t\tdump_ole_header(struct OLE_HDR *h);\nvoid\t\t\t*Malloc(size_t bytes);\nvoid\t\t\tdie(char *fmt, void *arg);\nvoid\t\t\tinit_ole();\n"
  },
  {
    "path": "state.c",
    "content": "\n\n#include \"main.h\"\n\nint initialize_state (f_state * s, int argc, char **argv)\n\t{\n\tchar\t**argv_copy = argv;\n\n\t/* The routines in current_time return statically allocated memory.\n     We strdup the result so that we don't accidently free() the wrong\n     thing later on. */\n\ts->start_time = strdup(current_time());\n\twildcard = '?';\n\ts->audit_file_open = FALSE;\n\ts->mode = DEFAULT_MODE;\n\ts->input_file = NULL;\n\ts->fileswritten = 0;\n\ts->block_size = 512;\n\n\t/* We use the setter fuctions here to call realpath */\n\tset_config_file(s, DEFAULT_CONFIG_FILE);\n\tset_output_directory(s, DEFAULT_OUTPUT_DIRECTORY);\n\n\ts->invocation = (char *)malloc(sizeof(char) * MAX_STRING_LENGTH);\n\ts->invocation[0] = 0;\n\ts->chunk_size = CHUNK_SIZE;\n\ts->num_builtin = 0;\n\ts->skip = 0;\n\ts->time_stamp = FALSE;\n\tdo\n\t\t{\n\t\tstrncat(s->invocation, *argv_copy, MAX_STRING_LENGTH - strlen(s->invocation));\n\t\tstrncat(s->invocation, \" \", MAX_STRING_LENGTH - strlen(s->invocation));\n\t\t++argv_copy;\n\t\t}\n\twhile (*argv_copy);\n\n\treturn FALSE;\n\t}\n\nvoid free_state(f_state *s)\n{\n\tfree(s->start_time);\n\tfree(s->output_directory);\n\tfree(s->config_file);\n}\n\nint get_audit_file_open(f_state *s)\n{\n\treturn (s->audit_file_open);\n}\n\nchar *get_invocation(f_state *s)\n{\n\treturn (s->invocation);\n}\n\nchar *get_start_time(f_state *s)\n{\n\treturn (s->start_time);\n}\n\nchar *get_config_file(f_state *s)\n{\n\treturn (s->config_file);\n}\n\nint set_config_file(f_state *s, char *fn)\n{\n\tchar\ttemp[PATH_MAX];\n\n\t/* If the configuration file doesn't exist, this realpath will return\n     NULL. We don't error check here as the user may specify a file\n     that doesn't currently exist */\n\trealpath(fn, temp);\n\n\t/* RBF - Does this create a memory leak? What happens to the old value? */\n\ts->config_file = strdup(temp);\n\treturn FALSE;\n}\n\nchar *get_output_directory(f_state *s)\n{\n\treturn (s->output_directory);\n}\n\nint set_output_directory(f_state *s, char *fn)\n{\n\tchar\ttemp[PATH_MAX];\n  int \tfullpathlen=0;\n\t/* We don't error check here as it's quite possible that the\n     output directory doesn't exist yet. If it doesn't, realpath\n     resolves the path correctly, but still returns NULL. */\n  //strncpy(s->output_directory,fn,PATH_MAX);\n  \n\trealpath(fn, temp);\n\tfullpathlen=strlen(temp);\n\n\tif(fullpathlen!=0)\n\t{\n\t\ts->output_directory = strdup(temp);\n\t}\n\telse\n\t{\n\t\t/*Realpath failed just use cwd*/\n\t\ts->output_directory = strdup(fn);\n\t}\n\treturn FALSE;\n}\n\nint get_mode(f_state *s, off_t check_mode)\n{\n\treturn (s->mode & check_mode);\n}\n\nvoid set_mode(f_state *s, off_t new_mode)\n{\n\ts->mode |= new_mode;\n}\n\nvoid set_chunk(f_state *s, int size)\n{\n\ts->chunk_size = size;\n}\n\nvoid set_skip(f_state *s, int size)\n{\n\ts->skip = size;\n}\n\nvoid set_block(f_state *s, int size)\n{\n\ts->block_size = size;\n}\n\nvoid write_audit_header(f_state *s)\n{\n\taudit_msg(s, \"Foremost version %s by %s\", VERSION, AUTHOR);\n\taudit_msg(s, \"Audit File\");\n\taudit_msg(s, \"\");\n\taudit_msg(s, \"Foremost started at %s\", get_start_time(s));\n\taudit_msg(s, \"Invocation: %s\", get_invocation(s));\n\taudit_msg(s, \"Output directory: %s\", get_output_directory(s));\n\taudit_msg(s, \"Configuration file: %s\", get_config_file(s));\n}\n\nint open_audit_file(f_state *s)\n{\n\tchar\tfn[MAX_STRING_LENGTH];\n\n\tsnprintf(fn,\n\t\t\t MAX_STRING_LENGTH,\n\t\t\t \"%s%c%s\",\n\t\t\t get_output_directory(s),\n\t\t\t DIR_SEPARATOR,\n\t\t\t AUDIT_FILE_NAME);\n\n\tif ((s->audit_file = fopen(fn, \"w\")) == NULL)\n\t\t{\n\t\tprint_error(s, fn, strerror(errno));\n\t\tfatal_error(s, \"Can't open audit file\");\n\t\t}\n\n\ts->audit_file_open = TRUE;\n\twrite_audit_header(s);\n\n\treturn FALSE;\n}\n\nint close_audit_file(f_state *s)\n{\n\taudit_msg(s, FOREMOST_DIVIDER);\n\taudit_msg(s, \"\");\n\taudit_msg(s, \"Foremost finished at %s\", current_time());\n\n\tif (fclose(s->audit_file))\n\t\t{\n\t\tprint_error(s, AUDIT_FILE_NAME, strerror(errno));\n\t\treturn TRUE;\n\t\t}\n\n\treturn FALSE;\n}\n\nvoid audit_msg(f_state *s, char *format, ...)\n{\n\tva_list argp;\n\tva_start(argp, format);\n\n\tif (get_mode(s, mode_verbose)) {\n\t\tprint_message(s, format, argp);\n\t\tva_end(argp);\n\t\tva_start(argp, format);\n\t}\n\n\tvfprintf(s->audit_file, format, argp);\n\tva_end(argp);\n\n\tfprintf(s->audit_file, \"%s\", NEWLINE);\n\tfflush(stdout);\n}\n\nvoid set_input_file(f_state *s, char *filename)\n{\n\ts->input_file = (char *)malloc((strlen(filename) + 1) * sizeof(char));\n\tstrncpy(s->input_file, filename, strlen(filename) + 1);\n}\n\n/*Initialize any search specs*/\nint init_builtin(f_state *s, int type, char *suffix, char *header, char *footer, int header_len,\n\t\t\t\t int footer_len, u_int64_t max_len, int case_sen)\n{\n\n\tint i = s->num_builtin;\n\n\tsearch_spec[i].type = type;\n\tsearch_spec[i].suffix = (char *)malloc((strlen(suffix)+1) * sizeof(char));\n\tsearch_spec[i].num_markers = 0;\n\tstrcpy(search_spec[i].suffix, suffix);\n\n\tsearch_spec[i].header_len = header_len;\n\tsearch_spec[i].footer_len = footer_len;\n\n\tsearch_spec[i].max_len = max_len;\n\tsearch_spec[i].found = 0;\n\tsearch_spec[i].header = (unsigned char *)malloc(search_spec[i].header_len * sizeof(unsigned char));\n\tsearch_spec[i].footer = (unsigned char *)malloc(search_spec[i].footer_len * sizeof(unsigned char));\n\tsearch_spec[i].case_sen = case_sen;\n\tmemset(search_spec[i].comment, 0, COMMENT_LENGTH - 1);\n\n\tmemcpy(search_spec[i].header, header, search_spec[i].header_len);\n\tmemcpy(search_spec[i].footer, footer, search_spec[i].footer_len);\n\n\tinit_bm_table(search_spec[i].header,\n\t\t\t\t  search_spec[i].header_bm_table,\n\t\t\t\t  search_spec[i].header_len,\n\t\t\t\t  search_spec[i].case_sen,\n\t\t\t\t  SEARCHTYPE_FORWARD);\n\tinit_bm_table(search_spec[i].footer,\n\t\t\t\t  search_spec[i].footer_bm_table,\n\t\t\t\t  search_spec[i].footer_len,\n\t\t\t\t  search_spec[i].case_sen,\n\t\t\t\t  SEARCHTYPE_FORWARD);\n\ts->num_builtin++;\n\n\treturn i;\n}\n\n/*Markers are a method to search for any unique information besides just the header and the footer*/\nvoid add_marker(f_state *s, int index, char *marker, int markerlength)\n{\n\tint i = search_spec[index].num_markers;\n\tif (marker == NULL)\n\t\t{\n\t\tsearch_spec[index].num_markers = 0;\n\t\treturn;\n\t\t}\n\n\tsearch_spec[index].markerlist[i].len = markerlength;\n\tsearch_spec[index].markerlist[i].value = (unsigned char *)malloc(search_spec[index].markerlist[i].len * sizeof(unsigned char));\n\n\tmemcpy(search_spec[index].markerlist[i].value, marker, search_spec[index].markerlist[i].len);\n\tinit_bm_table(search_spec[index].markerlist[i].value,\n\t\t\t\t  search_spec[index].markerlist[i].marker_bm_table,\n\t\t\t\t  search_spec[index].markerlist[i].len,\n\t\t\t\t  TRUE,\n\t\t\t\t  SEARCHTYPE_FORWARD);\n\tsearch_spec[index].num_markers++;\n}\n\n/*Initial every search spec we know about*/\nvoid init_all(f_state *state)\n{\n\tint index = 0;\n\tinit_builtin(state, JPEG, \"jpg\", \"\\xff\\xd8\\xff\", \"\\xff\\xd9\", 3, 2, 20 * MEGABYTE, TRUE);\n\tindex = init_builtin(state, GIF, \"gif\", \"\\x47\\x49\\x46\\x38\", \"\\x00\\x3b\", 4, 2, MEGABYTE, TRUE);\n\tadd_marker(state, index, \"\\x00\\x00\\x3b\", 3);\n\tinit_builtin(state, BMP, \"bmp\", \"BM\", NULL, 2, 0, 2 * MEGABYTE, TRUE);\n\tinit_builtin(state,\n\t\t\t\t WMV,\n\t\t\t\t \"wmv\",\n\t\t\t\t \"\\x30\\x26\\xB2\\x75\\x8E\\x66\\xCF\\x11\",\n\t\t\t\t \"\\xA1\\xDC\\xAB\\x8C\\x47\\xA9\",\n\t\t\t\t 8,\n\t\t\t\t 6,\n\t\t\t\t 40 * MEGABYTE,\n\t\t\t\t TRUE);\n\tinit_builtin(state, MOV, \"mov\", \"moov\", NULL, 4, 0, 40 * MEGABYTE, TRUE);\n\tinit_builtin(state, MP4, \"mp4\", \"\\x00\\x00\\x00\\x1c\\x66\\x74\\x79\\x70\", NULL, 8, 0, 600 * MEGABYTE, TRUE);\n\tinit_builtin(state, RIFF, \"rif\", \"RIFF\", \"INFO\", 4, 4, 20 * MEGABYTE, TRUE);\n\tinit_builtin(state, HTM, \"htm\", \"<html\", \"</html>\", 5, 7, MEGABYTE, FALSE);\n\tinit_builtin(state,\n\t\t\t\t OLE,\n\t\t\t\t \"ole\",\n\t\t\t\t \"\\xd0\\xcf\\x11\\xe0\\xa1\\xb1\\x1a\\xe1\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\",\n\t\t\t\t NULL,\n\t\t\t\t 16,\n\t\t\t\t 0,\n\t\t\t\t 5 * MEGABYTE,\n\t\t\t\t TRUE);\n\tinit_builtin(state,\n\t\t\t\t ZIP,\n\t\t\t\t \"zip\",\n\t\t\t\t \"\\x50\\x4B\\x03\\x04\",\n\t\t\t\t \"\\x4b\\x05\\x06\\x00\",\n\t\t\t\t 4,\n\t\t\t\t 4,\n\t\t\t\t 100 * MEGABYTE,\n\t\t\t\t TRUE);\n\tinit_builtin(state,\n\t\t\t\t RAR,\n\t\t\t\t \"rar\",\n\t\t\t\t \"\\x52\\x61\\x72\\x21\\x1A\\x07\\x00\",\n\t\t\t\t \"\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\",\n\t\t\t\t 7,\n\t\t\t\t 8,\n\t\t\t\t 100 * MEGABYTE,\n\t\t\t\t TRUE);\n\tinit_builtin(state, EXE, \"exe\", \"MZ\", NULL, 2, 0, 1 * MEGABYTE, TRUE);\n\n\tindex = init_builtin(state,\n\t\t\t\t\t\t PNG,\n\t\t\t\t\t\t \"png\",\n\t\t\t\t\t\t \"\\x89\\x50\\x4E\\x47\\x0D\\x0A\\x1A\\x0A\",\n\t\t\t\t\t\t \"IEND\",\n\t\t\t\t\t\t 8,\n\t\t\t\t\t\t 4,\n\t\t\t\t\t\t 1 * MEGABYTE,\n\t\t\t\t\t\t TRUE);\n\tindex = init_builtin(state,\n\t\t\t\t\t\t MPG,\n\t\t\t\t\t\t \"mpg\",\n\t\t\t\t\t\t \"\\x00\\x00\\x01\\xba\",\n\t\t\t\t\t\t \"\\x00\\x00\\x01\\xb9\",\n\t\t\t\t\t\t 4,\n\t\t\t\t\t\t 4,\n\t\t\t\t\t\t 50 * MEGABYTE,\n\t\t\t\t\t\t TRUE);\n\tadd_marker(state, index, \"\\x00\\x00\\x01\", 3);\n\n\tindex = init_builtin(state, PDF, \"pdf\", \"%PDF-1.\", \"%%EOF\", 7, 5, 40 * MEGABYTE, TRUE);\n\tadd_marker(state, index, \"/L \", 3);\n\tadd_marker(state, index, \"obj\", 3);\n\tadd_marker(state, index, \"/Linearized\", 11);\n\tadd_marker(state, index, \"/Length\", 7);\n}\n\n/*Process any command line args following the -t switch)*/\nint set_search_def(f_state *s, char *ft, u_int64_t max_file_size)\n{\n\tint index = 0;\n\n\tif (strcmp(ft, \"jpg\") == 0 || strcmp(ft, \"jpeg\") == 0)\n\t\t{\n\t\tif (max_file_size == 0)\n\t\t\tmax_file_size = 20 * MEGABYTE;\n\t\tinit_builtin(s, JPEG, \"jpg\", \"\\xff\\xd8\\xff\", \"\\xff\\xd9\", 3, 2, max_file_size, TRUE);\n\t\t}\n\telse if (strcmp(ft, \"gif\") == 0)\n\t\t{\n\t\tif (max_file_size == 0)\n\t\t\tmax_file_size = 1 * MEGABYTE;\n\t\tindex = init_builtin(s,\n\t\t\t\t\t\t\t GIF,\n\t\t\t\t\t\t\t \"gif\",\n\t\t\t\t\t\t\t \"\\x47\\x49\\x46\\x38\",\n\t\t\t\t\t\t\t \"\\x00\\x3b\",\n\t\t\t\t\t\t\t 4,\n\t\t\t\t\t\t\t 2,\n\t\t\t\t\t\t\t max_file_size,\n\t\t\t\t\t\t\t TRUE);\n\n\t\tadd_marker(s, index, \"\\x00\\x00\\x3b\", 3);\n\t\t}\n\telse if (strcmp(ft, \"bmp\") == 0)\n\t\t{\n\n\t\tif (max_file_size == 0)\n\t\t\tmax_file_size = 2 * MEGABYTE;\n\n\t\tinit_builtin(s, BMP, \"bmp\", \"BM\", NULL, 2, 0, max_file_size, TRUE);\n\t\t}\n\telse if (strcmp(ft, \"mp4\") == 0)\n\t\t{\n\t\t\tinit_builtin(s, MP4, \"mp4\", \"\\x00\\x00\\x00\\x1c\\x66\\x74\\x79\\x70\", NULL, 8, 0, 600 * MEGABYTE, TRUE);\n\t\t}\n\telse if (strcmp(ft, \"exe\") == 0)\n\t\t{\n\n\t\tif (max_file_size == 0)\n\t\t\tmax_file_size = 1 * MEGABYTE;\n\n\t\tinit_builtin(s, EXE, \"exe\", \"MZ\", NULL, 2, 0, max_file_size, TRUE);\n\t\t}\n\telse if (strcmp(ft, \"elf\") == 0)\n\t\t{\n\n\t\tif (max_file_size == 0)\n\t\t\tmax_file_size = 1 * MEGABYTE;\n\n\t\tinit_builtin(s, ELF, \"elf\", \"0x7fELF\", NULL, 4, 0, max_file_size, TRUE);\n\t\t}\t\n\telse if (strcmp(ft, \"reg\") == 0)\n\t\t{\n\n\t\tif (max_file_size == 0)\n\t\t\tmax_file_size = 2 * MEGABYTE;\n\n\t\tinit_builtin(s, REG, \"reg\", \"regf\", NULL, 4, 0, max_file_size, TRUE);\n\n\t\t}\t\n\telse if (strcmp(ft, \"mpg\") == 0 || strcmp(ft, \"mpeg\") == 0)\n\t\t{\n\t\tif (max_file_size == 0)\n\t\t\tmax_file_size = 50 * MEGABYTE;\n\n\t\t//20000000 \\x00\\x00\\x01\\xb3      \\x00\\x00\\x01\\xb7 //system data\n\t\tindex = init_builtin(s,\n\t\t\t\t\t\t\t MPG,\n\t\t\t\t\t\t\t \"mpg\",\n\t\t\t\t\t\t\t \"\\x00\\x00\\x01\\xba\",\n\t\t\t\t\t\t\t \"\\x00\\x00\\x01\\xb9\",\n\t\t\t\t\t\t\t 4,\n\t\t\t\t\t\t\t 4,\n\t\t\t\t\t\t\t max_file_size,\n\t\t\t\t\t\t\t TRUE);\n\t\tadd_marker(s, index, \"\\x00\\x00\\x01\", 3);\n\n\t\t/*\n\t    add_marker(s,index,\"\\x00\\x00\\x01\\xBB\",4);\n\t    add_marker(s,index,\"\\x00\\x00\\x01\\xBE\",4);\n\t    add_marker(s,index,\"\\x00\\x00\\x01\\xB3\",4);\n\t    */\n\t\t}\n\telse if (strcmp(ft, \"wmv\") == 0)\n\t\t{\n\n\t\tif (max_file_size == 0)\n\t\t\tmax_file_size = 20 * MEGABYTE;\n\n\t\tinit_builtin(s,\n\t\t\t\t\t WMV,\n\t\t\t\t\t \"wmv\",\n\t\t\t\t\t \"\\x30\\x26\\xB2\\x75\\x8E\\x66\\xCF\\x11\",\n\t\t\t\t\t \"\\xA1\\xDC\\xAB\\x8C\\x47\\xA9\",\n\t\t\t\t\t 8,\n\t\t\t\t\t 6,\n\t\t\t\t\t max_file_size,\n\t\t\t\t\t TRUE);\n\t\t}\n\telse if (strcmp(ft, \"avi\") == 0)\n\t\t{\n\n\t\tif (max_file_size == 0)\n\t\t\tmax_file_size = 20 * MEGABYTE;\n\n\t\tinit_builtin(s, AVI, \"avi\", \"RIFF\", \"INFO\", 4, 4, max_file_size, TRUE);\n\t\t}\n\n\telse if (strcmp(ft, \"rif\") == 0)\n\t\t{\n\n\t\tif (max_file_size == 0)\n\t\t\tmax_file_size = 20 * MEGABYTE;\n\t\tinit_builtin(s, RIFF, \"rif\", \"RIFF\", \"INFO\", 4, 4, max_file_size, TRUE);\n\t\t}\n\telse if (strcmp(ft, \"wav\") == 0)\n\t\t{\n\n\t\tif (max_file_size == 0)\n\t\t\tmax_file_size = 20 * MEGABYTE;\n\t\tinit_builtin(s, WAV, \"wav\", \"RIFF\", \"INFO\", 4, 4, max_file_size, TRUE);\n\n\t\t}\n\telse if (strcmp(ft, \"html\") == 0 || strcmp(ft, \"htm\") == 0)\n\t\t{\n\n\t\tif (max_file_size == 0)\n\t\t\tmax_file_size = 1 * MEGABYTE;\n\t\tinit_builtin(s, HTM, \"htm\", \"<html\", \"</html>\", 5, 7, max_file_size, FALSE);\n\t\t}\n\n\telse if (strcmp(ft, \"ole\") == 0 || strcmp(ft, \"office\") == 0)\n\t\t{\n\n\t\tif (max_file_size == 0)\n\t\t\tmax_file_size = 10 * MEGABYTE;\n\t\tinit_builtin(s,\n\t\t\t\t\t OLE,\n\t\t\t\t\t \"ole\",\n\t\t\t\t\t \"\\xd0\\xcf\\x11\\xe0\\xa1\\xb1\\x1a\\xe1\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\",\n\t\t\t\t\t NULL,\n\t\t\t\t\t 16,\n\t\t\t\t\t 0,\n\t\t\t\t\t max_file_size,\n\t\t\t\t\t TRUE);\n\t\t}\n\telse if (strcmp(ft, \"doc\") == 0)\n\t\t{\n\t\tif (max_file_size == 0)\n\t\t\tmax_file_size = 20 * MEGABYTE;\n\t\tinit_builtin(s,\n\t\t\t\t\t DOC,\n\t\t\t\t\t \"doc\",\n\t\t\t\t\t \"\\xd0\\xcf\\x11\\xe0\\xa1\\xb1\\x1a\\xe1\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\",\n\t\t\t\t\t NULL,\n\t\t\t\t\t 16,\n\t\t\t\t\t 0,\n\t\t\t\t\t max_file_size,\n\t\t\t\t\t TRUE);\n\t\t}\n\telse if (strcmp(ft, \"xls\") == 0)\n\t\t{\n\t\tif (max_file_size == 0)\n\t\t\tmax_file_size = 10 * MEGABYTE;\n\n\t\tinit_builtin(s,\n\t\t\t\t\t XLS,\n\t\t\t\t\t \"xls\",\n\t\t\t\t\t \"\\xd0\\xcf\\x11\\xe0\\xa1\\xb1\\x1a\\xe1\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\",\n\t\t\t\t\t NULL,\n\t\t\t\t\t 16,\n\t\t\t\t\t 0,\n\t\t\t\t\t max_file_size,\n\t\t\t\t\t TRUE);\n\n\t\t}\n\telse if (strcmp(ft, \"ppt\") == 0)\n\t\t{\n\n\t\tif (max_file_size == 0)\n\t\t\tmax_file_size = 10 * MEGABYTE;\n\t\tinit_builtin(s,\n\t\t\t\t\t PPT,\n\t\t\t\t\t \"ppt\",\n\t\t\t\t\t \"\\xd0\\xcf\\x11\\xe0\\xa1\\xb1\\x1a\\xe1\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\",\n\t\t\t\t\t NULL,\n\t\t\t\t\t 16,\n\t\t\t\t\t 0,\n\t\t\t\t\t max_file_size,\n\t\t\t\t\t TRUE);\n\t\t}\n\telse if (strcmp(ft, \"zip\") == 0)\n\t\t{\n\t\tif (max_file_size == 0)\n\t\t\tmax_file_size = 100 * MEGABYTE;\n\n\t\tinit_builtin(s,\n\t\t\t\t\t ZIP,\n\t\t\t\t\t \"zip\",\n\t\t\t\t\t \"\\x50\\x4B\\x03\\x04\",\n\t\t\t\t\t \"\\x50\\x4b\\x05\\x06\",\n\t\t\t\t\t 4,\n\t\t\t\t\t 4,\n\t\t\t\t\t max_file_size,\n\t\t\t\t\t TRUE);\n\n\t\t}\n\telse if (strcmp(ft, \"rar\") == 0)\n\t\t{\n\t\tif (max_file_size == 0)\n\t\t\tmax_file_size = 100 * MEGABYTE;\n\n\t\tinit_builtin(s,\n\t\t\t\t\t RAR,\n\t\t\t\t\t \"rar\",\n\t\t\t\t\t \"\\x52\\x61\\x72\\x21\\x1A\\x07\\x00\",\n\t\t\t\t\t \"\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\",\n\t\t\t\t\t 7,\n\t\t\t\t\t 8,\n\t\t\t\t\t max_file_size,\n\t\t\t\t\t TRUE);\n\n\t\t}\n\telse if (strcmp(ft, \"sxw\") == 0)\n\t\t{\n\t\tif (max_file_size == 0)\n\t\t\tmax_file_size = 10 * MEGABYTE;\n\n\t\tinit_builtin(s,\n\t\t\t\t\t SXW,\n\t\t\t\t\t \"sxw\",\n\t\t\t\t\t \"\\x50\\x4B\\x03\\x04\",\n\t\t\t\t\t \"\\x4b\\x05\\x06\\x00\",\n\t\t\t\t\t 4,\n\t\t\t\t\t 4,\n\t\t\t\t\t max_file_size,\n\t\t\t\t\t TRUE);\n\n\t\t}\n\telse if (strcmp(ft, \"sxc\") == 0)\n\t\t{\n\t\tif (max_file_size == 0)\n\t\t\tmax_file_size = 10 * MEGABYTE;\n\n\t\tinit_builtin(s,\n\t\t\t\t\t SXC,\n\t\t\t\t\t \"sxc\",\n\t\t\t\t\t \"\\x50\\x4B\\x03\\x04\",\n\t\t\t\t\t \"\\x4b\\x05\\x06\\x00\",\n\t\t\t\t\t 4,\n\t\t\t\t\t 4,\n\t\t\t\t\t max_file_size,\n\t\t\t\t\t TRUE);\n\n\t\t}\n\telse if (strcmp(ft, \"sxi\") == 0)\n\t\t{\n\t\tif (max_file_size == 0)\n\t\t\tmax_file_size = 10 * MEGABYTE;\n\n\t\tinit_builtin(s,\n\t\t\t\t\t SXI,\n\t\t\t\t\t \"sxi\",\n\t\t\t\t\t \"\\x50\\x4B\\x03\\x04\",\n\t\t\t\t\t \"\\x4b\\x05\\x06\\x00\",\n\t\t\t\t\t 4,\n\t\t\t\t\t 4,\n\t\t\t\t\t max_file_size,\n\t\t\t\t\t TRUE);\n\n\t\t}\n\telse if (strcmp(ft, \"docx\") == 0)\n\t\t{\n\t\tif (max_file_size == 0)\n\t\t\tmax_file_size = 10 * MEGABYTE;\n\n\t\tinit_builtin(s,\n\t\t\t\t\t DOCX,\n\t\t\t\t\t \"docx\",\n\t\t\t\t\t \"\\x50\\x4B\\x03\\x04\",\n\t\t\t\t\t \"\\x4b\\x05\\x06\\x00\",\n\t\t\t\t\t 4,\n\t\t\t\t\t 4,\n\t\t\t\t\t max_file_size,\n\t\t\t\t\t TRUE);\n\n\t\t}\n\telse if (strcmp(ft, \"pptx\") == 0)\n\t\t{\n\t\tif (max_file_size == 0)\n\t\t\tmax_file_size = 10 * MEGABYTE;\n\n\t\tinit_builtin(s,\n\t\t\t\t\t PPTX,\n\t\t\t\t\t \"pptx\",\n\t\t\t\t\t \"\\x50\\x4B\\x03\\x04\",\n\t\t\t\t\t \"\\x4b\\x05\\x06\\x00\",\n\t\t\t\t\t 4,\n\t\t\t\t\t 4,\n\t\t\t\t\t max_file_size,\n\t\t\t\t\t TRUE);\n\n\t\t}\n\telse if (strcmp(ft, \"xlsx\") == 0)\n\t\t{\n\t\tif (max_file_size == 0)\n\t\t\tmax_file_size = 10 * MEGABYTE;\n\n\t\tinit_builtin(s,\n\t\t\t\t\t XLSX,\n\t\t\t\t\t \"xlsx\",\n\t\t\t\t\t \"\\x50\\x4B\\x03\\x04\",\n\t\t\t\t\t \"\\x4b\\x05\\x06\\x00\",\n\t\t\t\t\t 4,\n\t\t\t\t\t 4,\n\t\t\t\t\t max_file_size,\n\t\t\t\t\t TRUE);\n\n\t\t}\n\telse if (strcmp(ft, \"gzip\") == 0 || strcmp(ft, \"gz\") == 0)\n\t\t{\n\t\tif (max_file_size == 0)\n\t\t\tmax_file_size = 100 * MEGABYTE;\n\n\t\tinit_builtin(s, GZIP, \"gz\", \"\\x1F\\x8B\", \"\\x00\\x00\\x00\\x00\", 2, 4, max_file_size, TRUE);\n\t\t}\n\telse if (strcmp(ft, \"pdf\") == 0)\n\t\t{\n\t\tif (max_file_size == 0)\n\t\t\tmax_file_size = 20 * MEGABYTE;\n\n\t\tindex = init_builtin(s, PDF, \"pdf\", \"%PDF-1.\", \"%%EOF\", 7, 5, max_file_size, TRUE);\n\t\tadd_marker(s, index, \"/L \", 3);\n\t\tadd_marker(s, index, \"obj\", 3);\n\t\tadd_marker(s, index, \"/Linearized\", 11);\n\t\tadd_marker(s, index, \"/Length\", 7);\n\t\t}\n\telse if (strcmp(ft, \"vjpeg\") == 0)\n\t\t{\n\t\tif (max_file_size == 0)\n\t\t\tmax_file_size = 40 * MEGABYTE;\n\t\tinit_builtin(s, VJPEG, \"mov\", \"pnot\", NULL, 4, 0, max_file_size, TRUE);\n\t\t}\n\telse if (strcmp(ft, \"mov\") == 0)\n\t\t{\n\t\tif (max_file_size == 0)\n\t\t\tmax_file_size = 40 * MEGABYTE;\n\n\t\tinit_builtin(s, MOV, \"mov\", \"moov\", NULL, 4, 0, max_file_size, TRUE);\n\t\t}\n\telse if (strcmp(ft, \"wpd\") == 0)\n\t\t{\n\t\tif (max_file_size == 0)\n\t\t\tmax_file_size = 1 * MEGABYTE;\n\n\t\tinit_builtin(s, WPD, \"wpd\", \"\\xff\\x57\\x50\\x43\", NULL, 4, 0, max_file_size, TRUE);\n\t\t}\n\telse if (strcmp(ft, \"cpp\") == 0)\n\t\t{\n\t\tif (max_file_size == 0)\n\t\t\tmax_file_size = 1 * MEGABYTE;\n\n\t\tindex = init_builtin(s, CPP, \"cpp\", \"#include\", \"char\", 8, 4, max_file_size, TRUE);\n\t\tadd_marker(s, index, \"int\", 3);\n\t\t}\n\telse if (strcmp(ft, \"png\") == 0)\n\t\t{\n\t\tif (max_file_size == 0)\n\t\t\tmax_file_size = 1 * MEGABYTE;\n\t\tindex = init_builtin(s,\n\t\t\t\t\t\t\t PNG,\n\t\t\t\t\t\t\t \"png\",\n\t\t\t\t\t\t\t \"\\x89\\x50\\x4E\\x47\\x0D\\x0A\\x1A\\x0A\",\n\t\t\t\t\t\t\t \"IEND\",\n\t\t\t\t\t\t\t 8,\n\t\t\t\t\t\t\t 4,\n\t\t\t\t\t\t\t max_file_size,\n\t\t\t\t\t\t\t TRUE);\n\t\t}\n\telse if (strcmp(ft, \"all\") == 0)\n\t\t{\n\t\tinit_all(s);\n\t\t}\n\telse\n\t\t{\n\t\treturn FALSE;\n\t\t}\n\n\treturn TRUE;\n\n}\n\nvoid init_bm_table(unsigned char *needle, size_t table[UCHAR_MAX + 1], size_t len, int casesensitive,\n\t\t\t\t   int searchtype)\n{\n\tsize_t\ti = 0, j = 0, currentindex = 0;\n\n\tfor (i = 0; i <= UCHAR_MAX; i++)\n\t\ttable[i] = len;\n\tfor (i = 0; i < len; i++)\n\t\t{\n\t\tif (searchtype == SEARCHTYPE_REVERSE)\n\t\t\t{\n\n\t\t\tcurrentindex = i;\t\t\t//If we are running our searches backwards\n\t\t\t//we count from the beginning of the string\n\t\t\t}\n\t\telse\n\t\t\t{\n\t\t\tcurrentindex = len - i - 1; //Count from the back of string\n\t\t\t}\n\n\t\tif (needle[i] == wildcard)\t\t//No skip entry can advance us past the last wildcard in the string\n\t\t\t{\n\t\t\tfor (j = 0; j <= UCHAR_MAX; j++)\n\t\t\t\ttable[j] = currentindex;\n\t\t\t}\n\n\t\ttable[(unsigned char)needle[i]] = currentindex;\n\t\tif (!casesensitive)\n\t\t\t{\n\n\t\t\t//RBF - this is a little kludgy but it works and this isn't the part\n\t\t\t//of the code we really need to worry about optimizing...\n\t\t\t//If we aren't case sensitive we just set both the upper and lower case\n\t\t\t//entries in the jump table.\n\t\t\ttable[tolower(needle[i])] = currentindex;\n\t\t\ttable[toupper(needle[i])] = currentindex;\n\t\t\t}\n\t\t}\n}\n\n#ifdef __DEBUG\nvoid dump_state(f_state *s)\n{\n\tprintf(\"Current state:\\n\");\n\tprintf(\"Config file: %s\\n\", s->config_file);\n\tprintf(\"Output directory: %s\\n\", s->output_directory);\n\tprintf(\"Mode: %llu\\n\", s->mode);\n\n}\n#endif\n"
  }
]