)
- ebook ui: change the font size
- ebook ui: change font/background combinations (just 3 options, like in kindle app:
black on white, sepia, white on black)
- ebook ui: change brightness
- ebook ui: table of content
- ebook ui: better (on-screen) ui for bookmarks
- html: hyphenation (http://www.tug.org/docs/liang/)
================================================
FILE: docs/settings/langs3.1.html
================================================
Languages supported by SumatraPDF 3.1
Languages supported by SumatraPDF
Languages supported by SumatraPDF. You can use ISO code as a value
of UiLanguage setting in settings file.
Note: not all languages are fully translated. Help us translate SumatraPDF.
| Language name | ISO code |
| Afrikaans | af |
| Albanian (Shqip) | sq |
| Arabic (الْعَرَبيّة) | ar |
| Armenian (Հայերեն) | am |
| Azerbaijani (Azərbaycanca) | az |
| Basque (Euskara) | eu |
| Belarusian (Беларуская) | by |
| Bengali (বাংলা) | bn |
| Bosnian (Bosanski) | bs |
| Bulgarian (Български) | bg |
| Burmese (ဗမာ စာ) | mm |
| Catalan (Català) | ca |
| Catalan-Valencian (Català-Valencià) | ca-xv |
| Chinese Simplified (简体中文) | cn |
| Chinese Traditional (繁體中文) | tw |
| Cornish (Kernewek) | kw |
| Croatian (Hrvatski) | hr |
| Czech (Čeština) | cz |
| Danish (Dansk) | dk |
| Dutch (Nederlands) | nl |
| English | en |
| Estonian (Eesti) | et |
| Finnish (Suomi) | fi |
| French (Français) | fr |
| Frisian (Frysk) | fy-nl |
| Galician (Galego) | gl |
| Georgian (ქართული) | ka |
| German (Deutsch) | de |
| Greek (Ελληνικά) | el |
| Hebrew (עברית) | he |
| Hindi (हिंदी) | hi |
| Hungarian (Magyar) | hu |
| Indonesian (Bahasa Indonesia) | id |
| Irish (Gaeilge) | ga |
| Italian (Italiano) | it |
| Japanese (日本語) | ja |
| Javanese (ꦧꦱꦗꦮ) | jv |
| Korean (한국어) | kr |
| Kurdish (كوردی) | ku |
| Latvian (latviešu valoda) | lv |
| Lithuanian (Lietuvių) | lt |
| Macedonian (македонски) | mk |
| Malayalam (മലയാളം) | ml |
| Malaysian (Bahasa Melayu) | my |
| Nepali (नेपाली) | ne |
| Norwegian (Norsk) | no |
| Norwegian Neo-Norwegian (Norsk nynorsk) | nn |
| Persian (فارسی) | fa |
| Polish (Polski) | pl |
| Portuguese - Brazil (Português) | br |
| Portuguese - Portugal (Português) | pt |
| Punjabi (ਪੰਜਾਬੀ) | pa |
| Romanian (Română) | ro |
| Russian (Русский) | ru |
| Serbian (Cyrillic) | sr-rs |
| Serbian (Latin) | sp-rs |
| Shona (Shona) | sn |
| Sinhala (සිංහල) | si |
| Slovak (Slovenčina) | sk |
| Slovenian (Slovenščina) | sl |
| Spanish (Español) | es |
| Swedish (Svenska) | sv |
| Tagalog (Tagalog) | tl |
| Tamil (தமிழ்) | ta |
| Thai (ภาษาไทย) | th |
| Turkish (Türkçe) | tr |
| Ukrainian (Українська) | uk |
| Uzbek (O'zbek) | uz |
| Vietnamese (Việt Nam) | vn |
| Welsh (Cymraeg) | cy |
================================================
FILE: docs/settings/langs3.2.html
================================================
Languages supported by SumatraPDF 3.2
Languages supported by SumatraPDF 3.2
Languages supported by SumatraPDF. You can use ISO code as a value
of UiLanguage setting in settings file.
Note: not all languages are fully translated. Help us translate SumatraPDF.
| Language name | ISO code |
| Afrikaans | af |
| Albanian (Shqip) | sq |
| Arabic (الْعَرَبيّة) | ar |
| Armenian (Հայերեն) | am |
| Azerbaijani (Azərbaycanca) | az |
| Basque (Euskara) | eu |
| Belarusian (Беларуская) | by |
| Bengali (বাংলা) | bn |
| Bosnian (Bosanski) | bs |
| Bulgarian (Български) | bg |
| Burmese (ဗမာ စာ) | mm |
| Catalan (Català) | ca |
| Catalan-Valencian (Català-Valencià) | ca-xv |
| Chinese Simplified (简体中文) | cn |
| Chinese Traditional (繁體中文) | tw |
| Cornish (Kernewek) | kw |
| Croatian (Hrvatski) | hr |
| Czech (Čeština) | cz |
| Danish (Dansk) | dk |
| Dutch (Nederlands) | nl |
| English | en |
| Estonian (Eesti) | et |
| Finnish (Suomi) | fi |
| French (Français) | fr |
| Frisian (Frysk) | fy-nl |
| Galician (Galego) | gl |
| Georgian (ქართული) | ka |
| German (Deutsch) | de |
| Greek (Ελληνικά) | el |
| Hebrew (עברית) | he |
| Hindi (हिंदी) | hi |
| Hungarian (Magyar) | hu |
| Indonesian (Bahasa Indonesia) | id |
| Irish (Gaeilge) | ga |
| Italian (Italiano) | it |
| Japanese (日本語) | ja |
| Javanese (ꦧꦱꦗꦮ) | jv |
| Korean (한국어) | kr |
| Kurdish (كوردی) | ku |
| Latvian (latviešu valoda) | lv |
| Lithuanian (Lietuvių) | lt |
| Macedonian (македонски) | mk |
| Malayalam (മലയാളം) | ml |
| Malaysian (Bahasa Melayu) | my |
| Nepali (नेपाली) | ne |
| Norwegian (Norsk) | no |
| Norwegian Neo-Norwegian (Norsk nynorsk) | nn |
| Persian (فارسی) | fa |
| Polish (Polski) | pl |
| Portuguese - Brazil (Português) | br |
| Portuguese - Portugal (Português) | pt |
| Punjabi (ਪੰਜਾਬੀ) | pa |
| Romanian (Română) | ro |
| Russian (Русский) | ru |
| Serbian (Cyrillic) | sr-rs |
| Serbian (Latin) | sp-rs |
| Shona (Shona) | sn |
| Sinhala (සිංහල) | si |
| Slovak (Slovenčina) | sk |
| Slovenian (Slovenščina) | sl |
| Spanish (Español) | es |
| Swedish (Svenska) | sv |
| Tagalog (Tagalog) | tl |
| Tamil (தமிழ்) | ta |
| Thai (ภาษาไทย) | th |
| Turkish (Türkçe) | tr |
| Ukrainian (Українська) | uk |
| Uzbek (O'zbek) | uz |
| Vietnamese (Việt Nam) | vn |
| Welsh (Cymraeg) | cy |
================================================
FILE: docs/settings/settings3.1.html
================================================
Customizing SumatraPDF 3.1
Customizing SumatraPDF 3.1
You can change the look and behavior of
SumatraPDF
by editing the file SumatraPDF-settings.txt. The file is stored in
%APPDATA%\SumatraPDF directory for the installed version or in the
same directory as SumatraPDF.exe executable for the portable version.
Use the menu item Settings -> Advanced Settings... to open the settings file
with your default text editor.
The file is in a simple text format. Below is an explanation of
what the different settings mean and what their default values are.
Highlighted settings can't be changed from the UI. Modifying other settings
directly in this file is not recommended.
If you add or remove lines with square brackets, make sure to always add/remove
square brackets in pairs! Else you risk losing all the data following them.
background color of the non-document windows, traditionally yellow
MainWindowBackground = #fff200
if true, Esc key closes SumatraPDF
EscToExit = false
if true, we'll always open files using existing SumatraPDF process
ReuseInstance = false
if true, we use Windows system colors for background/text color. Over-rides other settings
UseSysColors = false
if true and SessionData isn't empty, that session will be restored at startup
RestoreSession = true
customization options for PDF, XPS, DjVu and PostScript UI
FixedPageUI [
color value with which black (text) will be substituted
TextColor = #000000
color value with which white (background) will be substituted
BackgroundColor = #ffffff
color value for the text selection rectangle (also used to highlight found text) (introduced in
version 2.4)
SelectionColor = #f5fc0c
top, right, bottom and left margin (in that order) between window and document
WindowMargin = 2 4 2 4
horizontal and vertical distance between two pages in facing and book view modes
PageSpacing = 4 4
colors to use for the gradient from top to bottom (stops will be inserted at regular intervals
throughout the document); currently only up to three colors are supported; the idea behind this
experimental feature is that the background might allow to subconsciously determine reading
progress; suggested values: #2828aa #28aa28 #aa2828
GradientColors =
]
customization options for eBooks (EPUB, Mobi, FictionBook) UI. If UseFixedPageUI is true,
FixedPageUI settings apply instead
EbookUI [
name of the font. takes effect after re-opening the document
FontName = Georgia
size of the font. takes effect after re-opening the document
FontSize = 12.5
color for text
TextColor = #5f4b32
color of the background (page)
BackgroundColor = #fbf0d9
if true, the UI used for PDF documents will be used for ebooks as well (enables printing and
searching, disables automatic reflow)
UseFixedPageUI = false
]
customization options for Comic Book and images UI
ComicBookUI [
top, right, bottom and left margin (in that order) between window and document
WindowMargin = 0 0 0 0
horizontal and vertical distance between two pages in facing and book view modes
PageSpacing = 4 4
if true, default to displaying Comic Book files in manga mode (from right to left if showing 2
pages at a time)
CbxMangaMode = false
]
customization options for CHM UI. If UseFixedPageUI is true, FixedPageUI settings apply instead
ChmUI [
if true, the UI used for PDF documents will be used for CHM documents as well
UseFixedPageUI = false
]
list of additional external viewers for various file types (can have multiple entries for the same
format)
ExternalViewers [
[
command line with which to call the external viewer, may contain %p for page numer and "%1" for
the file name (add quotation marks around paths containing spaces)
CommandLine =
name of the external viewer to be shown in the menu (implied by CommandLine if missing)
Name =
optional filter for which file types the menu item is to be shown; separate multiple entries
using ';' and don't include any spaces (e.g. *.pdf;*.xps for all PDF and XPS documents)
Filter =
]
]
ShowMenubar = true
if true, a document will be reloaded automatically whenever it's changed (currently doesn't work for
documents shown in the ebook UI) (introduced in version 2.5)
ReloadModifiedDocuments = true
if true, we show the full path to a file in the title bar (introduced in version 3.0)
FullPathInTitle = false
sequence of zoom levels when zooming in/out; all values must lie between 8.33 and 6400
ZoomLevels = 8.33 12.5 18 25 33.33 50 66.67 75 100 125 150 200 300 400 600 800 1000 1200 1600 2000 2400 3200 4800 6400
zoom step size in percents relative to the current zoom level. if zero or negative, the values from
ZoomLevels are used instead
ZoomIncrement = 0
these override the default settings in the Print dialog
PrinterDefaults [
default value for scaling (shrink, fit, none)
PrintScale = shrink
]
customization options for how we show forward search results (used from LaTeX editors)
ForwardSearch [
when set to a positive value, the forward search highlight style will be changed to a rectangle
at the left of the page (with the indicated amount of margin from the page margin)
HighlightOffset = 0
width of the highlight rectangle (if HighlightOffset is > 0)
HighlightWidth = 15
color used for the forward search highlight
HighlightColor = #6581ff
if true, highlight remains visible until the next mouse click (instead of fading away
immediately)
HighlightPermanent = false
]
a whitespace separated list of passwords to try when opening a password protected document
(passwords containing spaces must be quoted) (introduced in version 2.4)
DefaultPasswords =
actual resolution of the main screen in DPI (if this value isn't positive, the system's UI setting
is used) (introduced in version 2.5)
CustomScreenDPI = 0
if true, we store display settings for each document separately (i.e. everything after
UseDefaultState in FileStates)
RememberStatePerDocument = true
ISO code of the current UI language
UiLanguage =
if true, we show the toolbar at the top of the window
ShowToolbar = true
if true, we show the Favorites sidebar
ShowFavorites = false
a list of extensions that SumatraPDF has associated itself with and will reassociate if a different
application takes over (e.g. ".pdf .xps .epub")
AssociatedExtensions =
whether file associations should be fixed silently or only after user feedback
AssociateSilently = false
if true, we check once a day if an update is available
CheckForUpdates = true
we won't ask again to update to this version
VersionToSkip =
if true, we remember which files we opened and their display settings
RememberOpenedFiles = true
pattern used to launch the LaTeX editor when doing inverse search
InverseSearchCmdLine =
if true, we expose the SyncTeX inverse search command line in Settings -> Options
EnableTeXEnhancements = false
default layout of pages. valid values: automatic, single page, facing, book view, continuous,
continuous facing, continuous book view
DefaultDisplayMode = automatic
default zoom (in %) or one of those values: fit page, fit width, fit content
DefaultZoom = fit page
default state of the window. 1 is normal, 2 is maximized, 3 is fullscreen, 4 is minimized
WindowState = 1
default position (x, y) and size (width, height) of the window
WindowPos = 0 0 0 0
if true, we show table of contents (Bookmarks) sidebar if it's present in the document
ShowToc = true
SidebarDx = 0
if both favorites and bookmarks parts of sidebar are visible, this is the height of bookmarks (table
of contents) part
TocDy = 0
if true, we show a list of frequently read documents when no document is loaded
ShowStartPage = true
if true, documents are opened in tabs instead of new windows (introduced in version 3.0)
UseTabs = true
information about opened files (in most recently used order)
FileStates [
[
path of the document
FilePath =
Values which are persisted for bookmarks/favorites
Favorites [
[
name of this favorite as shown in the menu
Name =
number of the bookmarked page
PageNo = 0
label for this page (only present if logical and physical page numbers are not the same)
PageLabel =
]
]
a document can be "pinned" to the Frequently Read list so that it isn't displaced by recently
opened documents
IsPinned = false
if true, the file is considered missing and won't be shown in any list
IsMissing = false
number of times this document has been opened recently
OpenCount = 0
data required to open a password protected document without having to ask for the password again
DecryptionKey =
if true, we use global defaults when opening this file (instead of the values below)
UseDefaultState = false
layout of pages. valid values: automatic, single page, facing, book view, continuous, continuous
facing, continuous book view
DisplayMode = automatic
how far this document has been scrolled (in x and y direction)
ScrollPos = 0 0
number of the last read page
PageNo = 1
zoom (in %) or one of those values: fit page, fit width, fit content
Zoom = fit page
how far pages have been rotated as a multiple of 90 degrees
Rotation = 0
state of the window. 1 is normal, 2 is maximized, 3 is fullscreen, 4 is minimized
WindowState = 0
default position (can be on any monitor)
WindowPos = 0 0 0 0
if true, we show table of contents (Bookmarks) sidebar if it's present in the document
ShowToc = true
SidebarDx = 0
if true, the document is displayed right-to-left in facing and book view modes (only used for
comic book documents)
DisplayR2L = false
data required to restore the last read page in the ebook UI
ReparseIdx = 0
data required to determine which parts of the table of contents have been expanded
TocState =
]
]
state of the last session, usage depends on RestoreSession (introduced in version 3.1)
SessionData [
[
data required for restoring the view state of a single tab
TabStates [
[
path of the document
FilePath =
same as FileStates -> DisplayMode
DisplayMode = automatic
number of the last read page
PageNo = 1
same as FileStates -> Zoom
Zoom = fit page
same as FileStates -> Rotation
Rotation = 0
how far this document has been scrolled (in x and y direction)
ScrollPos = 0 0
if true, the table of contents was shown when the document was closed
ShowToc = true
same as FileStates -> TocState
TocState =
]
]
index of the currently selected tab (1-based)
TabIndex = 1
same as FileState -> WindowState
WindowState = 0
default position (can be on any monitor)
WindowPos = 0 0 0 0
SidebarDx = 0
]
]
data required for reloading documents after an auto-update (introduced in version 3.0)
ReopenOnce =
data required to determine when SumatraPDF last checked for updates
TimeOfLastUpdateCheck = 0 0
value required to determine recency for the OpenCount value in FileStates
OpenCountWeek = 0
Syntax for color values
The syntax for colors is: #rrggbb.
The components are hex values (ranging from 00 to FF) and stand for:
rr : red component
gg : green component
bb : blue component
For example #ff0000 means red color. You can use
Color Picker or
Sphere or
ColorScheme Designer to pick a color.
================================================
FILE: docs/settings/settings3.2.html
================================================
Customizing SumatraPDF 3.2
Customizing SumatraPDF 3.2
You can change the look and behavior of
SumatraPDF
by editing the file SumatraPDF-settings.txt. The file is stored in
%APPDATA%\SumatraPDF directory for the installed version or in the
same directory as SumatraPDF.exe executable for the portable version.
Use the menu item Settings -> Advanced Settings... to open the settings file
with your default text editor.
The file is in a simple text format. Below is an explanation of
what the different settings mean and what their default values are.
Highlighted settings can't be changed from the UI. Modifying other settings
directly in this file is not recommended.
If you add or remove lines with square brackets, make sure to always add/remove
square brackets in pairs! Else you risk losing all the data following them.
background color of the non-document windows, traditionally yellow
MainWindowBackground = #fff200
if true, Esc key closes SumatraPDF
EscToExit = false
if true, we'll always open files using existing SumatraPDF process
ReuseInstance = false
if true, we use Windows system colors for background/text color. Over-rides other settings
UseSysColors = false
if true and SessionData isn't empty, that session will be restored at startup
RestoreSession = true
customization options for PDF, XPS, DjVu and PostScript UI
FixedPageUI [
color value with which black (text) will be substituted
TextColor = #000000
color value with which white (background) will be substituted
BackgroundColor = #ffffff
color value for the text selection rectangle (also used to highlight found text) (introduced in
version 2.4)
SelectionColor = #f5fc0c
top, right, bottom and left margin (in that order) between window and document
WindowMargin = 2 4 2 4
horizontal and vertical distance between two pages in facing and book view modes
PageSpacing = 4 4
colors to use for the gradient from top to bottom (stops will be inserted at regular intervals
throughout the document); currently only up to three colors are supported; the idea behind this
experimental feature is that the background might allow to subconsciously determine reading
progress; suggested values: #2828aa #28aa28 #aa2828
GradientColors =
]
customization options for eBooks (EPUB, Mobi, FictionBook) UI. If UseFixedPageUI is true,
FixedPageUI settings apply instead
EbookUI [
name of the font. takes effect after re-opening the document
FontName = Georgia
size of the font. takes effect after re-opening the document
FontSize = 12.5
color for text
TextColor = #5f4b32
color of the background (page)
BackgroundColor = #fbf0d9
if true, the UI used for PDF documents will be used for ebooks as well (enables printing and
searching, disables automatic reflow)
UseFixedPageUI = false
]
customization options for Comic Book and images UI
ComicBookUI [
top, right, bottom and left margin (in that order) between window and document
WindowMargin = 0 0 0 0
horizontal and vertical distance between two pages in facing and book view modes
PageSpacing = 4 4
if true, default to displaying Comic Book files in manga mode (from right to left if showing 2
pages at a time)
CbxMangaMode = false
]
customization options for CHM UI. If UseFixedPageUI is true, FixedPageUI settings apply instead
ChmUI [
if true, the UI used for PDF documents will be used for CHM documents as well
UseFixedPageUI = false
]
list of additional external viewers for various file types (can have multiple entries for the same
format)
ExternalViewers [
[
command line with which to call the external viewer, may contain %p for page numer and "%1" for
the file name (add quotation marks around paths containing spaces)
CommandLine =
name of the external viewer to be shown in the menu (implied by CommandLine if missing)
Name =
optional filter for which file types the menu item is to be shown; separate multiple entries
using ';' and don't include any spaces (e.g. *.pdf;*.xps for all PDF and XPS documents)
Filter =
]
]
ShowMenubar = true
if true, a document will be reloaded automatically whenever it's changed (currently doesn't work for
documents shown in the ebook UI) (introduced in version 2.5)
ReloadModifiedDocuments = true
if true, we show the full path to a file in the title bar (introduced in version 3.0)
FullPathInTitle = false
sequence of zoom levels when zooming in/out; all values must lie between 8.33 and 6400
ZoomLevels = 8.33 12.5 18 25 33.33 50 66.67 75 100 125 150 200 300 400 600 800 1000 1200 1600 2000 2400 3200 4800 6400
zoom step size in percents relative to the current zoom level. if zero or negative, the values from
ZoomLevels are used instead
ZoomIncrement = 0
these override the default settings in the Print dialog
PrinterDefaults [
default value for scaling (shrink, fit, none)
PrintScale = shrink
]
customization options for how we show forward search results (used from LaTeX editors)
ForwardSearch [
when set to a positive value, the forward search highlight style will be changed to a rectangle
at the left of the page (with the indicated amount of margin from the page margin)
HighlightOffset = 0
width of the highlight rectangle (if HighlightOffset is > 0)
HighlightWidth = 15
color used for the forward search highlight
HighlightColor = #6581ff
if true, highlight remains visible until the next mouse click (instead of fading away
immediately)
HighlightPermanent = false
]
a whitespace separated list of passwords to try when opening a password protected document
(passwords containing spaces must be quoted) (introduced in version 2.4)
DefaultPasswords =
actual resolution of the main screen in DPI (if this value isn't positive, the system's UI setting
is used) (introduced in version 2.5)
CustomScreenDPI = 0
if true, we store display settings for each document separately (i.e. everything after
UseDefaultState in FileStates)
RememberStatePerDocument = true
ISO code of the current UI language
UiLanguage =
if true, we show the toolbar at the top of the window
ShowToolbar = true
if true, we show the Favorites sidebar
ShowFavorites = false
a list of extensions that SumatraPDF has associated itself with and will reassociate if a different
application takes over (e.g. ".pdf .xps .epub")
AssociatedExtensions =
whether file associations should be fixed silently or only after user feedback
AssociateSilently = false
if true, we check once a day if an update is available
CheckForUpdates = true
we won't ask again to update to this version
VersionToSkip =
if true, we remember which files we opened and their display settings
RememberOpenedFiles = true
pattern used to launch the LaTeX editor when doing inverse search
InverseSearchCmdLine =
if true, we expose the SyncTeX inverse search command line in Settings -> Options
EnableTeXEnhancements = false
default layout of pages. valid values: automatic, single page, facing, book view, continuous,
continuous facing, continuous book view
DefaultDisplayMode = automatic
default zoom (in %) or one of those values: fit page, fit width, fit content
DefaultZoom = fit page
default state of the window. 1 is normal, 2 is maximized, 3 is fullscreen, 4 is minimized
WindowState = 1
default position (x, y) and size (width, height) of the window
WindowPos = 0 0 0 0
if true, we show table of contents (Bookmarks) sidebar if it's present in the document
ShowToc = true
SidebarDx = 0
if both favorites and bookmarks parts of sidebar are visible, this is the height of bookmarks (table
of contents) part
TocDy = 0
if true, we show a list of frequently read documents when no document is loaded
ShowStartPage = true
if true, documents are opened in tabs instead of new windows (introduced in version 3.0)
UseTabs = true
information about opened files (in most recently used order)
FileStates [
[
path of the document
FilePath =
Values which are persisted for bookmarks/favorites
Favorites [
[
name of this favorite as shown in the menu
Name =
number of the bookmarked page
PageNo = 0
label for this page (only present if logical and physical page numbers are not the same)
PageLabel =
]
]
a document can be "pinned" to the Frequently Read list so that it isn't displaced by recently
opened documents
IsPinned = false
if true, the file is considered missing and won't be shown in any list
IsMissing = false
number of times this document has been opened recently
OpenCount = 0
data required to open a password protected document without having to ask for the password again
DecryptionKey =
if true, we use global defaults when opening this file (instead of the values below)
UseDefaultState = false
layout of pages. valid values: automatic, single page, facing, book view, continuous, continuous
facing, continuous book view
DisplayMode = automatic
how far this document has been scrolled (in x and y direction)
ScrollPos = 0 0
number of the last read page
PageNo = 1
zoom (in %) or one of those values: fit page, fit width, fit content
Zoom = fit page
how far pages have been rotated as a multiple of 90 degrees
Rotation = 0
state of the window. 1 is normal, 2 is maximized, 3 is fullscreen, 4 is minimized
WindowState = 0
default position (can be on any monitor)
WindowPos = 0 0 0 0
if true, we show table of contents (Bookmarks) sidebar if it's present in the document
ShowToc = true
SidebarDx = 0
if true, the document is displayed right-to-left in facing and book view modes (only used for
comic book documents)
DisplayR2L = false
data required to restore the last read page in the ebook UI
ReparseIdx = 0
data required to determine which parts of the table of contents have been expanded
TocState =
]
]
state of the last session, usage depends on RestoreSession (introduced in version 3.1)
SessionData [
[
data required for restoring the view state of a single tab
TabStates [
[
path of the document
FilePath =
same as FileStates -> DisplayMode
DisplayMode = automatic
number of the last read page
PageNo = 1
same as FileStates -> Zoom
Zoom = fit page
same as FileStates -> Rotation
Rotation = 0
how far this document has been scrolled (in x and y direction)
ScrollPos = 0 0
if true, the table of contents was shown when the document was closed
ShowToc = true
same as FileStates -> TocState
TocState =
]
]
index of the currently selected tab (1-based)
TabIndex = 1
same as FileState -> WindowState
WindowState = 0
default position (can be on any monitor)
WindowPos = 0 0 0 0
SidebarDx = 0
]
]
data required for reloading documents after an auto-update (introduced in version 3.0)
ReopenOnce =
data required to determine when SumatraPDF last checked for updates
TimeOfLastUpdateCheck = 0 0
value required to determine recency for the OpenCount value in FileStates
OpenCountWeek = 0
Syntax for color values
The syntax for colors is: #rrggbb.
The components are hex values (ranging from 00 to FF) and stand for:
rr : red component
gg : green component
bb : blue component
For example #ff0000 means red color. You can use
Color Picker or
Sphere or
ColorScheme Designer to pick a color.
================================================
FILE: docs/sumatrapdfrestrict.ini
================================================
; This is an example configuration file which can be used
; to disable some of SumatraPDF's functionality.
; To apply this configuration, copy this file into
; the same directory as SumatraPDF.exe.
; All settings listed below can have a value of either
; 0 for disabling the feature or 1 for enabling the feature
; (missing settings default to 0).
[Policies]
; Whether SumatraPDF should be allowed to access the Internet.
; Needed for:
; * Checking for updates
; * Sending crash reports
InternetAccess = 1
; Whether SumatraPDF should allow access to the file system.
; Needed for:
; * Opening files through dialog
; * Saving file or bookmark shortcut
; * Opening a web browser after a click on a hyperlink
; * Launching external PDF viewers, LaTeX source editors or media players
; * Displaying Frequently Read page (also requires SavePreferences)
; * Reopening recently opened files
; * Print dialog
DiskAccess = 1
; Whether SumatraPDF should save user preferences on exit.
; Needed for:
; * Changing settings
; * Favorites menu
; * Remembering recently opened files (includes Frequently Read page)
SavePreferences = 1
; Whether SumatraPDF should be allowed to write to the Registry.
; Needed for:
; * Making SumatraPDF a default PDF viewer
RegistryAccess = 1
; Whether SumatraPDF should be allowed to print.
; Needed for:
; * Printing (parts of) a document
PrinterAccess = 1
; Whether users should be allowed to select and copy content.
; Needed for:
; * Selecting with the mouse
; * Select all
; * Copying the selection
CopySelection = 1
; Whether SumatraPDF should be allowed to cover the entire screen.
; Needed for:
; * Fullscreen mode and Presentation mode
FullscreenAccess = 1
; What protocols for links inside documents should be passed
; on to the operating system (e.g. for opening a browser).
; Default: http,https,mailto (web links and email addresses)
LinkProtocols = http,https,mailto
; What file types should be opened in an external application
; if they're linked to by a (PDF) document and can't be opened
; within SumatraPDF itself (use "*" for all types)
; These file types are stored as "PerceivedType" in the Registry,
; common values: audio, video, image, document, text, system
; Default: audio,video,webpage
SafeFileTypes = audio,video,webpage
================================================
FILE: docs/wishlist-lua.txt
================================================
## Idea
Write as much of Sumatra as possible in Lua
## Why
1. Writing Sumatra becomes harder and harder due to long compilation times
2. Writing in Lua would be faster thanks to garbage collection, closures and
other higher-level functionality
3. As a result of 1, testing different approaches (e.g. to layout system) is
too expensive so we don't do it. With lua we could hopefully test various
things by writing small scripts
4. We're writing half-assed functionality that Lua would give us for free.
Layout definitions that we parse from text (EbookWinDesc.txt) could be lua
script.
Settings file could be de(serialized) lua data.
MuiCss.h is a semi-dynamic system that would naturally could be done via lua
objects.
5. With lua it would be possible to add new code to running app, without the
long re-compilation needed with C.
## How
The ultimate goal would be to have lua be the driver and call C functions
(as opposed to C code calling Lua scripts).
The road from here to that will be long. Here's a possible approach:
- validate writing UI code with luajit is sensible by writing sample apps.
In those apps recreate some of the Sumatra functionality.
- write luajit FFI bindings to libraries we use
- integrate luajit and replace some small functionality with lua code
(e.g. parsing layout definitions)
- everything else
Such project would require lots of effort and would take a long time but I'm
very interested in exploring this approach.
## References
LuaJit
. http://luajit.org/
. Because it's faster than Lua and has great FFI which allows calling Windows
APIs
MoonScript
. http://moonscript.org
. Because it's much better syntax than Lua (inspired by CoffeeScript) and
it's fully interoperable with Lua (it compiles to Lua)
Various win32 API bindings
https://code.google.com/p/lua-files/source/browse/#hg%2Fwinapi
https://github.com/Wiladams/LJIT2Win32
https://github.com/Wiladams/TINN
================================================
FILE: docs/wishlist-tabs.txt
================================================
Summary of tab-related work.
TBD means "to be designed" i.e. a decision about the behavior must be made
Must have:
. we already are not very good at showing error messages. Need to make sure
that in the tab world, a failure to load document is shown to the user in some
way
. will probably release tabs as 3.0, as it's a big feature
Nice to have:
. tear-off tabs i.e. move out so that tab can become a window or be moved
between windows
. "start page" behaves like in chrome i.e. it stays as a tab when we open new
document (i.e. a document doesn't replace a start page). It can be closed
manually (and re-appears when last tab in the window is closed). We can
provide an option to change the behavior (i.e. don't keep "start page" as
a tab as there's argument that if we only have one document in the window,
we don't have to show tab strip, which saves screen space)
. save and restore session (opened files when tabs enabled and closing a window
with multiple tabs). Like chrome.
Done:
. TBD: go single-process? Currently we might end up with multiple Sumatra processes.
Currently it doesn't matter much but if we implement moving tabs between
windows, it'll be much easier to do it if all windows belong to the same
process. Otherwise we would have to marshall quite a bit of state across
processes.
. tabs must be above all content (i.e. toolbar will be inside the tab) to match
chrome/firefox and also because when we have tabs in ebook window, tab bar
would jump around the window since there is no toolbar in ebook window
. TBD: when a document is loaded as tab, the size of the window over-rides
the size previously remembered. But should we remember that size or try
to preserve the size from the time it was the only document in the window?
Probably remember as otherwise it'll be complicated and the rule would be
rather magical (which is not a good thing).
. tab support in ebook window
================================================
FILE: docs/wishlist.txt
================================================
A list of random ideas, big and small. In no particular order. Those that are
marked "risky" are likely to cause disturbance to existing code, so should be
done at the beginning of the dev cycle. I also estimate the complexity (time to
implement) as low/med/high
-- fullscreen (low)
See if http://stackoverflow.com/questions/2382464/win32-full-screen-and-hiding-taskbar
has better way of switching to fullscreen
-- text viewer (high)
Important thing: make it work even with gigantic files by limiting how much of
the file we load in memory (1-10 MB?). We would only build and index for each
line, consisting of:
* position in the file
* length in bytes
* encoding (to support various encodings; but we would start by only supporting
ascii/utf8)
* measured size of this line
We would build that index for the whole file in the background thread, then
only load the needed part as the user scrolls through the document.
-- hex viewer (high)
-- search for ebook UI (med)
-- thumbnails (med)
Many viewers have an option to navigate document via thumbnails.
For perf, we could cache thumbnails as a single, webp-encoded image + info
about where a thumbnail for a given page is within the image (similar to
sprite technique in web dev)
-- wclang build (med)
To get static analysis from clang, use wclang (https://github.com/tpoechtrager/wclang).
Would require writing a makefile and probably code changes to acommodate
mingw winapi headers.
Then would need another buildbot script (wclang only works on Linux).
-- toolbar improvements (med, risky)
Ditch using toolbar control for the toolbar and use more mui-like approach.
That would allow us to have overlaid toolbar (shown semi-transparent over the
content). Also we could add an option to make it vertical. Also an option to
have it user-configurable (via advanced settings, allow specifying the order
of controls in the toolbar).
This would also help in unifying full-screen modes (overlaid, auto-hidden
toolbar is a better match for full-screen mode than the current one).
-- change fit width mode
Make all pages have uniform size. Currently they all have uniform zoom ratio
and one big page can make other pages really small.
-- better looking notifications (med)
Visual style of notifications is dated. Use more modern look e.g. inspired by
Chrome or Android.
-- loading errors are not always reported (low)
In some cases we don't show document loading errors (e.g. drag&drop a file that
fails to load). We need to show them as notifications
-- more detail when page doesn't render
I think we sometimes get bug reports when PDF page doesn't render because of
running out of memory. It would be good to show the exact cause of page
rendering failure instead of generic "failed to render" message.
-- faster re-layout for ebooks (med, risky)
Layout time is dominated by measuring strings.
Split layout into 2 or 3 phases:
* generate instructions (text fragments, images, font/style changes etc.)
* measure strings, images etc.
* calculating positions of the elements given page dx/dy and break them into pages
When users resizes the window, we would only need to redo phase 3. A small
complication: when a string doesn't fit in a single line, we need to split
it into to string instructions. We would need to be able to do it e.g. by
adding a "compound" instruction, that just contains one or more other instructions.
That way the istruction stream would be almost-immutable, and we could turn
a e.g. a long string into 2 smaller strings by replacing string instruction with
"compound" instructions that points to the original string instructions (so that we
can undo that in the next layout) and 2 or more substrings.
Changing default font would require redoing phase 2 and 3.
-- ebook: re-introduce preserving top of page after re-layout (risky, med)
The code to preserve current top of page after re-layout was so complicated
that I had to remove it in order to implement dual pages in ebook mode.
It would be nice to bring it back in a saner way and without the problem
of breaking the styling.
It would be easier if we implement faster re-layout as described above and
have every element remember its reparse point (instead of having it on a
per-page basis).
That way we would generate instructions just once. A page would just be:
(index of first instruction, number of instructions).
We could avoid breaking styling because now we have access to the whole instruction
stream and we could quickly scan back instructions from any point to find
formatting instructions and recover current styling.
-- improve find UI (med, risky)
The one I like the best is in Chrome browser (with
modification needed for specyfing search parameters like case sensitivity
and additions like 'match whole words' modifier). Another nice implementation
is in Kindle PC reader.
We could then free up significant amount of space in the toolbar.
-- manual cropping of margins for PDFs (med, risky)
Screen space is always at premium and most PDFs have very wasteful
margins. Good Reader has a really nice feature for manual cropping of margins
(http://www.goodiware.com/gr-man-view-pdf.html#crop). They have a mode for
manually selecting visible part of the PDF, similar to how many graphics
program implement cropping. Cropping can be set for all pages or separately
for odd/even pages. Cropping can also be reset. After cropping, the program
only shows the non-cropped part. This would be especially valuable for small
screens (netbooks/small laptops).
Note: Automatic cropping is currently implemented as "Fit Content" View mode.
Note to note: it's similar but not really the same. One big difference is that
with this style of cropping, for the purpose of layout and display the cropped
part doesn't exist. Fit Content positions the page so that it's out of the view,
but change zoom and it's really there.
Plus, automatic cropping has limitations. There are many cases where a PDF has
lots of white-space but cannot be cropped because has some small thing there,
like a page number.
-- Editing/saving of PDF forms (high)
-- PDF JavaScript support using mujs (low, risky)
-- Integration with web-based backup/viewer system (high)
The idea is that users could easily backup their PDFs on the server. They would
have a convenient access to those PDFs from Sumatra as well as being able to
view them on the web. Basically it would be private Scribd for PDFs only. The
web service would have to be paid (since on-line storage is rather expensive)
but there would be free accounts (with quota, similar to how Dropbox works).
-- Direct integration with Dropbox (high)
They have APIs. Being able to list, download and read PDF files in Dropbox
account. Not sure if that makes sense, though, since having Dropbox pretty much
means the files are locally anyway.
-- Integration with Scribd (high)
As an alternative (or addition) to Dropbox integration. Being able to search for
PDF documents on Scribd and download/view them in Sumatra. They have APIs
(http://www.scribd.com/developers).
-- Document library management (high)
Similar to how e.g. Picasa manages photos or iTunes anages mp3 files.
User would tell us which directories contain PDF and oher supported files
(or, on Win 7, we could use windows search or scan the whole hard-drive to
automatically find all files). We would index the files (their filenames, metadata,
maybe extract text for full-text search), build thumbnails and allow efficient
browsing/searching those files.
This would be a good feature for those who have large collections of PDFs
(compared to using file explorer or open file dialog for locating the file).
-- Native client version (super high)
https://developers.google.com/native-client/.
There are some challenges (need completely custom UI, fonts aren't as easy
to get at, CHM support would be much harder). The upside is we would have
no-install, browser hosted (well, at least in Chrome) app.
-- mac version (super high)
-- use SafeInt (low)
We have quite a few places where we do integer overflow checks. Explore
doing that in a systematic way, using SafeInt (http://safeint.codeplex.com/) or
similar library (or extract useful stuff for us from SafeInt)
-- use pdfium (high)
Google released https://pdfium.googlesource.com/pdfium/ which is Foxit codebase
under BSD license.
It would probably be a bunch of work to integrate this (although at first it
could be done alongside mupdf, since we have necessary abstractions to plug
another engine). So that's the downside.
I've briefly looked at the code and it might have some benefits over mupdf:
* they have hooks for form editing (which should make implementing this much
easier)
* their priting code looks more efficient
* they probably support some of the more advanced PDF features
* they are probably faster and (thanks to Google) more secure
================================================
FILE: drmem-sup.txt
================================================
# Supressions for DrMemory
# Run as: ..\drmemory\bin\drmemory.exe -suppress drmem-sup.txt -- .\rel\SumatraPDF.exe
# Probably needs to be updated for different OS versions
# quite a few false positives come from inside __scrt_common_main_seh
#
# Suppression for startup code
UNINITIALIZED READ
name=crt startup issues
...
SumatraPDF.exe!__scrt_common_main_seh
# wcslen in crt startup
UNADDRESSABLE ACCESS
name=wcslen in crt startup
SumatraPDF.exe!wcsnlen
...
SumatraPDF.exe!__scrt_common_main_seh
KERNEL32.dll!BaseThreadInitThunk
# wcslen from str::FmtV
UNADDRESSABLE ACCESS
name=wcslen from FmtV
SumatraPDF.exe!wcsnlen
...
SumatraPDF.exe!str::FmtV
# parallels driver
UNINITIALIZED READ
name=parallels driver
...
PrlToolsShellExt.dll!*
# GetLocaleInfoW
UNINITIALIZED READ
name=GetLocaleInfoW
...
SumatraPDF.exe!GetMeasurementSystem
# GetUserInfoWord
UNINITIALIZED READ
name=GetUserInfoWord
KERNELBASE.dll!GetUserInfoWord
...
SumatraPDF.exe!FormatSystemTime
# GetDateFormatW from kernelbase.dll and kernel32.dll
UNINITIALIZED READ
name=GetDateFormatW
*!GetDateFormatW
...
SumatraPDF.exe!FormatSystemTime
# GetUserInfo
UNINITIALIZED READ
name=GetUserInfo
KERNELBASE.dll!GetUserInfo
# weird leak
LEAK
name=weird leak
...
SumatraPDF.exe!pre_c_initialization
# ddjvu_context_create
UNINITIALIZED READ
name=ddjvu_context_create
...
SumatraPDF.exe!ddjvu_context_create
# ddjvu_context_create #2
UNADDRESSABLE ACCESS
name=ddjvu_context_create #2
...
SumatraPDF.exe!ddjvu_context_create
================================================
FILE: ext/CHMLib/AUTHORS
================================================
Jed Wing
includes modified LZX code from cabextract-0.5 by Stuart Caie.
Thanks to:
iDEFENSE for reporting the stack overflow vulnerability.
Palasik Sandor for reporting and fixing the LZX buffer overrun vulnerability.
David Huseby for enhancements to the chm_enumerate functionality.
Vitaly Bursov for compilation fixes for x86-64.
Vadim Zeitlin for a patch to clean up and fix some deficiencies in the
configure script.
Stan Tobias for bugfixes and index-page improvement to chm_http.
Andrew Hodgetts for major portability improvement.
Rich Erwin for his work towards Windows CE support.
Pabs for bug fixes and suggestions.
Antony Dovgal for setting up autoconf/automake based build process.
Ragnar Hojland Espinosa for patches to make chm_http more useful.
Razvan Cojocaru for forwarding along information regarding building on OS X.
Anyone else I've forgotten.
================================================
FILE: ext/CHMLib/COPYING
================================================
GNU LESSER GENERAL PUBLIC LICENSE
Version 2.1, February 1999
Copyright (C) 1991, 1999 Free Software Foundation, Inc.
59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
[This is the first released version of the Lesser GPL. It also counts
as the successor of the GNU Library Public License, version 2, hence
the version number 2.1.]
Preamble
The licenses for most software are designed to take away your
freedom to share and change it. By contrast, the GNU General Public
Licenses are intended to guarantee your freedom to share and change
free software--to make sure the software is free for all its users.
This license, the Lesser General Public License, applies to some
specially designated software packages--typically libraries--of the
Free Software Foundation and other authors who decide to use it. You
can use it too, but we suggest you first think carefully about whether
this license or the ordinary General Public License is the better
strategy to use in any particular case, based on the explanations
below.
When we speak of free software, we are referring to freedom of use,
not price. Our General Public Licenses are designed to make sure that
you have the freedom to distribute copies of free software (and charge
for this service if you wish); that you receive source code or can get
it if you want it; that you can change the software and use pieces of
it in new free programs; and that you are informed that you can do
these things.
To protect your rights, we need to make restrictions that forbid
distributors to deny you these rights or to ask you to surrender these
rights. These restrictions translate to certain responsibilities for
you if you distribute copies of the library or if you modify it.
For example, if you distribute copies of the library, whether gratis
or for a fee, you must give the recipients all the rights that we gave
you. You must make sure that they, too, receive or can get the source
code. If you link other code with the library, you must provide
complete object files to the recipients, so that they can relink them
with the library after making changes to the library and recompiling
it. And you must show them these terms so they know their rights.
We protect your rights with a two-step method: (1) we copyright the
library, and (2) we offer you this license, which gives you legal
permission to copy, distribute and/or modify the library.
To protect each distributor, we want to make it very clear that
there is no warranty for the free library. Also, if the library is
modified by someone else and passed on, the recipients should know
that what they have is not the original version, so that the original
author's reputation will not be affected by problems that might be
introduced by others.
^L
Finally, software patents pose a constant threat to the existence of
any free program. We wish to make sure that a company cannot
effectively restrict the users of a free program by obtaining a
restrictive license from a patent holder. Therefore, we insist that
any patent license obtained for a version of the library must be
consistent with the full freedom of use specified in this license.
Most GNU software, including some libraries, is covered by the
ordinary GNU General Public License. This license, the GNU Lesser
General Public License, applies to certain designated libraries, and
is quite different from the ordinary General Public License. We use
this license for certain libraries in order to permit linking those
libraries into non-free programs.
When a program is linked with a library, whether statically or using
a shared library, the combination of the two is legally speaking a
combined work, a derivative of the original library. The ordinary
General Public License therefore permits such linking only if the
entire combination fits its criteria of freedom. The Lesser General
Public License permits more lax criteria for linking other code with
the library.
We call this license the "Lesser" General Public License because it
does Less to protect the user's freedom than the ordinary General
Public License. It also provides other free software developers Less
of an advantage over competing non-free programs. These disadvantages
are the reason we use the ordinary General Public License for many
libraries. However, the Lesser license provides advantages in certain
special circumstances.
For example, on rare occasions, there may be a special need to
encourage the widest possible use of a certain library, so that it
becomes a de-facto standard. To achieve this, non-free programs must
be allowed to use the library. A more frequent case is that a free
library does the same job as widely used non-free libraries. In this
case, there is little to gain by limiting the free library to free
software only, so we use the Lesser General Public License.
In other cases, permission to use a particular library in non-free
programs enables a greater number of people to use a large body of
free software. For example, permission to use the GNU C Library in
non-free programs enables many more people to use the whole GNU
operating system, as well as its variant, the GNU/Linux operating
system.
Although the Lesser General Public License is Less protective of the
users' freedom, it does ensure that the user of a program that is
linked with the Library has the freedom and the wherewithal to run
that program using a modified version of the Library.
The precise terms and conditions for copying, distribution and
modification follow. Pay close attention to the difference between a
"work based on the library" and a "work that uses the library". The
former contains code derived from the library, whereas the latter must
be combined with the library in order to run.
^L
GNU LESSER GENERAL PUBLIC LICENSE
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
0. This License Agreement applies to any software library or other
program which contains a notice placed by the copyright holder or
other authorized party saying it may be distributed under the terms of
this Lesser General Public License (also called "this License").
Each licensee is addressed as "you".
A "library" means a collection of software functions and/or data
prepared so as to be conveniently linked with application programs
(which use some of those functions and data) to form executables.
The "Library", below, refers to any such software library or work
which has been distributed under these terms. A "work based on the
Library" means either the Library or any derivative work under
copyright law: that is to say, a work containing the Library or a
portion of it, either verbatim or with modifications and/or translated
straightforwardly into another language. (Hereinafter, translation is
included without limitation in the term "modification".)
"Source code" for a work means the preferred form of the work for
making modifications to it. For a library, complete source code means
all the source code for all modules it contains, plus any associated
interface definition files, plus the scripts used to control
compilation and installation of the library.
Activities other than copying, distribution and modification are not
covered by this License; they are outside its scope. The act of
running a program using the Library is not restricted, and output from
such a program is covered only if its contents constitute a work based
on the Library (independent of the use of the Library in a tool for
writing it). Whether that is true depends on what the Library does
and what the program that uses the Library does.
1. You may copy and distribute verbatim copies of the Library's
complete source code as you receive it, in any medium, provided that
you conspicuously and appropriately publish on each copy an
appropriate copyright notice and disclaimer of warranty; keep intact
all the notices that refer to this License and to the absence of any
warranty; and distribute a copy of this License along with the
Library.
You may charge a fee for the physical act of transferring a copy,
and you may at your option offer warranty protection in exchange for a
fee.
2. You may modify your copy or copies of the Library or any portion
of it, thus forming a work based on the Library, and copy and
distribute such modifications or work under the terms of Section 1
above, provided that you also meet all of these conditions:
a) The modified work must itself be a software library.
b) You must cause the files modified to carry prominent notices
stating that you changed the files and the date of any change.
c) You must cause the whole of the work to be licensed at no
charge to all third parties under the terms of this License.
d) If a facility in the modified Library refers to a function or a
table of data to be supplied by an application program that uses
the facility, other than as an argument passed when the facility
is invoked, then you must make a good faith effort to ensure that,
in the event an application does not supply such function or
table, the facility still operates, and performs whatever part of
its purpose remains meaningful.
(For example, a function in a library to compute square roots has
a purpose that is entirely well-defined independent of the
application. Therefore, Subsection 2d requires that any
application-supplied function or table used by this function must
be optional: if the application does not supply it, the square
root function must still compute square roots.)
These requirements apply to the modified work as a whole. If
identifiable sections of that work are not derived from the Library,
and can be reasonably considered independent and separate works in
themselves, then this License, and its terms, do not apply to those
sections when you distribute them as separate works. But when you
distribute the same sections as part of a whole which is a work based
on the Library, the distribution of the whole must be on the terms of
this License, whose permissions for other licensees extend to the
entire whole, and thus to each and every part regardless of who wrote
it.
Thus, it is not the intent of this section to claim rights or contest
your rights to work written entirely by you; rather, the intent is to
exercise the right to control the distribution of derivative or
collective works based on the Library.
In addition, mere aggregation of another work not based on the Library
with the Library (or with a work based on the Library) on a volume of
a storage or distribution medium does not bring the other work under
the scope of this License.
3. You may opt to apply the terms of the ordinary GNU General Public
License instead of this License to a given copy of the Library. To do
this, you must alter all the notices that refer to this License, so
that they refer to the ordinary GNU General Public License, version 2,
instead of to this License. (If a newer version than version 2 of the
ordinary GNU General Public License has appeared, then you can specify
that version instead if you wish.) Do not make any other change in
these notices.
^L
Once this change is made in a given copy, it is irreversible for
that copy, so the ordinary GNU General Public License applies to all
subsequent copies and derivative works made from that copy.
This option is useful when you wish to copy part of the code of
the Library into a program that is not a library.
4. You may copy and distribute the Library (or a portion or
derivative of it, under Section 2) in object code or executable form
under the terms of Sections 1 and 2 above provided that you accompany
it with the complete corresponding machine-readable source code, which
must be distributed under the terms of Sections 1 and 2 above on a
medium customarily used for software interchange.
If distribution of object code is made by offering access to copy
from a designated place, then offering equivalent access to copy the
source code from the same place satisfies the requirement to
distribute the source code, even though third parties are not
compelled to copy the source along with the object code.
5. A program that contains no derivative of any portion of the
Library, but is designed to work with the Library by being compiled or
linked with it, is called a "work that uses the Library". Such a
work, in isolation, is not a derivative work of the Library, and
therefore falls outside the scope of this License.
However, linking a "work that uses the Library" with the Library
creates an executable that is a derivative of the Library (because it
contains portions of the Library), rather than a "work that uses the
library". The executable is therefore covered by this License.
Section 6 states terms for distribution of such executables.
When a "work that uses the Library" uses material from a header file
that is part of the Library, the object code for the work may be a
derivative work of the Library even though the source code is not.
Whether this is true is especially significant if the work can be
linked without the Library, or if the work is itself a library. The
threshold for this to be true is not precisely defined by law.
If such an object file uses only numerical parameters, data
structure layouts and accessors, and small macros and small inline
functions (ten lines or less in length), then the use of the object
file is unrestricted, regardless of whether it is legally a derivative
work. (Executables containing this object code plus portions of the
Library will still fall under Section 6.)
Otherwise, if the work is a derivative of the Library, you may
distribute the object code for the work under the terms of Section 6.
Any executables containing that work also fall under Section 6,
whether or not they are linked directly with the Library itself.
^L
6. As an exception to the Sections above, you may also combine or
link a "work that uses the Library" with the Library to produce a
work containing portions of the Library, and distribute that work
under terms of your choice, provided that the terms permit
modification of the work for the customer's own use and reverse
engineering for debugging such modifications.
You must give prominent notice with each copy of the work that the
Library is used in it and that the Library and its use are covered by
this License. You must supply a copy of this License. If the work
during execution displays copyright notices, you must include the
copyright notice for the Library among them, as well as a reference
directing the user to the copy of this License. Also, you must do one
of these things:
a) Accompany the work with the complete corresponding
machine-readable source code for the Library including whatever
changes were used in the work (which must be distributed under
Sections 1 and 2 above); and, if the work is an executable linked
with the Library, with the complete machine-readable "work that
uses the Library", as object code and/or source code, so that the
user can modify the Library and then relink to produce a modified
executable containing the modified Library. (It is understood
that the user who changes the contents of definitions files in the
Library will not necessarily be able to recompile the application
to use the modified definitions.)
b) Use a suitable shared library mechanism for linking with the
Library. A suitable mechanism is one that (1) uses at run time a
copy of the library already present on the user's computer system,
rather than copying library functions into the executable, and (2)
will operate properly with a modified version of the library, if
the user installs one, as long as the modified version is
interface-compatible with the version that the work was made with.
c) Accompany the work with a written offer, valid for at least
three years, to give the same user the materials specified in
Subsection 6a, above, for a charge no more than the cost of
performing this distribution.
d) If distribution of the work is made by offering access to copy
from a designated place, offer equivalent access to copy the above
specified materials from the same place.
e) Verify that the user has already received a copy of these
materials or that you have already sent this user a copy.
For an executable, the required form of the "work that uses the
Library" must include any data and utility programs needed for
reproducing the executable from it. However, as a special exception,
the materials to be distributed need not include anything that is
normally distributed (in either source or binary form) with the major
components (compiler, kernel, and so on) of the operating system on
which the executable runs, unless that component itself accompanies
the executable.
It may happen that this requirement contradicts the license
restrictions of other proprietary libraries that do not normally
accompany the operating system. Such a contradiction means you cannot
use both them and the Library together in an executable that you
distribute.
^L
7. You may place library facilities that are a work based on the
Library side-by-side in a single library together with other library
facilities not covered by this License, and distribute such a combined
library, provided that the separate distribution of the work based on
the Library and of the other library facilities is otherwise
permitted, and provided that you do these two things:
a) Accompany the combined library with a copy of the same work
based on the Library, uncombined with any other library
facilities. This must be distributed under the terms of the
Sections above.
b) Give prominent notice with the combined library of the fact
that part of it is a work based on the Library, and explaining
where to find the accompanying uncombined form of the same work.
8. You may not copy, modify, sublicense, link with, or distribute
the Library except as expressly provided under this License. Any
attempt otherwise to copy, modify, sublicense, link with, or
distribute the Library is void, and will automatically terminate your
rights under this License. However, parties who have received copies,
or rights, from you under this License will not have their licenses
terminated so long as such parties remain in full compliance.
9. You are not required to accept this License, since you have not
signed it. However, nothing else grants you permission to modify or
distribute the Library or its derivative works. These actions are
prohibited by law if you do not accept this License. Therefore, by
modifying or distributing the Library (or any work based on the
Library), you indicate your acceptance of this License to do so, and
all its terms and conditions for copying, distributing or modifying
the Library or works based on it.
10. Each time you redistribute the Library (or any work based on the
Library), the recipient automatically receives a license from the
original licensor to copy, distribute, link with or modify the Library
subject to these terms and conditions. You may not impose any further
restrictions on the recipients' exercise of the rights granted herein.
You are not responsible for enforcing compliance by third parties with
this License.
^L
11. If, as a consequence of a court judgment or allegation of patent
infringement or for any other reason (not limited to patent issues),
conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License. If you cannot
distribute so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you
may not distribute the Library at all. For example, if a patent
license would not permit royalty-free redistribution of the Library by
all those who receive copies directly or indirectly through you, then
the only way you could satisfy both it and this License would be to
refrain entirely from distribution of the Library.
If any portion of this section is held invalid or unenforceable under
any particular circumstance, the balance of the section is intended to
apply, and the section as a whole is intended to apply in other
circumstances.
It is not the purpose of this section to induce you to infringe any
patents or other property right claims or to contest validity of any
such claims; this section has the sole purpose of protecting the
integrity of the free software distribution system which is
implemented by public license practices. Many people have made
generous contributions to the wide range of software distributed
through that system in reliance on consistent application of that
system; it is up to the author/donor to decide if he or she is willing
to distribute software through any other system and a licensee cannot
impose that choice.
This section is intended to make thoroughly clear what is believed to
be a consequence of the rest of this License.
12. If the distribution and/or use of the Library is restricted in
certain countries either by patents or by copyrighted interfaces, the
original copyright holder who places the Library under this License
may add an explicit geographical distribution limitation excluding those
countries, so that distribution is permitted only in or among
countries not thus excluded. In such case, this License incorporates
the limitation as if written in the body of this License.
13. The Free Software Foundation may publish revised and/or new
versions of the Lesser General Public License from time to time.
Such new versions will be similar in spirit to the present version,
but may differ in detail to address new problems or concerns.
Each version is given a distinguishing version number. If the Library
specifies a version number of this License which applies to it and
"any later version", you have the option of following the terms and
conditions either of that version or of any later version published by
the Free Software Foundation. If the Library does not specify a
license version number, you may choose any version ever published by
the Free Software Foundation.
^L
14. If you wish to incorporate parts of the Library into other free
programs whose distribution conditions are incompatible with these,
write to the author to ask for permission. For software which is
copyrighted by the Free Software Foundation, write to the Free
Software Foundation; we sometimes make exceptions for this. Our
decision will be guided by the two goals of preserving the free status
of all derivatives of our free software and of promoting the sharing
and reuse of software generally.
NO WARRANTY
15. BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO
WARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW.
EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR
OTHER PARTIES PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY
KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE
LIBRARY IS WITH YOU. SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME
THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
16. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN
WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY
AND/OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU
FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR
CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE
LIBRARY (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING
RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A
FAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF
SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH
DAMAGES.
END OF TERMS AND CONDITIONS
^L
How to Apply These Terms to Your New Libraries
If you develop a new library, and you want it to be of the greatest
possible use to the public, we recommend making it free software that
everyone can redistribute and change. You can do so by permitting
redistribution under these terms (or, alternatively, under the terms
of the ordinary General Public License).
To apply these terms, attach the following notices to the library.
It is safest to attach them to the start of each source file to most
effectively convey the exclusion of warranty; and each file should
have at least the "copyright" line and a pointer to where the full
notice is found.
Copyright (C)
This library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.
This library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public
License along with this library; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
Also add information on how to contact you by electronic and paper mail.
You should also get your employer (if you work as a programmer) or
your school, if any, to sign a "copyright disclaimer" for the library,
if necessary. Here is a sample; alter the names:
Yoyodyne, Inc., hereby disclaims all copyright interest in the
library `Frob' (a library for tweaking knobs) written by James
Random Hacker.
, 1 April 1990
Ty Coon, President of Vice
That's all there is to it!
================================================
FILE: ext/CHMLib/NEWS
================================================
Changes from 0.39 to 0.40
- chm_http bug fixed (chm_http begins to refuse connections)
- bashism in contrib/mozilla_helper.sh
- patch to use stdint.h (from Goswin von Brederlow)
- patch to fix soname (from Julien Lemoine, via Kartik Mistry)
- fix for extract_chmLib with empty files (from Paul Wise)
Changes from 0.38 to 0.39
- Security fix: eliminated all uses of alloca and similar, in favor of
malloc/free. This was in response to an iDefense security advisory.
- Added autoconf/automake support (patch from Antony Dovgal)
- Added contrib/mozilla_helper.sh from Kyle Davenport
Changes from 0.37 to 0.38
- Fix for reading some chm files. Running over a large directory of chm
files, about 1% of them turned out to be unreadable. This resulted from
an incomplete understanding of one of the header fields (index_root).
Apparently, this can take negative values other than -1.
- Security fix for extract_chmLib. Pathnames containing a ".." element
will not be extracted. There doesn't seem to be a legitimate reason to
use ".." as a path element in a chm file.
Changes from 0.36 to 0.37
- Major security fix for stack overflow vulnerability:
http://www.sven-tantau.de/public_files/chmlib/chmlib_20051126.txt
- Corrected the broken Makefile.in.
Changes from 0.35 to 0.36
- Major security fix (iDEFENSE Security Advisory IDEF1099 - Stack Overflow
Vulnerability)
- Major security fix from Palasik Sandor (LZX decompression buffer overrun)
- Bugfix/enhancement from David Huseby to make the "what" flags to
chm_enumerate work correctly, and to pass the flags along to the callback
function (via the chmUnitInfo structure) so that the callback doesn't
need to re-parse the filename.
- Compilation fixes for x86-64 from Vitaly V. Bursov.
- Miscellaneous fixes to the configure script, including some significant
cleanup by Vadim Zeitlin. The changes from Vadim should also allow the
configure script to correctly configure the build on OS X, where it was
previously failing to note that pread64 doesn't work.
- Minor update to the Makefile.in to do a mkdir before the install, in case
the specified INSTALLPREFIX directory is non-existent
Changes from 0.32 to 0.35
- UTF-8 filenames, while still not handled correctly, are handled a little
more gracefully. That is to say, the library doesn't fail to open files
with filenames using characters outside the ASCII subset. I'm very
interested in any information as to the "right" way to handle filenames
of this sort.
- Files not containing a compressed section are handled properly, such as
.chw files. These files seem to contain information about compression,
but the information is invalid or empty. The library deals gracefully
with this now.
- Files compressed with different options were not being decompressed
properly. In particular, if the "reset interval" for the compressed
section was other than 2 block sizes, it could fail to read some of the
files.
- The caching system was improved slightly, in conjunction with this
previous bugfix.
Changes from 0.3 to 0.32:
- [Rich Erwin] Minor portability fixes for Windows CE.
- [Pabs] Minor bugfix regarding detecting directory entries versus empty files.
- [Antony Dovgal] autoconf-based build process
- [Ragnar Hojland Espinosa] Feature additions for chm_http:
* Use SO_REUSEADDR
* Allow --bind= and --port= command line arguments
- Simple makefile has been fixed (finally) to use gcc instead of gcc-3.2. (Sorry, everybody!)
Changes from 0.2 to 0.3:
- initial attempt at portability to Win32.
- bugfixes from Stan Tobias:
* memory corruption error with caching system
* case insensitivity, to match with the Windows semantics
- modification to chm_http by Stan Tobias:
* when the user requests the page '/', they get a page with links to
all of the files in the archive
- Andrew Hodgetts has ported the library to Solaris and Irix. See README for details.
- Stuart Caie has granted permission to relicense under the LGPL.
================================================
FILE: ext/CHMLib/NOTES
================================================
CHMLIB 0.40 Installation
=======================
-----
Linux/Unix and Windows (Cygwin)
-----
I. Relevant options:
CHM_MT: build library with synchronization for thread-safety
CHM_USE_PREAD: use pread instead of lseek/read
CHM_USE_IO64: support 64-bit file I/O
Modify the INSTALLPREFIX to change the installation location.
Except on platforms where they need to be disabled, I recommend leaving all
three options enabled. OS X, however, in particular, seems to need pread
and io64 disabled.
II. autoconf/automake-style build
./configure [options]
make
su
make install
III. old-style (plain Makefile) build
cd src
make -f Makefile.old
su
make install
To use the library, see chm_lib.h, and the included example programs:
test_chmLib.c
enum_chmLib.c
chm_http.c
-------
Windows (MSVC++, Win CE SDK)
-------
I. Relevant options:
CHM_MT: build library with synchronization for thread-safety
II. Windows Standard Build
Unzip ChmLib-vs6.zip in the src directory, and open the ChmLib.dsw file in
Developer Studio. (This was developed on Developer Studio 6. I don't know
if that matters.) You may wish to enable or disable certain features by
adding preprocessor defines under the project settings dialog:
CHM_MT: build library with synchronization for thread-safety
CHM_MT is enabled by default in the Windows build.
The resultant library is called chmlib.lib.
To use the library, see chm_lib.h, and the included example programs:
test_chmLib.c
enum_chmLib.c
chm_http.c
The example programs should also show up in the Visual Studio workspace,
except for chm_http. I don't know enough about Windows network programming
to try to get that one working. Other than that one, all the other examples
run without any problems.
III. Windows CE Build
Unzip ChmLib-ce.zip in the src directory. I don't know much beyond that,
as I have no familiarity with Windows CE, but this should be a good
starting point. These project files are from Rich Erwin, who also supplied
the necessary code changes to get it running.
Sparc (Solaris)
---------------
Andrew Hodgetts has gotten the library compilable and working on Sparc
Solaris machines, with CPUs ranging from a Sun4m (Sparc5) up through an
UltraSparcIII (SunFireV880). He has managed the compilation using both GCC and
SunProC, although, he notes, some modification to the Makefile was required,
since SunProC does not understand the -fPIC flag, which GCC uses for Position
Independent Code.
MIPS (SGI Irix)
---------------
Andrew Hodgetts has gotten the library compilable and working on SGI MIPS
machines running Irix; this was using only the standard MIPS compiler, not GCC.
He reported that the -n32 flag was required in the Makefile. He also reported
that the MIPS compiler was fairly verbose with the warning messages, but that
the simple examples that came with the library seemed to work.
OS X
----
Apparently, various people have gotten the library compiled for OS X. From
what I've heard, the secret is to disable pread and io64, and possibly to use
the 'libtool' from fink, instead of the one included with the standard
developers kit.
BSD variants
----
I've heard that the library has been compiled on BSD variants. I haven't
heard of any particular difficulties.
Other Unix variants
-------------------
The code has been written with an eye on portability. Presently, I've only
personally compiled on Linux and Windows, albeit on a variety of Linux
configurations, but, as reported above, Andrew Hodgetts has reported successful
use of the library on both Solaris machines and MIPS machines.. After I get
version 0.3 out, I may try to get it compiling on some of the machines I have
at work. This code may or may not compile out of the box with, for instance,
*BSD or other Unix variants. I welcome any patches that increase the
portability of this code.
Platforms that I have access to at work, and may attempt to support after
version 0.3:
- AIX
- maybe Tru64
================================================
FILE: ext/CHMLib/README
================================================
CHMLIB 0.40a
===========
-------
SUMMARY
-------
chmlib is a small library designed for accessing MS ITSS files. The ITSS file
format is used for Microsoft Html Help files (.chm), which have been the
predominant medium for software documentation from Microsoft during the past
several years, having superceded the previously used .hlp file format.
Note that this is NOT the same as the OLE structured storage file format used
by MS Excel, Word, and so on. Instead, it is a different file format which
fulfills a similar purpose. Both file formats may be accessed via instances
of the IStorage COM interface, which is essentially an "acts like a
filesystem" interface.
-------
FILE FORMAT SUPPORT
-------
Lookup of files in the archive is supported, and should be relatively quick.
Reading of files in the archive is also supported.
Writing is not supported, but may be added in the future.
In terms of support for the ITSS file format, there are a few places in which
the support provided by this library is not fully general:
1. ITSS files whose names contain UTF-8 characters which are not part of the
ASCII subset will not currently be dealt with gracefully. Currently, the
filenames are not converted from UTF-8, but are instead returned as-is. I'm
very interested in hearing any suggestions as to the "right" way to handle
this.
2. Only version 3 ITSS files are supported at present, though some work has
gone towards divining the differences between different versions of the
file format. It is possible that version 2 ITSS files might work properly
with this library, but unconfirmed.
3. Archives larger than 4 GB should be supported just fine, but if they
contain files larger than 4GB, this library may break. Fortunately, this
seems somewhat unlikely.
If you run into .chm files (or files you suspect are ITSS files) that this
library doesn't work with, please contact me so I can fix the library.
-------
PORTABILITY
-------
This software is maintained on an x86-64 Debian GNU/Linux machine using gcc
4.x. It has been compiled on various other Linux distributions, using versions
of gcc from 2.95 through 4.4. Win32 support is provided.
Chmlib apparently works on OS X, with some tweaks. In particular, disabling
pread and io64 apparently works.
Finally, Andrew Hodgetts has ported to Solaris and IRIX:
On Monday, 7 Oct 2002, Andrew Hodgetts wrote:
> Solaris(Sun):
>
> I used both SunProC and GCC on the solaris machines to compile. They
> both worked ok.
> However, both required -lsocket on the link line of the Makefile or you
> recieve linking errors.
>
> I have this working on CPUs ranging from Sun4m (Sparc5) through to
> UltraSparcIII (SunFireV880).
>
> Irix (SGI):
>
> I only testing with the MIPS compiler (not GCC). All worked ok - lots of
> warning messages, but it always does that.
He further noted that:
> ... for NON GCC compilers, a little tweaking may be required, but nothing too
> complex. ie SunProC doesn't understand -fPIC for library building. Irix
> required -n32 (new 32bit libraries) etc. These are things that someone who
> uses the OS and compiler should be used to dealing with.
-------
CREDITS
-------
* Stuart Caie: the LZX decompression code, and for granting permission to
re-license under the LGPL.
* Sven Tantau: identification of a stack-overflow security flaw and a quick fix
for the problem; identification of a possible security danger in the example
program "extract_chmLib"
* iDEFENSE Labs: identification of a nasty stack-overflow security flaw
* Palasik Sandor: identification of a potential security flaw in lzx.c as well
as a quick fix for the problem
* David Huseby: An excellent patch to the chm_enumerate functionality, relating
to the "what" flags, which didn't work entirely correctly before
* Vadim Zeitlin: Configure script cleanup, including an important update to
allow detection of platforms where pread64 doesn't work. (OS X)
* Vitaly V. Bursov: Compilation on x86-64.
* mc: A suggestion to add a "mkdir" to the install step.
* Stan Tobias: bugfixes and the added 'index page' feature of chm_http.
* Andrew Hodgetts: porting to Solaris and IRIX, as well as fixing some
little-endian biases in the code.
* Rich Erwin: Windows CE support.
* Pabs: bug fixes and suggestions.
* Kartik Mistry: Debian package maintainer
* Antony Dovgal: setting up autoconf/automake based build process.
* Ragnar Hojland Espinosa: patches to make chm_http more useful.
* Razvan Cojocaru: forwarding along information regarding building on OS X.
* Julien Lemoine: creating and maintaining the Debian package of chmlib.
* Prarit Bhargava: Compilation on ia64
* Jean-Marc Vanel: elimination of compilation warnings in extract_chmLib
* Sisyphus & Matej Spiller-Muys: Compilation under MinGW32
* Kyle Davenport: helper script for using chm_http with mozilla
* Matthew Daniel & Mark Rosenstand: help to sort out issues with the build
system.
* Anyone else I've forgotten. (?)
================================================
FILE: ext/CHMLib/src/Makefile.am
================================================
lib_LTLIBRARIES=libchm.la
libchm_la_SOURCES=chm_lib.c lzx.c
libchm_la_LDFLAGS=-version-info 1
include_HEADERS=chm_lib.h lzx.h
if EXAMPLES
bin_PROGRAMS=chm_http enum_chmLib enumdir_chmLib extract_chmLib test_chmLib
enum_chmLib_SOURCES=enum_chmLib.c
enum_chmLib_LDADD=libchm.la
chm_http_SOURCES=chm_http.c
chm_http_LDADD=libchm.la
enumdir_chmLib_SOURCES=enumdir_chmLib.c
enumdir_chmLib_LDADD=libchm.la
extract_chmLib_SOURCES=extract_chmLib.c
extract_chmLib_LDADD=libchm.la
test_chmLib_SOURCES=test_chmLib.c
test_chmLib_LDADD=libchm.la
endif ##EXAMPLES
================================================
FILE: ext/CHMLib/src/Makefile.simple
================================================
## Available defines for building chm_lib with particular options
# CHM_MT: build thread-safe version of chm_lib
# CHM_USE_PREAD: build chm_lib to use pread/pread64 for all I/O
# CHM_USE_IO64: build chm_lib to support 64-bit file I/O
#
# Note: LDFLAGS must contain -lpthread if you are using -DCHM_MT.
#
#CFLAGS=-DCHM_MT -DCHM_USE_PREAD -DCHM_USE_IO64
CFLAGS=-DCHM_MT -DCHM_USE_PREAD -DCHM_USE_IO64 -g -DDMALLOC_DISABLE
LDFLAGS=-lpthread
INSTALLPREFIX=/usr/local
CC=gcc
LD=gcc
LIBTOOL=libtool
CP=/bin/cp
EXAMPLES=test_chmLib enum_chmLib enumdir_chmLib chm_http extract_chmLib
all: libchm.la
examples: ${EXAMPLES}
%.lo: %.c
${LIBTOOL} --mode=compile ${CC} -c -o $@ $^ ${CFLAGS}
libchm.la: chm_lib.lo lzx.lo
${LIBTOOL} --mode=link ${LD} -o $@ $^ ${LDFLAGS} -rpath ${INSTALLPREFIX}/lib
install: libchm.la
chmod a+r libchm.la
${LIBTOOL} --mode=install ${CP} libchm.la ${INSTALLPREFIX}/lib
${CP} chm_lib.h ${INSTALLPREFIX}/include
clean:
rm -fr libchm.la *.o *.lo .libs ${EXAMPLES}
test_chmLib: test_chmLib.c
${CC} -o $@ $^ -I${INSTALLPREFIX}/include -L${INSTALLPREFIX}/lib -lchm ${CFLAGS}
enum_chmLib: enum_chmLib.c
${CC} -o $@ $^ -I${INSTALLPREFIX}/include -L${INSTALLPREFIX}/lib -lchm ${CFLAGS}
enumdir_chmLib: enumdir_chmLib.c
${CC} -o $@ $^ -I${INSTALLPREFIX}/include -L${INSTALLPREFIX}/lib -lchm ${CFLAGS}
extract_chmLib: extract_chmLib.c
${CC} -o $@ $^ -I${INSTALLPREFIX}/include -L${INSTALLPREFIX}/lib -lchm ${CFLAGS}
chm_http: chm_http.c
${CC} -o $@ $^ -I${INSTALLPREFIX}/include -L${INSTALLPREFIX}/lib -lchm -lpthread ${CFLAGS}
================================================
FILE: ext/CHMLib/src/chm_http.c
================================================
/* $Id: chm_http.c,v 1.7 2002/10/08 03:43:33 jedwin Exp $ */
/***************************************************************************
* chm_http.c - CHM archive test driver *
* ------------------- *
* *
* author: Jed Wing *
* notes: This is a slightly more complex test driver for the chm *
* routines. It also serves the purpose of making .chm html *
* help files viewable from a text mode browser, which was my *
* original purpose for all of this. *
* *
* It is not included with the expectation that it will be of *
* use to others; nor is it included as an example of a *
* stunningly good implementation of an HTTP server. It is, *
* in fact, probably badly broken for any serious usage. *
* *
* Nevertheless, it is another example program, and it does *
* serve a purpose for me, so I've included it as well. *
***************************************************************************/
/***************************************************************************
* *
* This program is free software; you can redistribute it and/or modify *
* it under the terms of the GNU Lesser General Public License as *
* published by the Free Software Foundation; either version 2.1 of the *
* License, or (at your option) any later version. *
* *
***************************************************************************/
#include "chm_lib.h"
/* standard system includes */
#define _REENTRANT
#include
#include
#include
#if __sun || __sgi
#include
#define strrchr rindex
#endif
/* includes for networking */
#include
#include
#include
/* threading includes */
#include
#include
int config_port = 8080;
char config_bind[65536] = "0.0.0.0";
static void usage(const char *argv0)
{
#ifdef CHM_HTTP_SIMPLE
fprintf(stderr, "usage: %s \n", argv0);
#else
fprintf(stderr, "usage: %s [--port=PORT] [--bind=IP] \n", argv0);
#endif
exit(1);
}
static void chmhttp_server(const char *filename);
int main(int c, char **v)
{
#ifdef CHM_HTTP_SIMPLE
if (c < 2)
usage(v[0]);
/* run the server */
chmhttp_server(v[1]);
#else
int optindex = 0;
struct option longopts[] =
{
{ "port", required_argument, 0, 'p' },
{ "bind", required_argument, 0, 'b' },
{ "help", no_argument, 0, 'h' },
{ 0, 0, 0, 0 }
};
while (1)
{
int o;
o = getopt_long (c, v, "n:b:h", longopts, &optindex);
if (o < 0)
{
break;
}
switch (o)
{
case 'p':
config_port = atoi (optarg);
if (config_port <= 0)
{
fprintf(stderr, "bad port number (%s)\n", optarg);
exit(1);
}
break;
case 'b':
strncpy (config_bind, optarg, 65536);
config_bind[65535] = '\0';
break;
case 'h':
usage (v[0]);
break;
}
}
if (optind + 1 != c)
{
usage (v[0]);
}
/* run the server */
chmhttp_server(v[optind]);
#endif
/* NOT REACHED */
return 0;
}
struct chmHttpServer
{
int socket;
struct chmFile *file;
};
struct chmHttpSlave
{
int fd;
struct chmHttpServer *server;
};
static void *_slave(void *param);
static void chmhttp_server(const char *filename)
{
struct chmHttpServer server;
struct chmHttpSlave *slave;
struct sockaddr_in bindAddr;
int addrLen;
pthread_t tid;
int one = 1;
/* open file */
if ((server.file = chm_open(filename)) == NULL)
{
fprintf(stderr, "couldn't open file '%s'\n", filename);
exit(2);
}
/* create socket */
server.socket = socket(AF_INET, SOCK_STREAM, 0);
memset(&bindAddr, 0, sizeof(struct sockaddr_in));
bindAddr.sin_family = AF_INET;
bindAddr.sin_port = htons(config_port);
bindAddr.sin_addr.s_addr = inet_addr(config_bind);
if (setsockopt (server.socket, SOL_SOCKET, SO_REUSEADDR, &one, sizeof(one)))
{
perror ("setsockopt");
exit(3);
}
if (bind(server.socket,
(struct sockaddr *)&bindAddr,
sizeof(struct sockaddr_in)) < 0)
{
close(server.socket);
server.socket = -1;
fprintf(stderr, "couldn't bind to ip %s port %d\n", config_bind, config_port);
exit(3);
}
/* listen for connections */
listen(server.socket, 75);
addrLen = sizeof(struct sockaddr);
while(1)
{
slave = (struct chmHttpSlave *)malloc(sizeof(struct chmHttpSlave));
slave->server = &server;
slave->fd = accept(server.socket, (struct sockaddr *)&bindAddr, &addrLen);
if (slave->fd == -1)
break;
pthread_create(&tid, NULL, _slave, (void *)slave);
pthread_detach(tid);
}
free(slave);
}
static void service_request(int fd, struct chmFile *file);
static void *_slave(void *param)
{
struct chmHttpSlave *slave;
struct chmFile *file;
/* grab our relevant information */
slave = (struct chmHttpSlave *)param;
file = slave->server->file;
/* handle request */
service_request(slave->fd, file);
/* free our resources and return */
close(slave->fd);
free(slave);
return NULL;
}
static const char CONTENT_404[] = "HTTP/1.1 404 File not found\r\nConnection: close\r\nContent-Type: text/html; charset=iso-8859-1\r\n\r\n404 File Not Found404 File not found
\r\n";
static const char CONTENT_500[] = "HTTP/1.1 500 Unknown thing\r\nConnection: close\r\nContent-Type: text/html; charset=iso-8859-1\r\n\r\n500 Unknown thing500 Unknown thing
\r\n";
static const char INTERNAL_ERROR[] = "HTTP/1.1 500 Internal error\r\nConnection: close\r\nContent-Type: text/html; charset=iso-8859-1\r\n\r\n500 Unknown thing500 Server error
\r\n";
struct mime_mapping
{
const char *ext;
const char *ctype;
};
struct mime_mapping mime_types[] =
{ { ".htm", "text/html" },
{ ".html", "text/html" },
{ ".css", "text/css" },
{ ".gif", "image/gif" },
{ ".jpg", "image/jpeg" },
{ ".jpeg", "image/jpeg" },
{ ".jpe", "image/jpeg" },
{ ".bmp", "image/bitmap" },
{ ".png", "image/png" }
};
static const char *lookup_mime(const char *ext)
{
int i;
if (ext != NULL)
{
for (i=0; i"
"%8d\n | "
"%s | "
"",
(int)ui->length, ui->path, ui->path);
return CHM_ENUMERATOR_CONTINUE;
}
static void deliver_index(FILE *fout, struct chmFile *file)
{
fprintf(fout,
"HTTP/1.1 200 OK\r\n"
"Connection: close\r\n"
/* "Content-Length: 1000000\r\n" */
"Content-Type: text/html\r\n\r\n"
"CHM contents:
"
""
"Size: | File: |
"
"");
if (! chm_enumerate(file, CHM_ENUMERATE_ALL, _print_ui_index, fout))
fprintf(fout,"
*** ERROR ***\r\n");
fprintf(fout,"
");
}
static void deliver_content(FILE *fout, const char *filename, struct chmFile *file)
{
struct chmUnitInfo ui;
const char *ext;
unsigned char buffer[65536];
int swath, offset;
if (strcmp(filename,"/") == 0)
{
deliver_index(fout,file);
fclose(fout);
return;
}
/* try to find the file */
if (chm_resolve_object(file, filename, &ui) != CHM_RESOLVE_SUCCESS)
{
fprintf(fout, CONTENT_404);
fclose(fout);
return;
}
/* send the file back */
ext = strrchr(filename, '.');
fprintf(fout, "HTTP/1.1 200 OK\r\nConnection: close\r\nContent-Length: %d\r\nContent-Type: %s\r\n\r\n",
(int)ui.length,
lookup_mime(ext));
/* pump the data out */
swath = 65536;
offset = 0;
while (offset < ui.length)
{
if ((ui.length - offset) < 65536)
swath = ui.length - offset;
else
swath = 65536;
swath = (int)chm_retrieve_object(file, &ui, buffer, offset, swath);
offset += swath;
fwrite(buffer, 1, swath, fout);
}
fclose(fout);
}
static void service_request(int fd, struct chmFile *file)
{
char buffer[4096];
char buffer2[4096];
char *end;
FILE *fout = fdopen(fd, "w+b");
if (fout == NULL)
{
perror("chm_http: failed to fdopen client stream");
write(fd, INTERNAL_ERROR, strlen(INTERNAL_ERROR));
close(fd);
return;
}
fgets(buffer, 4096, fout);
while (1)
{
if (fgets(buffer2, 4096, fout) == NULL)
break;
if (buffer2[0] == '\r' || buffer2[0] == '\n' || buffer2[0] == '\0')
break;
}
end = strrchr(buffer, ' ');
if (strncmp(end+1, "HTTP", 4) == 0)
*end = '\0';
if (strncmp(buffer, "GET ", 4) == 0)
deliver_content(fout, buffer+4, file);
else
{
fprintf(fout, CONTENT_500);
fclose(fout);
return;
}
}
================================================
FILE: ext/CHMLib/src/chm_lib.c
================================================
/* $Id: chm_lib.c,v 1.19 2003/09/07 13:01:43 jedwin Exp $ */
/***************************************************************************
* chm_lib.c - CHM archive manipulation routines *
* ------------------- *
* *
* author: Jed Wing *
* version: 0.3 *
* notes: These routines are meant for the manipulation of microsoft *
* .chm (compiled html help) files, but may likely be used *
* for the manipulation of any ITSS archive, if ever ITSS *
* archives are used for any other purpose. *
* *
* Note also that the section names are statically handled. *
* To be entirely correct, the section names should be read *
* from the section names meta-file, and then the various *
* content sections and the "transforms" to apply to the data *
* they contain should be inferred from the section name and *
* the meta-files referenced using that name; however, all of *
* the files I've been able to get my hands on appear to have *
* only two sections: Uncompressed and MSCompressed. *
* Additionally, the ITSS.DLL file included with Windows does *
* not appear to handle any different transforms than the *
* simple LZX-transform. Furthermore, the list of transforms *
* to apply is broken, in that only half the required space *
* is allocated for the list. (It appears as though the *
* space is allocated for ASCII strings, but the strings are *
* written as unicode. As a result, only the first half of *
* the string appears.) So this is probably not too big of *
* a deal, at least until CHM v4 (MS .lit files), which also *
* incorporate encryption, of some description. *
* *
* switches: CHM_MT: compile library with thread-safety *
* *
* switches (Linux only): *
* CHM_USE_PREAD: compile library to use pread instead of *
* lseek/read *
* CHM_USE_IO64: compile library to support full 64-bit I/O *
* as is needed to properly deal with the *
* 64-bit file offsets. *
***************************************************************************/
/***************************************************************************
* *
* This program is free software; you can redistribute it and/or modify *
* it under the terms of the GNU Lesser General Public License as *
* published by the Free Software Foundation; either version 2.1 of the *
* License, or (at your option) any later version. *
* *
***************************************************************************/
#include "chm_lib.h"
#ifdef CHM_MT
#define _REENTRANT
#endif
#include "lzx.h"
#include
#include
#include
#ifdef CHM_DEBUG
#include
#endif
#if __sun || __sgi
#include
#endif
#ifdef WIN32
#include
#include
#ifdef _WIN32_WCE
#define strcasecmp _stricmp
#define strncasecmp _strnicmp
#else
#define strcasecmp stricmp
#define strncasecmp strnicmp
#endif
#else
/* basic Linux system includes */
#define _XOPEN_SOURCE 500
#include
#include
#include
#include
/* #include */
#endif
/* includes/defines for threading, if using them */
#ifdef CHM_MT
#ifdef WIN32
#define CHM_ACQUIRE_LOCK(a) do { \
EnterCriticalSection(&(a)); \
} while(0)
#define CHM_RELEASE_LOCK(a) do { \
LeaveCriticalSection(&(a)); \
} while(0)
#else
#include
#define CHM_ACQUIRE_LOCK(a) do { \
pthread_mutex_lock(&(a)); \
} while(0)
#define CHM_RELEASE_LOCK(a) do { \
pthread_mutex_unlock(&(a)); \
} while(0)
#endif
#else
#define CHM_ACQUIRE_LOCK(a) /* do nothing */
#define CHM_RELEASE_LOCK(a) /* do nothing */
#endif
#ifdef WIN32
#define CHM_NULL_FD (INVALID_HANDLE_VALUE)
#define CHM_USE_WIN32IO 1
#define CHM_CLOSE_FILE(fd) CloseHandle((fd))
#else
#define CHM_NULL_FD (-1)
#define CHM_CLOSE_FILE(fd) close((fd))
#endif
/*
* defines related to tuning
*/
#ifndef CHM_MAX_BLOCKS_CACHED
#define CHM_MAX_BLOCKS_CACHED 5
#endif
/*
* architecture specific defines
*
* Note: as soon as C99 is more widespread, the below defines should
* probably just use the C99 sized-int types.
*
* The following settings will probably work for many platforms. The sizes
* don't have to be exactly correct, but the types must accommodate at least as
* many bits as they specify.
*/
/* i386, 32-bit, Windows */
#ifdef WIN32
typedef unsigned char UChar;
typedef __int16 Int16;
typedef unsigned __int16 UInt16;
typedef __int32 Int32;
typedef unsigned __int32 UInt32;
typedef __int64 Int64;
typedef unsigned __int64 UInt64;
/* x86-64 */
/* Note that these may be appropriate for other 64-bit machines. */
#elif defined(__LP64__)
typedef unsigned char UChar;
typedef short Int16;
typedef unsigned short UInt16;
typedef int Int32;
typedef unsigned int UInt32;
typedef long Int64;
typedef unsigned long UInt64;
/* I386, 32-bit, non-Windows */
/* Sparc */
/* MIPS */
/* PPC */
#else
typedef unsigned char UChar;
typedef short Int16;
typedef unsigned short UInt16;
typedef long Int32;
typedef unsigned long UInt32;
typedef long long Int64;
typedef unsigned long long UInt64;
#endif
/* GCC */
#ifdef __GNUC__
#define memcmp __builtin_memcmp
#define memcpy __builtin_memcpy
#define strlen __builtin_strlen
#elif defined(WIN32)
static int ffs(unsigned int val)
{
int bit=1, idx=1;
while (bit != 0 && (val & bit) == 0)
{
bit <<= 1;
++idx;
}
if (bit == 0)
return 0;
else
return idx;
}
#endif
/* utilities for unmarshalling data */
static int _unmarshal_char_array(unsigned char **pData,
unsigned int *pLenRemain,
char *dest,
int count)
{
if (count <= 0 || (unsigned int)count > *pLenRemain)
return 0;
memcpy(dest, (*pData), count);
*pData += count;
*pLenRemain -= count;
return 1;
}
static int _unmarshal_uchar_array(unsigned char **pData,
unsigned int *pLenRemain,
unsigned char *dest,
int count)
{
if (count <= 0 || (unsigned int)count > *pLenRemain)
return 0;
memcpy(dest, (*pData), count);
*pData += count;
*pLenRemain -= count;
return 1;
}
#if 0
static int _unmarshal_int16(unsigned char **pData,
unsigned int *pLenRemain,
Int16 *dest)
{
if (2 > *pLenRemain)
return 0;
*dest = (*pData)[0] | (*pData)[1]<<8;
*pData += 2;
*pLenRemain -= 2;
return 1;
}
static int _unmarshal_uint16(unsigned char **pData,
unsigned int *pLenRemain,
UInt16 *dest)
{
if (2 > *pLenRemain)
return 0;
*dest = (*pData)[0] | (*pData)[1]<<8;
*pData += 2;
*pLenRemain -= 2;
return 1;
}
#endif
static int _unmarshal_int32(unsigned char **pData,
unsigned int *pLenRemain,
Int32 *dest)
{
if (4 > *pLenRemain)
return 0;
*dest = (*pData)[0] | (*pData)[1]<<8 | (*pData)[2]<<16 | (*pData)[3]<<24;
*pData += 4;
*pLenRemain -= 4;
return 1;
}
static int _unmarshal_uint32(unsigned char **pData,
unsigned int *pLenRemain,
UInt32 *dest)
{
if (4 > *pLenRemain)
return 0;
*dest = (*pData)[0] | (*pData)[1]<<8 | (*pData)[2]<<16 | (*pData)[3]<<24;
*pData += 4;
*pLenRemain -= 4;
return 1;
}
static int _unmarshal_int64(unsigned char **pData,
unsigned int *pLenRemain,
Int64 *dest)
{
Int64 temp;
int i;
if (8 > *pLenRemain)
return 0;
temp=0;
for(i=8; i>0; i--)
{
temp <<= 8;
temp |= (*pData)[i-1];
}
*dest = temp;
*pData += 8;
*pLenRemain -= 8;
return 1;
}
static int _unmarshal_uint64(unsigned char **pData,
unsigned int *pLenRemain,
UInt64 *dest)
{
UInt64 temp;
int i;
if (8 > *pLenRemain)
return 0;
temp=0;
for(i=8; i>0; i--)
{
temp <<= 8;
temp |= (*pData)[i-1];
}
*dest = temp;
*pData += 8;
*pLenRemain -= 8;
return 1;
}
static int _unmarshal_uuid(unsigned char **pData,
unsigned int *pDataLen,
unsigned char *dest)
{
return _unmarshal_uchar_array(pData, pDataLen, dest, 16);
}
/* names of sections essential to decompression */
static const char _CHMU_RESET_TABLE[] =
"::DataSpace/Storage/MSCompressed/Transform/"
"{7FC28940-9D31-11D0-9B27-00A0C91E9C7C}/"
"InstanceData/ResetTable";
static const char _CHMU_LZXC_CONTROLDATA[] =
"::DataSpace/Storage/MSCompressed/ControlData";
static const char _CHMU_CONTENT[] =
"::DataSpace/Storage/MSCompressed/Content";
static const char _CHMU_SPANINFO[] =
"::DataSpace/Storage/MSCompressed/SpanInfo";
/*
* structures local to this module
*/
/* structure of ITSF headers */
#define _CHM_ITSF_V2_LEN (0x58)
#define _CHM_ITSF_V3_LEN (0x60)
struct chmItsfHeader
{
char signature[4]; /* 0 (ITSF) */
Int32 version; /* 4 */
Int32 header_len; /* 8 */
Int32 unknown_000c; /* c */
UInt32 last_modified; /* 10 */
UInt32 lang_id; /* 14 */
UChar dir_uuid[16]; /* 18 */
UChar stream_uuid[16]; /* 28 */
UInt64 unknown_offset; /* 38 */
UInt64 unknown_len; /* 40 */
UInt64 dir_offset; /* 48 */
UInt64 dir_len; /* 50 */
UInt64 data_offset; /* 58 (Not present before V3) */
}; /* __attribute__ ((aligned (1))); */
static int _unmarshal_itsf_header(unsigned char **pData,
unsigned int *pDataLen,
struct chmItsfHeader *dest)
{
/* we only know how to deal with the 0x58 and 0x60 byte structures */
if (*pDataLen != _CHM_ITSF_V2_LEN && *pDataLen != _CHM_ITSF_V3_LEN)
return 0;
/* unmarshal common fields */
_unmarshal_char_array(pData, pDataLen, dest->signature, 4);
_unmarshal_int32 (pData, pDataLen, &dest->version);
_unmarshal_int32 (pData, pDataLen, &dest->header_len);
_unmarshal_int32 (pData, pDataLen, &dest->unknown_000c);
_unmarshal_uint32 (pData, pDataLen, &dest->last_modified);
_unmarshal_uint32 (pData, pDataLen, &dest->lang_id);
_unmarshal_uuid (pData, pDataLen, dest->dir_uuid);
_unmarshal_uuid (pData, pDataLen, dest->stream_uuid);
_unmarshal_uint64 (pData, pDataLen, &dest->unknown_offset);
_unmarshal_uint64 (pData, pDataLen, &dest->unknown_len);
_unmarshal_uint64 (pData, pDataLen, &dest->dir_offset);
_unmarshal_uint64 (pData, pDataLen, &dest->dir_len);
/* error check the data */
/* XXX: should also check UUIDs, probably, though with a version 3 file,
* current MS tools do not seem to use them.
*/
if (memcmp(dest->signature, "ITSF", 4) != 0)
return 0;
if (dest->version == 2)
{
if (dest->header_len < _CHM_ITSF_V2_LEN)
return 0;
}
else if (dest->version == 3)
{
if (dest->header_len < _CHM_ITSF_V3_LEN)
return 0;
}
else
return 0;
/* now, if we have a V3 structure, unmarshal the rest.
* otherwise, compute it
*/
if (dest->version == 3)
{
if (*pDataLen != 0)
_unmarshal_uint64(pData, pDataLen, &dest->data_offset);
else
return 0;
}
else
dest->data_offset = dest->dir_offset + dest->dir_len;
/* SumatraPDF: sanity check (huge values are usually due to broken files) */
if (dest->dir_offset > UINT_MAX || dest->dir_len > UINT_MAX)
return 0;
return 1;
}
/* structure of ITSP headers */
#define _CHM_ITSP_V1_LEN (0x54)
struct chmItspHeader
{
char signature[4]; /* 0 (ITSP) */
Int32 version; /* 4 */
Int32 header_len; /* 8 */
Int32 unknown_000c; /* c */
UInt32 block_len; /* 10 */
Int32 blockidx_intvl; /* 14 */
Int32 index_depth; /* 18 */
Int32 index_root; /* 1c */
Int32 index_head; /* 20 */
Int32 unknown_0024; /* 24 */
UInt32 num_blocks; /* 28 */
Int32 unknown_002c; /* 2c */
UInt32 lang_id; /* 30 */
UChar system_uuid[16]; /* 34 */
UChar unknown_0044[16]; /* 44 */
}; /* __attribute__ ((aligned (1))); */
static int _unmarshal_itsp_header(unsigned char **pData,
unsigned int *pDataLen,
struct chmItspHeader *dest)
{
/* we only know how to deal with a 0x54 byte structures */
if (*pDataLen != _CHM_ITSP_V1_LEN)
return 0;
/* unmarshal fields */
_unmarshal_char_array(pData, pDataLen, dest->signature, 4);
_unmarshal_int32 (pData, pDataLen, &dest->version);
_unmarshal_int32 (pData, pDataLen, &dest->header_len);
_unmarshal_int32 (pData, pDataLen, &dest->unknown_000c);
_unmarshal_uint32 (pData, pDataLen, &dest->block_len);
_unmarshal_int32 (pData, pDataLen, &dest->blockidx_intvl);
_unmarshal_int32 (pData, pDataLen, &dest->index_depth);
_unmarshal_int32 (pData, pDataLen, &dest->index_root);
_unmarshal_int32 (pData, pDataLen, &dest->index_head);
_unmarshal_int32 (pData, pDataLen, &dest->unknown_0024);
_unmarshal_uint32 (pData, pDataLen, &dest->num_blocks);
_unmarshal_int32 (pData, pDataLen, &dest->unknown_002c);
_unmarshal_uint32 (pData, pDataLen, &dest->lang_id);
_unmarshal_uuid (pData, pDataLen, dest->system_uuid);
_unmarshal_uchar_array(pData, pDataLen, dest->unknown_0044, 16);
/* error check the data */
if (memcmp(dest->signature, "ITSP", 4) != 0)
return 0;
if (dest->version != 1)
return 0;
if (dest->header_len != _CHM_ITSP_V1_LEN)
return 0;
/* SumatraPDF: sanity check */
if (dest->block_len == 0)
return 0;
return 1;
}
/* structure of PMGL headers */
static const char _chm_pmgl_marker[4] = "PMGL";
#define _CHM_PMGL_LEN (0x14)
struct chmPmglHeader
{
char signature[4]; /* 0 (PMGL) */
UInt32 free_space; /* 4 */
UInt32 unknown_0008; /* 8 */
Int32 block_prev; /* c */
Int32 block_next; /* 10 */
}; /* __attribute__ ((aligned (1))); */
static int _unmarshal_pmgl_header(unsigned char **pData,
unsigned int *pDataLen,
unsigned int blockLen,
struct chmPmglHeader *dest)
{
/* we only know how to deal with a 0x14 byte structures */
if (*pDataLen != _CHM_PMGL_LEN)
return 0;
/* SumatraPDF: sanity check */
if (blockLen < _CHM_PMGL_LEN)
return 0;
/* unmarshal fields */
_unmarshal_char_array(pData, pDataLen, dest->signature, 4);
_unmarshal_uint32 (pData, pDataLen, &dest->free_space);
_unmarshal_uint32 (pData, pDataLen, &dest->unknown_0008);
_unmarshal_int32 (pData, pDataLen, &dest->block_prev);
_unmarshal_int32 (pData, pDataLen, &dest->block_next);
/* check structure */
if (memcmp(dest->signature, _chm_pmgl_marker, 4) != 0)
return 0;
/* SumatraPDF: sanity check */
if (dest->free_space > blockLen - _CHM_PMGL_LEN)
return 0;
return 1;
}
/* structure of PMGI headers */
static const char _chm_pmgi_marker[4] = "PMGI";
#define _CHM_PMGI_LEN (0x08)
struct chmPmgiHeader
{
char signature[4]; /* 0 (PMGI) */
UInt32 free_space; /* 4 */
}; /* __attribute__ ((aligned (1))); */
static int _unmarshal_pmgi_header(unsigned char **pData,
unsigned int *pDataLen,
unsigned int blockLen,
struct chmPmgiHeader *dest)
{
/* we only know how to deal with a 0x8 byte structures */
if (*pDataLen != _CHM_PMGI_LEN)
return 0;
/* SumatraPDF: sanity check */
if (blockLen < _CHM_PMGI_LEN)
return 0;
/* unmarshal fields */
_unmarshal_char_array(pData, pDataLen, dest->signature, 4);
_unmarshal_uint32 (pData, pDataLen, &dest->free_space);
/* check structure */
if (memcmp(dest->signature, _chm_pmgi_marker, 4) != 0)
return 0;
/* SumatraPDF: sanity check */
if (dest->free_space > blockLen - _CHM_PMGI_LEN)
return 0;
return 1;
}
/* structure of LZXC reset table */
#define _CHM_LZXC_RESETTABLE_V1_LEN (0x28)
struct chmLzxcResetTable
{
UInt32 version;
UInt32 block_count;
UInt32 unknown;
UInt32 table_offset;
UInt64 uncompressed_len;
UInt64 compressed_len;
UInt64 block_len;
}; /* __attribute__ ((aligned (1))); */
static int _unmarshal_lzxc_reset_table(unsigned char **pData,
unsigned int *pDataLen,
struct chmLzxcResetTable *dest)
{
/* we only know how to deal with a 0x28 byte structures */
if (*pDataLen != _CHM_LZXC_RESETTABLE_V1_LEN)
return 0;
/* unmarshal fields */
_unmarshal_uint32 (pData, pDataLen, &dest->version);
_unmarshal_uint32 (pData, pDataLen, &dest->block_count);
_unmarshal_uint32 (pData, pDataLen, &dest->unknown);
_unmarshal_uint32 (pData, pDataLen, &dest->table_offset);
_unmarshal_uint64 (pData, pDataLen, &dest->uncompressed_len);
_unmarshal_uint64 (pData, pDataLen, &dest->compressed_len);
_unmarshal_uint64 (pData, pDataLen, &dest->block_len);
/* check structure */
if (dest->version != 2)
return 0;
/* SumatraPDF: sanity check (huge values are usually due to broken files) */
if (dest->uncompressed_len > UINT_MAX || dest->compressed_len > UINT_MAX)
return 0;
if (dest->block_len == 0 || dest->block_len > UINT_MAX)
return 0;
return 1;
}
/* structure of LZXC control data block */
#define _CHM_LZXC_MIN_LEN (0x18)
#define _CHM_LZXC_V2_LEN (0x1c)
struct chmLzxcControlData
{
UInt32 size; /* 0 */
char signature[4]; /* 4 (LZXC) */
UInt32 version; /* 8 */
UInt32 resetInterval; /* c */
UInt32 windowSize; /* 10 */
UInt32 windowsPerReset; /* 14 */
UInt32 unknown_18; /* 18 */
};
static int _unmarshal_lzxc_control_data(unsigned char **pData,
unsigned int *pDataLen,
struct chmLzxcControlData *dest)
{
/* we want at least 0x18 bytes */
if (*pDataLen < _CHM_LZXC_MIN_LEN)
return 0;
/* unmarshal fields */
_unmarshal_uint32 (pData, pDataLen, &dest->size);
_unmarshal_char_array(pData, pDataLen, dest->signature, 4);
_unmarshal_uint32 (pData, pDataLen, &dest->version);
_unmarshal_uint32 (pData, pDataLen, &dest->resetInterval);
_unmarshal_uint32 (pData, pDataLen, &dest->windowSize);
_unmarshal_uint32 (pData, pDataLen, &dest->windowsPerReset);
if (*pDataLen >= _CHM_LZXC_V2_LEN)
_unmarshal_uint32 (pData, pDataLen, &dest->unknown_18);
else
dest->unknown_18 = 0;
if (dest->version == 2)
{
dest->resetInterval *= 0x8000;
dest->windowSize *= 0x8000;
}
if (dest->windowSize == 0 || dest->resetInterval == 0)
return 0;
/* for now, only support resetInterval a multiple of windowSize/2 */
if (dest->windowSize == 1)
return 0;
if ((dest->resetInterval % (dest->windowSize/2)) != 0)
return 0;
/* check structure */
if (memcmp(dest->signature, "LZXC", 4) != 0)
return 0;
return 1;
}
/* the structure used for chm file handles */
struct chmFile
{
#ifdef WIN32
HANDLE fd;
#else
int fd;
#endif
#ifdef CHM_MT
#ifdef WIN32
CRITICAL_SECTION mutex;
CRITICAL_SECTION lzx_mutex;
CRITICAL_SECTION cache_mutex;
#else
pthread_mutex_t mutex;
pthread_mutex_t lzx_mutex;
pthread_mutex_t cache_mutex;
#endif
#endif
UInt64 dir_offset;
UInt64 dir_len;
UInt64 data_offset;
Int32 index_root;
Int32 index_head;
UInt32 block_len;
UInt64 span;
struct chmUnitInfo rt_unit;
struct chmUnitInfo cn_unit;
struct chmLzxcResetTable reset_table;
/* LZX control data */
int compression_enabled;
UInt32 window_size;
UInt32 reset_interval;
UInt32 reset_blkcount;
/* decompressor state */
struct LZXstate *lzx_state;
int lzx_last_block;
/* cache for decompressed blocks */
UChar **cache_blocks;
UInt64 *cache_block_indices;
Int32 cache_num_blocks;
};
/*
* utility functions local to this module
*/
/* utility function to handle differences between {pread,read}(64)? */
static Int64 _chm_fetch_bytes(struct chmFile *h,
UChar *buf,
UInt64 os,
Int64 len)
{
Int64 readLen=0, oldOs=0;
if (h->fd == CHM_NULL_FD)
return readLen;
CHM_ACQUIRE_LOCK(h->mutex);
#ifdef CHM_USE_WIN32IO
/* NOTE: this might be better done with CreateFileMapping, et cetera... */
{
DWORD origOffsetLo=0, origOffsetHi=0;
DWORD offsetLo, offsetHi;
DWORD actualLen=0;
/* awkward Win32 Seek/Tell */
offsetLo = (unsigned int)(os & 0xffffffffL);
offsetHi = (unsigned int)((os >> 32) & 0xffffffffL);
origOffsetLo = SetFilePointer(h->fd, 0, &origOffsetHi, FILE_CURRENT);
offsetLo = SetFilePointer(h->fd, offsetLo, &offsetHi, FILE_BEGIN);
/* read the data */
if (ReadFile(h->fd,
buf,
(DWORD)len,
&actualLen,
NULL) == TRUE)
readLen = actualLen;
else
readLen = 0;
/* restore original position */
SetFilePointer(h->fd, origOffsetLo, &origOffsetHi, FILE_BEGIN);
}
#else
#ifdef CHM_USE_PREAD
#ifdef CHM_USE_IO64
readLen = pread64(h->fd, buf, (long)len, os);
#else
readLen = pread(h->fd, buf, (long)len, (unsigned int)os);
#endif
#else
#ifdef CHM_USE_IO64
oldOs = lseek64(h->fd, 0, SEEK_CUR);
lseek64(h->fd, os, SEEK_SET);
readLen = read(h->fd, buf, len);
lseek64(h->fd, oldOs, SEEK_SET);
#else
oldOs = lseek(h->fd, 0, SEEK_CUR);
lseek(h->fd, (long)os, SEEK_SET);
readLen = read(h->fd, buf, len);
lseek(h->fd, (long)oldOs, SEEK_SET);
#endif
#endif
#endif
CHM_RELEASE_LOCK(h->mutex);
return readLen;
}
/* open an ITS archive */
#ifdef PPC_BSTR
/* RWE 6/12/2003 */
struct chmFile *chm_open(BSTR filename)
#else
struct chmFile *chm_open(const char *filename)
#endif
{
unsigned char sbuffer[256];
unsigned int sremain;
unsigned char *sbufpos;
struct chmFile *newHandle=NULL;
struct chmItsfHeader itsfHeader;
struct chmItspHeader itspHeader;
#if 0
struct chmUnitInfo uiSpan;
#endif
struct chmUnitInfo uiLzxc;
struct chmLzxcControlData ctlData;
/* allocate handle */
newHandle = (struct chmFile *)malloc(sizeof(struct chmFile));
if (newHandle == NULL)
return NULL;
newHandle->fd = CHM_NULL_FD;
newHandle->lzx_state = NULL;
newHandle->cache_blocks = NULL;
newHandle->cache_block_indices = NULL;
newHandle->cache_num_blocks = 0;
/* open file */
#ifdef WIN32
#ifdef PPC_BSTR
if ((newHandle->fd=CreateFile(filename,
GENERIC_READ,
FILE_SHARE_READ,
NULL,
OPEN_EXISTING,
FILE_ATTRIBUTE_NORMAL,
NULL)) == CHM_NULL_FD)
{
free(newHandle);
return NULL;
}
#else
if ((newHandle->fd=CreateFileA(filename,
GENERIC_READ,
0,
NULL,
OPEN_EXISTING,
FILE_ATTRIBUTE_NORMAL,
NULL)) == CHM_NULL_FD)
{
free(newHandle);
return NULL;
}
#endif
#else
if ((newHandle->fd=open(filename, O_RDONLY)) == CHM_NULL_FD)
{
free(newHandle);
return NULL;
}
#endif
/* initialize mutexes, if needed */
#ifdef CHM_MT
#ifdef WIN32
InitializeCriticalSection(&newHandle->mutex);
InitializeCriticalSection(&newHandle->lzx_mutex);
InitializeCriticalSection(&newHandle->cache_mutex);
#else
pthread_mutex_init(&newHandle->mutex, NULL);
pthread_mutex_init(&newHandle->lzx_mutex, NULL);
pthread_mutex_init(&newHandle->cache_mutex, NULL);
#endif
#endif
/* read and verify header */
sremain = _CHM_ITSF_V3_LEN;
sbufpos = sbuffer;
if (_chm_fetch_bytes(newHandle, sbuffer, (UInt64)0, sremain) != sremain ||
!_unmarshal_itsf_header(&sbufpos, &sremain, &itsfHeader))
{
chm_close(newHandle);
return NULL;
}
/* stash important values from header */
newHandle->dir_offset = itsfHeader.dir_offset;
newHandle->dir_len = itsfHeader.dir_len;
newHandle->data_offset = itsfHeader.data_offset;
/* now, read and verify the directory header chunk */
sremain = _CHM_ITSP_V1_LEN;
sbufpos = sbuffer;
if (_chm_fetch_bytes(newHandle, sbuffer,
(UInt64)itsfHeader.dir_offset, sremain) != sremain ||
!_unmarshal_itsp_header(&sbufpos, &sremain, &itspHeader))
{
chm_close(newHandle);
return NULL;
}
/* grab essential information from ITSP header */
newHandle->dir_offset += itspHeader.header_len;
newHandle->dir_len -= itspHeader.header_len;
newHandle->index_root = itspHeader.index_root;
newHandle->index_head = itspHeader.index_head;
newHandle->block_len = itspHeader.block_len;
/* if the index root is -1, this means we don't have any PMGI blocks.
* as a result, we must use the sole PMGL block as the index root
*/
if (newHandle->index_root <= -1)
newHandle->index_root = newHandle->index_head;
/* By default, compression is enabled. */
newHandle->compression_enabled = 1;
/* Jed, Sun Jun 27: 'span' doesn't seem to be used anywhere?! */
#if 0
/* fetch span */
if (CHM_RESOLVE_SUCCESS != chm_resolve_object(newHandle,
_CHMU_SPANINFO,
&uiSpan) ||
uiSpan.space == CHM_COMPRESSED)
{
chm_close(newHandle);
return NULL;
}
/* N.B.: we've already checked that uiSpan is in the uncompressed section,
* so this should not require attempting to decompress, which may
* rely on having a valid "span"
*/
sremain = 8;
sbufpos = sbuffer;
if (chm_retrieve_object(newHandle, &uiSpan, sbuffer,
0, sremain) != sremain ||
!_unmarshal_uint64(&sbufpos, &sremain, &newHandle->span))
{
chm_close(newHandle);
return NULL;
}
#endif
/* prefetch most commonly needed unit infos */
if (CHM_RESOLVE_SUCCESS != chm_resolve_object(newHandle,
_CHMU_RESET_TABLE,
&newHandle->rt_unit) ||
newHandle->rt_unit.space == CHM_COMPRESSED ||
CHM_RESOLVE_SUCCESS != chm_resolve_object(newHandle,
_CHMU_CONTENT,
&newHandle->cn_unit) ||
newHandle->cn_unit.space == CHM_COMPRESSED ||
CHM_RESOLVE_SUCCESS != chm_resolve_object(newHandle,
_CHMU_LZXC_CONTROLDATA,
&uiLzxc) ||
uiLzxc.space == CHM_COMPRESSED)
{
newHandle->compression_enabled = 0;
}
/* read reset table info */
if (newHandle->compression_enabled)
{
sremain = _CHM_LZXC_RESETTABLE_V1_LEN;
sbufpos = sbuffer;
if (chm_retrieve_object(newHandle, &newHandle->rt_unit, sbuffer,
0, sremain) != sremain ||
!_unmarshal_lzxc_reset_table(&sbufpos, &sremain,
&newHandle->reset_table))
{
newHandle->compression_enabled = 0;
}
}
/* read control data */
if (newHandle->compression_enabled)
{
sremain = (unsigned int)uiLzxc.length;
if (uiLzxc.length > sizeof(sbuffer))
{
chm_close(newHandle);
return NULL;
}
sbufpos = sbuffer;
if (chm_retrieve_object(newHandle, &uiLzxc, sbuffer,
0, sremain) != sremain ||
!_unmarshal_lzxc_control_data(&sbufpos, &sremain,
&ctlData))
{
newHandle->compression_enabled = 0;
}
else /* SumatraPDF: prevent division by zero */
{
newHandle->window_size = ctlData.windowSize;
newHandle->reset_interval = ctlData.resetInterval;
/* Jed, Mon Jun 28: Experimentally, it appears that the reset block count */
/* must be multiplied by this formerly unknown ctrl data field in */
/* order to decompress some files. */
#if 0
newHandle->reset_blkcount = newHandle->reset_interval /
(newHandle->window_size / 2);
#else
newHandle->reset_blkcount = newHandle->reset_interval /
(newHandle->window_size / 2) *
ctlData.windowsPerReset;
#endif
}
}
/* initialize cache */
chm_set_param(newHandle, CHM_PARAM_MAX_BLOCKS_CACHED,
CHM_MAX_BLOCKS_CACHED);
return newHandle;
}
/* close an ITS archive */
void chm_close(struct chmFile *h)
{
if (h != NULL)
{
if (h->fd != CHM_NULL_FD)
CHM_CLOSE_FILE(h->fd);
h->fd = CHM_NULL_FD;
#ifdef CHM_MT
#ifdef WIN32
DeleteCriticalSection(&h->mutex);
DeleteCriticalSection(&h->lzx_mutex);
DeleteCriticalSection(&h->cache_mutex);
#else
pthread_mutex_destroy(&h->mutex);
pthread_mutex_destroy(&h->lzx_mutex);
pthread_mutex_destroy(&h->cache_mutex);
#endif
#endif
if (h->lzx_state)
LZXteardown(h->lzx_state);
h->lzx_state = NULL;
if (h->cache_blocks)
{
int i;
for (i=0; icache_num_blocks; i++)
{
if (h->cache_blocks[i])
free(h->cache_blocks[i]);
}
free(h->cache_blocks);
h->cache_blocks = NULL;
}
if (h->cache_block_indices)
free(h->cache_block_indices);
h->cache_block_indices = NULL;
free(h);
}
}
/*
* set a parameter on the file handle.
* valid parameter types:
* CHM_PARAM_MAX_BLOCKS_CACHED:
* how many decompressed blocks should be cached? A simple
* caching scheme is used, wherein the index of the block is
* used as a hash value, and hash collision results in the
* invalidation of the previously cached block.
*/
void chm_set_param(struct chmFile *h,
int paramType,
int paramVal)
{
switch (paramType)
{
case CHM_PARAM_MAX_BLOCKS_CACHED:
CHM_ACQUIRE_LOCK(h->cache_mutex);
if (paramVal != h->cache_num_blocks)
{
UChar **newBlocks;
UInt64 *newIndices;
int i;
/* allocate new cached blocks */
newBlocks = (UChar **)malloc(paramVal * sizeof (UChar *));
if (newBlocks == NULL) return;
newIndices = (UInt64 *)malloc(paramVal * sizeof (UInt64));
if (newIndices == NULL) { free(newBlocks); return; }
for (i=0; icache_blocks)
{
for (i=0; icache_num_blocks; i++)
{
int newSlot = (int)(h->cache_block_indices[i] % paramVal);
if (h->cache_blocks[i])
{
/* in case of collision, destroy newcomer */
if (newBlocks[newSlot])
{
free(h->cache_blocks[i]);
h->cache_blocks[i] = NULL;
}
else
{
newBlocks[newSlot] = h->cache_blocks[i];
newIndices[newSlot] =
h->cache_block_indices[i];
}
}
}
free(h->cache_blocks);
free(h->cache_block_indices);
}
/* now, set new values */
h->cache_blocks = newBlocks;
h->cache_block_indices = newIndices;
h->cache_num_blocks = paramVal;
}
CHM_RELEASE_LOCK(h->cache_mutex);
break;
default:
break;
}
}
/*
* helper methods for chm_resolve_object
*/
/* skip a compressed dword */
static void _chm_skip_cword(UChar **pEntry)
{
while (*(*pEntry)++ >= 0x80)
;
}
/* skip the data from a PMGL entry */
static void _chm_skip_PMGL_entry_data(UChar **pEntry)
{
_chm_skip_cword(pEntry);
_chm_skip_cword(pEntry);
_chm_skip_cword(pEntry);
}
/* parse a compressed dword */
static UInt64 _chm_parse_cword(UChar **pEntry)
{
UInt64 accum = 0;
UChar temp;
while ((temp=*(*pEntry)++) >= 0x80)
{
accum <<= 7;
accum += temp & 0x7f;
}
return (accum << 7) + temp;
}
/* parse a utf-8 string into an ASCII char buffer */
static int _chm_parse_UTF8(UChar **pEntry, UInt64 count, char *path)
{
/* XXX: implement UTF-8 support, including a real mapping onto
* ISO-8859-1? probably there is a library to do this? As is
* immediately apparent from the below code, I'm presently not doing
* any special handling for files in which none of the strings contain
* UTF-8 multi-byte characters.
*/
while (count != 0)
{
*path++ = (char)(*(*pEntry)++);
--count;
}
*path = '\0';
return 1;
}
/* parse a PMGL entry into a chmUnitInfo struct; return 1 on success. */
static int _chm_parse_PMGL_entry(UChar **pEntry, struct chmUnitInfo *ui)
{
UInt64 strLen;
/* parse str len */
strLen = _chm_parse_cword(pEntry);
if (strLen > CHM_MAX_PATHLEN)
return 0;
/* parse path */
if (! _chm_parse_UTF8(pEntry, strLen, ui->path))
return 0;
/* parse info */
ui->space = (int)_chm_parse_cword(pEntry);
ui->start = _chm_parse_cword(pEntry);
ui->length = _chm_parse_cword(pEntry);
return 1;
}
/* find an exact entry in PMGL; return NULL if we fail */
static UChar *_chm_find_in_PMGL(UChar *page_buf,
UInt32 block_len,
const char *objPath)
{
/* XXX: modify this to do a binary search using the nice index structure
* that is provided for us.
*/
struct chmPmglHeader header;
unsigned int hremain;
UChar *end;
UChar *cur;
UChar *temp;
UInt64 strLen;
char buffer[CHM_MAX_PATHLEN+1];
/* figure out where to start and end */
cur = page_buf;
hremain = _CHM_PMGL_LEN;
if (! _unmarshal_pmgl_header(&cur, &hremain, block_len, &header))
return NULL;
end = page_buf + block_len - (header.free_space);
/* now, scan progressively */
while (cur < end)
{
/* grab the name */
temp = cur;
strLen = _chm_parse_cword(&cur);
if (strLen > CHM_MAX_PATHLEN)
return NULL;
if (! _chm_parse_UTF8(&cur, strLen, buffer))
return NULL;
/* check if it is the right name */
if (! strcasecmp(buffer, objPath))
return temp;
_chm_skip_PMGL_entry_data(&cur);
}
return NULL;
}
/* find which block should be searched next for the entry; -1 if no block */
static Int32 _chm_find_in_PMGI(UChar *page_buf,
UInt32 block_len,
const char *objPath)
{
/* XXX: modify this to do a binary search using the nice index structure
* that is provided for us
*/
struct chmPmgiHeader header;
unsigned int hremain;
int page=-1;
UChar *end;
UChar *cur;
UInt64 strLen;
char buffer[CHM_MAX_PATHLEN+1];
/* figure out where to start and end */
cur = page_buf;
hremain = _CHM_PMGI_LEN;
if (! _unmarshal_pmgi_header(&cur, &hremain, block_len, &header))
return -1;
end = page_buf + block_len - (header.free_space);
/* now, scan progressively */
while (cur < end)
{
/* grab the name */
strLen = _chm_parse_cword(&cur);
if (strLen > CHM_MAX_PATHLEN)
return -1;
if (! _chm_parse_UTF8(&cur, strLen, buffer))
return -1;
/* check if it is the right name */
if (strcasecmp(buffer, objPath) > 0)
return page;
/* load next value for path */
page = (int)_chm_parse_cword(&cur);
}
return page;
}
/* resolve a particular object from the archive */
int chm_resolve_object(struct chmFile *h,
const char *objPath,
struct chmUnitInfo *ui)
{
/*
* XXX: implement caching scheme for dir pages
*/
Int32 curPage;
/* buffer to hold whatever page we're looking at */
/* RWE 6/12/2003 */
UChar *page_buf = malloc(h->block_len);
if (page_buf == NULL)
return CHM_RESOLVE_FAILURE;
/* starting page */
curPage = h->index_root;
/* until we have either returned or given up */
while (curPage != -1)
{
/* try to fetch the index page */
if (_chm_fetch_bytes(h, page_buf,
(UInt64)h->dir_offset + (UInt64)curPage*h->block_len,
h->block_len) != h->block_len)
{
free(page_buf);
return CHM_RESOLVE_FAILURE;
}
/* now, if it is a leaf node: */
if (memcmp(page_buf, _chm_pmgl_marker, 4) == 0)
{
/* scan block */
UChar *pEntry = _chm_find_in_PMGL(page_buf,
h->block_len,
objPath);
if (pEntry == NULL)
{
free(page_buf);
return CHM_RESOLVE_FAILURE;
}
/* parse entry and return */
_chm_parse_PMGL_entry(&pEntry, ui);
free(page_buf);
return CHM_RESOLVE_SUCCESS;
}
/* else, if it is a branch node: */
else if (memcmp(page_buf, _chm_pmgi_marker, 4) == 0)
curPage = _chm_find_in_PMGI(page_buf, h->block_len, objPath);
/* else, we are confused. give up. */
else
{
free(page_buf);
return CHM_RESOLVE_FAILURE;
}
}
/* didn't find anything. fail. */
free(page_buf);
return CHM_RESOLVE_FAILURE;
}
/*
* utility methods for dealing with compressed data
*/
/* get the bounds of a compressed block. return 0 on failure */
static int _chm_get_cmpblock_bounds(struct chmFile *h,
UInt64 block,
UInt64 *start,
Int64 *len)
{
UChar buffer[8], *dummy;
unsigned int remain;
/* for all but the last block, use the reset table */
if (block < h->reset_table.block_count-1)
{
/* unpack the start address */
dummy = buffer;
remain = 8;
if (_chm_fetch_bytes(h, buffer,
(UInt64)h->data_offset
+ (UInt64)h->rt_unit.start
+ (UInt64)h->reset_table.table_offset
+ (UInt64)block*8,
remain) != remain ||
!_unmarshal_uint64(&dummy, &remain, start))
return 0;
/* unpack the end address */
dummy = buffer;
remain = 8;
if (_chm_fetch_bytes(h, buffer,
(UInt64)h->data_offset
+ (UInt64)h->rt_unit.start
+ (UInt64)h->reset_table.table_offset
+ (UInt64)block*8 + 8,
remain) != remain ||
!_unmarshal_int64(&dummy, &remain, len))
return 0;
}
/* for the last block, use the span in addition to the reset table */
else
{
/* unpack the start address */
dummy = buffer;
remain = 8;
if (_chm_fetch_bytes(h, buffer,
(UInt64)h->data_offset
+ (UInt64)h->rt_unit.start
+ (UInt64)h->reset_table.table_offset
+ (UInt64)block*8,
remain) != remain ||
!_unmarshal_uint64(&dummy, &remain, start))
return 0;
*len = h->reset_table.compressed_len;
}
/* compute the length and absolute start address */
*len -= *start;
*start += h->data_offset + h->cn_unit.start;
return 1;
}
/* decompress the block. must have lzx_mutex. */
static Int64 _chm_decompress_block(struct chmFile *h,
UInt64 block,
UChar **ubuffer)
{
UChar *cbuffer = malloc(((unsigned int)h->reset_table.block_len + 6144));
UInt64 cmpStart; /* compressed start */
Int64 cmpLen; /* compressed len */
int indexSlot; /* cache index slot */
UChar *lbuffer; /* local buffer ptr */
UInt32 blockAlign = (UInt32)(block % h->reset_blkcount); /* reset intvl. aln. */
UInt32 i; /* local loop index */
if (cbuffer == NULL)
return -1;
/* let the caching system pull its weight! */
if (block - blockAlign <= h->lzx_last_block &&
block >= h->lzx_last_block)
blockAlign = (block - h->lzx_last_block);
/* check if we need previous blocks */
if (blockAlign != 0)
{
/* fetch all required previous blocks since last reset */
for (i = blockAlign; i > 0; i--)
{
UInt32 curBlockIdx = block - i;
/* check if we most recently decompressed the previous block */
if (h->lzx_last_block != (int)curBlockIdx)
{
if ((curBlockIdx % h->reset_blkcount) == 0)
{
#ifdef CHM_DEBUG
fprintf(stderr, "***RESET (1)***\n");
#endif
LZXreset(h->lzx_state);
}
indexSlot = (int)((curBlockIdx) % h->cache_num_blocks);
if (! h->cache_blocks[indexSlot])
h->cache_blocks[indexSlot] = (UChar *)malloc((unsigned int)(h->reset_table.block_len));
if (! h->cache_blocks[indexSlot])
{
free(cbuffer);
return -1;
}
h->cache_block_indices[indexSlot] = curBlockIdx;
lbuffer = h->cache_blocks[indexSlot];
/* decompress the previous block */
#ifdef CHM_DEBUG
fprintf(stderr, "Decompressing block #%4d (EXTRA)\n", curBlockIdx);
#endif
if (!_chm_get_cmpblock_bounds(h, curBlockIdx, &cmpStart, &cmpLen) ||
cmpLen < 0 ||
cmpLen > h->reset_table.block_len + 6144 ||
_chm_fetch_bytes(h, cbuffer, cmpStart, cmpLen) != cmpLen ||
LZXdecompress(h->lzx_state, cbuffer, lbuffer, (int)cmpLen,
(int)h->reset_table.block_len) != DECR_OK)
{
#ifdef CHM_DEBUG
fprintf(stderr, " (DECOMPRESS FAILED!)\n");
#endif
free(cbuffer);
return (Int64)0;
}
h->lzx_last_block = (int)curBlockIdx;
}
}
}
else
{
if ((block % h->reset_blkcount) == 0)
{
#ifdef CHM_DEBUG
fprintf(stderr, "***RESET (2)***\n");
#endif
LZXreset(h->lzx_state);
}
}
/* allocate slot in cache */
indexSlot = (int)(block % h->cache_num_blocks);
if (! h->cache_blocks[indexSlot])
h->cache_blocks[indexSlot] = (UChar *)malloc(((unsigned int)h->reset_table.block_len));
if (! h->cache_blocks[indexSlot])
{
free(cbuffer);
return -1;
}
h->cache_block_indices[indexSlot] = block;
lbuffer = h->cache_blocks[indexSlot];
*ubuffer = lbuffer;
/* decompress the block we actually want */
#ifdef CHM_DEBUG
fprintf(stderr, "Decompressing block #%4d (REAL )\n", block);
#endif
if (! _chm_get_cmpblock_bounds(h, block, &cmpStart, &cmpLen) ||
_chm_fetch_bytes(h, cbuffer, cmpStart, cmpLen) != cmpLen ||
LZXdecompress(h->lzx_state, cbuffer, lbuffer, (int)cmpLen,
(int)h->reset_table.block_len) != DECR_OK)
{
#ifdef CHM_DEBUG
fprintf(stderr, " (DECOMPRESS FAILED!)\n");
#endif
free(cbuffer);
return (Int64)0;
}
h->lzx_last_block = (int)block;
/* XXX: modify LZX routines to return the length of the data they
* decompressed and return that instead, for an extra sanity check.
*/
free(cbuffer);
return h->reset_table.block_len;
}
/* grab a region from a compressed block */
static Int64 _chm_decompress_region(struct chmFile *h,
UChar *buf,
UInt64 start,
Int64 len)
{
UInt64 nBlock, nOffset;
UInt64 nLen;
UInt64 gotLen;
UChar *ubuffer;
if (len <= 0)
return (Int64)0;
/* figure out what we need to read */
nBlock = start / h->reset_table.block_len;
nOffset = start % h->reset_table.block_len;
nLen = len;
if (nLen > (h->reset_table.block_len - nOffset))
nLen = h->reset_table.block_len - nOffset;
/* if block is cached, return data from it. */
CHM_ACQUIRE_LOCK(h->lzx_mutex);
CHM_ACQUIRE_LOCK(h->cache_mutex);
if (h->cache_block_indices[nBlock % h->cache_num_blocks] == nBlock &&
h->cache_blocks[nBlock % h->cache_num_blocks] != NULL)
{
memcpy(buf,
h->cache_blocks[nBlock % h->cache_num_blocks] + nOffset,
(unsigned int)nLen);
CHM_RELEASE_LOCK(h->cache_mutex);
CHM_RELEASE_LOCK(h->lzx_mutex);
return nLen;
}
CHM_RELEASE_LOCK(h->cache_mutex);
/* data request not satisfied, so... start up the decompressor machine */
if (! h->lzx_state)
{
int window_size = ffs(h->window_size) - 1;
h->lzx_last_block = -1;
h->lzx_state = LZXinit(window_size);
}
/* decompress some data */
gotLen = _chm_decompress_block(h, nBlock, &ubuffer);
/* SumatraPDF: check return value */
if (gotLen == (UInt64)-1)
{
CHM_RELEASE_LOCK(h->lzx_mutex);
return 0;
}
if (gotLen < nLen)
nLen = gotLen;
memcpy(buf, ubuffer+nOffset, (unsigned int)nLen);
CHM_RELEASE_LOCK(h->lzx_mutex);
return nLen;
}
/* retrieve (part of) an object */
LONGINT64 chm_retrieve_object(struct chmFile *h,
struct chmUnitInfo *ui,
unsigned char *buf,
LONGUINT64 addr,
LONGINT64 len)
{
/* must be valid file handle */
if (h == NULL)
return (Int64)0;
/* starting address must be in correct range */
if (addr < 0 || addr >= ui->length)
return (Int64)0;
/* clip length */
if (addr + len > ui->length)
len = ui->length - addr;
/* if the file is uncompressed, it's simple */
if (ui->space == CHM_UNCOMPRESSED)
{
/* read data */
return _chm_fetch_bytes(h,
buf,
(UInt64)h->data_offset + (UInt64)ui->start + (UInt64)addr,
len);
}
/* else if the file is compressed, it's a little trickier */
else /* ui->space == CHM_COMPRESSED */
{
Int64 swath=0, total=0;
/* if compression is not enabled for this file... */
if (! h->compression_enabled)
return total;
do {
/* swill another mouthful */
swath = _chm_decompress_region(h, buf, ui->start + addr, len);
/* if we didn't get any... */
if (swath == 0)
return total;
/* update stats */
total += swath;
len -= swath;
addr += swath;
buf += swath;
} while (len != 0);
return total;
}
}
/* enumerate the objects in the .chm archive */
int chm_enumerate(struct chmFile *h,
int what,
CHM_ENUMERATOR e,
void *context)
{
Int32 curPage;
/* buffer to hold whatever page we're looking at */
/* RWE 6/12/2003 */
UChar *page_buf = malloc((unsigned int)h->block_len);
struct chmPmglHeader header;
UChar *end;
UChar *cur;
unsigned int lenRemain;
UInt64 ui_path_len;
/* the current ui */
struct chmUnitInfo ui;
int type_bits = (what & 0x7);
int filter_bits = (what & 0xF8);
if (page_buf == NULL)
return 0;
/* starting page */
curPage = h->index_head;
/* until we have either returned or given up */
while (curPage != -1)
{
/* try to fetch the index page */
if (_chm_fetch_bytes(h,
page_buf,
(UInt64)h->dir_offset + (UInt64)curPage*h->block_len,
h->block_len) != h->block_len)
{
free(page_buf);
return 0;
}
/* figure out start and end for this page */
cur = page_buf;
lenRemain = _CHM_PMGL_LEN;
if (! _unmarshal_pmgl_header(&cur, &lenRemain, h->block_len, &header))
{
free(page_buf);
return 0;
}
end = page_buf + h->block_len - (header.free_space);
/* loop over this page */
while (cur < end)
{
ui.flags = 0;
if (! _chm_parse_PMGL_entry(&cur, &ui))
{
free(page_buf);
return 0;
}
/* get the length of the path */
ui_path_len = strlen(ui.path)-1;
/* check for DIRS */
if (ui.path[ui_path_len] == '/')
ui.flags |= CHM_ENUMERATE_DIRS;
/* check for FILES */
if (ui.path[ui_path_len] != '/')
ui.flags |= CHM_ENUMERATE_FILES;
/* check for NORMAL vs. META */
if (ui.path[0] == '/')
{
/* check for NORMAL vs. SPECIAL */
if (ui.path[1] == '#' || ui.path[1] == '$')
ui.flags |= CHM_ENUMERATE_SPECIAL;
else
ui.flags |= CHM_ENUMERATE_NORMAL;
}
else
ui.flags |= CHM_ENUMERATE_META;
if (! (type_bits & ui.flags))
continue;
if (filter_bits && ! (filter_bits & ui.flags))
continue;
/* call the enumerator */
{
int status = (*e)(h, &ui, context);
switch (status)
{
case CHM_ENUMERATOR_FAILURE:
free(page_buf);
return 0;
case CHM_ENUMERATOR_CONTINUE:
break;
case CHM_ENUMERATOR_SUCCESS:
free(page_buf);
return 1;
default:
break;
}
}
}
/* advance to next page */
curPage = header.block_next;
}
free(page_buf);
return 1;
}
int chm_enumerate_dir(struct chmFile *h,
const char *prefix,
int what,
CHM_ENUMERATOR e,
void *context)
{
/*
* XXX: do this efficiently (i.e. using the tree index)
*/
Int32 curPage;
/* buffer to hold whatever page we're looking at */
/* RWE 6/12/2003 */
UChar *page_buf = malloc((unsigned int)h->block_len);
struct chmPmglHeader header;
UChar *end;
UChar *cur;
unsigned int lenRemain;
/* set to 1 once we've started */
int it_has_begun=0;
/* the current ui */
struct chmUnitInfo ui;
int type_bits = (what & 0x7);
int filter_bits = (what & 0xF8);
UInt64 ui_path_len;
/* the length of the prefix */
char prefixRectified[CHM_MAX_PATHLEN+1];
int prefixLen;
char lastPath[CHM_MAX_PATHLEN+1];
int lastPathLen;
if (page_buf == NULL)
return 0;
/* starting page */
curPage = h->index_head;
/* initialize pathname state */
strncpy(prefixRectified, prefix, CHM_MAX_PATHLEN);
prefixRectified[CHM_MAX_PATHLEN] = '\0';
prefixLen = strlen(prefixRectified);
if (prefixLen != 0)
{
if (prefixRectified[prefixLen-1] != '/')
{
prefixRectified[prefixLen] = '/';
prefixRectified[prefixLen+1] = '\0';
++prefixLen;
}
}
lastPath[0] = '\0';
lastPathLen = -1;
/* until we have either returned or given up */
while (curPage != -1)
{
/* try to fetch the index page */
if (_chm_fetch_bytes(h,
page_buf,
(UInt64)h->dir_offset + (UInt64)curPage*h->block_len,
h->block_len) != h->block_len)
{
free(page_buf);
return 0;
}
/* figure out start and end for this page */
cur = page_buf;
lenRemain = _CHM_PMGL_LEN;
if (! _unmarshal_pmgl_header(&cur, &lenRemain, h->block_len, &header))
{
free(page_buf);
return 0;
}
end = page_buf + h->block_len - (header.free_space);
/* loop over this page */
while (cur < end)
{
ui.flags = 0;
if (! _chm_parse_PMGL_entry(&cur, &ui))
{
free(page_buf);
return 0;
}
/* check if we should start */
if (! it_has_begun)
{
if (ui.length == 0 && strncasecmp(ui.path, prefixRectified, prefixLen) == 0)
it_has_begun = 1;
else
continue;
if (ui.path[prefixLen] == '\0')
continue;
}
/* check if we should stop */
else
{
if (strncasecmp(ui.path, prefixRectified, prefixLen) != 0)
{
free(page_buf);
return 1;
}
}
/* check if we should include this path */
if (lastPathLen != -1)
{
if (strncasecmp(ui.path, lastPath, lastPathLen) == 0)
continue;
}
strncpy(lastPath, ui.path, CHM_MAX_PATHLEN);
lastPath[CHM_MAX_PATHLEN] = '\0';
lastPathLen = strlen(lastPath);
/* get the length of the path */
ui_path_len = strlen(ui.path)-1;
/* check for DIRS */
if (ui.path[ui_path_len] == '/')
ui.flags |= CHM_ENUMERATE_DIRS;
/* check for FILES */
if (ui.path[ui_path_len] != '/')
ui.flags |= CHM_ENUMERATE_FILES;
/* check for NORMAL vs. META */
if (ui.path[0] == '/')
{
/* check for NORMAL vs. SPECIAL */
if (ui.path[1] == '#' || ui.path[1] == '$')
ui.flags |= CHM_ENUMERATE_SPECIAL;
else
ui.flags |= CHM_ENUMERATE_NORMAL;
}
else
ui.flags |= CHM_ENUMERATE_META;
if (! (type_bits & ui.flags))
continue;
if (filter_bits && ! (filter_bits & ui.flags))
continue;
/* call the enumerator */
{
int status = (*e)(h, &ui, context);
switch (status)
{
case CHM_ENUMERATOR_FAILURE:
free(page_buf);
return 0;
case CHM_ENUMERATOR_CONTINUE:
break;
case CHM_ENUMERATOR_SUCCESS:
free(page_buf);
return 1;
default:
break;
}
}
}
/* advance to next page */
curPage = header.block_next;
}
free(page_buf);
return 1;
}
================================================
FILE: ext/CHMLib/src/chm_lib.h
================================================
/* $Id: chm_lib.h,v 1.10 2002/10/09 01:16:33 jedwin Exp $ */
/***************************************************************************
* chm_lib.h - CHM archive manipulation routines *
* ------------------- *
* *
* author: Jed Wing *
* version: 0.3 *
* notes: These routines are meant for the manipulation of microsoft *
* .chm (compiled html help) files, but may likely be used *
* for the manipulation of any ITSS archive, if ever ITSS *
* archives are used for any other purpose. *
* *
* Note also that the section names are statically handled. *
* To be entirely correct, the section names should be read *
* from the section names meta-file, and then the various *
* content sections and the "transforms" to apply to the data *
* they contain should be inferred from the section name and *
* the meta-files referenced using that name; however, all of *
* the files I've been able to get my hands on appear to have *
* only two sections: Uncompressed and MSCompressed. *
* Additionally, the ITSS.DLL file included with Windows does *
* not appear to handle any different transforms than the *
* simple LZX-transform. Furthermore, the list of transforms *
* to apply is broken, in that only half the required space *
* is allocated for the list. (It appears as though the *
* space is allocated for ASCII strings, but the strings are *
* written as unicode. As a result, only the first half of *
* the string appears.) So this is probably not too big of *
* a deal, at least until CHM v4 (MS .lit files), which also *
* incorporate encryption, of some description. *
***************************************************************************/
/***************************************************************************
* *
* This program is free software; you can redistribute it and/or modify *
* it under the terms of the GNU Lesser General Public License as *
* published by the Free Software Foundation; either version 2.1 of the *
* License, or (at your option) any later version. *
* *
***************************************************************************/
#ifndef INCLUDED_CHMLIB_H
#define INCLUDED_CHMLIB_H
#ifdef __cplusplus
extern "C" {
#endif
/* RWE 6/12/1002 */
#ifdef PPC_BSTR
#include
#endif
#ifdef WIN32
#ifdef __MINGW32__
#define __int64 long long
#endif
typedef unsigned __int64 LONGUINT64;
typedef __int64 LONGINT64;
#else
typedef unsigned long long LONGUINT64;
typedef long long LONGINT64;
#endif
/* the two available spaces in a CHM file */
/* N.B.: The format supports arbitrarily many spaces, but only */
/* two appear to be used at present. */
#define CHM_UNCOMPRESSED (0)
#define CHM_COMPRESSED (1)
/* structure representing an ITS (CHM) file stream */
struct chmFile;
/* structure representing an element from an ITS file stream */
#define CHM_MAX_PATHLEN (512)
struct chmUnitInfo
{
LONGUINT64 start;
LONGUINT64 length;
int space;
int flags;
char path[CHM_MAX_PATHLEN+1];
};
/* open an ITS archive */
#ifdef PPC_BSTR
/* RWE 6/12/2003 */
struct chmFile* chm_open(BSTR filename);
#else
struct chmFile* chm_open(const char *filename);
#endif
/* close an ITS archive */
void chm_close(struct chmFile *h);
/* methods for ssetting tuning parameters for particular file */
#define CHM_PARAM_MAX_BLOCKS_CACHED 0
void chm_set_param(struct chmFile *h,
int paramType,
int paramVal);
/* resolve a particular object from the archive */
#define CHM_RESOLVE_SUCCESS (0)
#define CHM_RESOLVE_FAILURE (1)
int chm_resolve_object(struct chmFile *h,
const char *objPath,
struct chmUnitInfo *ui);
/* retrieve part of an object from the archive */
LONGINT64 chm_retrieve_object(struct chmFile *h,
struct chmUnitInfo *ui,
unsigned char *buf,
LONGUINT64 addr,
LONGINT64 len);
/* enumerate the objects in the .chm archive */
typedef int (*CHM_ENUMERATOR)(struct chmFile *h,
struct chmUnitInfo *ui,
void *context);
#define CHM_ENUMERATE_NORMAL (1)
#define CHM_ENUMERATE_META (2)
#define CHM_ENUMERATE_SPECIAL (4)
#define CHM_ENUMERATE_FILES (8)
#define CHM_ENUMERATE_DIRS (16)
#define CHM_ENUMERATE_ALL (31)
#define CHM_ENUMERATOR_FAILURE (0)
#define CHM_ENUMERATOR_CONTINUE (1)
#define CHM_ENUMERATOR_SUCCESS (2)
int chm_enumerate(struct chmFile *h,
int what,
CHM_ENUMERATOR e,
void *context);
int chm_enumerate_dir(struct chmFile *h,
const char *prefix,
int what,
CHM_ENUMERATOR e,
void *context);
#ifdef __cplusplus
}
#endif
#endif /* INCLUDED_CHMLIB_H */
================================================
FILE: ext/CHMLib/src/enum_chmLib.c
================================================
/* $Id: enum_chmLib.c,v 1.7 2002/10/09 12:38:12 jedwin Exp $ */
/***************************************************************************
* enum_chmLib.c - CHM archive test driver *
* ------------------- *
* *
* author: Jed Wing *
* notes: This is a quick-and-dirty test driver for the chm lib *
* routines. The program takes as its input the paths to one *
* or more .chm files. It attempts to open each .chm file in *
* turn, and display a listing of all of the files in the *
* archive. *
* *
* It is not included as a particularly useful program, but *
* rather as a sort of "simplest possible" example of how to *
* use the enumerate portion of the API. *
***************************************************************************/
/***************************************************************************
* *
* This program is free software; you can redistribute it and/or modify *
* it under the terms of the GNU Lesser General Public License as *
* published by the Free Software Foundation; either version 2.1 of the *
* License, or (at your option) any later version. *
* *
***************************************************************************/
#include "chm_lib.h"
#include
#include
#include
/*
* callback function for enumerate API
*/
int _print_ui(struct chmFile *h,
struct chmUnitInfo *ui,
void *context)
{
static char szBuf[128];
memset(szBuf, 0, 128);
if(ui->flags & CHM_ENUMERATE_NORMAL)
strcpy(szBuf, "normal ");
else if(ui->flags & CHM_ENUMERATE_SPECIAL)
strcpy(szBuf, "special ");
else if(ui->flags & CHM_ENUMERATE_META)
strcpy(szBuf, "meta ");
if(ui->flags & CHM_ENUMERATE_DIRS)
strcat(szBuf, "dir");
else if(ui->flags & CHM_ENUMERATE_FILES)
strcat(szBuf, "file");
printf(" %1d %8d %8d %s\t\t%s\n",
(int)ui->space,
(int)ui->start,
(int)ui->length,
szBuf,
ui->path);
return CHM_ENUMERATOR_CONTINUE;
}
int main(int c, char **v)
{
struct chmFile *h;
int i;
for (i=1; i *
* notes: This is a quick-and-dirty test driver for the chm lib *
* routines. The program takes as its input the paths to one *
* or more .chm files. It attempts to open each .chm file in *
* turn, and display a listing of all of the files in the *
* archive. *
* *
* It is not included as a particularly useful program, but *
* rather as a sort of "simplest possible" example of how to *
* use the enumerate portion of the API. *
***************************************************************************/
/***************************************************************************
* *
* This program is free software; you can redistribute it and/or modify *
* it under the terms of the GNU Lesser General Public License as *
* published by the Free Software Foundation; either version 2.1 of the *
* License, or (at your option) any later version. *
* *
***************************************************************************/
#include "chm_lib.h"
#include
#include
#include
/*
* callback function for enumerate API
*/
int _print_ui(struct chmFile *h,
struct chmUnitInfo *ui,
void *context)
{
static char szBuf[128];
memset(szBuf, 0, 128);
if(ui->flags & CHM_ENUMERATE_NORMAL)
strcpy(szBuf, "normal ");
else if(ui->flags & CHM_ENUMERATE_SPECIAL)
strcpy(szBuf, "special ");
else if(ui->flags & CHM_ENUMERATE_META)
strcpy(szBuf, "meta ");
if(ui->flags & CHM_ENUMERATE_DIRS)
strcat(szBuf, "dir");
else if(ui->flags & CHM_ENUMERATE_FILES)
strcat(szBuf, "file");
printf(" %1d %8d %8d %s\t\t%s\n",
(int)ui->space,
(int)ui->start,
(int)ui->length,
szBuf,
ui->path);
return CHM_ENUMERATOR_CONTINUE;}
int main(int c, char **v)
{
struct chmFile *h;
int i;
if (c < 2)
{
fprintf(stderr, "%s [dir] [dir] ...\n", v[0]);
exit(1);
}
h = chm_open(v[1]);
if (h == NULL)
{
fprintf(stderr, "failed to open %s\n", v[1]);
exit(1);
}
if (c < 3)
{
printf("/:\n");
printf(" spc start length type\t\t\tname\n");
printf(" === ===== ====== ====\t\t\t====\n");
if (! chm_enumerate_dir(h,
"/",
CHM_ENUMERATE_ALL,
_print_ui,
NULL))
printf(" *** ERROR ***\n");
}
else
{
for (i=2; i *
* notes: This is a quick-and-dirty chm archive extractor. *
***************************************************************************/
/***************************************************************************
* *
* This program is free software; you can redistribute it and/or modify *
* it under the terms of the GNU Lesser General Public License as *
* published by the Free Software Foundation; either version 2.1 of the *
* License, or (at your option) any later version. *
* *
***************************************************************************/
#include "chm_lib.h"
#include
#include
#include
#ifdef WIN32
#include
#include
#define mkdir(X, Y) _mkdir(X)
#define snprintf _snprintf
#else
#include
#include
#include
#endif
struct extract_context
{
const char *base_path;
};
static int dir_exists(const char *path)
{
#ifdef WIN32
/* why doesn't this work?!? */
HANDLE hFile;
hFile = CreateFileA(path,
FILE_LIST_DIRECTORY,
0,
NULL,
OPEN_EXISTING,
FILE_ATTRIBUTE_NORMAL,
NULL);
if (hFile != INVALID_HANDLE_VALUE)
{
CloseHandle(hFile);
return 1;
}
else
return 0;
#else
struct stat statbuf;
if (stat(path, &statbuf) != -1)
return 1;
else
return 0;
#endif
}
static int rmkdir(char *path)
{
/*
* strip off trailing components unless we can stat the directory, or we
* have run out of components
*/
char *i = strrchr(path, '/');
if(path[0] == '\0' || dir_exists(path))
return 0;
if (i != NULL)
{
*i = '\0';
rmkdir(path);
*i = '/';
mkdir(path, 0777);
}
#ifdef WIN32
return 0;
#else
if (dir_exists(path))
return 0;
else
return -1;
#endif
}
/*
* callback function for enumerate API
*/
int _extract_callback(struct chmFile *h,
struct chmUnitInfo *ui,
void *context)
{
LONGUINT64 ui_path_len;
char buffer[32768];
struct extract_context *ctx = (struct extract_context *)context;
char *i;
if (ui->path[0] != '/')
return CHM_ENUMERATOR_CONTINUE;
/* quick hack for security hole mentioned by Sven Tantau */
if (strstr(ui->path, "/../") != NULL)
{
/* fprintf(stderr, "Not extracting %s (dangerous path)\n", ui->path); */
return CHM_ENUMERATOR_CONTINUE;
}
if (snprintf(buffer, sizeof(buffer), "%s%s", ctx->base_path, ui->path) > 1024)
return CHM_ENUMERATOR_FAILURE;
/* Get the length of the path */
ui_path_len = strlen(ui->path)-1;
/* Distinguish between files and dirs */
if (ui->path[ui_path_len] != '/' )
{
FILE *fout;
LONGINT64 len, remain=ui->length;
LONGUINT64 offset = 0;
printf("--> %s\n", ui->path);
if ((fout = fopen(buffer, "wb")) == NULL)
{
/* make sure that it isn't just a missing directory before we abort */
char newbuf[32768];
strcpy(newbuf, buffer);
i = strrchr(newbuf, '/');
*i = '\0';
rmkdir(newbuf);
if ((fout = fopen(buffer, "wb")) == NULL)
return CHM_ENUMERATOR_FAILURE;
}
while (remain != 0)
{
len = chm_retrieve_object(h, ui, (unsigned char *)buffer, offset, 32768);
if (len > 0)
{
fwrite(buffer, 1, (size_t)len, fout);
offset += len;
remain -= len;
}
else
{
fprintf(stderr, "incomplete file: %s\n", ui->path);
break;
}
}
fclose(fout);
}
else
{
if (rmkdir(buffer) == -1)
return CHM_ENUMERATOR_FAILURE;
}
return CHM_ENUMERATOR_CONTINUE;
}
int main(int c, char **v)
{
struct chmFile *h;
struct extract_context ec;
if (c < 3)
{
fprintf(stderr, "usage: %s \n", v[0]);
exit(1);
}
h = chm_open(v[1]);
if (h == NULL)
{
fprintf(stderr, "failed to open %s\n", v[1]);
exit(1);
}
printf("%s:\n", v[1]);
ec.base_path = v[2];
if (! chm_enumerate(h,
CHM_ENUMERATE_ALL,
_extract_callback,
(void *)&ec))
printf(" *** ERROR ***\n");
chm_close(h);
return 0;
}
================================================
FILE: ext/CHMLib/src/lzx.c
================================================
/* $Id: lzx.c,v 1.5 2002/10/09 01:16:33 jedwin Exp $ */
/***************************************************************************
* lzx.c - LZX decompression routines *
* ------------------- *
* *
* maintainer: Jed Wing *
* source: modified lzx.c from cabextract v0.5 *
* notes: This file was taken from cabextract v0.5, which was, *
* itself, a modified version of the lzx decompression code *
* from unlzx. *
* *
* platforms: In its current incarnation, this file has been tested on *
* two different Linux platforms (one, redhat-based, with a *
* 2.1.2 glibc and gcc 2.95.x, and the other, Debian, with *
* 2.2.4 glibc and both gcc 2.95.4 and gcc 3.0.2). Both were *
* Intel x86 compatible machines. *
***************************************************************************/
/***************************************************************************
* *
* This program is free software; you can redistribute it and/or modify *
* it under the terms of the GNU General Public License as published by *
* the Free Software Foundation; either version 2 of the License, or *
* (at your option) any later version. Note that an exemption to this *
* license has been granted by Stuart Caie for the purposes of *
* distribution with chmlib. This does not, to the best of my *
* knowledge, constitute a change in the license of this (the LZX) code *
* in general. *
* *
***************************************************************************/
#include "lzx.h"
#include
#include
#include
#ifdef __GNUC__
#define memcpy __builtin_memcpy
#endif
/* sized types */
typedef unsigned char UBYTE; /* 8 bits exactly */
typedef unsigned short UWORD; /* 16 bits (or more) */
typedef unsigned int ULONG; /* 32 bits (or more) */
typedef signed int LONG; /* 32 bits (or more) */
/* some constants defined by the LZX specification */
#define LZX_MIN_MATCH (2)
#define LZX_MAX_MATCH (257)
#define LZX_NUM_CHARS (256)
#define LZX_BLOCKTYPE_INVALID (0) /* also blocktypes 4-7 invalid */
#define LZX_BLOCKTYPE_VERBATIM (1)
#define LZX_BLOCKTYPE_ALIGNED (2)
#define LZX_BLOCKTYPE_UNCOMPRESSED (3)
#define LZX_PRETREE_NUM_ELEMENTS (20)
#define LZX_ALIGNED_NUM_ELEMENTS (8) /* aligned offset tree #elements */
#define LZX_NUM_PRIMARY_LENGTHS (7) /* this one missing from spec! */
#define LZX_NUM_SECONDARY_LENGTHS (249) /* length tree #elements */
/* LZX huffman defines: tweak tablebits as desired */
#define LZX_PRETREE_MAXSYMBOLS (LZX_PRETREE_NUM_ELEMENTS)
#define LZX_PRETREE_TABLEBITS (6)
#define LZX_MAINTREE_MAXSYMBOLS (LZX_NUM_CHARS + 50*8)
#define LZX_MAINTREE_TABLEBITS (12)
#define LZX_LENGTH_MAXSYMBOLS (LZX_NUM_SECONDARY_LENGTHS+1)
#define LZX_LENGTH_TABLEBITS (12)
#define LZX_ALIGNED_MAXSYMBOLS (LZX_ALIGNED_NUM_ELEMENTS)
#define LZX_ALIGNED_TABLEBITS (7)
#define LZX_LENTABLE_SAFETY (64) /* we allow length table decoding overruns */
#define LZX_DECLARE_TABLE(tbl) \
UWORD tbl##_table[(1< 21) return NULL;
/* allocate state and associated window */
pState = (struct LZXstate *)malloc(sizeof(struct LZXstate));
if (!pState || !(pState->window = (UBYTE *)malloc(wndsize)))
{
free(pState);
return NULL;
}
pState->actual_size = wndsize;
pState->window_size = wndsize;
/* calculate required position slots */
if (window == 20) posn_slots = 42;
else if (window == 21) posn_slots = 50;
else posn_slots = window << 1;
/** alternatively **/
/* posn_slots=i=0; while (i < wndsize) i += 1 << extra_bits[posn_slots++]; */
/* initialize other state */
pState->R0 = pState->R1 = pState->R2 = 1;
pState->main_elements = LZX_NUM_CHARS + (posn_slots << 3);
pState->header_read = 0;
pState->frames_read = 0;
pState->block_remaining = 0;
pState->block_type = LZX_BLOCKTYPE_INVALID;
pState->intel_curpos = 0;
pState->intel_started = 0;
pState->window_posn = 0;
/* initialise tables to 0 (because deltas will be applied to them) */
for (i = 0; i < LZX_MAINTREE_MAXSYMBOLS; i++) pState->MAINTREE_len[i] = 0;
for (i = 0; i < LZX_LENGTH_MAXSYMBOLS; i++) pState->LENGTH_len[i] = 0;
return pState;
}
void LZXteardown(struct LZXstate *pState)
{
if (pState)
{
if (pState->window)
free(pState->window);
free(pState);
}
}
int LZXreset(struct LZXstate *pState)
{
int i;
pState->R0 = pState->R1 = pState->R2 = 1;
pState->header_read = 0;
pState->frames_read = 0;
pState->block_remaining = 0;
pState->block_type = LZX_BLOCKTYPE_INVALID;
pState->intel_curpos = 0;
pState->intel_started = 0;
pState->window_posn = 0;
for (i = 0; i < LZX_MAINTREE_MAXSYMBOLS + LZX_LENTABLE_SAFETY; i++) pState->MAINTREE_len[i] = 0;
for (i = 0; i < LZX_LENGTH_MAXSYMBOLS + LZX_LENTABLE_SAFETY; i++) pState->LENGTH_len[i] = 0;
return DECR_OK;
}
/* Bitstream reading macros:
*
* INIT_BITSTREAM should be used first to set up the system
* READ_BITS(var,n) takes N bits from the buffer and puts them in var
*
* ENSURE_BITS(n) ensures there are at least N bits in the bit buffer
* PEEK_BITS(n) extracts (without removing) N bits from the bit buffer
* REMOVE_BITS(n) removes N bits from the bit buffer
*
* These bit access routines work by using the area beyond the MSB and the
* LSB as a free source of zeroes. This avoids having to mask any bits.
* So we have to know the bit width of the bitbuffer variable. This is
* sizeof(ULONG) * 8, also defined as ULONG_BITS
*/
/* number of bits in ULONG. Note: This must be at multiple of 16, and at
* least 32 for the bitbuffer code to work (ie, it must be able to ensure
* up to 17 bits - that's adding 16 bits when there's one bit left, or
* adding 32 bits when there are no bits left. The code should work fine
* for machines where ULONG >= 32 bits.
*/
#define ULONG_BITS (sizeof(ULONG)<<3)
#define INIT_BITSTREAM do { bitsleft = 0; bitbuf = 0; } while (0)
#define ENSURE_BITS(n) \
while (bitsleft < (n)) { \
bitbuf |= ((inpos[1]<<8)|inpos[0]) << (ULONG_BITS-16 - bitsleft); \
bitsleft += 16; inpos+=2; \
}
#define PEEK_BITS(n) (bitbuf >> (ULONG_BITS - (n)))
#define REMOVE_BITS(n) ((bitbuf <<= (n)), (bitsleft -= (n)))
#define READ_BITS(v,n) do { \
ENSURE_BITS(n); \
(v) = PEEK_BITS(n); \
REMOVE_BITS(n); \
} while (0)
/* Huffman macros */
#define TABLEBITS(tbl) (LZX_##tbl##_TABLEBITS)
#define MAXSYMBOLS(tbl) (LZX_##tbl##_MAXSYMBOLS)
#define SYMTABLE(tbl) (pState->tbl##_table)
#define LENTABLE(tbl) (pState->tbl##_len)
/* BUILD_TABLE(tablename) builds a huffman lookup table from code lengths.
* In reality, it just calls make_decode_table() with the appropriate
* values - they're all fixed by some #defines anyway, so there's no point
* writing each call out in full by hand.
*/
#define BUILD_TABLE(tbl) \
if (make_decode_table( \
MAXSYMBOLS(tbl), TABLEBITS(tbl), LENTABLE(tbl), SYMTABLE(tbl) \
)) { return DECR_ILLEGALDATA; }
/* READ_HUFFSYM(tablename, var) decodes one huffman symbol from the
* bitstream using the stated table and puts it in var.
*/
#define READ_HUFFSYM(tbl,var) do { \
ENSURE_BITS(16); \
hufftbl = SYMTABLE(tbl); \
if ((i = hufftbl[PEEK_BITS(TABLEBITS(tbl))]) >= MAXSYMBOLS(tbl)) { \
j = 1 << (ULONG_BITS - TABLEBITS(tbl)); \
do { \
j >>= 1; i <<= 1; i |= (bitbuf & j) ? 1 : 0; \
if (!j) { return DECR_ILLEGALDATA; } \
} while ((i = hufftbl[i]) >= MAXSYMBOLS(tbl)); \
} \
j = LENTABLE(tbl)[(var) = i]; \
REMOVE_BITS(j); \
} while (0)
/* READ_LENGTHS(tablename, first, last) reads in code lengths for symbols
* first to last in the given table. The code lengths are stored in their
* own special LZX way.
*/
#define READ_LENGTHS(tbl,first,last) do { \
lb.bb = bitbuf; lb.bl = bitsleft; lb.ip = inpos; \
if (lzx_read_lens(pState, LENTABLE(tbl),(first),(last),&lb)) { \
return DECR_ILLEGALDATA; \
} \
bitbuf = lb.bb; bitsleft = lb.bl; inpos = lb.ip; \
} while (0)
/* make_decode_table(nsyms, nbits, length[], table[])
*
* This function was coded by David Tritscher. It builds a fast huffman
* decoding table out of just a canonical huffman code lengths table.
*
* nsyms = total number of symbols in this huffman tree.
* nbits = any symbols with a code length of nbits or less can be decoded
* in one lookup of the table.
* length = A table to get code lengths from [0 to syms-1]
* table = The table to fill up with decoded symbols and pointers.
*
* Returns 0 for OK or 1 for error
*/
static int make_decode_table(ULONG nsyms, ULONG nbits, UBYTE *length, UWORD *table) {
register UWORD sym;
register ULONG leaf;
register UBYTE bit_num = 1;
ULONG fill;
ULONG pos = 0; /* the current position in the decode table */
ULONG table_mask = 1 << nbits;
ULONG bit_mask = table_mask >> 1; /* don't do 0 length codes */
ULONG next_symbol = bit_mask; /* base of allocation for long codes */
/* fill entries for codes short enough for a direct mapping */
while (bit_num <= nbits) {
for (sym = 0; sym < nsyms; sym++) {
if (length[sym] == bit_num) {
leaf = pos;
if((pos += bit_mask) > table_mask) return 1; /* table overrun */
/* fill all possible lookups of this symbol with the symbol itself */
fill = bit_mask;
while (fill-- > 0) table[leaf++] = sym;
}
}
bit_mask >>= 1;
bit_num++;
}
/* if there are any codes longer than nbits */
if (pos != table_mask) {
/* clear the remainder of the table */
for (sym = pos; sym < table_mask; sym++) table[sym] = 0;
/* give ourselves room for codes to grow by up to 16 more bits */
pos <<= 16;
table_mask <<= 16;
bit_mask = 1 << 15;
while (bit_num <= 16) {
for (sym = 0; sym < nsyms; sym++) {
if (length[sym] == bit_num) {
leaf = pos >> 16;
for (fill = 0; fill < bit_num - nbits; fill++) {
/* if this path hasn't been taken yet, 'allocate' two entries */
if (table[leaf] == 0) {
table[(next_symbol << 1)] = 0;
table[(next_symbol << 1) + 1] = 0;
table[leaf] = next_symbol++;
}
/* follow the path and select either left or right for next bit */
leaf = table[leaf] << 1;
if ((pos >> (15-fill)) & 1) leaf++;
}
table[leaf] = sym;
if ((pos += bit_mask) > table_mask) return 1; /* table overflow */
}
}
bit_mask >>= 1;
bit_num++;
}
}
/* full table? */
if (pos == table_mask) return 0;
/* either erroneous table, or all elements are 0 - let's find out. */
for (sym = 0; sym < nsyms; sym++) if (length[sym]) return 1;
return 0;
}
struct lzx_bits {
ULONG bb;
int bl;
UBYTE *ip;
};
static int lzx_read_lens(struct LZXstate *pState, UBYTE *lens, ULONG first, ULONG last, struct lzx_bits *lb) {
ULONG i,j, x,y;
int z;
register ULONG bitbuf = lb->bb;
register int bitsleft = lb->bl;
UBYTE *inpos = lb->ip;
UWORD *hufftbl;
for (x = 0; x < 20; x++) {
READ_BITS(y, 4);
LENTABLE(PRETREE)[x] = y;
}
BUILD_TABLE(PRETREE);
for (x = first; x < last; ) {
READ_HUFFSYM(PRETREE, z);
if (z == 17) {
READ_BITS(y, 4); y += 4;
while (y--) lens[x++] = 0;
}
else if (z == 18) {
READ_BITS(y, 5); y += 20;
while (y--) lens[x++] = 0;
}
else if (z == 19) {
READ_BITS(y, 1); y += 4;
READ_HUFFSYM(PRETREE, z);
z = lens[x] - z; if (z < 0) z += 17;
while (y--) lens[x++] = z;
}
else {
z = lens[x] - z; if (z < 0) z += 17;
lens[x++] = z;
}
}
lb->bb = bitbuf;
lb->bl = bitsleft;
lb->ip = inpos;
return 0;
}
int LZXdecompress(struct LZXstate *pState, unsigned char *inpos, unsigned char *outpos, int inlen, int outlen) {
UBYTE *endinp = inpos + inlen;
UBYTE *window = pState->window;
UBYTE *runsrc, *rundest;
UWORD *hufftbl; /* used in READ_HUFFSYM macro as chosen decoding table */
ULONG window_posn = pState->window_posn;
ULONG window_size = pState->window_size;
ULONG R0 = pState->R0;
ULONG R1 = pState->R1;
ULONG R2 = pState->R2;
register ULONG bitbuf;
register int bitsleft;
ULONG match_offset, i,j,k; /* ijk used in READ_HUFFSYM macro */
struct lzx_bits lb; /* used in READ_LENGTHS macro */
int togo = outlen, this_run, main_element, aligned_bits;
int match_length, length_footer, extra, verbatim_bits;
INIT_BITSTREAM;
/* read header if necessary */
if (!pState->header_read) {
i = j = 0;
READ_BITS(k, 1); if (k) { READ_BITS(i,16); READ_BITS(j,16); }
pState->intel_filesize = (i << 16) | j; /* or 0 if not encoded */
pState->header_read = 1;
}
/* main decoding loop */
while (togo > 0) {
/* last block finished, new block expected */
if (pState->block_remaining == 0) {
if (pState->block_type == LZX_BLOCKTYPE_UNCOMPRESSED) {
if (pState->block_length & 1) inpos++; /* realign bitstream to word */
INIT_BITSTREAM;
}
READ_BITS(pState->block_type, 3);
READ_BITS(i, 16);
READ_BITS(j, 8);
pState->block_remaining = pState->block_length = (i << 8) | j;
switch (pState->block_type) {
case LZX_BLOCKTYPE_ALIGNED:
for (i = 0; i < 8; i++) { READ_BITS(j, 3); LENTABLE(ALIGNED)[i] = j; }
BUILD_TABLE(ALIGNED);
/* rest of aligned header is same as verbatim */
case LZX_BLOCKTYPE_VERBATIM:
READ_LENGTHS(MAINTREE, 0, 256);
READ_LENGTHS(MAINTREE, 256, pState->main_elements);
BUILD_TABLE(MAINTREE);
if (LENTABLE(MAINTREE)[0xE8] != 0) pState->intel_started = 1;
READ_LENGTHS(LENGTH, 0, LZX_NUM_SECONDARY_LENGTHS);
BUILD_TABLE(LENGTH);
break;
case LZX_BLOCKTYPE_UNCOMPRESSED:
pState->intel_started = 1; /* because we can't assume otherwise */
ENSURE_BITS(16); /* get up to 16 pad bits into the buffer */
if (bitsleft > 16) inpos -= 2; /* and align the bitstream! */
R0 = inpos[0]|(inpos[1]<<8)|(inpos[2]<<16)|(inpos[3]<<24);inpos+=4;
R1 = inpos[0]|(inpos[1]<<8)|(inpos[2]<<16)|(inpos[3]<<24);inpos+=4;
R2 = inpos[0]|(inpos[1]<<8)|(inpos[2]<<16)|(inpos[3]<<24);inpos+=4;
break;
default:
return DECR_ILLEGALDATA;
}
}
/* buffer exhaustion check */
if (inpos > endinp) {
/* it's possible to have a file where the next run is less than
* 16 bits in size. In this case, the READ_HUFFSYM() macro used
* in building the tables will exhaust the buffer, so we should
* allow for this, but not allow those accidentally read bits to
* be used (so we check that there are at least 16 bits
* remaining - in this boundary case they aren't really part of
* the compressed data)
*/
if (inpos > (endinp+2) || bitsleft < 16) return DECR_ILLEGALDATA;
}
while ((this_run = pState->block_remaining) > 0 && togo > 0) {
if (this_run > togo) this_run = togo;
togo -= this_run;
pState->block_remaining -= this_run;
/* apply 2^x-1 mask */
window_posn &= window_size - 1;
/* runs can't straddle the window wraparound */
if ((window_posn + this_run) > window_size)
return DECR_DATAFORMAT;
switch (pState->block_type) {
case LZX_BLOCKTYPE_VERBATIM:
while (this_run > 0) {
READ_HUFFSYM(MAINTREE, main_element);
if (main_element < LZX_NUM_CHARS) {
/* literal: 0 to LZX_NUM_CHARS-1 */
window[window_posn++] = main_element;
this_run--;
}
else {
/* match: LZX_NUM_CHARS + ((slot<<3) | length_header (3 bits)) */
main_element -= LZX_NUM_CHARS;
match_length = main_element & LZX_NUM_PRIMARY_LENGTHS;
if (match_length == LZX_NUM_PRIMARY_LENGTHS) {
READ_HUFFSYM(LENGTH, length_footer);
match_length += length_footer;
}
match_length += LZX_MIN_MATCH;
match_offset = main_element >> 3;
if (match_offset > 2) {
/* not repeated offset */
if (match_offset != 3) {
extra = extra_bits[match_offset];
READ_BITS(verbatim_bits, extra);
match_offset = position_base[match_offset] - 2 + verbatim_bits;
}
else {
match_offset = 1;
}
/* update repeated offset LRU queue */
R2 = R1; R1 = R0; R0 = match_offset;
}
else if (match_offset == 0) {
match_offset = R0;
}
else if (match_offset == 1) {
match_offset = R1;
R1 = R0; R0 = match_offset;
}
else /* match_offset == 2 */ {
match_offset = R2;
R2 = R0; R0 = match_offset;
}
rundest = window + window_posn;
runsrc = rundest - match_offset;
window_posn += match_length;
if (window_posn > window_size) return DECR_ILLEGALDATA;
this_run -= match_length;
/* copy any wrapped around source data */
while ((runsrc < window) && (match_length-- > 0)) {
*rundest++ = *(runsrc + window_size); runsrc++;
}
/* copy match data - no worries about destination wraps */
while (match_length-- > 0) *rundest++ = *runsrc++;
}
}
break;
case LZX_BLOCKTYPE_ALIGNED:
while (this_run > 0) {
READ_HUFFSYM(MAINTREE, main_element);
if (main_element < LZX_NUM_CHARS) {
/* literal: 0 to LZX_NUM_CHARS-1 */
window[window_posn++] = main_element;
this_run--;
}
else {
/* match: LZX_NUM_CHARS + ((slot<<3) | length_header (3 bits)) */
main_element -= LZX_NUM_CHARS;
match_length = main_element & LZX_NUM_PRIMARY_LENGTHS;
if (match_length == LZX_NUM_PRIMARY_LENGTHS) {
READ_HUFFSYM(LENGTH, length_footer);
match_length += length_footer;
}
match_length += LZX_MIN_MATCH;
match_offset = main_element >> 3;
if (match_offset > 2) {
/* not repeated offset */
extra = extra_bits[match_offset];
match_offset = position_base[match_offset] - 2;
if (extra > 3) {
/* verbatim and aligned bits */
extra -= 3;
READ_BITS(verbatim_bits, extra);
match_offset += (verbatim_bits << 3);
READ_HUFFSYM(ALIGNED, aligned_bits);
match_offset += aligned_bits;
}
else if (extra == 3) {
/* aligned bits only */
READ_HUFFSYM(ALIGNED, aligned_bits);
match_offset += aligned_bits;
}
else if (extra > 0) { /* extra==1, extra==2 */
/* verbatim bits only */
READ_BITS(verbatim_bits, extra);
match_offset += verbatim_bits;
}
else /* extra == 0 */ {
/* ??? */
match_offset = 1;
}
/* update repeated offset LRU queue */
R2 = R1; R1 = R0; R0 = match_offset;
}
else if (match_offset == 0) {
match_offset = R0;
}
else if (match_offset == 1) {
match_offset = R1;
R1 = R0; R0 = match_offset;
}
else /* match_offset == 2 */ {
match_offset = R2;
R2 = R0; R0 = match_offset;
}
rundest = window + window_posn;
runsrc = rundest - match_offset;
window_posn += match_length;
if (window_posn > window_size) return DECR_ILLEGALDATA;
this_run -= match_length;
/* copy any wrapped around source data */
while ((runsrc < window) && (match_length-- > 0)) {
*rundest++ = *(runsrc + window_size); runsrc++;
}
/* copy match data - no worries about destination wraps */
while (match_length-- > 0) *rundest++ = *runsrc++;
}
}
break;
case LZX_BLOCKTYPE_UNCOMPRESSED:
if ((inpos + this_run) > endinp) return DECR_ILLEGALDATA;
memcpy(window + window_posn, inpos, (size_t) this_run);
inpos += this_run; window_posn += this_run;
break;
default:
return DECR_ILLEGALDATA; /* might as well */
}
}
}
if (togo != 0) return DECR_ILLEGALDATA;
memcpy(outpos, window + ((!window_posn) ? window_size : window_posn) - outlen, (size_t) outlen);
pState->window_posn = window_posn;
pState->R0 = R0;
pState->R1 = R1;
pState->R2 = R2;
/* intel E8 decoding */
if ((pState->frames_read++ < 32768) && pState->intel_filesize != 0) {
if (outlen <= 6 || !pState->intel_started) {
pState->intel_curpos += outlen;
}
else {
UBYTE *data = outpos;
UBYTE *dataend = data + outlen - 10;
LONG curpos = pState->intel_curpos;
LONG filesize = pState->intel_filesize;
LONG abs_off, rel_off;
pState->intel_curpos = curpos + outlen;
while (data < dataend) {
if (*data++ != 0xE8) { curpos++; continue; }
abs_off = data[0] | (data[1]<<8) | (data[2]<<16) | (data[3]<<24);
if ((abs_off >= -curpos) && (abs_off < filesize)) {
rel_off = (abs_off >= 0) ? abs_off - curpos : abs_off + filesize;
data[0] = (UBYTE) rel_off;
data[1] = (UBYTE) (rel_off >> 8);
data[2] = (UBYTE) (rel_off >> 16);
data[3] = (UBYTE) (rel_off >> 24);
}
data += 4;
curpos += 5;
}
}
}
return DECR_OK;
}
#ifdef LZX_CHM_TESTDRIVER
int main(int c, char **v)
{
FILE *fin, *fout;
struct LZXstate state;
UBYTE ibuf[16384];
UBYTE obuf[32768];
int ilen, olen;
int status;
int i;
int count=0;
int w = atoi(v[1]);
LZXinit(&state, w);
fout = fopen(v[2], "wb");
for (i=3; i *
* source: modified lzx.c from cabextract v0.5 *
* notes: This file was taken from cabextract v0.5, which was, *
* itself, a modified version of the lzx decompression code *
* from unlzx. *
***************************************************************************/
/***************************************************************************
* *
* This program is free software; you can redistribute it and/or modify *
* it under the terms of the GNU General Public License as published by *
* the Free Software Foundation; either version 2 of the License, or *
* (at your option) any later version. Note that an exemption to this *
* license has been granted by Stuart Caie for the purposes of *
* distribution with chmlib. This does not, to the best of my *
* knowledge, constitute a change in the license of this (the LZX) code *
* in general. *
* *
***************************************************************************/
#ifndef INCLUDED_LZX_H
#define INCLUDED_LZX_H
#ifdef __cplusplus
extern "C" {
#endif
/* return codes */
#define DECR_OK (0)
#define DECR_DATAFORMAT (1)
#define DECR_ILLEGALDATA (2)
#define DECR_NOMEMORY (3)
/* opaque state structure */
struct LZXstate;
/* create an lzx state object */
struct LZXstate *LZXinit(int window);
/* destroy an lzx state object */
void LZXteardown(struct LZXstate *pState);
/* reset an lzx stream */
int LZXreset(struct LZXstate *pState);
/* decompress an LZX compressed block */
int LZXdecompress(struct LZXstate *pState,
unsigned char *inpos,
unsigned char *outpos,
int inlen,
int outlen);
#ifdef __cplusplus
}
#endif
#endif /* INCLUDED_LZX_H */
================================================
FILE: ext/CHMLib/src/test_chmLib.c
================================================
/* $Id: test_chmLib.c,v 1.5 2002/10/09 12:38:12 jedwin Exp $ */
/***************************************************************************
* test_chmLib.c - CHM archive test driver *
* ------------------- *
* *
* author: Jed Wing *
* notes: This is the quick-and-dirty test driver for the chm lib *
* routines. The program takes as its inputs the path to a *
* .chm file, a path within the .chm file, and a destination *
* path. It attempts to open the .chm file, locate the *
* desired file in the archive, and extract to the specified *
* destination. *
* *
* It is not included as a particularly useful program, but *
* rather as a sort of "simplest possible" example of how to *
* use the resolve/retrieve portion of the API. *
***************************************************************************/
/***************************************************************************
* *
* This program is free software; you can redistribute it and/or modify *
* it under the terms of the GNU Lesser General Public License as *
* published by the Free Software Foundation; either version 2.1 of the *
* License, or (at your option) any later version. *
* *
***************************************************************************/
#include "chm_lib.h"
#ifdef WIN32
#include
#endif
#include
#include
int main(int c, char **v)
{
struct chmFile *h;
struct chmUnitInfo ui;
if (c < 4)
{
fprintf(stderr, "usage: %s \n", v[0]);
exit(1);
}
h = chm_open(v[1]);
if (h == NULL)
{
fprintf(stderr, "failed to open %s\n", v[1]);
exit(1);
}
printf("resolving %s\n", v[2]);
if (CHM_RESOLVE_SUCCESS == chm_resolve_object(h,
v[2],
&ui))
{
#ifdef WIN32
unsigned char *buffer = (unsigned char *)alloca((unsigned int)ui.length);
#else
unsigned char buffer[ui.length];
#endif
LONGINT64 gotLen;
FILE *fout;
printf(" object: <%d, %lu, %lu>\n",
ui.space,
(unsigned long)ui.start,
(unsigned long)ui.length);
printf("extracting to '%s'\n", v[3]);
gotLen = chm_retrieve_object(h, &ui, buffer, 0, ui.length);
if (gotLen == 0)
{
printf(" extract failed\n");
return 2;
}
else if ((fout = fopen(v[3], "wb")) == NULL)
{
printf(" create failed\n");
return 3;
}
else
{
fwrite(buffer, 1, (unsigned int)ui.length, fout);
fclose(fout);
printf(" finished\n");
}
}
else
printf(" failed\n");
return 0;
}
================================================
FILE: ext/_patches/CHMLib.patch
================================================
diff -rPu5 CHMLib.orig\src\chm_lib.c CHMLib\src\chm_lib.c
--- CHMLib.orig\src\chm_lib.c Fri Jul 03 08:34:54 2009
+++ CHMLib\src\chm_lib.c Tue Apr 09 20:53:17 2013
@@ -92,11 +92,11 @@
#ifdef WIN32
#define CHM_ACQUIRE_LOCK(a) do { \
EnterCriticalSection(&(a)); \
} while(0)
#define CHM_RELEASE_LOCK(a) do { \
- EnterCriticalSection(&(a)); \
+ LeaveCriticalSection(&(a)); \
} while(0)
#else
#include
@@ -410,10 +410,14 @@
return 0;
}
else
dest->data_offset = dest->dir_offset + dest->dir_len;
+ /* SumatraPDF: sanity check (huge values are usually due to broken files) */
+ if (dest->dir_offset > UINT_MAX || dest->dir_len > UINT_MAX)
+ return 0;
+
return 1;
}
/* structure of ITSP headers */
#define _CHM_ITSP_V1_LEN (0x54)
@@ -466,10 +470,13 @@
return 0;
if (dest->version != 1)
return 0;
if (dest->header_len != _CHM_ITSP_V1_LEN)
return 0;
+ /* SumatraPDF: sanity check */
+ if (dest->block_len == 0)
+ return 0;
return 1;
}
/* structure of PMGL headers */
@@ -484,15 +491,19 @@
Int32 block_next; /* 10 */
}; /* __attribute__ ((aligned (1))); */
static int _unmarshal_pmgl_header(unsigned char **pData,
unsigned int *pDataLen,
+ unsigned int blockLen,
struct chmPmglHeader *dest)
{
/* we only know how to deal with a 0x14 byte structures */
if (*pDataLen != _CHM_PMGL_LEN)
return 0;
+ /* SumatraPDF: sanity check */
+ if (blockLen < _CHM_PMGL_LEN)
+ return 0;
/* unmarshal fields */
_unmarshal_char_array(pData, pDataLen, dest->signature, 4);
_unmarshal_uint32 (pData, pDataLen, &dest->free_space);
_unmarshal_uint32 (pData, pDataLen, &dest->unknown_0008);
@@ -500,10 +511,13 @@
_unmarshal_int32 (pData, pDataLen, &dest->block_next);
/* check structure */
if (memcmp(dest->signature, _chm_pmgl_marker, 4) != 0)
return 0;
+ /* SumatraPDF: sanity check */
+ if (dest->free_space > blockLen - _CHM_PMGL_LEN)
+ return 0;
return 1;
}
/* structure of PMGI headers */
@@ -515,23 +529,30 @@
UInt32 free_space; /* 4 */
}; /* __attribute__ ((aligned (1))); */
static int _unmarshal_pmgi_header(unsigned char **pData,
unsigned int *pDataLen,
+ unsigned int blockLen,
struct chmPmgiHeader *dest)
{
/* we only know how to deal with a 0x8 byte structures */
if (*pDataLen != _CHM_PMGI_LEN)
return 0;
+ /* SumatraPDF: sanity check */
+ if (blockLen < _CHM_PMGI_LEN)
+ return 0;
/* unmarshal fields */
_unmarshal_char_array(pData, pDataLen, dest->signature, 4);
_unmarshal_uint32 (pData, pDataLen, &dest->free_space);
/* check structure */
if (memcmp(dest->signature, _chm_pmgi_marker, 4) != 0)
return 0;
+ /* SumatraPDF: sanity check */
+ if (dest->free_space > blockLen - _CHM_PMGI_LEN)
+ return 0;
return 1;
}
/* structure of LZXC reset table */
@@ -565,10 +586,15 @@
_unmarshal_uint64 (pData, pDataLen, &dest->block_len);
/* check structure */
if (dest->version != 2)
return 0;
+ /* SumatraPDF: sanity check (huge values are usually due to broken files) */
+ if (dest->uncompressed_len > UINT_MAX || dest->compressed_len > UINT_MAX)
+ return 0;
+ if (dest->block_len == 0 || dest->block_len > UINT_MAX)
+ return 0;
return 1;
}
/* structure of LZXC control data block */
@@ -936,10 +962,12 @@
!_unmarshal_lzxc_control_data(&sbufpos, &sremain,
&ctlData))
{
newHandle->compression_enabled = 0;
}
+ else /* SumatraPDF: prevent division by zero */
+ {
newHandle->window_size = ctlData.windowSize;
newHandle->reset_interval = ctlData.resetInterval;
/* Jed, Mon Jun 28: Experimentally, it appears that the reset block count */
@@ -951,10 +979,11 @@
#else
newHandle->reset_blkcount = newHandle->reset_interval /
(newHandle->window_size / 2) *
ctlData.windowsPerReset;
#endif
+ }
}
/* initialize cache */
chm_set_param(newHandle, CHM_PARAM_MAX_BLOCKS_CACHED,
CHM_MAX_BLOCKS_CACHED);
@@ -1172,11 +1201,11 @@
char buffer[CHM_MAX_PATHLEN+1];
/* figure out where to start and end */
cur = page_buf;
hremain = _CHM_PMGL_LEN;
- if (! _unmarshal_pmgl_header(&cur, &hremain, &header))
+ if (! _unmarshal_pmgl_header(&cur, &hremain, block_len, &header))
return NULL;
end = page_buf + block_len - (header.free_space);
/* now, scan progressively */
while (cur < end)
@@ -1216,11 +1245,11 @@
char buffer[CHM_MAX_PATHLEN+1];
/* figure out where to start and end */
cur = page_buf;
hremain = _CHM_PMGI_LEN;
- if (! _unmarshal_pmgi_header(&cur, &hremain, &header))
+ if (! _unmarshal_pmgi_header(&cur, &hremain, block_len, &header))
return -1;
end = page_buf + block_len - (header.free_space);
/* now, scan progressively */
while (cur < end)
@@ -1406,11 +1435,11 @@
for (i = blockAlign; i > 0; i--)
{
UInt32 curBlockIdx = block - i;
/* check if we most recently decompressed the previous block */
- if (h->lzx_last_block != curBlockIdx)
+ if (h->lzx_last_block != (int)curBlockIdx)
{
if ((curBlockIdx % h->reset_blkcount) == 0)
{
#ifdef CHM_DEBUG
fprintf(stderr, "***RESET (1)***\n");
@@ -1543,10 +1572,16 @@
h->lzx_state = LZXinit(window_size);
}
/* decompress some data */
gotLen = _chm_decompress_block(h, nBlock, &ubuffer);
+ /* SumatraPDF: check return value */
+ if (gotLen == (UInt64)-1)
+ {
+ CHM_RELEASE_LOCK(h->lzx_mutex);
+ return 0;
+ }
if (gotLen < nLen)
nLen = gotLen;
memcpy(buf, ubuffer+nOffset, (unsigned int)nLen);
CHM_RELEASE_LOCK(h->lzx_mutex);
return nLen;
@@ -1654,11 +1689,11 @@
}
/* figure out start and end for this page */
cur = page_buf;
lenRemain = _CHM_PMGL_LEN;
- if (! _unmarshal_pmgl_header(&cur, &lenRemain, &header))
+ if (! _unmarshal_pmgl_header(&cur, &lenRemain, h->block_len, &header))
{
free(page_buf);
return 0;
}
end = page_buf + h->block_len - (header.free_space);
@@ -1803,11 +1838,11 @@
}
/* figure out start and end for this page */
cur = page_buf;
lenRemain = _CHM_PMGL_LEN;
- if (! _unmarshal_pmgl_header(&cur, &lenRemain, &header))
+ if (! _unmarshal_pmgl_header(&cur, &lenRemain, h->block_len, &header))
{
free(page_buf);
return 0;
}
end = page_buf + h->block_len - (header.free_space);
diff -rPu5 CHMLib.orig\src\lzx.c CHMLib\src\lzx.c
--- CHMLib.orig\src\lzx.c Fri Jul 03 08:34:54 2009
+++ CHMLib\src\lzx.c Mon Oct 01 12:36:01 2012
@@ -175,11 +175,11 @@
/* if a previously allocated window is big enough, keep it */
if (window < 15 || window > 21) return NULL;
/* allocate state and associated window */
pState = (struct LZXstate *)malloc(sizeof(struct LZXstate));
- if (!(pState->window = (UBYTE *)malloc(wndsize)))
+ if (!pState || !(pState->window = (UBYTE *)malloc(wndsize)))
{
free(pState);
return NULL;
}
pState->actual_size = wndsize;
================================================
FILE: ext/_patches/bzip2.patch
================================================
diff -rPu5 bzip2.orig\bz_internal_error.c bzip2\bz_internal_error.c
--- bzip2.orig\bz_internal_error.c Thu Jan 01 01:00:00 1970
+++ bzip2\bz_internal_error.c Sun Feb 05 23:13:51 2012
@@ -0,0 +1,8 @@
+/* Use when compiling with BZ_NO_STDIO */
+
+#include
+
+void bz_internal_error(int errcode)
+{
+ assert(0);
+}
diff -rPu5 bzip2.orig\bzip_all.c bzip2\bzip_all.c
--- bzip2.orig\bzip_all.c Thu Jan 01 01:00:00 1970
+++ bzip2\bzip_all.c Tue Mar 19 10:17:15 2013
@@ -0,0 +1,8 @@
+#include "blocksort.c"
+#include "bzlib.c"
+#include "compress.c"
+#include "crctable.c"
+#include "decompress.c"
+#include "huffman.c"
+#include "randtable.c"
+#include "bz_internal_error.c"
================================================
FILE: ext/_patches/freetype2.patch
================================================
diff -rPu5 freetype2.orig\include\config\ftstdlib.h freetype2\include\config\ftstdlib.h
--- freetype2.orig\include\config\ftstdlib.h Thu Jan 01 15:45:12 2015
+++ freetype2\include\config\ftstdlib.h Thu Jan 01 15:48:53 2015
@@ -106,10 +106,16 @@
#define ft_fread fread
#define ft_fseek fseek
#define ft_ftell ftell
#define ft_sprintf sprintf
+/* cf. http://lists.gnu.org/archive/html/freetype/2006-09/msg00036.html */
+#ifdef _WIN32
+#undef ft_fopen
+#define ft_fopen ft_fopen_win32
+#endif
+
/**********************************************************************/
/* */
/* sorting */
/* */
diff -rPu5 freetype2.orig\include\ftsystem.h freetype2\include\ftsystem.h
--- freetype2.orig\include\ftsystem.h Thu Jan 01 15:45:12 2015
+++ freetype2\include\ftsystem.h Thu Jan 01 15:49:09 2015
@@ -345,10 +345,16 @@
} FT_StreamRec;
/* */
+/* cf. http://lists.gnu.org/archive/html/freetype/2006-09/msg00036.html */
+#ifdef _WIN32
+FT_FILE* ft_fopen_win32(const char *fname, const char *mode);
+#endif
+
+
FT_END_HEADER
#endif /* __FTSYSTEM_H__ */
diff -rPu5 freetype2.orig\src\base\ftsystem.c freetype2\src\base\ftsystem.c
--- freetype2.orig\src\base\ftsystem.c Thu Jan 01 15:45:12 2015
+++ freetype2\src\base\ftsystem.c Thu Jan 01 15:49:23 2015
@@ -315,6 +315,29 @@
#endif
ft_sfree( memory );
}
+/* cf. http://lists.gnu.org/archive/html/freetype/2006-09/msg00036.html */
+#ifdef _WIN32
+#include
+
+ FT_FILE* ft_fopen_win32(const char *fname, const char *mode)
+ {
+ // First try fopen, assuming nothing about character encodings.
+ FT_FILE *file = fopen(fname, mode);
+ if (!file)
+ {
+ // fopen failed. Assume the filename is UTF-8, convert to UTF-16, and try _wfopen.
+ WCHAR fnameW[MAX_PATH], modeW[8];
+ if (MultiByteToWideChar(CP_UTF8, 0, fname, -1, fnameW, _countof(fnameW)) &&
+ MultiByteToWideChar(CP_UTF8, 0, mode, -1, modeW, _countof(modeW)))
+ {
+ file = _wfopen(fnameW, modeW);
+ }
+ }
+ return file;
+ }
+#endif
+
+
/* END */
diff -rPu5 freetype2.orig\src\sfnt\sfobjs.c freetype2\src\sfnt\sfobjs.c
--- freetype2.orig\src\sfnt\sfobjs.c Thu Jan 01 15:45:12 2015
+++ freetype2\src\sfnt\sfobjs.c Thu Jan 01 15:49:41 2015
@@ -1077,10 +1077,14 @@
get_glyph_metrics )
{
face->horizontal.number_Of_HMetrics = 0;
error = FT_Err_Ok;
}
+#else /* cf. https://code.google.com/p/sumatrapdf/issues/detail?id=2778 */
+ FT_ERROR(("sfnt_load_face: horizontal metrics (hmtx) table missing\n"));
+ face->horizontal.number_Of_HMetrics = 0;
+ error = FT_Err_Ok;
#endif
}
}
else if ( FT_ERR_EQ( error, Table_Missing ) )
{
================================================
FILE: ext/_patches/libdjvu.patch
================================================
diff -rPu5 libdjvu.orig\ddjvuapi.cpp libdjvu\ddjvuapi.cpp
--- libdjvu.orig\ddjvuapi.cpp Tue May 08 04:56:53 2012
+++ libdjvu\ddjvuapi.cpp Mon Aug 11 14:45:43 2014
@@ -1051,10 +1051,49 @@
int cache)
{
return ddjvu_document_create_by_filename_imp(ctx,filename,cache,1);
}
+/* SumatraPDF: ddjvu_document_create_by_data */
+ddjvu_document_t *
+ddjvu_document_create_by_data(ddjvu_context_t *ctx,
+ const char *data,
+ unsigned long datalen)
+{
+ ddjvu_document_t *d = 0;
+ G_TRY
+ {
+ d = new ddjvu_document_s;
+ ref(d);
+ GMonitorLock lock(&d->monitor);
+ d->streams[0] = DataPool::create();
+ d->streamid = -1;
+ d->fileflag = false;
+ d->docinfoflag = false;
+ d->pageinfoflag = false;
+ d->myctx = ctx;
+ d->mydoc = 0;
+ d->doc = DjVuDocument::create_noinit();
+ ddjvu_stream_write(d, 0, data, datalen);
+ ddjvu_stream_close(d, 0, 0);
+ GUTF8String s;
+ s.format("ddjvu:///doc%d/index.djvu", ++(ctx->uniqueid));;
+ GURL gurl = s;
+ d->urlflag = false;
+ d->doc->start_init(gurl, d, 0);
+ }
+ G_CATCH(ex)
+ {
+ if (d)
+ unref(d);
+ d = 0;
+ ERROR1(ctx, ex);
+ }
+ G_ENDCATCH;
+ return d;
+}
+
ddjvu_job_t *
ddjvu_document_job(ddjvu_document_t *document)
{
return document;
}
@@ -3841,10 +3880,12 @@
{
if (file->get_flags() & DjVuFile::STOPPED)
return miniexp_status(DDJVU_JOB_STOPPED);
return miniexp_status(DDJVU_JOB_FAILED);
}
+ /* SumatraPDF: TODO: how to prevent a potentially infinite loop? */
+ return miniexp_status(DDJVU_JOB_FAILED);
}
return miniexp_dummy;
}
// Access annotation data
return get_bytestream_anno(file->get_merged_anno());
@@ -4090,5 +4131,10 @@
{
return document->doc;
}
+/* SumatraPDF: access to free() mirroring malloc() above */
+void ddjvu_free(void *ptr)
+{
+ free(ptr);
+}
diff -rPu5 libdjvu.orig\ddjvuapi.h libdjvu\ddjvuapi.h
--- libdjvu.orig\ddjvuapi.h Tue May 08 04:56:53 2012
+++ libdjvu\ddjvuapi.h Mon Aug 11 14:49:17 2014
@@ -528,10 +528,22 @@
DDJVUAPI ddjvu_document_t *
ddjvu_document_create_by_filename_utf8(ddjvu_context_t *context,
const char *filename,
int cache);
+
+/* SumatraPDF: ddvu_document_create_by_data ---
+ Creates a document from in-memory data
+ (needed as an alternative to ddjvu_document_create when
+ compiling libdjvu without thread support) */
+
+DDJVUAPI ddjvu_document_t *
+ddjvu_document_create_by_data(ddjvu_context_t *context,
+ const char *data,
+ unsigned long datalen);
+
+
/* ddjvu_document_job ---
Access the job object in charge of decoding the document header.
In fact is a subclass of
and this function is a type cast. */
@@ -1673,7 +1685,10 @@
DDJVUAPI GP
ddjvu_get_DjVuDocument(ddjvu_document_t *document);
# endif
# endif
#endif
+
+/* SumatraPDF: implementation of mentioned above */
+void ddjvu_free(void *ptr);
#endif /* DDJVUAPI_H */
diff -rPu5 libdjvu.orig\djvu_all.cpp libdjvu\djvu_all.cpp
--- libdjvu.orig\djvu_all.cpp Thu Jan 01 01:00:00 1970
+++ libdjvu\djvu_all.cpp Tue May 14 21:21:21 2013
@@ -0,0 +1,44 @@
+#include "Arrays.cpp"
+#include "atomic.cpp"
+#include "BSByteStream.cpp"
+#include "BSEncodeByteStream.cpp"
+#include "ByteStream.cpp"
+#include "DataPool.cpp"
+#include "DjVmDir0.cpp"
+#include "DjVmDoc.cpp"
+#include "DjVmNav.cpp"
+#include "DjVuAnno.cpp"
+#include "DjVuDumpHelper.cpp"
+#include "DjVuErrorList.cpp"
+#include "DjVuFile.cpp"
+#include "DjVuFileCache.cpp"
+#include "DjVuGlobal.cpp"
+#include "DjVuGlobalMemory.cpp"
+#include "DjVuImage.cpp"
+#include "DjVuInfo.cpp"
+#include "DjVuMessage.cpp"
+#include "DjVuNavDir.cpp"
+#include "DjVuPalette.cpp"
+#include "DjVuPort.cpp"
+#include "DjVuText.cpp"
+#include "GBitmap.cpp"
+#include "GContainer.cpp"
+#include "GException.cpp"
+#include "GIFFManager.cpp"
+#include "GOS.cpp"
+#include "GRect.cpp"
+#include "GSmartPointer.cpp"
+#include "GString.cpp"
+#include "GThreads.cpp"
+#include "GUnicode.cpp"
+#include "IFFByteStream.cpp"
+#include "JB2EncodeCodec.cpp"
+#include "DjVmDir.cpp"
+#include "MMRDecoder.cpp"
+#include "MMX.cpp"
+#include "UnicodeByteStream.cpp"
+#include "XMLTags.cpp"
+#include "ZPCodec.cpp"
+#include "ddjvuapi.cpp"
+#include "debug.cpp"
+
diff -rPu5 libdjvu.orig\DjVuGlobal.h libdjvu\DjVuGlobal.h
--- libdjvu.orig\DjVuGlobal.h Tue May 08 04:56:53 2012
+++ libdjvu\DjVuGlobal.h Thu Dec 27 14:30:53 2012
@@ -70,11 +70,12 @@
# include
#else
# include
#endif
-#ifdef WIN32
+// SumatraPDF: allow to build as a static library (built-in)
+#ifdef WIN32_AND_NOT_STATIC
# ifdef DLL_EXPORT
# define DJVUAPI __declspec(dllexport)
# else
# define DJVUAPI __declspec(dllimport)
# endif
diff -rPu5 libdjvu.orig\DjVuMessage.cpp libdjvu\DjVuMessage.cpp
--- libdjvu.orig\DjVuMessage.cpp Tue May 08 04:56:53 2012
+++ libdjvu\DjVuMessage.cpp Mon Aug 12 21:07:40 2013
@@ -498,10 +498,11 @@
static GUTF8String
parse(GMap > &retval)
{
GUTF8String errors;
GPList body;
+ if (0) /* SumatraPDF: don't bother looking for messages.xml and languages.xml using broken code */
{
GList paths=DjVuMessage::GetProfilePaths();
GMap map;
GUTF8String m(MessageFile);
errors=getbodies(paths,m,body,map);
diff -rPu5 libdjvu.orig\DjVuPalette.cpp libdjvu\DjVuPalette.cpp
--- libdjvu.orig\DjVuPalette.cpp Tue May 08 04:56:53 2012
+++ libdjvu\DjVuPalette.cpp Fri Oct 25 13:39:08 2013
@@ -96,13 +96,16 @@
inline unsigned char
umin(unsigned char a, unsigned char b)
{ return (a>b) ? b : a; }
+/* SumatraPDF: in VS 2013 math.h already defines fmin */
+#if !defined(_MSC_VER) || (_MSC_VER < 1800)
inline float
fmin(float a, float b)
{ return (a>b) ? b : a; }
+#endif
// ------- DJVUPALETTE
diff -rPu5 libdjvu.orig\GException.cpp libdjvu\GException.cpp
--- libdjvu.orig\GException.cpp Tue May 08 04:56:53 2012
+++ libdjvu\GException.cpp Tue May 14 21:21:21 2013
@@ -251,10 +251,12 @@
// ------ MEMORY MANAGEMENT HANDLER
+/* SumatraPDF: prevent exception handler overriding when not building stand-alone libdjvu */
+#ifdef ALLOW_GLOBAL_OOM_HANDLING
#ifndef NEED_DJVU_MEMORY
// This is not activated when C++ memory management
// is overidden. The overriding functions handle
// memory exceptions by themselves.
# if defined(_MSC_VER)
@@ -271,10 +273,11 @@
static void (*old_handler)() = set_new_handler(throw_memory_error);
# endif // HAVE_STDINCLUDES
# endif // ! WIN32
# endif // !_MSC_VER
#endif // !NEED_DJVU_MEMORY
+#endif
#ifdef HAVE_NAMESPACES
}
# ifndef NOT_USING_DJVU_NAMESPACE
diff -rPu5 libdjvu.orig\GException.h libdjvu\GException.h
--- libdjvu.orig\GException.h Tue May 08 04:56:53 2012
+++ libdjvu\GException.h Tue May 14 21:21:21 2013
@@ -310,12 +310,13 @@
#ifdef __GNUG__
#define G_THROW_TYPE(msg,xtype) GExceptionHandler::emthrow \
(GException(msg, __FILE__, __LINE__, __PRETTY_FUNCTION__, xtype))
#define G_EMTHROW(ex) GExceptionHandler::emthrow(ex)
#else
+// SumatraPDF: don't collect messages, file and line for smaller size
#define G_THROW_TYPE(m,xtype) GExceptionHandler::emthrow \
- (GException(m, __FILE__, __LINE__,0, xtype))
+ (GException(0, 0, 0, 0, xtype))
#define G_EMTHROW(ex) GExceptionHandler::emthrow(ex)
#endif
#endif // !CPP_SUPPORTS_EXCEPTIONS
diff -rPu5 libdjvu.orig\GThreads.h libdjvu\GThreads.h
--- libdjvu.orig\GThreads.h Tue May 08 04:56:53 2012
+++ libdjvu\GThreads.h Sat Aug 18 20:17:23 2012
@@ -105,10 +105,13 @@
#include "GException.h"
#define NOTHREADS 0
#define POSIXTHREADS 10
#define WINTHREADS 11
+/* SumatraPDF: prevent these constants from being confused with NOTHREADS */
+#define MACTHREADS -1
+#define COTHREADS -1
// Known platforms
#ifndef THREADMODEL
#if defined(WIN32)
#define THREADMODEL WINTHREADS
diff -rPu5 libdjvu.orig\GURL.cpp libdjvu\GURL.cpp
--- libdjvu.orig\GURL.cpp Tue May 08 04:56:53 2012
+++ libdjvu\GURL.cpp Sun Dec 16 16:00:23 2012
@@ -482,11 +482,11 @@
GURL::protocol(const GUTF8String& url)
{
const char * const url_ptr=url;
const char * ptr=url_ptr;
for(char c=*ptr;
- c && (isalnum(c) || c == '+' || c == '-' || c == '.');
+ c && (isalnum((unsigned char)c) || c == '+' || c == '-' || c == '.');
c=*(++ptr)) EMPTY_LOOP;
if (ptr[0]==colon && ptr[1]=='/' && ptr[2]=='/')
return GUTF8String(url_ptr, ptr-url_ptr);
return GUTF8String();
}
diff -rPu5 libdjvu.orig\IW44Image.cpp libdjvu\IW44Image.cpp
--- libdjvu.orig\IW44Image.cpp Tue May 08 04:56:53 2012
+++ libdjvu\IW44Image.cpp Sat Jul 26 23:31:55 2014
@@ -682,10 +682,13 @@
void
IW44Image::Map::image(signed char *img8, int rowsize, int pixsep, int fast)
{
// Allocate reconstruction buffer
short *data16;
+ // cf. http://sourceforge.net/p/djvu/djvulibre-git/ci/7993b445f071a15248bd4be788a10643213cb9d2/
+ if (INT_MAX / bw < bh)
+ G_THROW("IW44Image: image size exceeds maximum (corrupted file?)");
GPBuffer gdata16(data16,bw*bh);
// Copy coefficients
int i;
short *p = data16;
const IW44Image::Block *block = blocks;
diff -rPu5 libdjvu.orig\miniexp.cpp libdjvu\miniexp.cpp
--- libdjvu.orig\miniexp.cpp Tue May 08 04:56:53 2012
+++ libdjvu\miniexp.cpp Tue May 14 21:21:21 2013
@@ -899,11 +899,12 @@
}
int
miniexp_stringp(miniexp_t p)
{
- return miniexp_isa(p, ministring_t::classname) ? 1 : 0;
+ // SumatraPDF: don't execute code until asked to
+ return miniexp_isa(p, miniexp_symbol("string")) ? 1 : 0;
}
const char *
miniexp_to_str(miniexp_t p)
{
@@ -1333,10 +1334,13 @@
}
/* ---- PNAME */
+// SumatraPDF: don't compile as it's not used and it's the only place
+// using try/catch, which is not compatible with compiling as /EHs-c-
+#if 0
static int
pname_fputs(miniexp_io_t *io, const char *s)
{
char *b = (char*)(io->data[0]);
size_t l = (size_t)(io->data[2]);
@@ -1380,10 +1384,11 @@
{
delete [] (char*)(io.data[0]);
}
return r;
}
+#endif
/* ---- INPUT */
static void
diff -rPu5 libdjvu.orig\miniexp.h libdjvu\miniexp.h
--- libdjvu.orig\miniexp.h Tue May 08 04:56:53 2012
+++ libdjvu\miniexp.h Sat Aug 18 20:14:08 2012
@@ -679,15 +679,16 @@
public: static const miniexp_t classname; \
virtual miniexp_t classof() const; \
virtual bool isa(miniexp_t) const;
#define MINIOBJ_IMPLEMENT(cls, supercls, name)\
- const miniexp_t cls::classname = miniexp_symbol(name);\
+ /* SumatraPDF: don't execute code until asked to */\
+ const miniexp_t cls::classname = 0;\
miniexp_t cls::classof() const {\
- return cls::classname; }\
+ return miniexp_symbol(name); }\
bool cls::isa(miniexp_t n) const {\
- return (cls::classname==n) || (supercls::isa(n)); }
+ return (classof()==n) || (supercls::isa(n)); }
/* miniexp_to_obj --
Returns a pointer to the object represented by an lisp
expression. Returns NULL if the expression is not an
================================================
FILE: ext/_patches/libjpeg-turbo.patch
================================================
diff -rPu5 libjpeg-turbo.orig\config.h libjpeg-turbo\config.h
--- libjpeg-turbo.orig\config.h Thu Jan 01 01:00:00 1970
+++ libjpeg-turbo\config.h Sun Apr 20 14:13:22 2014
@@ -0,0 +1,13 @@
+#define VERSION 1.3.1
+#define BUILD 0
+#define PACKAGE_NAME "libjpeg-turbo"
+
+#ifndef INLINE
+#if defined(__GNUC__)
+#define INLINE __attribute__((always_inline))
+#elif defined(_MSC_VER)
+#define INLINE __forceinline
+#else
+#define INLINE
+#endif
+#endif
diff -rPu5 libjpeg-turbo.orig\jconfig.h libjpeg-turbo\jconfig.h
--- libjpeg-turbo.orig\jconfig.h Thu Jan 01 01:00:00 1970
+++ libjpeg-turbo\jconfig.h Sun Apr 20 14:13:14 2014
@@ -0,0 +1,43 @@
+/* jconfig.vc --- jconfig.h for Microsoft Visual C++ on Windows 95 or NT. */
+/* see jconfig.txt for explanations */
+
+#define JPEG_LIB_VERSION 80
+#define LIBJPEG_TURBO_VERSION 1.3.1
+#define C_ARITH_CODING_SUPPORTED
+#define D_ARITH_CODING_SUPPORTED
+
+#define HAVE_PROTOTYPES
+#define HAVE_UNSIGNED_CHAR
+#define HAVE_UNSIGNED_SHORT
+/* #define void char */
+/* #define const */
+#undef CHAR_IS_UNSIGNED
+#define HAVE_STDDEF_H
+#define HAVE_STDLIB_H
+#undef NEED_BSD_STRINGS
+#undef NEED_SYS_TYPES_H
+#undef NEED_FAR_POINTERS /* we presume a 32-bit flat memory model */
+#undef NEED_SHORT_EXTERNAL_NAMES
+#undef INCOMPLETE_TYPES_BROKEN
+
+/* Define "boolean" as unsigned char, not int, per Windows custom */
+#ifndef __RPCNDR_H__ /* don't conflict if rpcndr.h already read */
+typedef unsigned char boolean;
+#endif
+#define HAVE_BOOLEAN /* prevent jmorecfg.h from redefining it */
+
+/* Define "INT32" as int, not long, per Windows custom */
+#if !(defined(_BASETSD_H_) || defined(_BASETSD_H)) /* don't conflict if basetsd.h already read */
+typedef short INT16;
+typedef signed int INT32;
+#endif
+#define XMD_H /* prevent jmorecfg.h from redefining it */
+
+#ifdef JPEG_INTERNALS
+
+#undef RIGHT_SHIFT_IS_UNSIGNED
+
+#endif /* JPEG_INTERNALS */
+
+/* SumatraPDF: enable SIMD under Win32 */
+#define WITH_SIMD
================================================
FILE: ext/_patches/openjpeg.patch
================================================
diff -rPu5 openjpeg.orig\j2k.c openjpeg\j2k.c
--- openjpeg.orig\j2k.c Tue Apr 29 09:15:02 2014
+++ openjpeg\j2k.c Wed Jul 09 02:05:04 2014
@@ -3162,11 +3162,11 @@
&l_cp->tcps[p_j2k->m_current_tile_number] :
p_j2k->m_specific_param.m_decoder.m_default_tcp;
l_old_poc_nb = l_tcp->POC ? l_tcp->numpocs + 1 : 0;
l_current_poc_nb += l_old_poc_nb;
- if(l_current_poc_nb >= 32)
+ if(l_current_poc_nb >= sizeof(l_tcp->pocs) / sizeof(l_tcp->pocs[0]))
{
opj_event_msg(p_manager, EVT_ERROR, "Too many POCs %d\n", l_current_poc_nb);
return OPJ_FALSE;
}
assert(l_current_poc_nb < 32);
@@ -3645,11 +3645,12 @@
p_header_size -= l_N_ppm;
p_header_data += l_N_ppm;
l_cp->ppm_data_read += l_N_ppm; /* Increase the number of data read*/
- if (p_header_size)
+ /* cf. https://code.google.com/p/openjpeg/issues/detail?id=362 */
+ if (p_header_size >= 4)
{
opj_read_bytes(p_header_data,&l_N_ppm,4); /* N_ppm^i */
p_header_data+=4;
p_header_size-=4;
}
@@ -3686,11 +3687,12 @@
/* Need to read an incomplete Ippm series*/
if (l_remaining_data) {
OPJ_BYTE *new_ppm_data;
assert(l_cp->ppm_data == l_cp->ppm_buffer && "We need ppm_data and ppm_buffer to be the same when reallocating");
- new_ppm_data = (OPJ_BYTE *) opj_realloc(l_cp->ppm_data, l_cp->ppm_len + l_N_ppm);
+ /* cf. https://code.google.com/p/openjpeg/issues/detail?id=362 */
+ new_ppm_data = (OPJ_BYTE *) opj_realloc(l_cp->ppm_data, l_cp->ppm_len + l_remaining_data);
if (! new_ppm_data) {
opj_free(l_cp->ppm_data);
l_cp->ppm_data = NULL;
l_cp->ppm_buffer = NULL; /* TODO: no need for a new local variable: ppm_buffer and ppm_data are enough */
l_cp->ppm_len = 0;
@@ -4070,10 +4072,14 @@
"number of tile-part (%d), giving up\n", l_current_part, l_tcp->m_nb_tile_parts );
p_j2k->m_specific_param.m_decoder.m_last_tile_part = 1;
return OPJ_FALSE;
}
}
+ /* cf. https://code.google.com/p/openjpeg/issues/detail?id=254 */
+ if (++l_num_parts < l_tcp->m_nb_tile_parts) {
+ l_num_parts = l_tcp->m_nb_tile_parts;
+ }
if( l_current_part >= l_num_parts ) {
/* testcase 451.pdf.SIGSEGV.ce9.3723 */
opj_event_msg(p_manager, EVT_ERROR, "In SOT marker, TPSot (%d) is not valid regards to the current "
"number of tile-part (header) (%d), giving up\n", l_current_part, l_num_parts );
p_j2k->m_specific_param.m_decoder.m_last_tile_part = 1;
@@ -4314,10 +4320,16 @@
l_current_data = &(l_tcp->m_data);
l_tile_len = &l_tcp->m_data_size;
/* Patch to support new PHR data */
if (p_j2k->m_specific_param.m_decoder.m_sot_length) {
+ /* cf. https://code.google.com/p/openjpeg/issues/detail?id=348 */
+ if (p_j2k->m_specific_param.m_decoder.m_sot_length > opj_stream_get_number_byte_left(p_stream)) {
+ opj_event_msg(p_manager, EVT_ERROR, "Not enough data to decode tile\n");
+ return OPJ_FALSE;
+ }
+
if (! *l_current_data) {
/* LH: oddly enough, in this path, l_tile_len!=0.
* TODO: If this was consistant, we could simplify the code to only use realloc(), as realloc(0,...) default to malloc(0,...).
*/
*l_current_data = (OPJ_BYTE*) opj_malloc(p_j2k->m_specific_param.m_decoder.m_sot_length);
diff -rPu5 openjpeg.orig\opj_config.h openjpeg\opj_config.h
--- openjpeg.orig\opj_config.h Thu Jan 01 01:00:00 1970
+++ openjpeg\opj_config.h Thu May 15 21:53:22 2014
@@ -0,0 +1,5 @@
+// #define OPJ_HAVE_STDINT_H
+
+#define OPJ_VERSION_MAJOR 2
+#define OPJ_VERSION_MINOR 1
+#define OPJ_VERSION_BUILD 0
diff -rPu5 openjpeg.orig\opj_config_private.h openjpeg\opj_config_private.h
--- openjpeg.orig\opj_config_private.h Thu Jan 01 01:00:00 1970
+++ openjpeg\opj_config_private.h Thu May 15 21:56:02 2014
@@ -0,0 +1,11 @@
+// #define OPJ_HAVE_STDINT_H
+
+#define OPJ_PACKAGE_VERSION "2.1.0"
+
+// #define OPJ_HAVE_INTTYPES_H
+// #define OPJ_HAVE_FSEEKO
+
+#define OPJ_STATIC
+#define OPJ_EXPORTS
+
+#define USE_JPIP
diff -rPu5 openjpeg.orig\opj_malloc.h openjpeg\opj_malloc.h
--- openjpeg.orig\opj_malloc.h Tue Apr 29 09:15:02 2014
+++ openjpeg\opj_malloc.h Thu May 15 22:00:14 2014
@@ -53,11 +53,11 @@
#ifdef ALLOC_PERF_OPT
void * OPJ_CALLCONV opj_malloc(size_t size);
#else
/* prevent assertion on overflow for MSVC */
#ifdef _MSC_VER
-#define opj_malloc(size) ((size_t)(size) >= (size_t)-0x100 ? NULL : malloc(size))
+#define opj_malloc(size) ((size_t)(size) >= 0x7ffdefff ? NULL : malloc(size))
#else
#define opj_malloc(size) malloc(size)
#endif
#endif
@@ -70,11 +70,11 @@
#ifdef ALLOC_PERF_OPT
void * OPJ_CALLCONV opj_calloc(size_t _NumOfElements, size_t _SizeOfElements);
#else
/* prevent assertion on overflow for MSVC */
#ifdef _MSC_VER
-#define opj_calloc(num, size) ((size_t)(num) != 0 && (size_t)(num) >= (size_t)-0x100 / (size_t)(size) ? NULL : calloc(num, size))
+#define opj_calloc(num, size) ((size_t)(num) != 0 && (size_t)(num) >= 0x7ffdefff / (size_t)(size) ? NULL : calloc(num, size))
#else
#define opj_calloc(num, size) calloc(num, size)
#endif
#endif
@@ -154,11 +154,11 @@
#ifdef ALLOC_PERF_OPT
void * OPJ_CALLCONV opj_realloc(void * m, size_t s);
#else
/* prevent assertion on overflow for MSVC */
#ifdef _MSC_VER
-#define opj_realloc(m, s) ((size_t)(s) >= (size_t)-0x100 ? NULL : realloc(m, s))
+#define opj_realloc(m, s) ((size_t)(s) >= 0x7ffdefff ? NULL : realloc(m, s))
#else
#define opj_realloc(m, s) realloc(m, s)
#endif
#endif
diff -rPu5 openjpeg.orig\t2.c openjpeg\t2.c
--- openjpeg.orig\t2.c Tue Apr 29 09:15:02 2014
+++ openjpeg\t2.c Thu May 15 22:12:09 2014
@@ -861,14 +861,13 @@
/* SOP markers */
if (p_tcp->csty & J2K_CP_CSTY_SOP) {
if (p_max_length < 6) {
/* TODO opj_event_msg(p_t2->cinfo->event_mgr, EVT_WARNING, "Not enough space for expected SOP marker\n"); */
- printf("Not enough space for expected SOP marker\n");
+ fprintf(stderr, "Not enough space for expected SOP marker\n");
} else if ((*l_current_data) != 0xff || (*(l_current_data + 1) != 0x91)) {
/* TODO opj_event_msg(p_t2->cinfo->event_mgr, EVT_WARNING, "Expected SOP marker\n"); */
- printf("Expected SOP marker\n");
fprintf(stderr, "Error : expected SOP marker\n");
} else {
l_current_data += 6;
}
@@ -1015,10 +1014,15 @@
do {
l_cblk->segs[l_segno].numnewpasses = (OPJ_UINT32)opj_int_min((OPJ_INT32)(l_cblk->segs[l_segno].maxpasses - l_cblk->segs[l_segno].numpasses), n);
l_cblk->segs[l_segno].newlen = opj_bio_read(l_bio, l_cblk->numlenbits + opj_uint_floorlog2(l_cblk->segs[l_segno].numnewpasses));
JAS_FPRINTF(stderr, "included=%d numnewpasses=%d increment=%d len=%d \n", l_included, l_cblk->segs[l_segno].numnewpasses, l_increment, l_cblk->segs[l_segno].newlen );
+ /* testcase 1802.pdf.SIGSEGV.36e.894 */
+ if (l_cblk->segs[l_segno].newlen > *l_modified_length_ptr) {
+ opj_bio_destroy(l_bio);
+ return OPJ_FALSE;
+ }
n -= (OPJ_INT32)l_cblk->segs[l_segno].numnewpasses;
if (n > 0) {
++l_segno;
@@ -1157,10 +1161,11 @@
/* Check if the cblk->data have allocated enough memory */
if ((l_cblk->data_current_size + l_seg->newlen) > l_cblk->data_max_size) {
OPJ_BYTE* new_cblk_data = (OPJ_BYTE*) opj_realloc(l_cblk->data, l_cblk->data_current_size + l_seg->newlen);
if(! new_cblk_data) {
opj_free(l_cblk->data);
+ l_cblk->data = NULL;
l_cblk->data_max_size = 0;
/* opj_event_msg(p_manager, EVT_ERROR, "Not enough memory to realloc code block cata!\n"); */
return OPJ_FALSE;
}
l_cblk->data_max_size = l_cblk->data_current_size + l_seg->newlen;
================================================
FILE: ext/_patches/synctex.patch
================================================
diff -rPu5 synctex.orig\synctex_parser.c synctex\synctex_parser.c
--- synctex.orig\synctex_parser.c Tue Jun 14 15:40:56 2011
+++ synctex\synctex_parser.c Mon Feb 04 17:41:54 2013
@@ -224,37 +224,43 @@
# define SYNCTEX_SET_NEXT_HORIZ_BOX(NODE,NEXT_BOX) if (NODE && NEXT_BOX){\
SYNCTEX_GETTER(NODE,next_box)[0]=NEXT_BOX;\
}
void _synctex_free_node(synctex_node_t node);
-void _synctex_free_leaf(synctex_node_t node);
+/* SumatraPDF: prevent stack overflow */
+# define _synctex_free_leaf _synctex_free_node
/* A node is meant to own its child and sibling.
* It is not owned by its parent, unless it is its first child.
* This destructor is for all nodes with children.
*/
void _synctex_free_node(synctex_node_t node) {
- if (node) {
+ /* SumatraPDF: prevent stack overflow */
+ synctex_node_t next;
+ while (node) {
(*((node->class)->sibling))(node);
- SYNCTEX_FREE(SYNCTEX_SIBLING(node));
+ next = SYNCTEX_SIBLING(node);
SYNCTEX_FREE(SYNCTEX_CHILD(node));
free(node);
+ node = next;
}
return;
}
/* A node is meant to own its child and sibling.
* It is not owned by its parent, unless it is its first child.
* This destructor is for nodes with no child.
*/
+/* SumatraPDF: prevent stack overflow * /
void _synctex_free_leaf(synctex_node_t node) {
if (node) {
SYNCTEX_FREE(SYNCTEX_SIBLING(node));
free(node);
}
return;
}
+*/
# ifdef __SYNCTEX_WORK__
# include "/usr/include/zlib.h"
# else
# include
# endif
@@ -1418,10 +1424,13 @@
/* We have current_size+len+1<=UINT_MAX
* or equivalently new_size 0 && (*value_ref)[new_size - 1] == '\r')
+ new_size--;
(* value_ref)[new_size]='\0'; /* Terminate the string */
SYNCTEX_CUR += len;/* Advance to the terminating '\n' */
return SYNCTEX_STATUS_OK;
}
free(* value_ref);
@@ -4146,11 +4155,11 @@
typedef int (*synctex_fprintf_t)(void *, const char * , ...); /* print formatted to either FILE * or gzFile */
# define SYNCTEX_BITS_PER_BYTE 8
struct __synctex_updater_t {
- void *file; /* the foo.synctex or foo.synctex.gz I/O identifier */
+ gzFile file; /* the foo.synctex or foo.synctex.gz I/O identifier */
synctex_fprintf_t fprintf; /* either fprintf or gzprintf */
int length; /* the number of chars appended */
struct _flags {
unsigned int no_gz:1; /* Whether zlib is used or not */
unsigned int reserved:SYNCTEX_BITS_PER_BYTE*sizeof(int)-1; /* Align */
diff -rPu5 synctex.orig\synctex_parser_utils.c synctex\synctex_parser_utils.c
--- synctex.orig\synctex_parser_utils.c Tue Jun 14 10:23:56 2011
+++ synctex\synctex_parser_utils.c Mon Mar 12 19:56:52 2012
@@ -166,10 +166,13 @@
next_character:
if(SYNCTEX_IS_PATH_SEPARATOR(*lhs)) {/* lhs points to a path separator */
if(!SYNCTEX_IS_PATH_SEPARATOR(*rhs)) {/* but not rhs */
return synctex_NO;
}
+ /* SumatraPDF: ignore spurious "./" parts (caused by TeXlive 2011) */
+ lhs = synctex_ignore_leading_dot_slash(lhs + 1) - 1;
+ rhs = synctex_ignore_leading_dot_slash(rhs + 1) - 1;
} else if(SYNCTEX_IS_PATH_SEPARATOR(*rhs)) {/* rhs points to a path separator but not lhs */
return synctex_NO;
} else if(toupper(*lhs) != toupper(*rhs)){/* uppercase do not match */
return synctex_NO;
} else if (!*lhs) {/* lhs is at the end of the string */
================================================
FILE: ext/bzip2/CHANGES
================================================
------------------------------------------------------------------
This file is part of bzip2/libbzip2, a program and library for
lossless, block-sorting data compression.
bzip2/libbzip2 version 1.0.6 of 6 September 2010
Copyright (C) 1996-2010 Julian Seward
Please read the WARNING, DISCLAIMER and PATENTS sections in the
README file.
This program is released under the terms of the license contained
in the file LICENSE.
------------------------------------------------------------------
0.9.0
~~~~~
First version.
0.9.0a
~~~~~~
Removed 'ranlib' from Makefile, since most modern Unix-es
don't need it, or even know about it.
0.9.0b
~~~~~~
Fixed a problem with error reporting in bzip2.c. This does not effect
the library in any way. Problem is: versions 0.9.0 and 0.9.0a (of the
program proper) compress and decompress correctly, but give misleading
error messages (internal panics) when an I/O error occurs, instead of
reporting the problem correctly. This shouldn't give any data loss
(as far as I can see), but is confusing.
Made the inline declarations disappear for non-GCC compilers.
0.9.0c
~~~~~~
Fixed some problems in the library pertaining to some boundary cases.
This makes the library behave more correctly in those situations. The
fixes apply only to features (calls and parameters) not used by
bzip2.c, so the non-fixedness of them in previous versions has no
effect on reliability of bzip2.c.
In bzlib.c:
* made zero-length BZ_FLUSH work correctly in bzCompress().
* fixed bzWrite/bzRead to ignore zero-length requests.
* fixed bzread to correctly handle read requests after EOF.
* wrong parameter order in call to bzDecompressInit in
bzBuffToBuffDecompress. Fixed.
In compress.c:
* changed setting of nGroups in sendMTFValues() so as to
do a bit better on small files. This _does_ effect
bzip2.c.
0.9.5a
~~~~~~
Major change: add a fallback sorting algorithm (blocksort.c)
to give reasonable behaviour even for very repetitive inputs.
Nuked --repetitive-best and --repetitive-fast since they are
no longer useful.
Minor changes: mostly a whole bunch of small changes/
bugfixes in the driver (bzip2.c). Changes pertaining to the
user interface are:
allow decompression of symlink'd files to stdout
decompress/test files even without .bz2 extension
give more accurate error messages for I/O errors
when compressing/decompressing to stdout, don't catch control-C
read flags from BZIP2 and BZIP environment variables
decline to break hard links to a file unless forced with -f
allow -c flag even with no filenames
preserve file ownerships as far as possible
make -s -1 give the expected block size (100k)
add a flag -q --quiet to suppress nonessential warnings
stop decoding flags after --, so files beginning in - can be handled
resolved inconsistent naming: bzcat or bz2cat ?
bzip2 --help now returns 0
Programming-level changes are:
fixed syntax error in GET_LL4 for Borland C++ 5.02
let bzBuffToBuffDecompress return BZ_DATA_ERROR{_MAGIC}
fix overshoot of mode-string end in bzopen_or_bzdopen
wrapped bzlib.h in #ifdef __cplusplus ... extern "C" { ... }
close file handles under all error conditions
added minor mods so it compiles with DJGPP out of the box
fixed Makefile so it doesn't give problems with BSD make
fix uninitialised memory reads in dlltest.c
0.9.5b
~~~~~~
Open stdin/stdout in binary mode for DJGPP.
0.9.5c
~~~~~~
Changed BZ_N_OVERSHOOT to be ... + 2 instead of ... + 1. The + 1
version could cause the sorted order to be wrong in some extremely
obscure cases. Also changed setting of quadrant in blocksort.c.
0.9.5d
~~~~~~
The only functional change is to make bzlibVersion() in the library
return the correct string. This has no effect whatsoever on the
functioning of the bzip2 program or library. Added a couple of casts
so the library compiles without warnings at level 3 in MS Visual
Studio 6.0. Included a Y2K statement in the file Y2K_INFO. All other
changes are minor documentation changes.
1.0
~~~
Several minor bugfixes and enhancements:
* Large file support. The library uses 64-bit counters to
count the volume of data passing through it. bzip2.c
is now compiled with -D_FILE_OFFSET_BITS=64 to get large
file support from the C library. -v correctly prints out
file sizes greater than 4 gigabytes. All these changes have
been made without assuming a 64-bit platform or a C compiler
which supports 64-bit ints, so, except for the C library
aspect, they are fully portable.
* Decompression robustness. The library/program should be
robust to any corruption of compressed data, detecting and
handling _all_ corruption, instead of merely relying on
the CRCs. What this means is that the program should
never crash, given corrupted data, and the library should
always return BZ_DATA_ERROR.
* Fixed an obscure race-condition bug only ever observed on
Solaris, in which, if you were very unlucky and issued
control-C at exactly the wrong time, both input and output
files would be deleted.
* Don't run out of file handles on test/decompression when
large numbers of files have invalid magic numbers.
* Avoid library namespace pollution. Prefix all exported
symbols with BZ2_.
* Minor sorting enhancements from my DCC2000 paper.
* Advance the version number to 1.0, so as to counteract the
(false-in-this-case) impression some people have that programs
with version numbers less than 1.0 are in some way, experimental,
pre-release versions.
* Create an initial Makefile-libbz2_so to build a shared library.
Yes, I know I should really use libtool et al ...
* Make the program exit with 2 instead of 0 when decompression
fails due to a bad magic number (ie, an invalid bzip2 header).
Also exit with 1 (as the manual claims :-) whenever a diagnostic
message would have been printed AND the corresponding operation
is aborted, for example
bzip2: Output file xx already exists.
When a diagnostic message is printed but the operation is not
aborted, for example
bzip2: Can't guess original name for wurble -- using wurble.out
then the exit value 0 is returned, unless some other problem is
also detected.
I think it corresponds more closely to what the manual claims now.
1.0.1
~~~~~
* Modified dlltest.c so it uses the new BZ2_ naming scheme.
* Modified makefile-msc to fix minor build probs on Win2k.
* Updated README.COMPILATION.PROBLEMS.
There are no functionality changes or bug fixes relative to version
1.0.0. This is just a documentation update + a fix for minor Win32
build problems. For almost everyone, upgrading from 1.0.0 to 1.0.1 is
utterly pointless. Don't bother.
1.0.2
~~~~~
A bug fix release, addressing various minor issues which have appeared
in the 18 or so months since 1.0.1 was released. Most of the fixes
are to do with file-handling or documentation bugs. To the best of my
knowledge, there have been no data-loss-causing bugs reported in the
compression/decompression engine of 1.0.0 or 1.0.1.
Note that this release does not improve the rather crude build system
for Unix platforms. The general plan here is to autoconfiscate/
libtoolise 1.0.2 soon after release, and release the result as 1.1.0
or perhaps 1.2.0. That, however, is still just a plan at this point.
Here are the changes in 1.0.2. Bug-reporters and/or patch-senders in
parentheses.
* Fix an infinite segfault loop in 1.0.1 when a directory is
encountered in -f (force) mode.
(Trond Eivind Glomsrod, Nicholas Nethercote, Volker Schmidt)
* Avoid double fclose() of output file on certain I/O error paths.
(Solar Designer)
* Don't fail with internal error 1007 when fed a long stream (> 48MB)
of byte 251. Also print useful message suggesting that 1007s may be
caused by bad memory.
(noticed by Juan Pedro Vallejo, fixed by me)
* Fix uninitialised variable silly bug in demo prog dlltest.c.
(Jorj Bauer)
* Remove 512-MB limitation on recovered file size for bzip2recover
on selected platforms which support 64-bit ints. At the moment
all GCC supported platforms, and Win32.
(me, Alson van der Meulen)
* Hard-code header byte values, to give correct operation on platforms
using EBCDIC as their native character set (IBM's OS/390).
(Leland Lucius)
* Copy file access times correctly.
(Marty Leisner)
* Add distclean and check targets to Makefile.
(Michael Carmack)
* Parameterise use of ar and ranlib in Makefile. Also add $(LDFLAGS).
(Rich Ireland, Bo Thorsen)
* Pass -p (create parent dirs as needed) to mkdir during make install.
(Jeremy Fusco)
* Dereference symlinks when copying file permissions in -f mode.
(Volker Schmidt)
* Majorly simplify implementation of uInt64_qrm10.
(Bo Lindbergh)
* Check the input file still exists before deleting the output one,
when aborting in cleanUpAndFail().
(Joerg Prante, Robert Linden, Matthias Krings)
Also a bunch of patches courtesy of Philippe Troin, the Debian maintainer
of bzip2:
* Wrapper scripts (with manpages): bzdiff, bzgrep, bzmore.
* Spelling changes and minor enhancements in bzip2.1.
* Avoid race condition between creating the output file and setting its
interim permissions safely, by using fopen_output_safely().
No changes to bzip2recover since there is no issue with file
permissions there.
* do not print senseless report with -v when compressing an empty
file.
* bzcat -f works on non-bzip2 files.
* do not try to escape shell meta-characters on unix (the shell takes
care of these).
* added --fast and --best aliases for -1 -9 for gzip compatibility.
1.0.3 (15 Feb 05)
~~~~~~~~~~~~~~~~~
Fixes some minor bugs since the last version, 1.0.2.
* Further robustification against corrupted compressed data.
There are currently no known bitstreams which can cause the
decompressor to crash, loop or access memory which does not
belong to it. If you are using bzip2 or the library to
decompress bitstreams from untrusted sources, an upgrade
to 1.0.3 is recommended. This fixes CAN-2005-1260.
* The documentation has been converted to XML, from which html
and pdf can be derived.
* Various minor bugs in the documentation have been fixed.
* Fixes for various compilation warnings with newer versions of
gcc, and on 64-bit platforms.
* The BZ_NO_STDIO cpp symbol was not properly observed in 1.0.2.
This has been fixed.
1.0.4 (20 Dec 06)
~~~~~~~~~~~~~~~~~
Fixes some minor bugs since the last version, 1.0.3.
* Fix file permissions race problem (CAN-2005-0953).
* Avoid possible segfault in BZ2_bzclose. From Coverity's NetBSD
scan.
* 'const'/prototype cleanups in the C code.
* Change default install location to /usr/local, and handle multiple
'make install's without error.
* Sanitise file names more carefully in bzgrep. Fixes CAN-2005-0758
to the extent that applies to bzgrep.
* Use 'mktemp' rather than 'tempfile' in bzdiff.
* Tighten up a couple of assertions in blocksort.c following automated
analysis.
* Fix minor doc/comment bugs.
1.0.5 (10 Dec 07)
~~~~~~~~~~~~~~~~~
Security fix only. Fixes CERT-FI 20469 as it applies to bzip2.
1.0.6 (6 Sept 10)
~~~~~~~~~~~~~~~~~
* Security fix for CVE-2010-0405. This was reported by Mikolaj
Izdebski.
* Make the documentation build on Ubuntu 10.04
================================================
FILE: ext/bzip2/LICENSE
================================================
--------------------------------------------------------------------------
This program, "bzip2", the associated library "libbzip2", and all
documentation, are copyright (C) 1996-2010 Julian R Seward. All
rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
2. The origin of this software must not be misrepresented; you must
not claim that you wrote the original software. If you use this
software in a product, an acknowledgment in the product
documentation would be appreciated but is not required.
3. Altered source versions must be plainly marked as such, and must
not be misrepresented as being the original software.
4. The name of the author may not be used to endorse or promote
products derived from this software without specific prior written
permission.
THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS
OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Julian Seward, jseward@bzip.org
bzip2/libbzip2 version 1.0.6 of 6 September 2010
--------------------------------------------------------------------------
================================================
FILE: ext/bzip2/blocksort.c
================================================
/*-------------------------------------------------------------*/
/*--- Block sorting machinery ---*/
/*--- blocksort.c ---*/
/*-------------------------------------------------------------*/
/* ------------------------------------------------------------------
This file is part of bzip2/libbzip2, a program and library for
lossless, block-sorting data compression.
bzip2/libbzip2 version 1.0.6 of 6 September 2010
Copyright (C) 1996-2010 Julian Seward
Please read the WARNING, DISCLAIMER and PATENTS sections in the
README file.
This program is released under the terms of the license contained
in the file LICENSE.
------------------------------------------------------------------ */
#include "bzlib_private.h"
/*---------------------------------------------*/
/*--- Fallback O(N log(N)^2) sorting ---*/
/*--- algorithm, for repetitive blocks ---*/
/*---------------------------------------------*/
/*---------------------------------------------*/
static
__inline__
void fallbackSimpleSort ( UInt32* fmap,
UInt32* eclass,
Int32 lo,
Int32 hi )
{
Int32 i, j, tmp;
UInt32 ec_tmp;
if (lo == hi) return;
if (hi - lo > 3) {
for ( i = hi-4; i >= lo; i-- ) {
tmp = fmap[i];
ec_tmp = eclass[tmp];
for ( j = i+4; j <= hi && ec_tmp > eclass[fmap[j]]; j += 4 )
fmap[j-4] = fmap[j];
fmap[j-4] = tmp;
}
}
for ( i = hi-1; i >= lo; i-- ) {
tmp = fmap[i];
ec_tmp = eclass[tmp];
for ( j = i+1; j <= hi && ec_tmp > eclass[fmap[j]]; j++ )
fmap[j-1] = fmap[j];
fmap[j-1] = tmp;
}
}
/*---------------------------------------------*/
#define fswap(zz1, zz2) \
{ Int32 zztmp = zz1; zz1 = zz2; zz2 = zztmp; }
#define fvswap(zzp1, zzp2, zzn) \
{ \
Int32 yyp1 = (zzp1); \
Int32 yyp2 = (zzp2); \
Int32 yyn = (zzn); \
while (yyn > 0) { \
fswap(fmap[yyp1], fmap[yyp2]); \
yyp1++; yyp2++; yyn--; \
} \
}
#define fmin(a,b) ((a) < (b)) ? (a) : (b)
#define fpush(lz,hz) { stackLo[sp] = lz; \
stackHi[sp] = hz; \
sp++; }
#define fpop(lz,hz) { sp--; \
lz = stackLo[sp]; \
hz = stackHi[sp]; }
#define FALLBACK_QSORT_SMALL_THRESH 10
#define FALLBACK_QSORT_STACK_SIZE 100
static
void fallbackQSort3 ( UInt32* fmap,
UInt32* eclass,
Int32 loSt,
Int32 hiSt )
{
Int32 unLo, unHi, ltLo, gtHi, n, m;
Int32 sp, lo, hi;
UInt32 med, r, r3;
Int32 stackLo[FALLBACK_QSORT_STACK_SIZE];
Int32 stackHi[FALLBACK_QSORT_STACK_SIZE];
r = 0;
sp = 0;
fpush ( loSt, hiSt );
while (sp > 0) {
AssertH ( sp < FALLBACK_QSORT_STACK_SIZE - 1, 1004 );
fpop ( lo, hi );
if (hi - lo < FALLBACK_QSORT_SMALL_THRESH) {
fallbackSimpleSort ( fmap, eclass, lo, hi );
continue;
}
/* Random partitioning. Median of 3 sometimes fails to
avoid bad cases. Median of 9 seems to help but
looks rather expensive. This too seems to work but
is cheaper. Guidance for the magic constants
7621 and 32768 is taken from Sedgewick's algorithms
book, chapter 35.
*/
r = ((r * 7621) + 1) % 32768;
r3 = r % 3;
if (r3 == 0) med = eclass[fmap[lo]]; else
if (r3 == 1) med = eclass[fmap[(lo+hi)>>1]]; else
med = eclass[fmap[hi]];
unLo = ltLo = lo;
unHi = gtHi = hi;
while (1) {
while (1) {
if (unLo > unHi) break;
n = (Int32)eclass[fmap[unLo]] - (Int32)med;
if (n == 0) {
fswap(fmap[unLo], fmap[ltLo]);
ltLo++; unLo++;
continue;
};
if (n > 0) break;
unLo++;
}
while (1) {
if (unLo > unHi) break;
n = (Int32)eclass[fmap[unHi]] - (Int32)med;
if (n == 0) {
fswap(fmap[unHi], fmap[gtHi]);
gtHi--; unHi--;
continue;
};
if (n < 0) break;
unHi--;
}
if (unLo > unHi) break;
fswap(fmap[unLo], fmap[unHi]); unLo++; unHi--;
}
AssertD ( unHi == unLo-1, "fallbackQSort3(2)" );
if (gtHi < ltLo) continue;
n = fmin(ltLo-lo, unLo-ltLo); fvswap(lo, unLo-n, n);
m = fmin(hi-gtHi, gtHi-unHi); fvswap(unLo, hi-m+1, m);
n = lo + unLo - ltLo - 1;
m = hi - (gtHi - unHi) + 1;
if (n - lo > hi - m) {
fpush ( lo, n );
fpush ( m, hi );
} else {
fpush ( m, hi );
fpush ( lo, n );
}
}
}
#undef fmin
#undef fpush
#undef fpop
#undef fswap
#undef fvswap
#undef FALLBACK_QSORT_SMALL_THRESH
#undef FALLBACK_QSORT_STACK_SIZE
/*---------------------------------------------*/
/* Pre:
nblock > 0
eclass exists for [0 .. nblock-1]
((UChar*)eclass) [0 .. nblock-1] holds block
ptr exists for [0 .. nblock-1]
Post:
((UChar*)eclass) [0 .. nblock-1] holds block
All other areas of eclass destroyed
fmap [0 .. nblock-1] holds sorted order
bhtab [ 0 .. 2+(nblock/32) ] destroyed
*/
#define SET_BH(zz) bhtab[(zz) >> 5] |= (1 << ((zz) & 31))
#define CLEAR_BH(zz) bhtab[(zz) >> 5] &= ~(1 << ((zz) & 31))
#define ISSET_BH(zz) (bhtab[(zz) >> 5] & (1 << ((zz) & 31)))
#define WORD_BH(zz) bhtab[(zz) >> 5]
#define UNALIGNED_BH(zz) ((zz) & 0x01f)
static
void fallbackSort ( UInt32* fmap,
UInt32* eclass,
UInt32* bhtab,
Int32 nblock,
Int32 verb )
{
Int32 ftab[257];
Int32 ftabCopy[256];
Int32 H, i, j, k, l, r, cc, cc1;
Int32 nNotDone;
Int32 nBhtab;
UChar* eclass8 = (UChar*)eclass;
/*--
Initial 1-char radix sort to generate
initial fmap and initial BH bits.
--*/
if (verb >= 4)
VPrintf0 ( " bucket sorting ...\n" );
for (i = 0; i < 257; i++) ftab[i] = 0;
for (i = 0; i < nblock; i++) ftab[eclass8[i]]++;
for (i = 0; i < 256; i++) ftabCopy[i] = ftab[i];
for (i = 1; i < 257; i++) ftab[i] += ftab[i-1];
for (i = 0; i < nblock; i++) {
j = eclass8[i];
k = ftab[j] - 1;
ftab[j] = k;
fmap[k] = i;
}
nBhtab = 2 + (nblock / 32);
for (i = 0; i < nBhtab; i++) bhtab[i] = 0;
for (i = 0; i < 256; i++) SET_BH(ftab[i]);
/*--
Inductively refine the buckets. Kind-of an
"exponential radix sort" (!), inspired by the
Manber-Myers suffix array construction algorithm.
--*/
/*-- set sentinel bits for block-end detection --*/
for (i = 0; i < 32; i++) {
SET_BH(nblock + 2*i);
CLEAR_BH(nblock + 2*i + 1);
}
/*-- the log(N) loop --*/
H = 1;
while (1) {
if (verb >= 4)
VPrintf1 ( " depth %6d has ", H );
j = 0;
for (i = 0; i < nblock; i++) {
if (ISSET_BH(i)) j = i;
k = fmap[i] - H; if (k < 0) k += nblock;
eclass[k] = j;
}
nNotDone = 0;
r = -1;
while (1) {
/*-- find the next non-singleton bucket --*/
k = r + 1;
while (ISSET_BH(k) && UNALIGNED_BH(k)) k++;
if (ISSET_BH(k)) {
while (WORD_BH(k) == 0xffffffff) k += 32;
while (ISSET_BH(k)) k++;
}
l = k - 1;
if (l >= nblock) break;
while (!ISSET_BH(k) && UNALIGNED_BH(k)) k++;
if (!ISSET_BH(k)) {
while (WORD_BH(k) == 0x00000000) k += 32;
while (!ISSET_BH(k)) k++;
}
r = k - 1;
if (r >= nblock) break;
/*-- now [l, r] bracket current bucket --*/
if (r > l) {
nNotDone += (r - l + 1);
fallbackQSort3 ( fmap, eclass, l, r );
/*-- scan bucket and generate header bits-- */
cc = -1;
for (i = l; i <= r; i++) {
cc1 = eclass[fmap[i]];
if (cc != cc1) { SET_BH(i); cc = cc1; };
}
}
}
if (verb >= 4)
VPrintf1 ( "%6d unresolved strings\n", nNotDone );
H *= 2;
if (H > nblock || nNotDone == 0) break;
}
/*--
Reconstruct the original block in
eclass8 [0 .. nblock-1], since the
previous phase destroyed it.
--*/
if (verb >= 4)
VPrintf0 ( " reconstructing block ...\n" );
j = 0;
for (i = 0; i < nblock; i++) {
while (ftabCopy[j] == 0) j++;
ftabCopy[j]--;
eclass8[fmap[i]] = (UChar)j;
}
AssertH ( j < 256, 1005 );
}
#undef SET_BH
#undef CLEAR_BH
#undef ISSET_BH
#undef WORD_BH
#undef UNALIGNED_BH
/*---------------------------------------------*/
/*--- The main, O(N^2 log(N)) sorting ---*/
/*--- algorithm. Faster for "normal" ---*/
/*--- non-repetitive blocks. ---*/
/*---------------------------------------------*/
/*---------------------------------------------*/
static
__inline__
Bool mainGtU ( UInt32 i1,
UInt32 i2,
UChar* block,
UInt16* quadrant,
UInt32 nblock,
Int32* budget )
{
Int32 k;
UChar c1, c2;
UInt16 s1, s2;
AssertD ( i1 != i2, "mainGtU" );
/* 1 */
c1 = block[i1]; c2 = block[i2];
if (c1 != c2) return (c1 > c2);
i1++; i2++;
/* 2 */
c1 = block[i1]; c2 = block[i2];
if (c1 != c2) return (c1 > c2);
i1++; i2++;
/* 3 */
c1 = block[i1]; c2 = block[i2];
if (c1 != c2) return (c1 > c2);
i1++; i2++;
/* 4 */
c1 = block[i1]; c2 = block[i2];
if (c1 != c2) return (c1 > c2);
i1++; i2++;
/* 5 */
c1 = block[i1]; c2 = block[i2];
if (c1 != c2) return (c1 > c2);
i1++; i2++;
/* 6 */
c1 = block[i1]; c2 = block[i2];
if (c1 != c2) return (c1 > c2);
i1++; i2++;
/* 7 */
c1 = block[i1]; c2 = block[i2];
if (c1 != c2) return (c1 > c2);
i1++; i2++;
/* 8 */
c1 = block[i1]; c2 = block[i2];
if (c1 != c2) return (c1 > c2);
i1++; i2++;
/* 9 */
c1 = block[i1]; c2 = block[i2];
if (c1 != c2) return (c1 > c2);
i1++; i2++;
/* 10 */
c1 = block[i1]; c2 = block[i2];
if (c1 != c2) return (c1 > c2);
i1++; i2++;
/* 11 */
c1 = block[i1]; c2 = block[i2];
if (c1 != c2) return (c1 > c2);
i1++; i2++;
/* 12 */
c1 = block[i1]; c2 = block[i2];
if (c1 != c2) return (c1 > c2);
i1++; i2++;
k = nblock + 8;
do {
/* 1 */
c1 = block[i1]; c2 = block[i2];
if (c1 != c2) return (c1 > c2);
s1 = quadrant[i1]; s2 = quadrant[i2];
if (s1 != s2) return (s1 > s2);
i1++; i2++;
/* 2 */
c1 = block[i1]; c2 = block[i2];
if (c1 != c2) return (c1 > c2);
s1 = quadrant[i1]; s2 = quadrant[i2];
if (s1 != s2) return (s1 > s2);
i1++; i2++;
/* 3 */
c1 = block[i1]; c2 = block[i2];
if (c1 != c2) return (c1 > c2);
s1 = quadrant[i1]; s2 = quadrant[i2];
if (s1 != s2) return (s1 > s2);
i1++; i2++;
/* 4 */
c1 = block[i1]; c2 = block[i2];
if (c1 != c2) return (c1 > c2);
s1 = quadrant[i1]; s2 = quadrant[i2];
if (s1 != s2) return (s1 > s2);
i1++; i2++;
/* 5 */
c1 = block[i1]; c2 = block[i2];
if (c1 != c2) return (c1 > c2);
s1 = quadrant[i1]; s2 = quadrant[i2];
if (s1 != s2) return (s1 > s2);
i1++; i2++;
/* 6 */
c1 = block[i1]; c2 = block[i2];
if (c1 != c2) return (c1 > c2);
s1 = quadrant[i1]; s2 = quadrant[i2];
if (s1 != s2) return (s1 > s2);
i1++; i2++;
/* 7 */
c1 = block[i1]; c2 = block[i2];
if (c1 != c2) return (c1 > c2);
s1 = quadrant[i1]; s2 = quadrant[i2];
if (s1 != s2) return (s1 > s2);
i1++; i2++;
/* 8 */
c1 = block[i1]; c2 = block[i2];
if (c1 != c2) return (c1 > c2);
s1 = quadrant[i1]; s2 = quadrant[i2];
if (s1 != s2) return (s1 > s2);
i1++; i2++;
if (i1 >= nblock) i1 -= nblock;
if (i2 >= nblock) i2 -= nblock;
k -= 8;
(*budget)--;
}
while (k >= 0);
return False;
}
/*---------------------------------------------*/
/*--
Knuth's increments seem to work better
than Incerpi-Sedgewick here. Possibly
because the number of elems to sort is
usually small, typically <= 20.
--*/
static
Int32 incs[14] = { 1, 4, 13, 40, 121, 364, 1093, 3280,
9841, 29524, 88573, 265720,
797161, 2391484 };
static
void mainSimpleSort ( UInt32* ptr,
UChar* block,
UInt16* quadrant,
Int32 nblock,
Int32 lo,
Int32 hi,
Int32 d,
Int32* budget )
{
Int32 i, j, h, bigN, hp;
UInt32 v;
bigN = hi - lo + 1;
if (bigN < 2) return;
hp = 0;
while (incs[hp] < bigN) hp++;
hp--;
for (; hp >= 0; hp--) {
h = incs[hp];
i = lo + h;
while (True) {
/*-- copy 1 --*/
if (i > hi) break;
v = ptr[i];
j = i;
while ( mainGtU (
ptr[j-h]+d, v+d, block, quadrant, nblock, budget
) ) {
ptr[j] = ptr[j-h];
j = j - h;
if (j <= (lo + h - 1)) break;
}
ptr[j] = v;
i++;
/*-- copy 2 --*/
if (i > hi) break;
v = ptr[i];
j = i;
while ( mainGtU (
ptr[j-h]+d, v+d, block, quadrant, nblock, budget
) ) {
ptr[j] = ptr[j-h];
j = j - h;
if (j <= (lo + h - 1)) break;
}
ptr[j] = v;
i++;
/*-- copy 3 --*/
if (i > hi) break;
v = ptr[i];
j = i;
while ( mainGtU (
ptr[j-h]+d, v+d, block, quadrant, nblock, budget
) ) {
ptr[j] = ptr[j-h];
j = j - h;
if (j <= (lo + h - 1)) break;
}
ptr[j] = v;
i++;
if (*budget < 0) return;
}
}
}
/*---------------------------------------------*/
/*--
The following is an implementation of
an elegant 3-way quicksort for strings,
described in a paper "Fast Algorithms for
Sorting and Searching Strings", by Robert
Sedgewick and Jon L. Bentley.
--*/
#define mswap(zz1, zz2) \
{ Int32 zztmp = zz1; zz1 = zz2; zz2 = zztmp; }
#define mvswap(zzp1, zzp2, zzn) \
{ \
Int32 yyp1 = (zzp1); \
Int32 yyp2 = (zzp2); \
Int32 yyn = (zzn); \
while (yyn > 0) { \
mswap(ptr[yyp1], ptr[yyp2]); \
yyp1++; yyp2++; yyn--; \
} \
}
static
__inline__
UChar mmed3 ( UChar a, UChar b, UChar c )
{
UChar t;
if (a > b) { t = a; a = b; b = t; };
if (b > c) {
b = c;
if (a > b) b = a;
}
return b;
}
#define mmin(a,b) ((a) < (b)) ? (a) : (b)
#define mpush(lz,hz,dz) { stackLo[sp] = lz; \
stackHi[sp] = hz; \
stackD [sp] = dz; \
sp++; }
#define mpop(lz,hz,dz) { sp--; \
lz = stackLo[sp]; \
hz = stackHi[sp]; \
dz = stackD [sp]; }
#define mnextsize(az) (nextHi[az]-nextLo[az])
#define mnextswap(az,bz) \
{ Int32 tz; \
tz = nextLo[az]; nextLo[az] = nextLo[bz]; nextLo[bz] = tz; \
tz = nextHi[az]; nextHi[az] = nextHi[bz]; nextHi[bz] = tz; \
tz = nextD [az]; nextD [az] = nextD [bz]; nextD [bz] = tz; }
#define MAIN_QSORT_SMALL_THRESH 20
#define MAIN_QSORT_DEPTH_THRESH (BZ_N_RADIX + BZ_N_QSORT)
#define MAIN_QSORT_STACK_SIZE 100
static
void mainQSort3 ( UInt32* ptr,
UChar* block,
UInt16* quadrant,
Int32 nblock,
Int32 loSt,
Int32 hiSt,
Int32 dSt,
Int32* budget )
{
Int32 unLo, unHi, ltLo, gtHi, n, m, med;
Int32 sp, lo, hi, d;
Int32 stackLo[MAIN_QSORT_STACK_SIZE];
Int32 stackHi[MAIN_QSORT_STACK_SIZE];
Int32 stackD [MAIN_QSORT_STACK_SIZE];
Int32 nextLo[3];
Int32 nextHi[3];
Int32 nextD [3];
sp = 0;
mpush ( loSt, hiSt, dSt );
while (sp > 0) {
AssertH ( sp < MAIN_QSORT_STACK_SIZE - 2, 1001 );
mpop ( lo, hi, d );
if (hi - lo < MAIN_QSORT_SMALL_THRESH ||
d > MAIN_QSORT_DEPTH_THRESH) {
mainSimpleSort ( ptr, block, quadrant, nblock, lo, hi, d, budget );
if (*budget < 0) return;
continue;
}
med = (Int32)
mmed3 ( block[ptr[ lo ]+d],
block[ptr[ hi ]+d],
block[ptr[ (lo+hi)>>1 ]+d] );
unLo = ltLo = lo;
unHi = gtHi = hi;
while (True) {
while (True) {
if (unLo > unHi) break;
n = ((Int32)block[ptr[unLo]+d]) - med;
if (n == 0) {
mswap(ptr[unLo], ptr[ltLo]);
ltLo++; unLo++; continue;
};
if (n > 0) break;
unLo++;
}
while (True) {
if (unLo > unHi) break;
n = ((Int32)block[ptr[unHi]+d]) - med;
if (n == 0) {
mswap(ptr[unHi], ptr[gtHi]);
gtHi--; unHi--; continue;
};
if (n < 0) break;
unHi--;
}
if (unLo > unHi) break;
mswap(ptr[unLo], ptr[unHi]); unLo++; unHi--;
}
AssertD ( unHi == unLo-1, "mainQSort3(2)" );
if (gtHi < ltLo) {
mpush(lo, hi, d+1 );
continue;
}
n = mmin(ltLo-lo, unLo-ltLo); mvswap(lo, unLo-n, n);
m = mmin(hi-gtHi, gtHi-unHi); mvswap(unLo, hi-m+1, m);
n = lo + unLo - ltLo - 1;
m = hi - (gtHi - unHi) + 1;
nextLo[0] = lo; nextHi[0] = n; nextD[0] = d;
nextLo[1] = m; nextHi[1] = hi; nextD[1] = d;
nextLo[2] = n+1; nextHi[2] = m-1; nextD[2] = d+1;
if (mnextsize(0) < mnextsize(1)) mnextswap(0,1);
if (mnextsize(1) < mnextsize(2)) mnextswap(1,2);
if (mnextsize(0) < mnextsize(1)) mnextswap(0,1);
AssertD (mnextsize(0) >= mnextsize(1), "mainQSort3(8)" );
AssertD (mnextsize(1) >= mnextsize(2), "mainQSort3(9)" );
mpush (nextLo[0], nextHi[0], nextD[0]);
mpush (nextLo[1], nextHi[1], nextD[1]);
mpush (nextLo[2], nextHi[2], nextD[2]);
}
}
#undef mswap
#undef mvswap
#undef mpush
#undef mpop
#undef mmin
#undef mnextsize
#undef mnextswap
#undef MAIN_QSORT_SMALL_THRESH
#undef MAIN_QSORT_DEPTH_THRESH
#undef MAIN_QSORT_STACK_SIZE
/*---------------------------------------------*/
/* Pre:
nblock > N_OVERSHOOT
block32 exists for [0 .. nblock-1 +N_OVERSHOOT]
((UChar*)block32) [0 .. nblock-1] holds block
ptr exists for [0 .. nblock-1]
Post:
((UChar*)block32) [0 .. nblock-1] holds block
All other areas of block32 destroyed
ftab [0 .. 65536 ] destroyed
ptr [0 .. nblock-1] holds sorted order
if (*budget < 0), sorting was abandoned
*/
#define BIGFREQ(b) (ftab[((b)+1) << 8] - ftab[(b) << 8])
#define SETMASK (1 << 21)
#define CLEARMASK (~(SETMASK))
static
void mainSort ( UInt32* ptr,
UChar* block,
UInt16* quadrant,
UInt32* ftab,
Int32 nblock,
Int32 verb,
Int32* budget )
{
Int32 i, j, k, ss, sb;
Int32 runningOrder[256];
Bool bigDone[256];
Int32 copyStart[256];
Int32 copyEnd [256];
UChar c1;
Int32 numQSorted;
UInt16 s;
if (verb >= 4) VPrintf0 ( " main sort initialise ...\n" );
/*-- set up the 2-byte frequency table --*/
for (i = 65536; i >= 0; i--) ftab[i] = 0;
j = block[0] << 8;
i = nblock-1;
for (; i >= 3; i -= 4) {
quadrant[i] = 0;
j = (j >> 8) | ( ((UInt16)block[i]) << 8);
ftab[j]++;
quadrant[i-1] = 0;
j = (j >> 8) | ( ((UInt16)block[i-1]) << 8);
ftab[j]++;
quadrant[i-2] = 0;
j = (j >> 8) | ( ((UInt16)block[i-2]) << 8);
ftab[j]++;
quadrant[i-3] = 0;
j = (j >> 8) | ( ((UInt16)block[i-3]) << 8);
ftab[j]++;
}
for (; i >= 0; i--) {
quadrant[i] = 0;
j = (j >> 8) | ( ((UInt16)block[i]) << 8);
ftab[j]++;
}
/*-- (emphasises close relationship of block & quadrant) --*/
for (i = 0; i < BZ_N_OVERSHOOT; i++) {
block [nblock+i] = block[i];
quadrant[nblock+i] = 0;
}
if (verb >= 4) VPrintf0 ( " bucket sorting ...\n" );
/*-- Complete the initial radix sort --*/
for (i = 1; i <= 65536; i++) ftab[i] += ftab[i-1];
s = block[0] << 8;
i = nblock-1;
for (; i >= 3; i -= 4) {
s = (s >> 8) | (block[i] << 8);
j = ftab[s] -1;
ftab[s] = j;
ptr[j] = i;
s = (s >> 8) | (block[i-1] << 8);
j = ftab[s] -1;
ftab[s] = j;
ptr[j] = i-1;
s = (s >> 8) | (block[i-2] << 8);
j = ftab[s] -1;
ftab[s] = j;
ptr[j] = i-2;
s = (s >> 8) | (block[i-3] << 8);
j = ftab[s] -1;
ftab[s] = j;
ptr[j] = i-3;
}
for (; i >= 0; i--) {
s = (s >> 8) | (block[i] << 8);
j = ftab[s] -1;
ftab[s] = j;
ptr[j] = i;
}
/*--
Now ftab contains the first loc of every small bucket.
Calculate the running order, from smallest to largest
big bucket.
--*/
for (i = 0; i <= 255; i++) {
bigDone [i] = False;
runningOrder[i] = i;
}
{
Int32 vv;
Int32 h = 1;
do h = 3 * h + 1; while (h <= 256);
do {
h = h / 3;
for (i = h; i <= 255; i++) {
vv = runningOrder[i];
j = i;
while ( BIGFREQ(runningOrder[j-h]) > BIGFREQ(vv) ) {
runningOrder[j] = runningOrder[j-h];
j = j - h;
if (j <= (h - 1)) goto zero;
}
zero:
runningOrder[j] = vv;
}
} while (h != 1);
}
/*--
The main sorting loop.
--*/
numQSorted = 0;
for (i = 0; i <= 255; i++) {
/*--
Process big buckets, starting with the least full.
Basically this is a 3-step process in which we call
mainQSort3 to sort the small buckets [ss, j], but
also make a big effort to avoid the calls if we can.
--*/
ss = runningOrder[i];
/*--
Step 1:
Complete the big bucket [ss] by quicksorting
any unsorted small buckets [ss, j], for j != ss.
Hopefully previous pointer-scanning phases have already
completed many of the small buckets [ss, j], so
we don't have to sort them at all.
--*/
for (j = 0; j <= 255; j++) {
if (j != ss) {
sb = (ss << 8) + j;
if ( ! (ftab[sb] & SETMASK) ) {
Int32 lo = ftab[sb] & CLEARMASK;
Int32 hi = (ftab[sb+1] & CLEARMASK) - 1;
if (hi > lo) {
if (verb >= 4)
VPrintf4 ( " qsort [0x%x, 0x%x] "
"done %d this %d\n",
ss, j, numQSorted, hi - lo + 1 );
mainQSort3 (
ptr, block, quadrant, nblock,
lo, hi, BZ_N_RADIX, budget
);
numQSorted += (hi - lo + 1);
if (*budget < 0) return;
}
}
ftab[sb] |= SETMASK;
}
}
AssertH ( !bigDone[ss], 1006 );
/*--
Step 2:
Now scan this big bucket [ss] so as to synthesise the
sorted order for small buckets [t, ss] for all t,
including, magically, the bucket [ss,ss] too.
This will avoid doing Real Work in subsequent Step 1's.
--*/
{
for (j = 0; j <= 255; j++) {
copyStart[j] = ftab[(j << 8) + ss] & CLEARMASK;
copyEnd [j] = (ftab[(j << 8) + ss + 1] & CLEARMASK) - 1;
}
for (j = ftab[ss << 8] & CLEARMASK; j < copyStart[ss]; j++) {
k = ptr[j]-1; if (k < 0) k += nblock;
c1 = block[k];
if (!bigDone[c1])
ptr[ copyStart[c1]++ ] = k;
}
for (j = (ftab[(ss+1) << 8] & CLEARMASK) - 1; j > copyEnd[ss]; j--) {
k = ptr[j]-1; if (k < 0) k += nblock;
c1 = block[k];
if (!bigDone[c1])
ptr[ copyEnd[c1]-- ] = k;
}
}
AssertH ( (copyStart[ss]-1 == copyEnd[ss])
||
/* Extremely rare case missing in bzip2-1.0.0 and 1.0.1.
Necessity for this case is demonstrated by compressing
a sequence of approximately 48.5 million of character
251; 1.0.0/1.0.1 will then die here. */
(copyStart[ss] == 0 && copyEnd[ss] == nblock-1),
1007 )
for (j = 0; j <= 255; j++) ftab[(j << 8) + ss] |= SETMASK;
/*--
Step 3:
The [ss] big bucket is now done. Record this fact,
and update the quadrant descriptors. Remember to
update quadrants in the overshoot area too, if
necessary. The "if (i < 255)" test merely skips
this updating for the last bucket processed, since
updating for the last bucket is pointless.
The quadrant array provides a way to incrementally
cache sort orderings, as they appear, so as to
make subsequent comparisons in fullGtU() complete
faster. For repetitive blocks this makes a big
difference (but not big enough to be able to avoid
the fallback sorting mechanism, exponential radix sort).
The precise meaning is: at all times:
for 0 <= i < nblock and 0 <= j <= nblock
if block[i] != block[j],
then the relative values of quadrant[i] and
quadrant[j] are meaningless.
else {
if quadrant[i] < quadrant[j]
then the string starting at i lexicographically
precedes the string starting at j
else if quadrant[i] > quadrant[j]
then the string starting at j lexicographically
precedes the string starting at i
else
the relative ordering of the strings starting
at i and j has not yet been determined.
}
--*/
bigDone[ss] = True;
if (i < 255) {
Int32 bbStart = ftab[ss << 8] & CLEARMASK;
Int32 bbSize = (ftab[(ss+1) << 8] & CLEARMASK) - bbStart;
Int32 shifts = 0;
while ((bbSize >> shifts) > 65534) shifts++;
for (j = bbSize-1; j >= 0; j--) {
Int32 a2update = ptr[bbStart + j];
UInt16 qVal = (UInt16)(j >> shifts);
quadrant[a2update] = qVal;
if (a2update < BZ_N_OVERSHOOT)
quadrant[a2update + nblock] = qVal;
}
AssertH ( ((bbSize-1) >> shifts) <= 65535, 1002 );
}
}
if (verb >= 4)
VPrintf3 ( " %d pointers, %d sorted, %d scanned\n",
nblock, numQSorted, nblock - numQSorted );
}
#undef BIGFREQ
#undef SETMASK
#undef CLEARMASK
/*---------------------------------------------*/
/* Pre:
nblock > 0
arr2 exists for [0 .. nblock-1 +N_OVERSHOOT]
((UChar*)arr2) [0 .. nblock-1] holds block
arr1 exists for [0 .. nblock-1]
Post:
((UChar*)arr2) [0 .. nblock-1] holds block
All other areas of block destroyed
ftab [ 0 .. 65536 ] destroyed
arr1 [0 .. nblock-1] holds sorted order
*/
void BZ2_blockSort ( EState* s )
{
UInt32* ptr = s->ptr;
UChar* block = s->block;
UInt32* ftab = s->ftab;
Int32 nblock = s->nblock;
Int32 verb = s->verbosity;
Int32 wfact = s->workFactor;
UInt16* quadrant;
Int32 budget;
Int32 budgetInit;
Int32 i;
if (nblock < 10000) {
fallbackSort ( s->arr1, s->arr2, ftab, nblock, verb );
} else {
/* Calculate the location for quadrant, remembering to get
the alignment right. Assumes that &(block[0]) is at least
2-byte aligned -- this should be ok since block is really
the first section of arr2.
*/
i = nblock+BZ_N_OVERSHOOT;
if (i & 1) i++;
quadrant = (UInt16*)(&(block[i]));
/* (wfact-1) / 3 puts the default-factor-30
transition point at very roughly the same place as
with v0.1 and v0.9.0.
Not that it particularly matters any more, since the
resulting compressed stream is now the same regardless
of whether or not we use the main sort or fallback sort.
*/
if (wfact < 1 ) wfact = 1;
if (wfact > 100) wfact = 100;
budgetInit = nblock * ((wfact-1) / 3);
budget = budgetInit;
mainSort ( ptr, block, quadrant, ftab, nblock, verb, &budget );
if (verb >= 3)
VPrintf3 ( " %d work, %d block, ratio %5.2f\n",
budgetInit - budget,
nblock,
(float)(budgetInit - budget) /
(float)(nblock==0 ? 1 : nblock) );
if (budget < 0) {
if (verb >= 2)
VPrintf0 ( " too repetitive; using fallback"
" sorting algorithm\n" );
fallbackSort ( s->arr1, s->arr2, ftab, nblock, verb );
}
}
s->origPtr = -1;
for (i = 0; i < s->nblock; i++)
if (ptr[i] == 0)
{ s->origPtr = i; break; };
AssertH( s->origPtr != -1, 1003 );
}
/*-------------------------------------------------------------*/
/*--- end blocksort.c ---*/
/*-------------------------------------------------------------*/
================================================
FILE: ext/bzip2/bz_internal_error.c
================================================
/* Use when compiling with BZ_NO_STDIO */
#include
void bz_internal_error(int errcode)
{
assert(0);
}
================================================
FILE: ext/bzip2/bzip_all.c
================================================
#include "blocksort.c"
#include "bzlib.c"
#include "compress.c"
#include "crctable.c"
#include "decompress.c"
#include "huffman.c"
#include "randtable.c"
#include "bz_internal_error.c"
================================================
FILE: ext/bzip2/bzlib.c
================================================
/*-------------------------------------------------------------*/
/*--- Library top-level functions. ---*/
/*--- bzlib.c ---*/
/*-------------------------------------------------------------*/
/* ------------------------------------------------------------------
This file is part of bzip2/libbzip2, a program and library for
lossless, block-sorting data compression.
bzip2/libbzip2 version 1.0.6 of 6 September 2010
Copyright (C) 1996-2010 Julian Seward
Please read the WARNING, DISCLAIMER and PATENTS sections in the
README file.
This program is released under the terms of the license contained
in the file LICENSE.
------------------------------------------------------------------ */
/* CHANGES
0.9.0 -- original version.
0.9.0a/b -- no changes in this file.
0.9.0c -- made zero-length BZ_FLUSH work correctly in bzCompress().
fixed bzWrite/bzRead to ignore zero-length requests.
fixed bzread to correctly handle read requests after EOF.
wrong parameter order in call to bzDecompressInit in
bzBuffToBuffDecompress. Fixed.
*/
#include "bzlib_private.h"
/*---------------------------------------------------*/
/*--- Compression stuff ---*/
/*---------------------------------------------------*/
/*---------------------------------------------------*/
#ifndef BZ_NO_STDIO
void BZ2_bz__AssertH__fail ( int errcode )
{
fprintf(stderr,
"\n\nbzip2/libbzip2: internal error number %d.\n"
"This is a bug in bzip2/libbzip2, %s.\n"
"Please report it to me at: jseward@bzip.org. If this happened\n"
"when you were using some program which uses libbzip2 as a\n"
"component, you should also report this bug to the author(s)\n"
"of that program. Please make an effort to report this bug;\n"
"timely and accurate bug reports eventually lead to higher\n"
"quality software. Thanks. Julian Seward, 10 December 2007.\n\n",
errcode,
BZ2_bzlibVersion()
);
if (errcode == 1007) {
fprintf(stderr,
"\n*** A special note about internal error number 1007 ***\n"
"\n"
"Experience suggests that a common cause of i.e. 1007\n"
"is unreliable memory or other hardware. The 1007 assertion\n"
"just happens to cross-check the results of huge numbers of\n"
"memory reads/writes, and so acts (unintendedly) as a stress\n"
"test of your memory system.\n"
"\n"
"I suggest the following: try compressing the file again,\n"
"possibly monitoring progress in detail with the -vv flag.\n"
"\n"
"* If the error cannot be reproduced, and/or happens at different\n"
" points in compression, you may have a flaky memory system.\n"
" Try a memory-test program. I have used Memtest86\n"
" (www.memtest86.com). At the time of writing it is free (GPLd).\n"
" Memtest86 tests memory much more thorougly than your BIOSs\n"
" power-on test, and may find failures that the BIOS doesn't.\n"
"\n"
"* If the error can be repeatably reproduced, this is a bug in\n"
" bzip2, and I would very much like to hear about it. Please\n"
" let me know, and, ideally, save a copy of the file causing the\n"
" problem -- without which I will be unable to investigate it.\n"
"\n"
);
}
exit(3);
}
#endif
/*---------------------------------------------------*/
static
int bz_config_ok ( void )
{
if (sizeof(int) != 4) return 0;
if (sizeof(short) != 2) return 0;
if (sizeof(char) != 1) return 0;
return 1;
}
/*---------------------------------------------------*/
static
void* default_bzalloc ( void* opaque, Int32 items, Int32 size )
{
void* v = malloc ( items * size );
return v;
}
static
void default_bzfree ( void* opaque, void* addr )
{
if (addr != NULL) free ( addr );
}
/*---------------------------------------------------*/
static
void prepare_new_block ( EState* s )
{
Int32 i;
s->nblock = 0;
s->numZ = 0;
s->state_out_pos = 0;
BZ_INITIALISE_CRC ( s->blockCRC );
for (i = 0; i < 256; i++) s->inUse[i] = False;
s->blockNo++;
}
/*---------------------------------------------------*/
static
void init_RL ( EState* s )
{
s->state_in_ch = 256;
s->state_in_len = 0;
}
static
Bool isempty_RL ( EState* s )
{
if (s->state_in_ch < 256 && s->state_in_len > 0)
return False; else
return True;
}
/*---------------------------------------------------*/
int BZ_API(BZ2_bzCompressInit)
( bz_stream* strm,
int blockSize100k,
int verbosity,
int workFactor )
{
Int32 n;
EState* s;
if (!bz_config_ok()) return BZ_CONFIG_ERROR;
if (strm == NULL ||
blockSize100k < 1 || blockSize100k > 9 ||
workFactor < 0 || workFactor > 250)
return BZ_PARAM_ERROR;
if (workFactor == 0) workFactor = 30;
if (strm->bzalloc == NULL) strm->bzalloc = default_bzalloc;
if (strm->bzfree == NULL) strm->bzfree = default_bzfree;
s = BZALLOC( sizeof(EState) );
if (s == NULL) return BZ_MEM_ERROR;
s->strm = strm;
s->arr1 = NULL;
s->arr2 = NULL;
s->ftab = NULL;
n = 100000 * blockSize100k;
s->arr1 = BZALLOC( n * sizeof(UInt32) );
s->arr2 = BZALLOC( (n+BZ_N_OVERSHOOT) * sizeof(UInt32) );
s->ftab = BZALLOC( 65537 * sizeof(UInt32) );
if (s->arr1 == NULL || s->arr2 == NULL || s->ftab == NULL) {
if (s->arr1 != NULL) BZFREE(s->arr1);
if (s->arr2 != NULL) BZFREE(s->arr2);
if (s->ftab != NULL) BZFREE(s->ftab);
if (s != NULL) BZFREE(s);
return BZ_MEM_ERROR;
}
s->blockNo = 0;
s->state = BZ_S_INPUT;
s->mode = BZ_M_RUNNING;
s->combinedCRC = 0;
s->blockSize100k = blockSize100k;
s->nblockMAX = 100000 * blockSize100k - 19;
s->verbosity = verbosity;
s->workFactor = workFactor;
s->block = (UChar*)s->arr2;
s->mtfv = (UInt16*)s->arr1;
s->zbits = NULL;
s->ptr = (UInt32*)s->arr1;
strm->state = s;
strm->total_in_lo32 = 0;
strm->total_in_hi32 = 0;
strm->total_out_lo32 = 0;
strm->total_out_hi32 = 0;
init_RL ( s );
prepare_new_block ( s );
return BZ_OK;
}
/*---------------------------------------------------*/
static
void add_pair_to_block ( EState* s )
{
Int32 i;
UChar ch = (UChar)(s->state_in_ch);
for (i = 0; i < s->state_in_len; i++) {
BZ_UPDATE_CRC( s->blockCRC, ch );
}
s->inUse[s->state_in_ch] = True;
switch (s->state_in_len) {
case 1:
s->block[s->nblock] = (UChar)ch; s->nblock++;
break;
case 2:
s->block[s->nblock] = (UChar)ch; s->nblock++;
s->block[s->nblock] = (UChar)ch; s->nblock++;
break;
case 3:
s->block[s->nblock] = (UChar)ch; s->nblock++;
s->block[s->nblock] = (UChar)ch; s->nblock++;
s->block[s->nblock] = (UChar)ch; s->nblock++;
break;
default:
s->inUse[s->state_in_len-4] = True;
s->block[s->nblock] = (UChar)ch; s->nblock++;
s->block[s->nblock] = (UChar)ch; s->nblock++;
s->block[s->nblock] = (UChar)ch; s->nblock++;
s->block[s->nblock] = (UChar)ch; s->nblock++;
s->block[s->nblock] = ((UChar)(s->state_in_len-4));
s->nblock++;
break;
}
}
/*---------------------------------------------------*/
static
void flush_RL ( EState* s )
{
if (s->state_in_ch < 256) add_pair_to_block ( s );
init_RL ( s );
}
/*---------------------------------------------------*/
#define ADD_CHAR_TO_BLOCK(zs,zchh0) \
{ \
UInt32 zchh = (UInt32)(zchh0); \
/*-- fast track the common case --*/ \
if (zchh != zs->state_in_ch && \
zs->state_in_len == 1) { \
UChar ch = (UChar)(zs->state_in_ch); \
BZ_UPDATE_CRC( zs->blockCRC, ch ); \
zs->inUse[zs->state_in_ch] = True; \
zs->block[zs->nblock] = (UChar)ch; \
zs->nblock++; \
zs->state_in_ch = zchh; \
} \
else \
/*-- general, uncommon cases --*/ \
if (zchh != zs->state_in_ch || \
zs->state_in_len == 255) { \
if (zs->state_in_ch < 256) \
add_pair_to_block ( zs ); \
zs->state_in_ch = zchh; \
zs->state_in_len = 1; \
} else { \
zs->state_in_len++; \
} \
}
/*---------------------------------------------------*/
static
Bool copy_input_until_stop ( EState* s )
{
Bool progress_in = False;
if (s->mode == BZ_M_RUNNING) {
/*-- fast track the common case --*/
while (True) {
/*-- block full? --*/
if (s->nblock >= s->nblockMAX) break;
/*-- no input? --*/
if (s->strm->avail_in == 0) break;
progress_in = True;
ADD_CHAR_TO_BLOCK ( s, (UInt32)(*((UChar*)(s->strm->next_in))) );
s->strm->next_in++;
s->strm->avail_in--;
s->strm->total_in_lo32++;
if (s->strm->total_in_lo32 == 0) s->strm->total_in_hi32++;
}
} else {
/*-- general, uncommon case --*/
while (True) {
/*-- block full? --*/
if (s->nblock >= s->nblockMAX) break;
/*-- no input? --*/
if (s->strm->avail_in == 0) break;
/*-- flush/finish end? --*/
if (s->avail_in_expect == 0) break;
progress_in = True;
ADD_CHAR_TO_BLOCK ( s, (UInt32)(*((UChar*)(s->strm->next_in))) );
s->strm->next_in++;
s->strm->avail_in--;
s->strm->total_in_lo32++;
if (s->strm->total_in_lo32 == 0) s->strm->total_in_hi32++;
s->avail_in_expect--;
}
}
return progress_in;
}
/*---------------------------------------------------*/
static
Bool copy_output_until_stop ( EState* s )
{
Bool progress_out = False;
while (True) {
/*-- no output space? --*/
if (s->strm->avail_out == 0) break;
/*-- block done? --*/
if (s->state_out_pos >= s->numZ) break;
progress_out = True;
*(s->strm->next_out) = s->zbits[s->state_out_pos];
s->state_out_pos++;
s->strm->avail_out--;
s->strm->next_out++;
s->strm->total_out_lo32++;
if (s->strm->total_out_lo32 == 0) s->strm->total_out_hi32++;
}
return progress_out;
}
/*---------------------------------------------------*/
static
Bool handle_compress ( bz_stream* strm )
{
Bool progress_in = False;
Bool progress_out = False;
EState* s = strm->state;
while (True) {
if (s->state == BZ_S_OUTPUT) {
progress_out |= copy_output_until_stop ( s );
if (s->state_out_pos < s->numZ) break;
if (s->mode == BZ_M_FINISHING &&
s->avail_in_expect == 0 &&
isempty_RL(s)) break;
prepare_new_block ( s );
s->state = BZ_S_INPUT;
if (s->mode == BZ_M_FLUSHING &&
s->avail_in_expect == 0 &&
isempty_RL(s)) break;
}
if (s->state == BZ_S_INPUT) {
progress_in |= copy_input_until_stop ( s );
if (s->mode != BZ_M_RUNNING && s->avail_in_expect == 0) {
flush_RL ( s );
BZ2_compressBlock ( s, (Bool)(s->mode == BZ_M_FINISHING) );
s->state = BZ_S_OUTPUT;
}
else
if (s->nblock >= s->nblockMAX) {
BZ2_compressBlock ( s, False );
s->state = BZ_S_OUTPUT;
}
else
if (s->strm->avail_in == 0) {
break;
}
}
}
return progress_in || progress_out;
}
/*---------------------------------------------------*/
int BZ_API(BZ2_bzCompress) ( bz_stream *strm, int action )
{
Bool progress;
EState* s;
if (strm == NULL) return BZ_PARAM_ERROR;
s = strm->state;
if (s == NULL) return BZ_PARAM_ERROR;
if (s->strm != strm) return BZ_PARAM_ERROR;
preswitch:
switch (s->mode) {
case BZ_M_IDLE:
return BZ_SEQUENCE_ERROR;
case BZ_M_RUNNING:
if (action == BZ_RUN) {
progress = handle_compress ( strm );
return progress ? BZ_RUN_OK : BZ_PARAM_ERROR;
}
else
if (action == BZ_FLUSH) {
s->avail_in_expect = strm->avail_in;
s->mode = BZ_M_FLUSHING;
goto preswitch;
}
else
if (action == BZ_FINISH) {
s->avail_in_expect = strm->avail_in;
s->mode = BZ_M_FINISHING;
goto preswitch;
}
else
return BZ_PARAM_ERROR;
case BZ_M_FLUSHING:
if (action != BZ_FLUSH) return BZ_SEQUENCE_ERROR;
if (s->avail_in_expect != s->strm->avail_in)
return BZ_SEQUENCE_ERROR;
progress = handle_compress ( strm );
if (s->avail_in_expect > 0 || !isempty_RL(s) ||
s->state_out_pos < s->numZ) return BZ_FLUSH_OK;
s->mode = BZ_M_RUNNING;
return BZ_RUN_OK;
case BZ_M_FINISHING:
if (action != BZ_FINISH) return BZ_SEQUENCE_ERROR;
if (s->avail_in_expect != s->strm->avail_in)
return BZ_SEQUENCE_ERROR;
progress = handle_compress ( strm );
if (!progress) return BZ_SEQUENCE_ERROR;
if (s->avail_in_expect > 0 || !isempty_RL(s) ||
s->state_out_pos < s->numZ) return BZ_FINISH_OK;
s->mode = BZ_M_IDLE;
return BZ_STREAM_END;
}
return BZ_OK; /*--not reached--*/
}
/*---------------------------------------------------*/
int BZ_API(BZ2_bzCompressEnd) ( bz_stream *strm )
{
EState* s;
if (strm == NULL) return BZ_PARAM_ERROR;
s = strm->state;
if (s == NULL) return BZ_PARAM_ERROR;
if (s->strm != strm) return BZ_PARAM_ERROR;
if (s->arr1 != NULL) BZFREE(s->arr1);
if (s->arr2 != NULL) BZFREE(s->arr2);
if (s->ftab != NULL) BZFREE(s->ftab);
BZFREE(strm->state);
strm->state = NULL;
return BZ_OK;
}
/*---------------------------------------------------*/
/*--- Decompression stuff ---*/
/*---------------------------------------------------*/
/*---------------------------------------------------*/
int BZ_API(BZ2_bzDecompressInit)
( bz_stream* strm,
int verbosity,
int small )
{
DState* s;
if (!bz_config_ok()) return BZ_CONFIG_ERROR;
if (strm == NULL) return BZ_PARAM_ERROR;
if (small != 0 && small != 1) return BZ_PARAM_ERROR;
if (verbosity < 0 || verbosity > 4) return BZ_PARAM_ERROR;
if (strm->bzalloc == NULL) strm->bzalloc = default_bzalloc;
if (strm->bzfree == NULL) strm->bzfree = default_bzfree;
s = BZALLOC( sizeof(DState) );
if (s == NULL) return BZ_MEM_ERROR;
s->strm = strm;
strm->state = s;
s->state = BZ_X_MAGIC_1;
s->bsLive = 0;
s->bsBuff = 0;
s->calculatedCombinedCRC = 0;
strm->total_in_lo32 = 0;
strm->total_in_hi32 = 0;
strm->total_out_lo32 = 0;
strm->total_out_hi32 = 0;
s->smallDecompress = (Bool)small;
s->ll4 = NULL;
s->ll16 = NULL;
s->tt = NULL;
s->currBlockNo = 0;
s->verbosity = verbosity;
return BZ_OK;
}
/*---------------------------------------------------*/
/* Return True iff data corruption is discovered.
Returns False if there is no problem.
*/
static
Bool unRLE_obuf_to_output_FAST ( DState* s )
{
UChar k1;
if (s->blockRandomised) {
while (True) {
/* try to finish existing run */
while (True) {
if (s->strm->avail_out == 0) return False;
if (s->state_out_len == 0) break;
*( (UChar*)(s->strm->next_out) ) = s->state_out_ch;
BZ_UPDATE_CRC ( s->calculatedBlockCRC, s->state_out_ch );
s->state_out_len--;
s->strm->next_out++;
s->strm->avail_out--;
s->strm->total_out_lo32++;
if (s->strm->total_out_lo32 == 0) s->strm->total_out_hi32++;
}
/* can a new run be started? */
if (s->nblock_used == s->save_nblock+1) return False;
/* Only caused by corrupt data stream? */
if (s->nblock_used > s->save_nblock+1)
return True;
s->state_out_len = 1;
s->state_out_ch = s->k0;
BZ_GET_FAST(k1); BZ_RAND_UPD_MASK;
k1 ^= BZ_RAND_MASK; s->nblock_used++;
if (s->nblock_used == s->save_nblock+1) continue;
if (k1 != s->k0) { s->k0 = k1; continue; };
s->state_out_len = 2;
BZ_GET_FAST(k1); BZ_RAND_UPD_MASK;
k1 ^= BZ_RAND_MASK; s->nblock_used++;
if (s->nblock_used == s->save_nblock+1) continue;
if (k1 != s->k0) { s->k0 = k1; continue; };
s->state_out_len = 3;
BZ_GET_FAST(k1); BZ_RAND_UPD_MASK;
k1 ^= BZ_RAND_MASK; s->nblock_used++;
if (s->nblock_used == s->save_nblock+1) continue;
if (k1 != s->k0) { s->k0 = k1; continue; };
BZ_GET_FAST(k1); BZ_RAND_UPD_MASK;
k1 ^= BZ_RAND_MASK; s->nblock_used++;
s->state_out_len = ((Int32)k1) + 4;
BZ_GET_FAST(s->k0); BZ_RAND_UPD_MASK;
s->k0 ^= BZ_RAND_MASK; s->nblock_used++;
}
} else {
/* restore */
UInt32 c_calculatedBlockCRC = s->calculatedBlockCRC;
UChar c_state_out_ch = s->state_out_ch;
Int32 c_state_out_len = s->state_out_len;
Int32 c_nblock_used = s->nblock_used;
Int32 c_k0 = s->k0;
UInt32* c_tt = s->tt;
UInt32 c_tPos = s->tPos;
char* cs_next_out = s->strm->next_out;
unsigned int cs_avail_out = s->strm->avail_out;
Int32 ro_blockSize100k = s->blockSize100k;
/* end restore */
UInt32 avail_out_INIT = cs_avail_out;
Int32 s_save_nblockPP = s->save_nblock+1;
unsigned int total_out_lo32_old;
while (True) {
/* try to finish existing run */
if (c_state_out_len > 0) {
while (True) {
if (cs_avail_out == 0) goto return_notr;
if (c_state_out_len == 1) break;
*( (UChar*)(cs_next_out) ) = c_state_out_ch;
BZ_UPDATE_CRC ( c_calculatedBlockCRC, c_state_out_ch );
c_state_out_len--;
cs_next_out++;
cs_avail_out--;
}
s_state_out_len_eq_one:
{
if (cs_avail_out == 0) {
c_state_out_len = 1; goto return_notr;
};
*( (UChar*)(cs_next_out) ) = c_state_out_ch;
BZ_UPDATE_CRC ( c_calculatedBlockCRC, c_state_out_ch );
cs_next_out++;
cs_avail_out--;
}
}
/* Only caused by corrupt data stream? */
if (c_nblock_used > s_save_nblockPP)
return True;
/* can a new run be started? */
if (c_nblock_used == s_save_nblockPP) {
c_state_out_len = 0; goto return_notr;
};
c_state_out_ch = c_k0;
BZ_GET_FAST_C(k1); c_nblock_used++;
if (k1 != c_k0) {
c_k0 = k1; goto s_state_out_len_eq_one;
};
if (c_nblock_used == s_save_nblockPP)
goto s_state_out_len_eq_one;
c_state_out_len = 2;
BZ_GET_FAST_C(k1); c_nblock_used++;
if (c_nblock_used == s_save_nblockPP) continue;
if (k1 != c_k0) { c_k0 = k1; continue; };
c_state_out_len = 3;
BZ_GET_FAST_C(k1); c_nblock_used++;
if (c_nblock_used == s_save_nblockPP) continue;
if (k1 != c_k0) { c_k0 = k1; continue; };
BZ_GET_FAST_C(k1); c_nblock_used++;
c_state_out_len = ((Int32)k1) + 4;
BZ_GET_FAST_C(c_k0); c_nblock_used++;
}
return_notr:
total_out_lo32_old = s->strm->total_out_lo32;
s->strm->total_out_lo32 += (avail_out_INIT - cs_avail_out);
if (s->strm->total_out_lo32 < total_out_lo32_old)
s->strm->total_out_hi32++;
/* save */
s->calculatedBlockCRC = c_calculatedBlockCRC;
s->state_out_ch = c_state_out_ch;
s->state_out_len = c_state_out_len;
s->nblock_used = c_nblock_used;
s->k0 = c_k0;
s->tt = c_tt;
s->tPos = c_tPos;
s->strm->next_out = cs_next_out;
s->strm->avail_out = cs_avail_out;
/* end save */
}
return False;
}
/*---------------------------------------------------*/
__inline__ Int32 BZ2_indexIntoF ( Int32 indx, Int32 *cftab )
{
Int32 nb, na, mid;
nb = 0;
na = 256;
do {
mid = (nb + na) >> 1;
if (indx >= cftab[mid]) nb = mid; else na = mid;
}
while (na - nb != 1);
return nb;
}
/*---------------------------------------------------*/
/* Return True iff data corruption is discovered.
Returns False if there is no problem.
*/
static
Bool unRLE_obuf_to_output_SMALL ( DState* s )
{
UChar k1;
if (s->blockRandomised) {
while (True) {
/* try to finish existing run */
while (True) {
if (s->strm->avail_out == 0) return False;
if (s->state_out_len == 0) break;
*( (UChar*)(s->strm->next_out) ) = s->state_out_ch;
BZ_UPDATE_CRC ( s->calculatedBlockCRC, s->state_out_ch );
s->state_out_len--;
s->strm->next_out++;
s->strm->avail_out--;
s->strm->total_out_lo32++;
if (s->strm->total_out_lo32 == 0) s->strm->total_out_hi32++;
}
/* can a new run be started? */
if (s->nblock_used == s->save_nblock+1) return False;
/* Only caused by corrupt data stream? */
if (s->nblock_used > s->save_nblock+1)
return True;
s->state_out_len = 1;
s->state_out_ch = s->k0;
BZ_GET_SMALL(k1); BZ_RAND_UPD_MASK;
k1 ^= BZ_RAND_MASK; s->nblock_used++;
if (s->nblock_used == s->save_nblock+1) continue;
if (k1 != s->k0) { s->k0 = k1; continue; };
s->state_out_len = 2;
BZ_GET_SMALL(k1); BZ_RAND_UPD_MASK;
k1 ^= BZ_RAND_MASK; s->nblock_used++;
if (s->nblock_used == s->save_nblock+1) continue;
if (k1 != s->k0) { s->k0 = k1; continue; };
s->state_out_len = 3;
BZ_GET_SMALL(k1); BZ_RAND_UPD_MASK;
k1 ^= BZ_RAND_MASK; s->nblock_used++;
if (s->nblock_used == s->save_nblock+1) continue;
if (k1 != s->k0) { s->k0 = k1; continue; };
BZ_GET_SMALL(k1); BZ_RAND_UPD_MASK;
k1 ^= BZ_RAND_MASK; s->nblock_used++;
s->state_out_len = ((Int32)k1) + 4;
BZ_GET_SMALL(s->k0); BZ_RAND_UPD_MASK;
s->k0 ^= BZ_RAND_MASK; s->nblock_used++;
}
} else {
while (True) {
/* try to finish existing run */
while (True) {
if (s->strm->avail_out == 0) return False;
if (s->state_out_len == 0) break;
*( (UChar*)(s->strm->next_out) ) = s->state_out_ch;
BZ_UPDATE_CRC ( s->calculatedBlockCRC, s->state_out_ch );
s->state_out_len--;
s->strm->next_out++;
s->strm->avail_out--;
s->strm->total_out_lo32++;
if (s->strm->total_out_lo32 == 0) s->strm->total_out_hi32++;
}
/* can a new run be started? */
if (s->nblock_used == s->save_nblock+1) return False;
/* Only caused by corrupt data stream? */
if (s->nblock_used > s->save_nblock+1)
return True;
s->state_out_len = 1;
s->state_out_ch = s->k0;
BZ_GET_SMALL(k1); s->nblock_used++;
if (s->nblock_used == s->save_nblock+1) continue;
if (k1 != s->k0) { s->k0 = k1; continue; };
s->state_out_len = 2;
BZ_GET_SMALL(k1); s->nblock_used++;
if (s->nblock_used == s->save_nblock+1) continue;
if (k1 != s->k0) { s->k0 = k1; continue; };
s->state_out_len = 3;
BZ_GET_SMALL(k1); s->nblock_used++;
if (s->nblock_used == s->save_nblock+1) continue;
if (k1 != s->k0) { s->k0 = k1; continue; };
BZ_GET_SMALL(k1); s->nblock_used++;
s->state_out_len = ((Int32)k1) + 4;
BZ_GET_SMALL(s->k0); s->nblock_used++;
}
}
}
/*---------------------------------------------------*/
int BZ_API(BZ2_bzDecompress) ( bz_stream *strm )
{
Bool corrupt;
DState* s;
if (strm == NULL) return BZ_PARAM_ERROR;
s = strm->state;
if (s == NULL) return BZ_PARAM_ERROR;
if (s->strm != strm) return BZ_PARAM_ERROR;
while (True) {
if (s->state == BZ_X_IDLE) return BZ_SEQUENCE_ERROR;
if (s->state == BZ_X_OUTPUT) {
if (s->smallDecompress)
corrupt = unRLE_obuf_to_output_SMALL ( s ); else
corrupt = unRLE_obuf_to_output_FAST ( s );
if (corrupt) return BZ_DATA_ERROR;
if (s->nblock_used == s->save_nblock+1 && s->state_out_len == 0) {
BZ_FINALISE_CRC ( s->calculatedBlockCRC );
if (s->verbosity >= 3)
VPrintf2 ( " {0x%08x, 0x%08x}", s->storedBlockCRC,
s->calculatedBlockCRC );
if (s->verbosity >= 2) VPrintf0 ( "]" );
if (s->calculatedBlockCRC != s->storedBlockCRC)
return BZ_DATA_ERROR;
s->calculatedCombinedCRC
= (s->calculatedCombinedCRC << 1) |
(s->calculatedCombinedCRC >> 31);
s->calculatedCombinedCRC ^= s->calculatedBlockCRC;
s->state = BZ_X_BLKHDR_1;
} else {
return BZ_OK;
}
}
if (s->state >= BZ_X_MAGIC_1) {
Int32 r = BZ2_decompress ( s );
if (r == BZ_STREAM_END) {
if (s->verbosity >= 3)
VPrintf2 ( "\n combined CRCs: stored = 0x%08x, computed = 0x%08x",
s->storedCombinedCRC, s->calculatedCombinedCRC );
if (s->calculatedCombinedCRC != s->storedCombinedCRC)
return BZ_DATA_ERROR;
return r;
}
if (s->state != BZ_X_OUTPUT) return r;
}
}
AssertH ( 0, 6001 );
return 0; /*NOTREACHED*/
}
/*---------------------------------------------------*/
int BZ_API(BZ2_bzDecompressEnd) ( bz_stream *strm )
{
DState* s;
if (strm == NULL) return BZ_PARAM_ERROR;
s = strm->state;
if (s == NULL) return BZ_PARAM_ERROR;
if (s->strm != strm) return BZ_PARAM_ERROR;
if (s->tt != NULL) BZFREE(s->tt);
if (s->ll16 != NULL) BZFREE(s->ll16);
if (s->ll4 != NULL) BZFREE(s->ll4);
BZFREE(strm->state);
strm->state = NULL;
return BZ_OK;
}
#ifndef BZ_NO_STDIO
/*---------------------------------------------------*/
/*--- File I/O stuff ---*/
/*---------------------------------------------------*/
#define BZ_SETERR(eee) \
{ \
if (bzerror != NULL) *bzerror = eee; \
if (bzf != NULL) bzf->lastErr = eee; \
}
typedef
struct {
FILE* handle;
Char buf[BZ_MAX_UNUSED];
Int32 bufN;
Bool writing;
bz_stream strm;
Int32 lastErr;
Bool initialisedOk;
}
bzFile;
/*---------------------------------------------*/
static Bool myfeof ( FILE* f )
{
Int32 c = fgetc ( f );
if (c == EOF) return True;
ungetc ( c, f );
return False;
}
/*---------------------------------------------------*/
BZFILE* BZ_API(BZ2_bzWriteOpen)
( int* bzerror,
FILE* f,
int blockSize100k,
int verbosity,
int workFactor )
{
Int32 ret;
bzFile* bzf = NULL;
BZ_SETERR(BZ_OK);
if (f == NULL ||
(blockSize100k < 1 || blockSize100k > 9) ||
(workFactor < 0 || workFactor > 250) ||
(verbosity < 0 || verbosity > 4))
{ BZ_SETERR(BZ_PARAM_ERROR); return NULL; };
if (ferror(f))
{ BZ_SETERR(BZ_IO_ERROR); return NULL; };
bzf = malloc ( sizeof(bzFile) );
if (bzf == NULL)
{ BZ_SETERR(BZ_MEM_ERROR); return NULL; };
BZ_SETERR(BZ_OK);
bzf->initialisedOk = False;
bzf->bufN = 0;
bzf->handle = f;
bzf->writing = True;
bzf->strm.bzalloc = NULL;
bzf->strm.bzfree = NULL;
bzf->strm.opaque = NULL;
if (workFactor == 0) workFactor = 30;
ret = BZ2_bzCompressInit ( &(bzf->strm), blockSize100k,
verbosity, workFactor );
if (ret != BZ_OK)
{ BZ_SETERR(ret); free(bzf); return NULL; };
bzf->strm.avail_in = 0;
bzf->initialisedOk = True;
return bzf;
}
/*---------------------------------------------------*/
void BZ_API(BZ2_bzWrite)
( int* bzerror,
BZFILE* b,
void* buf,
int len )
{
Int32 n, n2, ret;
bzFile* bzf = (bzFile*)b;
BZ_SETERR(BZ_OK);
if (bzf == NULL || buf == NULL || len < 0)
{ BZ_SETERR(BZ_PARAM_ERROR); return; };
if (!(bzf->writing))
{ BZ_SETERR(BZ_SEQUENCE_ERROR); return; };
if (ferror(bzf->handle))
{ BZ_SETERR(BZ_IO_ERROR); return; };
if (len == 0)
{ BZ_SETERR(BZ_OK); return; };
bzf->strm.avail_in = len;
bzf->strm.next_in = buf;
while (True) {
bzf->strm.avail_out = BZ_MAX_UNUSED;
bzf->strm.next_out = bzf->buf;
ret = BZ2_bzCompress ( &(bzf->strm), BZ_RUN );
if (ret != BZ_RUN_OK)
{ BZ_SETERR(ret); return; };
if (bzf->strm.avail_out < BZ_MAX_UNUSED) {
n = BZ_MAX_UNUSED - bzf->strm.avail_out;
n2 = fwrite ( (void*)(bzf->buf), sizeof(UChar),
n, bzf->handle );
if (n != n2 || ferror(bzf->handle))
{ BZ_SETERR(BZ_IO_ERROR); return; };
}
if (bzf->strm.avail_in == 0)
{ BZ_SETERR(BZ_OK); return; };
}
}
/*---------------------------------------------------*/
void BZ_API(BZ2_bzWriteClose)
( int* bzerror,
BZFILE* b,
int abandon,
unsigned int* nbytes_in,
unsigned int* nbytes_out )
{
BZ2_bzWriteClose64 ( bzerror, b, abandon,
nbytes_in, NULL, nbytes_out, NULL );
}
void BZ_API(BZ2_bzWriteClose64)
( int* bzerror,
BZFILE* b,
int abandon,
unsigned int* nbytes_in_lo32,
unsigned int* nbytes_in_hi32,
unsigned int* nbytes_out_lo32,
unsigned int* nbytes_out_hi32 )
{
Int32 n, n2, ret;
bzFile* bzf = (bzFile*)b;
if (bzf == NULL)
{ BZ_SETERR(BZ_OK); return; };
if (!(bzf->writing))
{ BZ_SETERR(BZ_SEQUENCE_ERROR); return; };
if (ferror(bzf->handle))
{ BZ_SETERR(BZ_IO_ERROR); return; };
if (nbytes_in_lo32 != NULL) *nbytes_in_lo32 = 0;
if (nbytes_in_hi32 != NULL) *nbytes_in_hi32 = 0;
if (nbytes_out_lo32 != NULL) *nbytes_out_lo32 = 0;
if (nbytes_out_hi32 != NULL) *nbytes_out_hi32 = 0;
if ((!abandon) && bzf->lastErr == BZ_OK) {
while (True) {
bzf->strm.avail_out = BZ_MAX_UNUSED;
bzf->strm.next_out = bzf->buf;
ret = BZ2_bzCompress ( &(bzf->strm), BZ_FINISH );
if (ret != BZ_FINISH_OK && ret != BZ_STREAM_END)
{ BZ_SETERR(ret); return; };
if (bzf->strm.avail_out < BZ_MAX_UNUSED) {
n = BZ_MAX_UNUSED - bzf->strm.avail_out;
n2 = fwrite ( (void*)(bzf->buf), sizeof(UChar),
n, bzf->handle );
if (n != n2 || ferror(bzf->handle))
{ BZ_SETERR(BZ_IO_ERROR); return; };
}
if (ret == BZ_STREAM_END) break;
}
}
if ( !abandon && !ferror ( bzf->handle ) ) {
fflush ( bzf->handle );
if (ferror(bzf->handle))
{ BZ_SETERR(BZ_IO_ERROR); return; };
}
if (nbytes_in_lo32 != NULL)
*nbytes_in_lo32 = bzf->strm.total_in_lo32;
if (nbytes_in_hi32 != NULL)
*nbytes_in_hi32 = bzf->strm.total_in_hi32;
if (nbytes_out_lo32 != NULL)
*nbytes_out_lo32 = bzf->strm.total_out_lo32;
if (nbytes_out_hi32 != NULL)
*nbytes_out_hi32 = bzf->strm.total_out_hi32;
BZ_SETERR(BZ_OK);
BZ2_bzCompressEnd ( &(bzf->strm) );
free ( bzf );
}
/*---------------------------------------------------*/
BZFILE* BZ_API(BZ2_bzReadOpen)
( int* bzerror,
FILE* f,
int verbosity,
int small,
void* unused,
int nUnused )
{
bzFile* bzf = NULL;
int ret;
BZ_SETERR(BZ_OK);
if (f == NULL ||
(small != 0 && small != 1) ||
(verbosity < 0 || verbosity > 4) ||
(unused == NULL && nUnused != 0) ||
(unused != NULL && (nUnused < 0 || nUnused > BZ_MAX_UNUSED)))
{ BZ_SETERR(BZ_PARAM_ERROR); return NULL; };
if (ferror(f))
{ BZ_SETERR(BZ_IO_ERROR); return NULL; };
bzf = malloc ( sizeof(bzFile) );
if (bzf == NULL)
{ BZ_SETERR(BZ_MEM_ERROR); return NULL; };
BZ_SETERR(BZ_OK);
bzf->initialisedOk = False;
bzf->handle = f;
bzf->bufN = 0;
bzf->writing = False;
bzf->strm.bzalloc = NULL;
bzf->strm.bzfree = NULL;
bzf->strm.opaque = NULL;
while (nUnused > 0) {
bzf->buf[bzf->bufN] = *((UChar*)(unused)); bzf->bufN++;
unused = ((void*)( 1 + ((UChar*)(unused)) ));
nUnused--;
}
ret = BZ2_bzDecompressInit ( &(bzf->strm), verbosity, small );
if (ret != BZ_OK)
{ BZ_SETERR(ret); free(bzf); return NULL; };
bzf->strm.avail_in = bzf->bufN;
bzf->strm.next_in = bzf->buf;
bzf->initialisedOk = True;
return bzf;
}
/*---------------------------------------------------*/
void BZ_API(BZ2_bzReadClose) ( int *bzerror, BZFILE *b )
{
bzFile* bzf = (bzFile*)b;
BZ_SETERR(BZ_OK);
if (bzf == NULL)
{ BZ_SETERR(BZ_OK); return; };
if (bzf->writing)
{ BZ_SETERR(BZ_SEQUENCE_ERROR); return; };
if (bzf->initialisedOk)
(void)BZ2_bzDecompressEnd ( &(bzf->strm) );
free ( bzf );
}
/*---------------------------------------------------*/
int BZ_API(BZ2_bzRead)
( int* bzerror,
BZFILE* b,
void* buf,
int len )
{
Int32 n, ret;
bzFile* bzf = (bzFile*)b;
BZ_SETERR(BZ_OK);
if (bzf == NULL || buf == NULL || len < 0)
{ BZ_SETERR(BZ_PARAM_ERROR); return 0; };
if (bzf->writing)
{ BZ_SETERR(BZ_SEQUENCE_ERROR); return 0; };
if (len == 0)
{ BZ_SETERR(BZ_OK); return 0; };
bzf->strm.avail_out = len;
bzf->strm.next_out = buf;
while (True) {
if (ferror(bzf->handle))
{ BZ_SETERR(BZ_IO_ERROR); return 0; };
if (bzf->strm.avail_in == 0 && !myfeof(bzf->handle)) {
n = fread ( bzf->buf, sizeof(UChar),
BZ_MAX_UNUSED, bzf->handle );
if (ferror(bzf->handle))
{ BZ_SETERR(BZ_IO_ERROR); return 0; };
bzf->bufN = n;
bzf->strm.avail_in = bzf->bufN;
bzf->strm.next_in = bzf->buf;
}
ret = BZ2_bzDecompress ( &(bzf->strm) );
if (ret != BZ_OK && ret != BZ_STREAM_END)
{ BZ_SETERR(ret); return 0; };
if (ret == BZ_OK && myfeof(bzf->handle) &&
bzf->strm.avail_in == 0 && bzf->strm.avail_out > 0)
{ BZ_SETERR(BZ_UNEXPECTED_EOF); return 0; };
if (ret == BZ_STREAM_END)
{ BZ_SETERR(BZ_STREAM_END);
return len - bzf->strm.avail_out; };
if (bzf->strm.avail_out == 0)
{ BZ_SETERR(BZ_OK); return len; };
}
return 0; /*not reached*/
}
/*---------------------------------------------------*/
void BZ_API(BZ2_bzReadGetUnused)
( int* bzerror,
BZFILE* b,
void** unused,
int* nUnused )
{
bzFile* bzf = (bzFile*)b;
if (bzf == NULL)
{ BZ_SETERR(BZ_PARAM_ERROR); return; };
if (bzf->lastErr != BZ_STREAM_END)
{ BZ_SETERR(BZ_SEQUENCE_ERROR); return; };
if (unused == NULL || nUnused == NULL)
{ BZ_SETERR(BZ_PARAM_ERROR); return; };
BZ_SETERR(BZ_OK);
*nUnused = bzf->strm.avail_in;
*unused = bzf->strm.next_in;
}
#endif
/*---------------------------------------------------*/
/*--- Misc convenience stuff ---*/
/*---------------------------------------------------*/
/*---------------------------------------------------*/
int BZ_API(BZ2_bzBuffToBuffCompress)
( char* dest,
unsigned int* destLen,
char* source,
unsigned int sourceLen,
int blockSize100k,
int verbosity,
int workFactor )
{
bz_stream strm;
int ret;
if (dest == NULL || destLen == NULL ||
source == NULL ||
blockSize100k < 1 || blockSize100k > 9 ||
verbosity < 0 || verbosity > 4 ||
workFactor < 0 || workFactor > 250)
return BZ_PARAM_ERROR;
if (workFactor == 0) workFactor = 30;
strm.bzalloc = NULL;
strm.bzfree = NULL;
strm.opaque = NULL;
ret = BZ2_bzCompressInit ( &strm, blockSize100k,
verbosity, workFactor );
if (ret != BZ_OK) return ret;
strm.next_in = source;
strm.next_out = dest;
strm.avail_in = sourceLen;
strm.avail_out = *destLen;
ret = BZ2_bzCompress ( &strm, BZ_FINISH );
if (ret == BZ_FINISH_OK) goto output_overflow;
if (ret != BZ_STREAM_END) goto errhandler;
/* normal termination */
*destLen -= strm.avail_out;
BZ2_bzCompressEnd ( &strm );
return BZ_OK;
output_overflow:
BZ2_bzCompressEnd ( &strm );
return BZ_OUTBUFF_FULL;
errhandler:
BZ2_bzCompressEnd ( &strm );
return ret;
}
/*---------------------------------------------------*/
int BZ_API(BZ2_bzBuffToBuffDecompress)
( char* dest,
unsigned int* destLen,
char* source,
unsigned int sourceLen,
int small,
int verbosity )
{
bz_stream strm;
int ret;
if (dest == NULL || destLen == NULL ||
source == NULL ||
(small != 0 && small != 1) ||
verbosity < 0 || verbosity > 4)
return BZ_PARAM_ERROR;
strm.bzalloc = NULL;
strm.bzfree = NULL;
strm.opaque = NULL;
ret = BZ2_bzDecompressInit ( &strm, verbosity, small );
if (ret != BZ_OK) return ret;
strm.next_in = source;
strm.next_out = dest;
strm.avail_in = sourceLen;
strm.avail_out = *destLen;
ret = BZ2_bzDecompress ( &strm );
if (ret == BZ_OK) goto output_overflow_or_eof;
if (ret != BZ_STREAM_END) goto errhandler;
/* normal termination */
*destLen -= strm.avail_out;
BZ2_bzDecompressEnd ( &strm );
return BZ_OK;
output_overflow_or_eof:
if (strm.avail_out > 0) {
BZ2_bzDecompressEnd ( &strm );
return BZ_UNEXPECTED_EOF;
} else {
BZ2_bzDecompressEnd ( &strm );
return BZ_OUTBUFF_FULL;
};
errhandler:
BZ2_bzDecompressEnd ( &strm );
return ret;
}
/*---------------------------------------------------*/
/*--
Code contributed by Yoshioka Tsuneo (tsuneo@rr.iij4u.or.jp)
to support better zlib compatibility.
This code is not _officially_ part of libbzip2 (yet);
I haven't tested it, documented it, or considered the
threading-safeness of it.
If this code breaks, please contact both Yoshioka and me.
--*/
/*---------------------------------------------------*/
/*---------------------------------------------------*/
/*--
return version like "0.9.5d, 4-Sept-1999".
--*/
const char * BZ_API(BZ2_bzlibVersion)(void)
{
return BZ_VERSION;
}
#ifndef BZ_NO_STDIO
/*---------------------------------------------------*/
#if defined(_WIN32) || defined(OS2) || defined(MSDOS)
# include
# include
# define SET_BINARY_MODE(file) setmode(fileno(file),O_BINARY)
#else
# define SET_BINARY_MODE(file)
#endif
static
BZFILE * bzopen_or_bzdopen
( const char *path, /* no use when bzdopen */
int fd, /* no use when bzdopen */
const char *mode,
int open_mode) /* bzopen: 0, bzdopen:1 */
{
int bzerr;
char unused[BZ_MAX_UNUSED];
int blockSize100k = 9;
int writing = 0;
char mode2[10] = "";
FILE *fp = NULL;
BZFILE *bzfp = NULL;
int verbosity = 0;
int workFactor = 30;
int smallMode = 0;
int nUnused = 0;
if (mode == NULL) return NULL;
while (*mode) {
switch (*mode) {
case 'r':
writing = 0; break;
case 'w':
writing = 1; break;
case 's':
smallMode = 1; break;
default:
if (isdigit((int)(*mode))) {
blockSize100k = *mode-BZ_HDR_0;
}
}
mode++;
}
strcat(mode2, writing ? "w" : "r" );
strcat(mode2,"b"); /* binary mode */
if (open_mode==0) {
if (path==NULL || strcmp(path,"")==0) {
fp = (writing ? stdout : stdin);
SET_BINARY_MODE(fp);
} else {
fp = fopen(path,mode2);
}
} else {
#ifdef BZ_STRICT_ANSI
fp = NULL;
#else
fp = fdopen(fd,mode2);
#endif
}
if (fp == NULL) return NULL;
if (writing) {
/* Guard against total chaos and anarchy -- JRS */
if (blockSize100k < 1) blockSize100k = 1;
if (blockSize100k > 9) blockSize100k = 9;
bzfp = BZ2_bzWriteOpen(&bzerr,fp,blockSize100k,
verbosity,workFactor);
} else {
bzfp = BZ2_bzReadOpen(&bzerr,fp,verbosity,smallMode,
unused,nUnused);
}
if (bzfp == NULL) {
if (fp != stdin && fp != stdout) fclose(fp);
return NULL;
}
return bzfp;
}
/*---------------------------------------------------*/
/*--
open file for read or write.
ex) bzopen("file","w9")
case path="" or NULL => use stdin or stdout.
--*/
BZFILE * BZ_API(BZ2_bzopen)
( const char *path,
const char *mode )
{
return bzopen_or_bzdopen(path,-1,mode,/*bzopen*/0);
}
/*---------------------------------------------------*/
BZFILE * BZ_API(BZ2_bzdopen)
( int fd,
const char *mode )
{
return bzopen_or_bzdopen(NULL,fd,mode,/*bzdopen*/1);
}
/*---------------------------------------------------*/
int BZ_API(BZ2_bzread) (BZFILE* b, void* buf, int len )
{
int bzerr, nread;
if (((bzFile*)b)->lastErr == BZ_STREAM_END) return 0;
nread = BZ2_bzRead(&bzerr,b,buf,len);
if (bzerr == BZ_OK || bzerr == BZ_STREAM_END) {
return nread;
} else {
return -1;
}
}
/*---------------------------------------------------*/
int BZ_API(BZ2_bzwrite) (BZFILE* b, void* buf, int len )
{
int bzerr;
BZ2_bzWrite(&bzerr,b,buf,len);
if(bzerr == BZ_OK){
return len;
}else{
return -1;
}
}
/*---------------------------------------------------*/
int BZ_API(BZ2_bzflush) (BZFILE *b)
{
/* do nothing now... */
return 0;
}
/*---------------------------------------------------*/
void BZ_API(BZ2_bzclose) (BZFILE* b)
{
int bzerr;
FILE *fp;
if (b==NULL) {return;}
fp = ((bzFile *)b)->handle;
if(((bzFile*)b)->writing){
BZ2_bzWriteClose(&bzerr,b,0,NULL,NULL);
if(bzerr != BZ_OK){
BZ2_bzWriteClose(NULL,b,1,NULL,NULL);
}
}else{
BZ2_bzReadClose(&bzerr,b);
}
if(fp!=stdin && fp!=stdout){
fclose(fp);
}
}
/*---------------------------------------------------*/
/*--
return last error code
--*/
static const char *bzerrorstrings[] = {
"OK"
,"SEQUENCE_ERROR"
,"PARAM_ERROR"
,"MEM_ERROR"
,"DATA_ERROR"
,"DATA_ERROR_MAGIC"
,"IO_ERROR"
,"UNEXPECTED_EOF"
,"OUTBUFF_FULL"
,"CONFIG_ERROR"
,"???" /* for future */
,"???" /* for future */
,"???" /* for future */
,"???" /* for future */
,"???" /* for future */
,"???" /* for future */
};
const char * BZ_API(BZ2_bzerror) (BZFILE *b, int *errnum)
{
int err = ((bzFile *)b)->lastErr;
if(err>0) err = 0;
*errnum = err;
return bzerrorstrings[err*-1];
}
#endif
/*-------------------------------------------------------------*/
/*--- end bzlib.c ---*/
/*-------------------------------------------------------------*/
================================================
FILE: ext/bzip2/bzlib.h
================================================
/*-------------------------------------------------------------*/
/*--- Public header file for the library. ---*/
/*--- bzlib.h ---*/
/*-------------------------------------------------------------*/
/* ------------------------------------------------------------------
This file is part of bzip2/libbzip2, a program and library for
lossless, block-sorting data compression.
bzip2/libbzip2 version 1.0.6 of 6 September 2010
Copyright (C) 1996-2010 Julian Seward
Please read the WARNING, DISCLAIMER and PATENTS sections in the
README file.
This program is released under the terms of the license contained
in the file LICENSE.
------------------------------------------------------------------ */
#ifndef _BZLIB_H
#define _BZLIB_H
#ifdef __cplusplus
extern "C" {
#endif
#define BZ_RUN 0
#define BZ_FLUSH 1
#define BZ_FINISH 2
#define BZ_OK 0
#define BZ_RUN_OK 1
#define BZ_FLUSH_OK 2
#define BZ_FINISH_OK 3
#define BZ_STREAM_END 4
#define BZ_SEQUENCE_ERROR (-1)
#define BZ_PARAM_ERROR (-2)
#define BZ_MEM_ERROR (-3)
#define BZ_DATA_ERROR (-4)
#define BZ_DATA_ERROR_MAGIC (-5)
#define BZ_IO_ERROR (-6)
#define BZ_UNEXPECTED_EOF (-7)
#define BZ_OUTBUFF_FULL (-8)
#define BZ_CONFIG_ERROR (-9)
typedef
struct {
char *next_in;
unsigned int avail_in;
unsigned int total_in_lo32;
unsigned int total_in_hi32;
char *next_out;
unsigned int avail_out;
unsigned int total_out_lo32;
unsigned int total_out_hi32;
void *state;
void *(*bzalloc)(void *,int,int);
void (*bzfree)(void *,void *);
void *opaque;
}
bz_stream;
#ifndef BZ_IMPORT
#define BZ_EXPORT
#endif
#ifndef BZ_NO_STDIO
/* Need a definitition for FILE */
#include
#endif
#ifdef _WIN32
# include
# ifdef small
/* windows.h define small to char */
# undef small
# endif
# ifdef BZ_EXPORT
# define BZ_API(func) WINAPI func
# define BZ_EXTERN extern
# else
/* import windows dll dynamically */
# define BZ_API(func) (WINAPI * func)
# define BZ_EXTERN
# endif
#else
# define BZ_API(func) func
# define BZ_EXTERN extern
#endif
/*-- Core (low-level) library functions --*/
BZ_EXTERN int BZ_API(BZ2_bzCompressInit) (
bz_stream* strm,
int blockSize100k,
int verbosity,
int workFactor
);
BZ_EXTERN int BZ_API(BZ2_bzCompress) (
bz_stream* strm,
int action
);
BZ_EXTERN int BZ_API(BZ2_bzCompressEnd) (
bz_stream* strm
);
BZ_EXTERN int BZ_API(BZ2_bzDecompressInit) (
bz_stream *strm,
int verbosity,
int small
);
BZ_EXTERN int BZ_API(BZ2_bzDecompress) (
bz_stream* strm
);
BZ_EXTERN int BZ_API(BZ2_bzDecompressEnd) (
bz_stream *strm
);
/*-- High(er) level library functions --*/
#ifndef BZ_NO_STDIO
#define BZ_MAX_UNUSED 5000
typedef void BZFILE;
BZ_EXTERN BZFILE* BZ_API(BZ2_bzReadOpen) (
int* bzerror,
FILE* f,
int verbosity,
int small,
void* unused,
int nUnused
);
BZ_EXTERN void BZ_API(BZ2_bzReadClose) (
int* bzerror,
BZFILE* b
);
BZ_EXTERN void BZ_API(BZ2_bzReadGetUnused) (
int* bzerror,
BZFILE* b,
void** unused,
int* nUnused
);
BZ_EXTERN int BZ_API(BZ2_bzRead) (
int* bzerror,
BZFILE* b,
void* buf,
int len
);
BZ_EXTERN BZFILE* BZ_API(BZ2_bzWriteOpen) (
int* bzerror,
FILE* f,
int blockSize100k,
int verbosity,
int workFactor
);
BZ_EXTERN void BZ_API(BZ2_bzWrite) (
int* bzerror,
BZFILE* b,
void* buf,
int len
);
BZ_EXTERN void BZ_API(BZ2_bzWriteClose) (
int* bzerror,
BZFILE* b,
int abandon,
unsigned int* nbytes_in,
unsigned int* nbytes_out
);
BZ_EXTERN void BZ_API(BZ2_bzWriteClose64) (
int* bzerror,
BZFILE* b,
int abandon,
unsigned int* nbytes_in_lo32,
unsigned int* nbytes_in_hi32,
unsigned int* nbytes_out_lo32,
unsigned int* nbytes_out_hi32
);
#endif
/*-- Utility functions --*/
BZ_EXTERN int BZ_API(BZ2_bzBuffToBuffCompress) (
char* dest,
unsigned int* destLen,
char* source,
unsigned int sourceLen,
int blockSize100k,
int verbosity,
int workFactor
);
BZ_EXTERN int BZ_API(BZ2_bzBuffToBuffDecompress) (
char* dest,
unsigned int* destLen,
char* source,
unsigned int sourceLen,
int small,
int verbosity
);
/*--
Code contributed by Yoshioka Tsuneo (tsuneo@rr.iij4u.or.jp)
to support better zlib compatibility.
This code is not _officially_ part of libbzip2 (yet);
I haven't tested it, documented it, or considered the
threading-safeness of it.
If this code breaks, please contact both Yoshioka and me.
--*/
BZ_EXTERN const char * BZ_API(BZ2_bzlibVersion) (
void
);
#ifndef BZ_NO_STDIO
BZ_EXTERN BZFILE * BZ_API(BZ2_bzopen) (
const char *path,
const char *mode
);
BZ_EXTERN BZFILE * BZ_API(BZ2_bzdopen) (
int fd,
const char *mode
);
BZ_EXTERN int BZ_API(BZ2_bzread) (
BZFILE* b,
void* buf,
int len
);
BZ_EXTERN int BZ_API(BZ2_bzwrite) (
BZFILE* b,
void* buf,
int len
);
BZ_EXTERN int BZ_API(BZ2_bzflush) (
BZFILE* b
);
BZ_EXTERN void BZ_API(BZ2_bzclose) (
BZFILE* b
);
BZ_EXTERN const char * BZ_API(BZ2_bzerror) (
BZFILE *b,
int *errnum
);
#endif
#ifdef __cplusplus
}
#endif
#endif
/*-------------------------------------------------------------*/
/*--- end bzlib.h ---*/
/*-------------------------------------------------------------*/
================================================
FILE: ext/bzip2/bzlib_private.h
================================================
/*-------------------------------------------------------------*/
/*--- Private header file for the library. ---*/
/*--- bzlib_private.h ---*/
/*-------------------------------------------------------------*/
/* ------------------------------------------------------------------
This file is part of bzip2/libbzip2, a program and library for
lossless, block-sorting data compression.
bzip2/libbzip2 version 1.0.6 of 6 September 2010
Copyright (C) 1996-2010 Julian Seward
Please read the WARNING, DISCLAIMER and PATENTS sections in the
README file.
This program is released under the terms of the license contained
in the file LICENSE.
------------------------------------------------------------------ */
#ifndef _BZLIB_PRIVATE_H
#define _BZLIB_PRIVATE_H
#include
#ifndef BZ_NO_STDIO
#include
#include
#include
#endif
#include "bzlib.h"
/*-- General stuff. --*/
#define BZ_VERSION "1.0.6, 6-Sept-2010"
typedef char Char;
typedef unsigned char Bool;
typedef unsigned char UChar;
typedef int Int32;
typedef unsigned int UInt32;
typedef short Int16;
typedef unsigned short UInt16;
#define True ((Bool)1)
#define False ((Bool)0)
#ifndef __GNUC__
#define __inline__ /* */
#endif
#ifndef BZ_NO_STDIO
extern void BZ2_bz__AssertH__fail ( int errcode );
#define AssertH(cond,errcode) \
{ if (!(cond)) BZ2_bz__AssertH__fail ( errcode ); }
#if BZ_DEBUG
#define AssertD(cond,msg) \
{ if (!(cond)) { \
fprintf ( stderr, \
"\n\nlibbzip2(debug build): internal error\n\t%s\n", msg );\
exit(1); \
}}
#else
#define AssertD(cond,msg) /* */
#endif
#define VPrintf0(zf) \
fprintf(stderr,zf)
#define VPrintf1(zf,za1) \
fprintf(stderr,zf,za1)
#define VPrintf2(zf,za1,za2) \
fprintf(stderr,zf,za1,za2)
#define VPrintf3(zf,za1,za2,za3) \
fprintf(stderr,zf,za1,za2,za3)
#define VPrintf4(zf,za1,za2,za3,za4) \
fprintf(stderr,zf,za1,za2,za3,za4)
#define VPrintf5(zf,za1,za2,za3,za4,za5) \
fprintf(stderr,zf,za1,za2,za3,za4,za5)
#else
extern void bz_internal_error ( int errcode );
#define AssertH(cond,errcode) \
{ if (!(cond)) bz_internal_error ( errcode ); }
#define AssertD(cond,msg) do { } while (0)
#define VPrintf0(zf) do { } while (0)
#define VPrintf1(zf,za1) do { } while (0)
#define VPrintf2(zf,za1,za2) do { } while (0)
#define VPrintf3(zf,za1,za2,za3) do { } while (0)
#define VPrintf4(zf,za1,za2,za3,za4) do { } while (0)
#define VPrintf5(zf,za1,za2,za3,za4,za5) do { } while (0)
#endif
#define BZALLOC(nnn) (strm->bzalloc)(strm->opaque,(nnn),1)
#define BZFREE(ppp) (strm->bzfree)(strm->opaque,(ppp))
/*-- Header bytes. --*/
#define BZ_HDR_B 0x42 /* 'B' */
#define BZ_HDR_Z 0x5a /* 'Z' */
#define BZ_HDR_h 0x68 /* 'h' */
#define BZ_HDR_0 0x30 /* '0' */
/*-- Constants for the back end. --*/
#define BZ_MAX_ALPHA_SIZE 258
#define BZ_MAX_CODE_LEN 23
#define BZ_RUNA 0
#define BZ_RUNB 1
#define BZ_N_GROUPS 6
#define BZ_G_SIZE 50
#define BZ_N_ITERS 4
#define BZ_MAX_SELECTORS (2 + (900000 / BZ_G_SIZE))
/*-- Stuff for randomising repetitive blocks. --*/
extern Int32 BZ2_rNums[512];
#define BZ_RAND_DECLS \
Int32 rNToGo; \
Int32 rTPos \
#define BZ_RAND_INIT_MASK \
s->rNToGo = 0; \
s->rTPos = 0 \
#define BZ_RAND_MASK ((s->rNToGo == 1) ? 1 : 0)
#define BZ_RAND_UPD_MASK \
if (s->rNToGo == 0) { \
s->rNToGo = BZ2_rNums[s->rTPos]; \
s->rTPos++; \
if (s->rTPos == 512) s->rTPos = 0; \
} \
s->rNToGo--;
/*-- Stuff for doing CRCs. --*/
extern UInt32 BZ2_crc32Table[256];
#define BZ_INITIALISE_CRC(crcVar) \
{ \
crcVar = 0xffffffffL; \
}
#define BZ_FINALISE_CRC(crcVar) \
{ \
crcVar = ~(crcVar); \
}
#define BZ_UPDATE_CRC(crcVar,cha) \
{ \
crcVar = (crcVar << 8) ^ \
BZ2_crc32Table[(crcVar >> 24) ^ \
((UChar)cha)]; \
}
/*-- States and modes for compression. --*/
#define BZ_M_IDLE 1
#define BZ_M_RUNNING 2
#define BZ_M_FLUSHING 3
#define BZ_M_FINISHING 4
#define BZ_S_OUTPUT 1
#define BZ_S_INPUT 2
#define BZ_N_RADIX 2
#define BZ_N_QSORT 12
#define BZ_N_SHELL 18
#define BZ_N_OVERSHOOT (BZ_N_RADIX + BZ_N_QSORT + BZ_N_SHELL + 2)
/*-- Structure holding all the compression-side stuff. --*/
typedef
struct {
/* pointer back to the struct bz_stream */
bz_stream* strm;
/* mode this stream is in, and whether inputting */
/* or outputting data */
Int32 mode;
Int32 state;
/* remembers avail_in when flush/finish requested */
UInt32 avail_in_expect;
/* for doing the block sorting */
UInt32* arr1;
UInt32* arr2;
UInt32* ftab;
Int32 origPtr;
/* aliases for arr1 and arr2 */
UInt32* ptr;
UChar* block;
UInt16* mtfv;
UChar* zbits;
/* for deciding when to use the fallback sorting algorithm */
Int32 workFactor;
/* run-length-encoding of the input */
UInt32 state_in_ch;
Int32 state_in_len;
BZ_RAND_DECLS;
/* input and output limits and current posns */
Int32 nblock;
Int32 nblockMAX;
Int32 numZ;
Int32 state_out_pos;
/* map of bytes used in block */
Int32 nInUse;
Bool inUse[256];
UChar unseqToSeq[256];
/* the buffer for bit stream creation */
UInt32 bsBuff;
Int32 bsLive;
/* block and combined CRCs */
UInt32 blockCRC;
UInt32 combinedCRC;
/* misc administratium */
Int32 verbosity;
Int32 blockNo;
Int32 blockSize100k;
/* stuff for coding the MTF values */
Int32 nMTF;
Int32 mtfFreq [BZ_MAX_ALPHA_SIZE];
UChar selector [BZ_MAX_SELECTORS];
UChar selectorMtf[BZ_MAX_SELECTORS];
UChar len [BZ_N_GROUPS][BZ_MAX_ALPHA_SIZE];
Int32 code [BZ_N_GROUPS][BZ_MAX_ALPHA_SIZE];
Int32 rfreq [BZ_N_GROUPS][BZ_MAX_ALPHA_SIZE];
/* second dimension: only 3 needed; 4 makes index calculations faster */
UInt32 len_pack[BZ_MAX_ALPHA_SIZE][4];
}
EState;
/*-- externs for compression. --*/
extern void
BZ2_blockSort ( EState* );
extern void
BZ2_compressBlock ( EState*, Bool );
extern void
BZ2_bsInitWrite ( EState* );
extern void
BZ2_hbAssignCodes ( Int32*, UChar*, Int32, Int32, Int32 );
extern void
BZ2_hbMakeCodeLengths ( UChar*, Int32*, Int32, Int32 );
/*-- states for decompression. --*/
#define BZ_X_IDLE 1
#define BZ_X_OUTPUT 2
#define BZ_X_MAGIC_1 10
#define BZ_X_MAGIC_2 11
#define BZ_X_MAGIC_3 12
#define BZ_X_MAGIC_4 13
#define BZ_X_BLKHDR_1 14
#define BZ_X_BLKHDR_2 15
#define BZ_X_BLKHDR_3 16
#define BZ_X_BLKHDR_4 17
#define BZ_X_BLKHDR_5 18
#define BZ_X_BLKHDR_6 19
#define BZ_X_BCRC_1 20
#define BZ_X_BCRC_2 21
#define BZ_X_BCRC_3 22
#define BZ_X_BCRC_4 23
#define BZ_X_RANDBIT 24
#define BZ_X_ORIGPTR_1 25
#define BZ_X_ORIGPTR_2 26
#define BZ_X_ORIGPTR_3 27
#define BZ_X_MAPPING_1 28
#define BZ_X_MAPPING_2 29
#define BZ_X_SELECTOR_1 30
#define BZ_X_SELECTOR_2 31
#define BZ_X_SELECTOR_3 32
#define BZ_X_CODING_1 33
#define BZ_X_CODING_2 34
#define BZ_X_CODING_3 35
#define BZ_X_MTF_1 36
#define BZ_X_MTF_2 37
#define BZ_X_MTF_3 38
#define BZ_X_MTF_4 39
#define BZ_X_MTF_5 40
#define BZ_X_MTF_6 41
#define BZ_X_ENDHDR_2 42
#define BZ_X_ENDHDR_3 43
#define BZ_X_ENDHDR_4 44
#define BZ_X_ENDHDR_5 45
#define BZ_X_ENDHDR_6 46
#define BZ_X_CCRC_1 47
#define BZ_X_CCRC_2 48
#define BZ_X_CCRC_3 49
#define BZ_X_CCRC_4 50
/*-- Constants for the fast MTF decoder. --*/
#define MTFA_SIZE 4096
#define MTFL_SIZE 16
/*-- Structure holding all the decompression-side stuff. --*/
typedef
struct {
/* pointer back to the struct bz_stream */
bz_stream* strm;
/* state indicator for this stream */
Int32 state;
/* for doing the final run-length decoding */
UChar state_out_ch;
Int32 state_out_len;
Bool blockRandomised;
BZ_RAND_DECLS;
/* the buffer for bit stream reading */
UInt32 bsBuff;
Int32 bsLive;
/* misc administratium */
Int32 blockSize100k;
Bool smallDecompress;
Int32 currBlockNo;
Int32 verbosity;
/* for undoing the Burrows-Wheeler transform */
Int32 origPtr;
UInt32 tPos;
Int32 k0;
Int32 unzftab[256];
Int32 nblock_used;
Int32 cftab[257];
Int32 cftabCopy[257];
/* for undoing the Burrows-Wheeler transform (FAST) */
UInt32 *tt;
/* for undoing the Burrows-Wheeler transform (SMALL) */
UInt16 *ll16;
UChar *ll4;
/* stored and calculated CRCs */
UInt32 storedBlockCRC;
UInt32 storedCombinedCRC;
UInt32 calculatedBlockCRC;
UInt32 calculatedCombinedCRC;
/* map of bytes used in block */
Int32 nInUse;
Bool inUse[256];
Bool inUse16[16];
UChar seqToUnseq[256];
/* for decoding the MTF values */
UChar mtfa [MTFA_SIZE];
Int32 mtfbase[256 / MTFL_SIZE];
UChar selector [BZ_MAX_SELECTORS];
UChar selectorMtf[BZ_MAX_SELECTORS];
UChar len [BZ_N_GROUPS][BZ_MAX_ALPHA_SIZE];
Int32 limit [BZ_N_GROUPS][BZ_MAX_ALPHA_SIZE];
Int32 base [BZ_N_GROUPS][BZ_MAX_ALPHA_SIZE];
Int32 perm [BZ_N_GROUPS][BZ_MAX_ALPHA_SIZE];
Int32 minLens[BZ_N_GROUPS];
/* save area for scalars in the main decompress code */
Int32 save_i;
Int32 save_j;
Int32 save_t;
Int32 save_alphaSize;
Int32 save_nGroups;
Int32 save_nSelectors;
Int32 save_EOB;
Int32 save_groupNo;
Int32 save_groupPos;
Int32 save_nextSym;
Int32 save_nblockMAX;
Int32 save_nblock;
Int32 save_es;
Int32 save_N;
Int32 save_curr;
Int32 save_zt;
Int32 save_zn;
Int32 save_zvec;
Int32 save_zj;
Int32 save_gSel;
Int32 save_gMinlen;
Int32* save_gLimit;
Int32* save_gBase;
Int32* save_gPerm;
}
DState;
/*-- Macros for decompression. --*/
#define BZ_GET_FAST(cccc) \
/* c_tPos is unsigned, hence test < 0 is pointless. */ \
if (s->tPos >= (UInt32)100000 * (UInt32)s->blockSize100k) return True; \
s->tPos = s->tt[s->tPos]; \
cccc = (UChar)(s->tPos & 0xff); \
s->tPos >>= 8;
#define BZ_GET_FAST_C(cccc) \
/* c_tPos is unsigned, hence test < 0 is pointless. */ \
if (c_tPos >= (UInt32)100000 * (UInt32)ro_blockSize100k) return True; \
c_tPos = c_tt[c_tPos]; \
cccc = (UChar)(c_tPos & 0xff); \
c_tPos >>= 8;
#define SET_LL4(i,n) \
{ if (((i) & 0x1) == 0) \
s->ll4[(i) >> 1] = (s->ll4[(i) >> 1] & 0xf0) | (n); else \
s->ll4[(i) >> 1] = (s->ll4[(i) >> 1] & 0x0f) | ((n) << 4); \
}
#define GET_LL4(i) \
((((UInt32)(s->ll4[(i) >> 1])) >> (((i) << 2) & 0x4)) & 0xF)
#define SET_LL(i,n) \
{ s->ll16[i] = (UInt16)(n & 0x0000ffff); \
SET_LL4(i, n >> 16); \
}
#define GET_LL(i) \
(((UInt32)s->ll16[i]) | (GET_LL4(i) << 16))
#define BZ_GET_SMALL(cccc) \
/* c_tPos is unsigned, hence test < 0 is pointless. */ \
if (s->tPos >= (UInt32)100000 * (UInt32)s->blockSize100k) return True; \
cccc = BZ2_indexIntoF ( s->tPos, s->cftab ); \
s->tPos = GET_LL(s->tPos);
/*-- externs for decompression. --*/
extern Int32
BZ2_indexIntoF ( Int32, Int32* );
extern Int32
BZ2_decompress ( DState* );
extern void
BZ2_hbCreateDecodeTables ( Int32*, Int32*, Int32*, UChar*,
Int32, Int32, Int32 );
#endif
/*-- BZ_NO_STDIO seems to make NULL disappear on some platforms. --*/
#ifdef BZ_NO_STDIO
#ifndef NULL
#define NULL 0
#endif
#endif
/*-------------------------------------------------------------*/
/*--- end bzlib_private.h ---*/
/*-------------------------------------------------------------*/
================================================
FILE: ext/bzip2/compress.c
================================================
/*-------------------------------------------------------------*/
/*--- Compression machinery (not incl block sorting) ---*/
/*--- compress.c ---*/
/*-------------------------------------------------------------*/
/* ------------------------------------------------------------------
This file is part of bzip2/libbzip2, a program and library for
lossless, block-sorting data compression.
bzip2/libbzip2 version 1.0.6 of 6 September 2010
Copyright (C) 1996-2010 Julian Seward
Please read the WARNING, DISCLAIMER and PATENTS sections in the
README file.
This program is released under the terms of the license contained
in the file LICENSE.
------------------------------------------------------------------ */
/* CHANGES
0.9.0 -- original version.
0.9.0a/b -- no changes in this file.
0.9.0c -- changed setting of nGroups in sendMTFValues()
so as to do a bit better on small files
*/
#include "bzlib_private.h"
/*---------------------------------------------------*/
/*--- Bit stream I/O ---*/
/*---------------------------------------------------*/
/*---------------------------------------------------*/
void BZ2_bsInitWrite ( EState* s )
{
s->bsLive = 0;
s->bsBuff = 0;
}
/*---------------------------------------------------*/
static
void bsFinishWrite ( EState* s )
{
while (s->bsLive > 0) {
s->zbits[s->numZ] = (UChar)(s->bsBuff >> 24);
s->numZ++;
s->bsBuff <<= 8;
s->bsLive -= 8;
}
}
/*---------------------------------------------------*/
#define bsNEEDW(nz) \
{ \
while (s->bsLive >= 8) { \
s->zbits[s->numZ] \
= (UChar)(s->bsBuff >> 24); \
s->numZ++; \
s->bsBuff <<= 8; \
s->bsLive -= 8; \
} \
}
/*---------------------------------------------------*/
static
__inline__
void bsW ( EState* s, Int32 n, UInt32 v )
{
bsNEEDW ( n );
s->bsBuff |= (v << (32 - s->bsLive - n));
s->bsLive += n;
}
/*---------------------------------------------------*/
static
void bsPutUInt32 ( EState* s, UInt32 u )
{
bsW ( s, 8, (u >> 24) & 0xffL );
bsW ( s, 8, (u >> 16) & 0xffL );
bsW ( s, 8, (u >> 8) & 0xffL );
bsW ( s, 8, u & 0xffL );
}
/*---------------------------------------------------*/
static
void bsPutUChar ( EState* s, UChar c )
{
bsW( s, 8, (UInt32)c );
}
/*---------------------------------------------------*/
/*--- The back end proper ---*/
/*---------------------------------------------------*/
/*---------------------------------------------------*/
static
void makeMaps_e ( EState* s )
{
Int32 i;
s->nInUse = 0;
for (i = 0; i < 256; i++)
if (s->inUse[i]) {
s->unseqToSeq[i] = s->nInUse;
s->nInUse++;
}
}
/*---------------------------------------------------*/
static
void generateMTFValues ( EState* s )
{
UChar yy[256];
Int32 i, j;
Int32 zPend;
Int32 wr;
Int32 EOB;
/*
After sorting (eg, here),
s->arr1 [ 0 .. s->nblock-1 ] holds sorted order,
and
((UChar*)s->arr2) [ 0 .. s->nblock-1 ]
holds the original block data.
The first thing to do is generate the MTF values,
and put them in
((UInt16*)s->arr1) [ 0 .. s->nblock-1 ].
Because there are strictly fewer or equal MTF values
than block values, ptr values in this area are overwritten
with MTF values only when they are no longer needed.
The final compressed bitstream is generated into the
area starting at
(UChar*) (&((UChar*)s->arr2)[s->nblock])
These storage aliases are set up in bzCompressInit(),
except for the last one, which is arranged in
compressBlock().
*/
UInt32* ptr = s->ptr;
UChar* block = s->block;
UInt16* mtfv = s->mtfv;
makeMaps_e ( s );
EOB = s->nInUse+1;
for (i = 0; i <= EOB; i++) s->mtfFreq[i] = 0;
wr = 0;
zPend = 0;
for (i = 0; i < s->nInUse; i++) yy[i] = (UChar) i;
for (i = 0; i < s->nblock; i++) {
UChar ll_i;
AssertD ( wr <= i, "generateMTFValues(1)" );
j = ptr[i]-1; if (j < 0) j += s->nblock;
ll_i = s->unseqToSeq[block[j]];
AssertD ( ll_i < s->nInUse, "generateMTFValues(2a)" );
if (yy[0] == ll_i) {
zPend++;
} else {
if (zPend > 0) {
zPend--;
while (True) {
if (zPend & 1) {
mtfv[wr] = BZ_RUNB; wr++;
s->mtfFreq[BZ_RUNB]++;
} else {
mtfv[wr] = BZ_RUNA; wr++;
s->mtfFreq[BZ_RUNA]++;
}
if (zPend < 2) break;
zPend = (zPend - 2) / 2;
};
zPend = 0;
}
{
register UChar rtmp;
register UChar* ryy_j;
register UChar rll_i;
rtmp = yy[1];
yy[1] = yy[0];
ryy_j = &(yy[1]);
rll_i = ll_i;
while ( rll_i != rtmp ) {
register UChar rtmp2;
ryy_j++;
rtmp2 = rtmp;
rtmp = *ryy_j;
*ryy_j = rtmp2;
};
yy[0] = rtmp;
j = ryy_j - &(yy[0]);
mtfv[wr] = j+1; wr++; s->mtfFreq[j+1]++;
}
}
}
if (zPend > 0) {
zPend--;
while (True) {
if (zPend & 1) {
mtfv[wr] = BZ_RUNB; wr++;
s->mtfFreq[BZ_RUNB]++;
} else {
mtfv[wr] = BZ_RUNA; wr++;
s->mtfFreq[BZ_RUNA]++;
}
if (zPend < 2) break;
zPend = (zPend - 2) / 2;
};
zPend = 0;
}
mtfv[wr] = EOB; wr++; s->mtfFreq[EOB]++;
s->nMTF = wr;
}
/*---------------------------------------------------*/
#define BZ_LESSER_ICOST 0
#define BZ_GREATER_ICOST 15
static
void sendMTFValues ( EState* s )
{
Int32 v, t, i, j, gs, ge, totc, bt, bc, iter;
Int32 nSelectors, alphaSize, minLen, maxLen, selCtr;
Int32 nGroups, nBytes;
/*--
UChar len [BZ_N_GROUPS][BZ_MAX_ALPHA_SIZE];
is a global since the decoder also needs it.
Int32 code[BZ_N_GROUPS][BZ_MAX_ALPHA_SIZE];
Int32 rfreq[BZ_N_GROUPS][BZ_MAX_ALPHA_SIZE];
are also globals only used in this proc.
Made global to keep stack frame size small.
--*/
UInt16 cost[BZ_N_GROUPS];
Int32 fave[BZ_N_GROUPS];
UInt16* mtfv = s->mtfv;
if (s->verbosity >= 3)
VPrintf3( " %d in block, %d after MTF & 1-2 coding, "
"%d+2 syms in use\n",
s->nblock, s->nMTF, s->nInUse );
alphaSize = s->nInUse+2;
for (t = 0; t < BZ_N_GROUPS; t++)
for (v = 0; v < alphaSize; v++)
s->len[t][v] = BZ_GREATER_ICOST;
/*--- Decide how many coding tables to use ---*/
AssertH ( s->nMTF > 0, 3001 );
if (s->nMTF < 200) nGroups = 2; else
if (s->nMTF < 600) nGroups = 3; else
if (s->nMTF < 1200) nGroups = 4; else
if (s->nMTF < 2400) nGroups = 5; else
nGroups = 6;
/*--- Generate an initial set of coding tables ---*/
{
Int32 nPart, remF, tFreq, aFreq;
nPart = nGroups;
remF = s->nMTF;
gs = 0;
while (nPart > 0) {
tFreq = remF / nPart;
ge = gs-1;
aFreq = 0;
while (aFreq < tFreq && ge < alphaSize-1) {
ge++;
aFreq += s->mtfFreq[ge];
}
if (ge > gs
&& nPart != nGroups && nPart != 1
&& ((nGroups-nPart) % 2 == 1)) {
aFreq -= s->mtfFreq[ge];
ge--;
}
if (s->verbosity >= 3)
VPrintf5( " initial group %d, [%d .. %d], "
"has %d syms (%4.1f%%)\n",
nPart, gs, ge, aFreq,
(100.0 * (float)aFreq) / (float)(s->nMTF) );
for (v = 0; v < alphaSize; v++)
if (v >= gs && v <= ge)
s->len[nPart-1][v] = BZ_LESSER_ICOST; else
s->len[nPart-1][v] = BZ_GREATER_ICOST;
nPart--;
gs = ge+1;
remF -= aFreq;
}
}
/*---
Iterate up to BZ_N_ITERS times to improve the tables.
---*/
for (iter = 0; iter < BZ_N_ITERS; iter++) {
for (t = 0; t < nGroups; t++) fave[t] = 0;
for (t = 0; t < nGroups; t++)
for (v = 0; v < alphaSize; v++)
s->rfreq[t][v] = 0;
/*---
Set up an auxiliary length table which is used to fast-track
the common case (nGroups == 6).
---*/
if (nGroups == 6) {
for (v = 0; v < alphaSize; v++) {
s->len_pack[v][0] = (s->len[1][v] << 16) | s->len[0][v];
s->len_pack[v][1] = (s->len[3][v] << 16) | s->len[2][v];
s->len_pack[v][2] = (s->len[5][v] << 16) | s->len[4][v];
}
}
nSelectors = 0;
totc = 0;
gs = 0;
while (True) {
/*--- Set group start & end marks. --*/
if (gs >= s->nMTF) break;
ge = gs + BZ_G_SIZE - 1;
if (ge >= s->nMTF) ge = s->nMTF-1;
/*--
Calculate the cost of this group as coded
by each of the coding tables.
--*/
for (t = 0; t < nGroups; t++) cost[t] = 0;
if (nGroups == 6 && 50 == ge-gs+1) {
/*--- fast track the common case ---*/
register UInt32 cost01, cost23, cost45;
register UInt16 icv;
cost01 = cost23 = cost45 = 0;
# define BZ_ITER(nn) \
icv = mtfv[gs+(nn)]; \
cost01 += s->len_pack[icv][0]; \
cost23 += s->len_pack[icv][1]; \
cost45 += s->len_pack[icv][2]; \
BZ_ITER(0); BZ_ITER(1); BZ_ITER(2); BZ_ITER(3); BZ_ITER(4);
BZ_ITER(5); BZ_ITER(6); BZ_ITER(7); BZ_ITER(8); BZ_ITER(9);
BZ_ITER(10); BZ_ITER(11); BZ_ITER(12); BZ_ITER(13); BZ_ITER(14);
BZ_ITER(15); BZ_ITER(16); BZ_ITER(17); BZ_ITER(18); BZ_ITER(19);
BZ_ITER(20); BZ_ITER(21); BZ_ITER(22); BZ_ITER(23); BZ_ITER(24);
BZ_ITER(25); BZ_ITER(26); BZ_ITER(27); BZ_ITER(28); BZ_ITER(29);
BZ_ITER(30); BZ_ITER(31); BZ_ITER(32); BZ_ITER(33); BZ_ITER(34);
BZ_ITER(35); BZ_ITER(36); BZ_ITER(37); BZ_ITER(38); BZ_ITER(39);
BZ_ITER(40); BZ_ITER(41); BZ_ITER(42); BZ_ITER(43); BZ_ITER(44);
BZ_ITER(45); BZ_ITER(46); BZ_ITER(47); BZ_ITER(48); BZ_ITER(49);
# undef BZ_ITER
cost[0] = cost01 & 0xffff; cost[1] = cost01 >> 16;
cost[2] = cost23 & 0xffff; cost[3] = cost23 >> 16;
cost[4] = cost45 & 0xffff; cost[5] = cost45 >> 16;
} else {
/*--- slow version which correctly handles all situations ---*/
for (i = gs; i <= ge; i++) {
UInt16 icv = mtfv[i];
for (t = 0; t < nGroups; t++) cost[t] += s->len[t][icv];
}
}
/*--
Find the coding table which is best for this group,
and record its identity in the selector table.
--*/
bc = 999999999; bt = -1;
for (t = 0; t < nGroups; t++)
if (cost[t] < bc) { bc = cost[t]; bt = t; };
totc += bc;
fave[bt]++;
s->selector[nSelectors] = bt;
nSelectors++;
/*--
Increment the symbol frequencies for the selected table.
--*/
if (nGroups == 6 && 50 == ge-gs+1) {
/*--- fast track the common case ---*/
# define BZ_ITUR(nn) s->rfreq[bt][ mtfv[gs+(nn)] ]++
BZ_ITUR(0); BZ_ITUR(1); BZ_ITUR(2); BZ_ITUR(3); BZ_ITUR(4);
BZ_ITUR(5); BZ_ITUR(6); BZ_ITUR(7); BZ_ITUR(8); BZ_ITUR(9);
BZ_ITUR(10); BZ_ITUR(11); BZ_ITUR(12); BZ_ITUR(13); BZ_ITUR(14);
BZ_ITUR(15); BZ_ITUR(16); BZ_ITUR(17); BZ_ITUR(18); BZ_ITUR(19);
BZ_ITUR(20); BZ_ITUR(21); BZ_ITUR(22); BZ_ITUR(23); BZ_ITUR(24);
BZ_ITUR(25); BZ_ITUR(26); BZ_ITUR(27); BZ_ITUR(28); BZ_ITUR(29);
BZ_ITUR(30); BZ_ITUR(31); BZ_ITUR(32); BZ_ITUR(33); BZ_ITUR(34);
BZ_ITUR(35); BZ_ITUR(36); BZ_ITUR(37); BZ_ITUR(38); BZ_ITUR(39);
BZ_ITUR(40); BZ_ITUR(41); BZ_ITUR(42); BZ_ITUR(43); BZ_ITUR(44);
BZ_ITUR(45); BZ_ITUR(46); BZ_ITUR(47); BZ_ITUR(48); BZ_ITUR(49);
# undef BZ_ITUR
} else {
/*--- slow version which correctly handles all situations ---*/
for (i = gs; i <= ge; i++)
s->rfreq[bt][ mtfv[i] ]++;
}
gs = ge+1;
}
if (s->verbosity >= 3) {
VPrintf2 ( " pass %d: size is %d, grp uses are ",
iter+1, totc/8 );
for (t = 0; t < nGroups; t++)
VPrintf1 ( "%d ", fave[t] );
VPrintf0 ( "\n" );
}
/*--
Recompute the tables based on the accumulated frequencies.
--*/
/* maxLen was changed from 20 to 17 in bzip2-1.0.3. See
comment in huffman.c for details. */
for (t = 0; t < nGroups; t++)
BZ2_hbMakeCodeLengths ( &(s->len[t][0]), &(s->rfreq[t][0]),
alphaSize, 17 /*20*/ );
}
AssertH( nGroups < 8, 3002 );
AssertH( nSelectors < 32768 &&
nSelectors <= (2 + (900000 / BZ_G_SIZE)),
3003 );
/*--- Compute MTF values for the selectors. ---*/
{
UChar pos[BZ_N_GROUPS], ll_i, tmp2, tmp;
for (i = 0; i < nGroups; i++) pos[i] = i;
for (i = 0; i < nSelectors; i++) {
ll_i = s->selector[i];
j = 0;
tmp = pos[j];
while ( ll_i != tmp ) {
j++;
tmp2 = tmp;
tmp = pos[j];
pos[j] = tmp2;
};
pos[0] = tmp;
s->selectorMtf[i] = j;
}
};
/*--- Assign actual codes for the tables. --*/
for (t = 0; t < nGroups; t++) {
minLen = 32;
maxLen = 0;
for (i = 0; i < alphaSize; i++) {
if (s->len[t][i] > maxLen) maxLen = s->len[t][i];
if (s->len[t][i] < minLen) minLen = s->len[t][i];
}
AssertH ( !(maxLen > 17 /*20*/ ), 3004 );
AssertH ( !(minLen < 1), 3005 );
BZ2_hbAssignCodes ( &(s->code[t][0]), &(s->len[t][0]),
minLen, maxLen, alphaSize );
}
/*--- Transmit the mapping table. ---*/
{
Bool inUse16[16];
for (i = 0; i < 16; i++) {
inUse16[i] = False;
for (j = 0; j < 16; j++)
if (s->inUse[i * 16 + j]) inUse16[i] = True;
}
nBytes = s->numZ;
for (i = 0; i < 16; i++)
if (inUse16[i]) bsW(s,1,1); else bsW(s,1,0);
for (i = 0; i < 16; i++)
if (inUse16[i])
for (j = 0; j < 16; j++) {
if (s->inUse[i * 16 + j]) bsW(s,1,1); else bsW(s,1,0);
}
if (s->verbosity >= 3)
VPrintf1( " bytes: mapping %d, ", s->numZ-nBytes );
}
/*--- Now the selectors. ---*/
nBytes = s->numZ;
bsW ( s, 3, nGroups );
bsW ( s, 15, nSelectors );
for (i = 0; i < nSelectors; i++) {
for (j = 0; j < s->selectorMtf[i]; j++) bsW(s,1,1);
bsW(s,1,0);
}
if (s->verbosity >= 3)
VPrintf1( "selectors %d, ", s->numZ-nBytes );
/*--- Now the coding tables. ---*/
nBytes = s->numZ;
for (t = 0; t < nGroups; t++) {
Int32 curr = s->len[t][0];
bsW ( s, 5, curr );
for (i = 0; i < alphaSize; i++) {
while (curr < s->len[t][i]) { bsW(s,2,2); curr++; /* 10 */ };
while (curr > s->len[t][i]) { bsW(s,2,3); curr--; /* 11 */ };
bsW ( s, 1, 0 );
}
}
if (s->verbosity >= 3)
VPrintf1 ( "code lengths %d, ", s->numZ-nBytes );
/*--- And finally, the block data proper ---*/
nBytes = s->numZ;
selCtr = 0;
gs = 0;
while (True) {
if (gs >= s->nMTF) break;
ge = gs + BZ_G_SIZE - 1;
if (ge >= s->nMTF) ge = s->nMTF-1;
AssertH ( s->selector[selCtr] < nGroups, 3006 );
if (nGroups == 6 && 50 == ge-gs+1) {
/*--- fast track the common case ---*/
UInt16 mtfv_i;
UChar* s_len_sel_selCtr
= &(s->len[s->selector[selCtr]][0]);
Int32* s_code_sel_selCtr
= &(s->code[s->selector[selCtr]][0]);
# define BZ_ITAH(nn) \
mtfv_i = mtfv[gs+(nn)]; \
bsW ( s, \
s_len_sel_selCtr[mtfv_i], \
s_code_sel_selCtr[mtfv_i] )
BZ_ITAH(0); BZ_ITAH(1); BZ_ITAH(2); BZ_ITAH(3); BZ_ITAH(4);
BZ_ITAH(5); BZ_ITAH(6); BZ_ITAH(7); BZ_ITAH(8); BZ_ITAH(9);
BZ_ITAH(10); BZ_ITAH(11); BZ_ITAH(12); BZ_ITAH(13); BZ_ITAH(14);
BZ_ITAH(15); BZ_ITAH(16); BZ_ITAH(17); BZ_ITAH(18); BZ_ITAH(19);
BZ_ITAH(20); BZ_ITAH(21); BZ_ITAH(22); BZ_ITAH(23); BZ_ITAH(24);
BZ_ITAH(25); BZ_ITAH(26); BZ_ITAH(27); BZ_ITAH(28); BZ_ITAH(29);
BZ_ITAH(30); BZ_ITAH(31); BZ_ITAH(32); BZ_ITAH(33); BZ_ITAH(34);
BZ_ITAH(35); BZ_ITAH(36); BZ_ITAH(37); BZ_ITAH(38); BZ_ITAH(39);
BZ_ITAH(40); BZ_ITAH(41); BZ_ITAH(42); BZ_ITAH(43); BZ_ITAH(44);
BZ_ITAH(45); BZ_ITAH(46); BZ_ITAH(47); BZ_ITAH(48); BZ_ITAH(49);
# undef BZ_ITAH
} else {
/*--- slow version which correctly handles all situations ---*/
for (i = gs; i <= ge; i++) {
bsW ( s,
s->len [s->selector[selCtr]] [mtfv[i]],
s->code [s->selector[selCtr]] [mtfv[i]] );
}
}
gs = ge+1;
selCtr++;
}
AssertH( selCtr == nSelectors, 3007 );
if (s->verbosity >= 3)
VPrintf1( "codes %d\n", s->numZ-nBytes );
}
/*---------------------------------------------------*/
void BZ2_compressBlock ( EState* s, Bool is_last_block )
{
if (s->nblock > 0) {
BZ_FINALISE_CRC ( s->blockCRC );
s->combinedCRC = (s->combinedCRC << 1) | (s->combinedCRC >> 31);
s->combinedCRC ^= s->blockCRC;
if (s->blockNo > 1) s->numZ = 0;
if (s->verbosity >= 2)
VPrintf4( " block %d: crc = 0x%08x, "
"combined CRC = 0x%08x, size = %d\n",
s->blockNo, s->blockCRC, s->combinedCRC, s->nblock );
BZ2_blockSort ( s );
}
s->zbits = (UChar*) (&((UChar*)s->arr2)[s->nblock]);
/*-- If this is the first block, create the stream header. --*/
if (s->blockNo == 1) {
BZ2_bsInitWrite ( s );
bsPutUChar ( s, BZ_HDR_B );
bsPutUChar ( s, BZ_HDR_Z );
bsPutUChar ( s, BZ_HDR_h );
bsPutUChar ( s, (UChar)(BZ_HDR_0 + s->blockSize100k) );
}
if (s->nblock > 0) {
bsPutUChar ( s, 0x31 ); bsPutUChar ( s, 0x41 );
bsPutUChar ( s, 0x59 ); bsPutUChar ( s, 0x26 );
bsPutUChar ( s, 0x53 ); bsPutUChar ( s, 0x59 );
/*-- Now the block's CRC, so it is in a known place. --*/
bsPutUInt32 ( s, s->blockCRC );
/*--
Now a single bit indicating (non-)randomisation.
As of version 0.9.5, we use a better sorting algorithm
which makes randomisation unnecessary. So always set
the randomised bit to 'no'. Of course, the decoder
still needs to be able to handle randomised blocks
so as to maintain backwards compatibility with
older versions of bzip2.
--*/
bsW(s,1,0);
bsW ( s, 24, s->origPtr );
generateMTFValues ( s );
sendMTFValues ( s );
}
/*-- If this is the last block, add the stream trailer. --*/
if (is_last_block) {
bsPutUChar ( s, 0x17 ); bsPutUChar ( s, 0x72 );
bsPutUChar ( s, 0x45 ); bsPutUChar ( s, 0x38 );
bsPutUChar ( s, 0x50 ); bsPutUChar ( s, 0x90 );
bsPutUInt32 ( s, s->combinedCRC );
if (s->verbosity >= 2)
VPrintf1( " final combined CRC = 0x%08x\n ", s->combinedCRC );
bsFinishWrite ( s );
}
}
/*-------------------------------------------------------------*/
/*--- end compress.c ---*/
/*-------------------------------------------------------------*/
================================================
FILE: ext/bzip2/crctable.c
================================================
/*-------------------------------------------------------------*/
/*--- Table for doing CRCs ---*/
/*--- crctable.c ---*/
/*-------------------------------------------------------------*/
/* ------------------------------------------------------------------
This file is part of bzip2/libbzip2, a program and library for
lossless, block-sorting data compression.
bzip2/libbzip2 version 1.0.6 of 6 September 2010
Copyright (C) 1996-2010 Julian Seward
Please read the WARNING, DISCLAIMER and PATENTS sections in the
README file.
This program is released under the terms of the license contained
in the file LICENSE.
------------------------------------------------------------------ */
#include "bzlib_private.h"
/*--
I think this is an implementation of the AUTODIN-II,
Ethernet & FDDI 32-bit CRC standard. Vaguely derived
from code by Rob Warnock, in Section 51 of the
comp.compression FAQ.
--*/
UInt32 BZ2_crc32Table[256] = {
/*-- Ugly, innit? --*/
0x00000000L, 0x04c11db7L, 0x09823b6eL, 0x0d4326d9L,
0x130476dcL, 0x17c56b6bL, 0x1a864db2L, 0x1e475005L,
0x2608edb8L, 0x22c9f00fL, 0x2f8ad6d6L, 0x2b4bcb61L,
0x350c9b64L, 0x31cd86d3L, 0x3c8ea00aL, 0x384fbdbdL,
0x4c11db70L, 0x48d0c6c7L, 0x4593e01eL, 0x4152fda9L,
0x5f15adacL, 0x5bd4b01bL, 0x569796c2L, 0x52568b75L,
0x6a1936c8L, 0x6ed82b7fL, 0x639b0da6L, 0x675a1011L,
0x791d4014L, 0x7ddc5da3L, 0x709f7b7aL, 0x745e66cdL,
0x9823b6e0L, 0x9ce2ab57L, 0x91a18d8eL, 0x95609039L,
0x8b27c03cL, 0x8fe6dd8bL, 0x82a5fb52L, 0x8664e6e5L,
0xbe2b5b58L, 0xbaea46efL, 0xb7a96036L, 0xb3687d81L,
0xad2f2d84L, 0xa9ee3033L, 0xa4ad16eaL, 0xa06c0b5dL,
0xd4326d90L, 0xd0f37027L, 0xddb056feL, 0xd9714b49L,
0xc7361b4cL, 0xc3f706fbL, 0xceb42022L, 0xca753d95L,
0xf23a8028L, 0xf6fb9d9fL, 0xfbb8bb46L, 0xff79a6f1L,
0xe13ef6f4L, 0xe5ffeb43L, 0xe8bccd9aL, 0xec7dd02dL,
0x34867077L, 0x30476dc0L, 0x3d044b19L, 0x39c556aeL,
0x278206abL, 0x23431b1cL, 0x2e003dc5L, 0x2ac12072L,
0x128e9dcfL, 0x164f8078L, 0x1b0ca6a1L, 0x1fcdbb16L,
0x018aeb13L, 0x054bf6a4L, 0x0808d07dL, 0x0cc9cdcaL,
0x7897ab07L, 0x7c56b6b0L, 0x71159069L, 0x75d48ddeL,
0x6b93dddbL, 0x6f52c06cL, 0x6211e6b5L, 0x66d0fb02L,
0x5e9f46bfL, 0x5a5e5b08L, 0x571d7dd1L, 0x53dc6066L,
0x4d9b3063L, 0x495a2dd4L, 0x44190b0dL, 0x40d816baL,
0xaca5c697L, 0xa864db20L, 0xa527fdf9L, 0xa1e6e04eL,
0xbfa1b04bL, 0xbb60adfcL, 0xb6238b25L, 0xb2e29692L,
0x8aad2b2fL, 0x8e6c3698L, 0x832f1041L, 0x87ee0df6L,
0x99a95df3L, 0x9d684044L, 0x902b669dL, 0x94ea7b2aL,
0xe0b41de7L, 0xe4750050L, 0xe9362689L, 0xedf73b3eL,
0xf3b06b3bL, 0xf771768cL, 0xfa325055L, 0xfef34de2L,
0xc6bcf05fL, 0xc27dede8L, 0xcf3ecb31L, 0xcbffd686L,
0xd5b88683L, 0xd1799b34L, 0xdc3abdedL, 0xd8fba05aL,
0x690ce0eeL, 0x6dcdfd59L, 0x608edb80L, 0x644fc637L,
0x7a089632L, 0x7ec98b85L, 0x738aad5cL, 0x774bb0ebL,
0x4f040d56L, 0x4bc510e1L, 0x46863638L, 0x42472b8fL,
0x5c007b8aL, 0x58c1663dL, 0x558240e4L, 0x51435d53L,
0x251d3b9eL, 0x21dc2629L, 0x2c9f00f0L, 0x285e1d47L,
0x36194d42L, 0x32d850f5L, 0x3f9b762cL, 0x3b5a6b9bL,
0x0315d626L, 0x07d4cb91L, 0x0a97ed48L, 0x0e56f0ffL,
0x1011a0faL, 0x14d0bd4dL, 0x19939b94L, 0x1d528623L,
0xf12f560eL, 0xf5ee4bb9L, 0xf8ad6d60L, 0xfc6c70d7L,
0xe22b20d2L, 0xe6ea3d65L, 0xeba91bbcL, 0xef68060bL,
0xd727bbb6L, 0xd3e6a601L, 0xdea580d8L, 0xda649d6fL,
0xc423cd6aL, 0xc0e2d0ddL, 0xcda1f604L, 0xc960ebb3L,
0xbd3e8d7eL, 0xb9ff90c9L, 0xb4bcb610L, 0xb07daba7L,
0xae3afba2L, 0xaafbe615L, 0xa7b8c0ccL, 0xa379dd7bL,
0x9b3660c6L, 0x9ff77d71L, 0x92b45ba8L, 0x9675461fL,
0x8832161aL, 0x8cf30badL, 0x81b02d74L, 0x857130c3L,
0x5d8a9099L, 0x594b8d2eL, 0x5408abf7L, 0x50c9b640L,
0x4e8ee645L, 0x4a4ffbf2L, 0x470cdd2bL, 0x43cdc09cL,
0x7b827d21L, 0x7f436096L, 0x7200464fL, 0x76c15bf8L,
0x68860bfdL, 0x6c47164aL, 0x61043093L, 0x65c52d24L,
0x119b4be9L, 0x155a565eL, 0x18197087L, 0x1cd86d30L,
0x029f3d35L, 0x065e2082L, 0x0b1d065bL, 0x0fdc1becL,
0x3793a651L, 0x3352bbe6L, 0x3e119d3fL, 0x3ad08088L,
0x2497d08dL, 0x2056cd3aL, 0x2d15ebe3L, 0x29d4f654L,
0xc5a92679L, 0xc1683bceL, 0xcc2b1d17L, 0xc8ea00a0L,
0xd6ad50a5L, 0xd26c4d12L, 0xdf2f6bcbL, 0xdbee767cL,
0xe3a1cbc1L, 0xe760d676L, 0xea23f0afL, 0xeee2ed18L,
0xf0a5bd1dL, 0xf464a0aaL, 0xf9278673L, 0xfde69bc4L,
0x89b8fd09L, 0x8d79e0beL, 0x803ac667L, 0x84fbdbd0L,
0x9abc8bd5L, 0x9e7d9662L, 0x933eb0bbL, 0x97ffad0cL,
0xafb010b1L, 0xab710d06L, 0xa6322bdfL, 0xa2f33668L,
0xbcb4666dL, 0xb8757bdaL, 0xb5365d03L, 0xb1f740b4L
};
/*-------------------------------------------------------------*/
/*--- end crctable.c ---*/
/*-------------------------------------------------------------*/
================================================
FILE: ext/bzip2/decompress.c
================================================
/*-------------------------------------------------------------*/
/*--- Decompression machinery ---*/
/*--- decompress.c ---*/
/*-------------------------------------------------------------*/
/* ------------------------------------------------------------------
This file is part of bzip2/libbzip2, a program and library for
lossless, block-sorting data compression.
bzip2/libbzip2 version 1.0.6 of 6 September 2010
Copyright (C) 1996-2010 Julian Seward
Please read the WARNING, DISCLAIMER and PATENTS sections in the
README file.
This program is released under the terms of the license contained
in the file LICENSE.
------------------------------------------------------------------ */
#include "bzlib_private.h"
/*---------------------------------------------------*/
static
void makeMaps_d ( DState* s )
{
Int32 i;
s->nInUse = 0;
for (i = 0; i < 256; i++)
if (s->inUse[i]) {
s->seqToUnseq[s->nInUse] = i;
s->nInUse++;
}
}
/*---------------------------------------------------*/
#define RETURN(rrr) \
{ retVal = rrr; goto save_state_and_return; };
#define GET_BITS(lll,vvv,nnn) \
case lll: s->state = lll; \
while (True) { \
if (s->bsLive >= nnn) { \
UInt32 v; \
v = (s->bsBuff >> \
(s->bsLive-nnn)) & ((1 << nnn)-1); \
s->bsLive -= nnn; \
vvv = v; \
break; \
} \
if (s->strm->avail_in == 0) RETURN(BZ_OK); \
s->bsBuff \
= (s->bsBuff << 8) | \
((UInt32) \
(*((UChar*)(s->strm->next_in)))); \
s->bsLive += 8; \
s->strm->next_in++; \
s->strm->avail_in--; \
s->strm->total_in_lo32++; \
if (s->strm->total_in_lo32 == 0) \
s->strm->total_in_hi32++; \
}
#define GET_UCHAR(lll,uuu) \
GET_BITS(lll,uuu,8)
#define GET_BIT(lll,uuu) \
GET_BITS(lll,uuu,1)
/*---------------------------------------------------*/
#define GET_MTF_VAL(label1,label2,lval) \
{ \
if (groupPos == 0) { \
groupNo++; \
if (groupNo >= nSelectors) \
RETURN(BZ_DATA_ERROR); \
groupPos = BZ_G_SIZE; \
gSel = s->selector[groupNo]; \
gMinlen = s->minLens[gSel]; \
gLimit = &(s->limit[gSel][0]); \
gPerm = &(s->perm[gSel][0]); \
gBase = &(s->base[gSel][0]); \
} \
groupPos--; \
zn = gMinlen; \
GET_BITS(label1, zvec, zn); \
while (1) { \
if (zn > 20 /* the longest code */) \
RETURN(BZ_DATA_ERROR); \
if (zvec <= gLimit[zn]) break; \
zn++; \
GET_BIT(label2, zj); \
zvec = (zvec << 1) | zj; \
}; \
if (zvec - gBase[zn] < 0 \
|| zvec - gBase[zn] >= BZ_MAX_ALPHA_SIZE) \
RETURN(BZ_DATA_ERROR); \
lval = gPerm[zvec - gBase[zn]]; \
}
/*---------------------------------------------------*/
Int32 BZ2_decompress ( DState* s )
{
UChar uc;
Int32 retVal;
Int32 minLen, maxLen;
bz_stream* strm = s->strm;
/* stuff that needs to be saved/restored */
Int32 i;
Int32 j;
Int32 t;
Int32 alphaSize;
Int32 nGroups;
Int32 nSelectors;
Int32 EOB;
Int32 groupNo;
Int32 groupPos;
Int32 nextSym;
Int32 nblockMAX;
Int32 nblock;
Int32 es;
Int32 N;
Int32 curr;
Int32 zt;
Int32 zn;
Int32 zvec;
Int32 zj;
Int32 gSel;
Int32 gMinlen;
Int32* gLimit;
Int32* gBase;
Int32* gPerm;
if (s->state == BZ_X_MAGIC_1) {
/*initialise the save area*/
s->save_i = 0;
s->save_j = 0;
s->save_t = 0;
s->save_alphaSize = 0;
s->save_nGroups = 0;
s->save_nSelectors = 0;
s->save_EOB = 0;
s->save_groupNo = 0;
s->save_groupPos = 0;
s->save_nextSym = 0;
s->save_nblockMAX = 0;
s->save_nblock = 0;
s->save_es = 0;
s->save_N = 0;
s->save_curr = 0;
s->save_zt = 0;
s->save_zn = 0;
s->save_zvec = 0;
s->save_zj = 0;
s->save_gSel = 0;
s->save_gMinlen = 0;
s->save_gLimit = NULL;
s->save_gBase = NULL;
s->save_gPerm = NULL;
}
/*restore from the save area*/
i = s->save_i;
j = s->save_j;
t = s->save_t;
alphaSize = s->save_alphaSize;
nGroups = s->save_nGroups;
nSelectors = s->save_nSelectors;
EOB = s->save_EOB;
groupNo = s->save_groupNo;
groupPos = s->save_groupPos;
nextSym = s->save_nextSym;
nblockMAX = s->save_nblockMAX;
nblock = s->save_nblock;
es = s->save_es;
N = s->save_N;
curr = s->save_curr;
zt = s->save_zt;
zn = s->save_zn;
zvec = s->save_zvec;
zj = s->save_zj;
gSel = s->save_gSel;
gMinlen = s->save_gMinlen;
gLimit = s->save_gLimit;
gBase = s->save_gBase;
gPerm = s->save_gPerm;
retVal = BZ_OK;
switch (s->state) {
GET_UCHAR(BZ_X_MAGIC_1, uc);
if (uc != BZ_HDR_B) RETURN(BZ_DATA_ERROR_MAGIC);
GET_UCHAR(BZ_X_MAGIC_2, uc);
if (uc != BZ_HDR_Z) RETURN(BZ_DATA_ERROR_MAGIC);
GET_UCHAR(BZ_X_MAGIC_3, uc)
if (uc != BZ_HDR_h) RETURN(BZ_DATA_ERROR_MAGIC);
GET_BITS(BZ_X_MAGIC_4, s->blockSize100k, 8)
if (s->blockSize100k < (BZ_HDR_0 + 1) ||
s->blockSize100k > (BZ_HDR_0 + 9)) RETURN(BZ_DATA_ERROR_MAGIC);
s->blockSize100k -= BZ_HDR_0;
if (s->smallDecompress) {
s->ll16 = BZALLOC( s->blockSize100k * 100000 * sizeof(UInt16) );
s->ll4 = BZALLOC(
((1 + s->blockSize100k * 100000) >> 1) * sizeof(UChar)
);
if (s->ll16 == NULL || s->ll4 == NULL) RETURN(BZ_MEM_ERROR);
} else {
s->tt = BZALLOC( s->blockSize100k * 100000 * sizeof(Int32) );
if (s->tt == NULL) RETURN(BZ_MEM_ERROR);
}
GET_UCHAR(BZ_X_BLKHDR_1, uc);
if (uc == 0x17) goto endhdr_2;
if (uc != 0x31) RETURN(BZ_DATA_ERROR);
GET_UCHAR(BZ_X_BLKHDR_2, uc);
if (uc != 0x41) RETURN(BZ_DATA_ERROR);
GET_UCHAR(BZ_X_BLKHDR_3, uc);
if (uc != 0x59) RETURN(BZ_DATA_ERROR);
GET_UCHAR(BZ_X_BLKHDR_4, uc);
if (uc != 0x26) RETURN(BZ_DATA_ERROR);
GET_UCHAR(BZ_X_BLKHDR_5, uc);
if (uc != 0x53) RETURN(BZ_DATA_ERROR);
GET_UCHAR(BZ_X_BLKHDR_6, uc);
if (uc != 0x59) RETURN(BZ_DATA_ERROR);
s->currBlockNo++;
if (s->verbosity >= 2)
VPrintf1 ( "\n [%d: huff+mtf ", s->currBlockNo );
s->storedBlockCRC = 0;
GET_UCHAR(BZ_X_BCRC_1, uc);
s->storedBlockCRC = (s->storedBlockCRC << 8) | ((UInt32)uc);
GET_UCHAR(BZ_X_BCRC_2, uc);
s->storedBlockCRC = (s->storedBlockCRC << 8) | ((UInt32)uc);
GET_UCHAR(BZ_X_BCRC_3, uc);
s->storedBlockCRC = (s->storedBlockCRC << 8) | ((UInt32)uc);
GET_UCHAR(BZ_X_BCRC_4, uc);
s->storedBlockCRC = (s->storedBlockCRC << 8) | ((UInt32)uc);
GET_BITS(BZ_X_RANDBIT, s->blockRandomised, 1);
s->origPtr = 0;
GET_UCHAR(BZ_X_ORIGPTR_1, uc);
s->origPtr = (s->origPtr << 8) | ((Int32)uc);
GET_UCHAR(BZ_X_ORIGPTR_2, uc);
s->origPtr = (s->origPtr << 8) | ((Int32)uc);
GET_UCHAR(BZ_X_ORIGPTR_3, uc);
s->origPtr = (s->origPtr << 8) | ((Int32)uc);
if (s->origPtr < 0)
RETURN(BZ_DATA_ERROR);
if (s->origPtr > 10 + 100000*s->blockSize100k)
RETURN(BZ_DATA_ERROR);
/*--- Receive the mapping table ---*/
for (i = 0; i < 16; i++) {
GET_BIT(BZ_X_MAPPING_1, uc);
if (uc == 1)
s->inUse16[i] = True; else
s->inUse16[i] = False;
}
for (i = 0; i < 256; i++) s->inUse[i] = False;
for (i = 0; i < 16; i++)
if (s->inUse16[i])
for (j = 0; j < 16; j++) {
GET_BIT(BZ_X_MAPPING_2, uc);
if (uc == 1) s->inUse[i * 16 + j] = True;
}
makeMaps_d ( s );
if (s->nInUse == 0) RETURN(BZ_DATA_ERROR);
alphaSize = s->nInUse+2;
/*--- Now the selectors ---*/
GET_BITS(BZ_X_SELECTOR_1, nGroups, 3);
if (nGroups < 2 || nGroups > 6) RETURN(BZ_DATA_ERROR);
GET_BITS(BZ_X_SELECTOR_2, nSelectors, 15);
if (nSelectors < 1) RETURN(BZ_DATA_ERROR);
for (i = 0; i < nSelectors; i++) {
j = 0;
while (True) {
GET_BIT(BZ_X_SELECTOR_3, uc);
if (uc == 0) break;
j++;
if (j >= nGroups) RETURN(BZ_DATA_ERROR);
}
s->selectorMtf[i] = j;
}
/*--- Undo the MTF values for the selectors. ---*/
{
UChar pos[BZ_N_GROUPS], tmp, v;
for (v = 0; v < nGroups; v++) pos[v] = v;
for (i = 0; i < nSelectors; i++) {
v = s->selectorMtf[i];
tmp = pos[v];
while (v > 0) { pos[v] = pos[v-1]; v--; }
pos[0] = tmp;
s->selector[i] = tmp;
}
}
/*--- Now the coding tables ---*/
for (t = 0; t < nGroups; t++) {
GET_BITS(BZ_X_CODING_1, curr, 5);
for (i = 0; i < alphaSize; i++) {
while (True) {
if (curr < 1 || curr > 20) RETURN(BZ_DATA_ERROR);
GET_BIT(BZ_X_CODING_2, uc);
if (uc == 0) break;
GET_BIT(BZ_X_CODING_3, uc);
if (uc == 0) curr++; else curr--;
}
s->len[t][i] = curr;
}
}
/*--- Create the Huffman decoding tables ---*/
for (t = 0; t < nGroups; t++) {
minLen = 32;
maxLen = 0;
for (i = 0; i < alphaSize; i++) {
if (s->len[t][i] > maxLen) maxLen = s->len[t][i];
if (s->len[t][i] < minLen) minLen = s->len[t][i];
}
BZ2_hbCreateDecodeTables (
&(s->limit[t][0]),
&(s->base[t][0]),
&(s->perm[t][0]),
&(s->len[t][0]),
minLen, maxLen, alphaSize
);
s->minLens[t] = minLen;
}
/*--- Now the MTF values ---*/
EOB = s->nInUse+1;
nblockMAX = 100000 * s->blockSize100k;
groupNo = -1;
groupPos = 0;
for (i = 0; i <= 255; i++) s->unzftab[i] = 0;
/*-- MTF init --*/
{
Int32 ii, jj, kk;
kk = MTFA_SIZE-1;
for (ii = 256 / MTFL_SIZE - 1; ii >= 0; ii--) {
for (jj = MTFL_SIZE-1; jj >= 0; jj--) {
s->mtfa[kk] = (UChar)(ii * MTFL_SIZE + jj);
kk--;
}
s->mtfbase[ii] = kk + 1;
}
}
/*-- end MTF init --*/
nblock = 0;
GET_MTF_VAL(BZ_X_MTF_1, BZ_X_MTF_2, nextSym);
while (True) {
if (nextSym == EOB) break;
if (nextSym == BZ_RUNA || nextSym == BZ_RUNB) {
es = -1;
N = 1;
do {
/* Check that N doesn't get too big, so that es doesn't
go negative. The maximum value that can be
RUNA/RUNB encoded is equal to the block size (post
the initial RLE), viz, 900k, so bounding N at 2
million should guard against overflow without
rejecting any legitimate inputs. */
if (N >= 2*1024*1024) RETURN(BZ_DATA_ERROR);
if (nextSym == BZ_RUNA) es = es + (0+1) * N; else
if (nextSym == BZ_RUNB) es = es + (1+1) * N;
N = N * 2;
GET_MTF_VAL(BZ_X_MTF_3, BZ_X_MTF_4, nextSym);
}
while (nextSym == BZ_RUNA || nextSym == BZ_RUNB);
es++;
uc = s->seqToUnseq[ s->mtfa[s->mtfbase[0]] ];
s->unzftab[uc] += es;
if (s->smallDecompress)
while (es > 0) {
if (nblock >= nblockMAX) RETURN(BZ_DATA_ERROR);
s->ll16[nblock] = (UInt16)uc;
nblock++;
es--;
}
else
while (es > 0) {
if (nblock >= nblockMAX) RETURN(BZ_DATA_ERROR);
s->tt[nblock] = (UInt32)uc;
nblock++;
es--;
};
continue;
} else {
if (nblock >= nblockMAX) RETURN(BZ_DATA_ERROR);
/*-- uc = MTF ( nextSym-1 ) --*/
{
Int32 ii, jj, kk, pp, lno, off;
UInt32 nn;
nn = (UInt32)(nextSym - 1);
if (nn < MTFL_SIZE) {
/* avoid general-case expense */
pp = s->mtfbase[0];
uc = s->mtfa[pp+nn];
while (nn > 3) {
Int32 z = pp+nn;
s->mtfa[(z) ] = s->mtfa[(z)-1];
s->mtfa[(z)-1] = s->mtfa[(z)-2];
s->mtfa[(z)-2] = s->mtfa[(z)-3];
s->mtfa[(z)-3] = s->mtfa[(z)-4];
nn -= 4;
}
while (nn > 0) {
s->mtfa[(pp+nn)] = s->mtfa[(pp+nn)-1]; nn--;
};
s->mtfa[pp] = uc;
} else {
/* general case */
lno = nn / MTFL_SIZE;
off = nn % MTFL_SIZE;
pp = s->mtfbase[lno] + off;
uc = s->mtfa[pp];
while (pp > s->mtfbase[lno]) {
s->mtfa[pp] = s->mtfa[pp-1]; pp--;
};
s->mtfbase[lno]++;
while (lno > 0) {
s->mtfbase[lno]--;
s->mtfa[s->mtfbase[lno]]
= s->mtfa[s->mtfbase[lno-1] + MTFL_SIZE - 1];
lno--;
}
s->mtfbase[0]--;
s->mtfa[s->mtfbase[0]] = uc;
if (s->mtfbase[0] == 0) {
kk = MTFA_SIZE-1;
for (ii = 256 / MTFL_SIZE-1; ii >= 0; ii--) {
for (jj = MTFL_SIZE-1; jj >= 0; jj--) {
s->mtfa[kk] = s->mtfa[s->mtfbase[ii] + jj];
kk--;
}
s->mtfbase[ii] = kk + 1;
}
}
}
}
/*-- end uc = MTF ( nextSym-1 ) --*/
s->unzftab[s->seqToUnseq[uc]]++;
if (s->smallDecompress)
s->ll16[nblock] = (UInt16)(s->seqToUnseq[uc]); else
s->tt[nblock] = (UInt32)(s->seqToUnseq[uc]);
nblock++;
GET_MTF_VAL(BZ_X_MTF_5, BZ_X_MTF_6, nextSym);
continue;
}
}
/* Now we know what nblock is, we can do a better sanity
check on s->origPtr.
*/
if (s->origPtr < 0 || s->origPtr >= nblock)
RETURN(BZ_DATA_ERROR);
/*-- Set up cftab to facilitate generation of T^(-1) --*/
/* Check: unzftab entries in range. */
for (i = 0; i <= 255; i++) {
if (s->unzftab[i] < 0 || s->unzftab[i] > nblock)
RETURN(BZ_DATA_ERROR);
}
/* Actually generate cftab. */
s->cftab[0] = 0;
for (i = 1; i <= 256; i++) s->cftab[i] = s->unzftab[i-1];
for (i = 1; i <= 256; i++) s->cftab[i] += s->cftab[i-1];
/* Check: cftab entries in range. */
for (i = 0; i <= 256; i++) {
if (s->cftab[i] < 0 || s->cftab[i] > nblock) {
/* s->cftab[i] can legitimately be == nblock */
RETURN(BZ_DATA_ERROR);
}
}
/* Check: cftab entries non-descending. */
for (i = 1; i <= 256; i++) {
if (s->cftab[i-1] > s->cftab[i]) {
RETURN(BZ_DATA_ERROR);
}
}
s->state_out_len = 0;
s->state_out_ch = 0;
BZ_INITIALISE_CRC ( s->calculatedBlockCRC );
s->state = BZ_X_OUTPUT;
if (s->verbosity >= 2) VPrintf0 ( "rt+rld" );
if (s->smallDecompress) {
/*-- Make a copy of cftab, used in generation of T --*/
for (i = 0; i <= 256; i++) s->cftabCopy[i] = s->cftab[i];
/*-- compute the T vector --*/
for (i = 0; i < nblock; i++) {
uc = (UChar)(s->ll16[i]);
SET_LL(i, s->cftabCopy[uc]);
s->cftabCopy[uc]++;
}
/*-- Compute T^(-1) by pointer reversal on T --*/
i = s->origPtr;
j = GET_LL(i);
do {
Int32 tmp = GET_LL(j);
SET_LL(j, i);
i = j;
j = tmp;
}
while (i != s->origPtr);
s->tPos = s->origPtr;
s->nblock_used = 0;
if (s->blockRandomised) {
BZ_RAND_INIT_MASK;
BZ_GET_SMALL(s->k0); s->nblock_used++;
BZ_RAND_UPD_MASK; s->k0 ^= BZ_RAND_MASK;
} else {
BZ_GET_SMALL(s->k0); s->nblock_used++;
}
} else {
/*-- compute the T^(-1) vector --*/
for (i = 0; i < nblock; i++) {
uc = (UChar)(s->tt[i] & 0xff);
s->tt[s->cftab[uc]] |= (i << 8);
s->cftab[uc]++;
}
s->tPos = s->tt[s->origPtr] >> 8;
s->nblock_used = 0;
if (s->blockRandomised) {
BZ_RAND_INIT_MASK;
BZ_GET_FAST(s->k0); s->nblock_used++;
BZ_RAND_UPD_MASK; s->k0 ^= BZ_RAND_MASK;
} else {
BZ_GET_FAST(s->k0); s->nblock_used++;
}
}
RETURN(BZ_OK);
endhdr_2:
GET_UCHAR(BZ_X_ENDHDR_2, uc);
if (uc != 0x72) RETURN(BZ_DATA_ERROR);
GET_UCHAR(BZ_X_ENDHDR_3, uc);
if (uc != 0x45) RETURN(BZ_DATA_ERROR);
GET_UCHAR(BZ_X_ENDHDR_4, uc);
if (uc != 0x38) RETURN(BZ_DATA_ERROR);
GET_UCHAR(BZ_X_ENDHDR_5, uc);
if (uc != 0x50) RETURN(BZ_DATA_ERROR);
GET_UCHAR(BZ_X_ENDHDR_6, uc);
if (uc != 0x90) RETURN(BZ_DATA_ERROR);
s->storedCombinedCRC = 0;
GET_UCHAR(BZ_X_CCRC_1, uc);
s->storedCombinedCRC = (s->storedCombinedCRC << 8) | ((UInt32)uc);
GET_UCHAR(BZ_X_CCRC_2, uc);
s->storedCombinedCRC = (s->storedCombinedCRC << 8) | ((UInt32)uc);
GET_UCHAR(BZ_X_CCRC_3, uc);
s->storedCombinedCRC = (s->storedCombinedCRC << 8) | ((UInt32)uc);
GET_UCHAR(BZ_X_CCRC_4, uc);
s->storedCombinedCRC = (s->storedCombinedCRC << 8) | ((UInt32)uc);
s->state = BZ_X_IDLE;
RETURN(BZ_STREAM_END);
default: AssertH ( False, 4001 );
}
AssertH ( False, 4002 );
save_state_and_return:
s->save_i = i;
s->save_j = j;
s->save_t = t;
s->save_alphaSize = alphaSize;
s->save_nGroups = nGroups;
s->save_nSelectors = nSelectors;
s->save_EOB = EOB;
s->save_groupNo = groupNo;
s->save_groupPos = groupPos;
s->save_nextSym = nextSym;
s->save_nblockMAX = nblockMAX;
s->save_nblock = nblock;
s->save_es = es;
s->save_N = N;
s->save_curr = curr;
s->save_zt = zt;
s->save_zn = zn;
s->save_zvec = zvec;
s->save_zj = zj;
s->save_gSel = gSel;
s->save_gMinlen = gMinlen;
s->save_gLimit = gLimit;
s->save_gBase = gBase;
s->save_gPerm = gPerm;
return retVal;
}
/*-------------------------------------------------------------*/
/*--- end decompress.c ---*/
/*-------------------------------------------------------------*/
================================================
FILE: ext/bzip2/huffman.c
================================================
/*-------------------------------------------------------------*/
/*--- Huffman coding low-level stuff ---*/
/*--- huffman.c ---*/
/*-------------------------------------------------------------*/
/* ------------------------------------------------------------------
This file is part of bzip2/libbzip2, a program and library for
lossless, block-sorting data compression.
bzip2/libbzip2 version 1.0.6 of 6 September 2010
Copyright (C) 1996-2010 Julian Seward
Please read the WARNING, DISCLAIMER and PATENTS sections in the
README file.
This program is released under the terms of the license contained
in the file LICENSE.
------------------------------------------------------------------ */
#include "bzlib_private.h"
/*---------------------------------------------------*/
#define WEIGHTOF(zz0) ((zz0) & 0xffffff00)
#define DEPTHOF(zz1) ((zz1) & 0x000000ff)
#define MYMAX(zz2,zz3) ((zz2) > (zz3) ? (zz2) : (zz3))
#define ADDWEIGHTS(zw1,zw2) \
(WEIGHTOF(zw1)+WEIGHTOF(zw2)) | \
(1 + MYMAX(DEPTHOF(zw1),DEPTHOF(zw2)))
#define UPHEAP(z) \
{ \
Int32 zz, tmp; \
zz = z; tmp = heap[zz]; \
while (weight[tmp] < weight[heap[zz >> 1]]) { \
heap[zz] = heap[zz >> 1]; \
zz >>= 1; \
} \
heap[zz] = tmp; \
}
#define DOWNHEAP(z) \
{ \
Int32 zz, yy, tmp; \
zz = z; tmp = heap[zz]; \
while (True) { \
yy = zz << 1; \
if (yy > nHeap) break; \
if (yy < nHeap && \
weight[heap[yy+1]] < weight[heap[yy]]) \
yy++; \
if (weight[tmp] < weight[heap[yy]]) break; \
heap[zz] = heap[yy]; \
zz = yy; \
} \
heap[zz] = tmp; \
}
/*---------------------------------------------------*/
void BZ2_hbMakeCodeLengths ( UChar *len,
Int32 *freq,
Int32 alphaSize,
Int32 maxLen )
{
/*--
Nodes and heap entries run from 1. Entry 0
for both the heap and nodes is a sentinel.
--*/
Int32 nNodes, nHeap, n1, n2, i, j, k;
Bool tooLong;
Int32 heap [ BZ_MAX_ALPHA_SIZE + 2 ];
Int32 weight [ BZ_MAX_ALPHA_SIZE * 2 ];
Int32 parent [ BZ_MAX_ALPHA_SIZE * 2 ];
for (i = 0; i < alphaSize; i++)
weight[i+1] = (freq[i] == 0 ? 1 : freq[i]) << 8;
while (True) {
nNodes = alphaSize;
nHeap = 0;
heap[0] = 0;
weight[0] = 0;
parent[0] = -2;
for (i = 1; i <= alphaSize; i++) {
parent[i] = -1;
nHeap++;
heap[nHeap] = i;
UPHEAP(nHeap);
}
AssertH( nHeap < (BZ_MAX_ALPHA_SIZE+2), 2001 );
while (nHeap > 1) {
n1 = heap[1]; heap[1] = heap[nHeap]; nHeap--; DOWNHEAP(1);
n2 = heap[1]; heap[1] = heap[nHeap]; nHeap--; DOWNHEAP(1);
nNodes++;
parent[n1] = parent[n2] = nNodes;
weight[nNodes] = ADDWEIGHTS(weight[n1], weight[n2]);
parent[nNodes] = -1;
nHeap++;
heap[nHeap] = nNodes;
UPHEAP(nHeap);
}
AssertH( nNodes < (BZ_MAX_ALPHA_SIZE * 2), 2002 );
tooLong = False;
for (i = 1; i <= alphaSize; i++) {
j = 0;
k = i;
while (parent[k] >= 0) { k = parent[k]; j++; }
len[i-1] = j;
if (j > maxLen) tooLong = True;
}
if (! tooLong) break;
/* 17 Oct 04: keep-going condition for the following loop used
to be 'i < alphaSize', which missed the last element,
theoretically leading to the possibility of the compressor
looping. However, this count-scaling step is only needed if
one of the generated Huffman code words is longer than
maxLen, which up to and including version 1.0.2 was 20 bits,
which is extremely unlikely. In version 1.0.3 maxLen was
changed to 17 bits, which has minimal effect on compression
ratio, but does mean this scaling step is used from time to
time, enough to verify that it works.
This means that bzip2-1.0.3 and later will only produce
Huffman codes with a maximum length of 17 bits. However, in
order to preserve backwards compatibility with bitstreams
produced by versions pre-1.0.3, the decompressor must still
handle lengths of up to 20. */
for (i = 1; i <= alphaSize; i++) {
j = weight[i] >> 8;
j = 1 + (j / 2);
weight[i] = j << 8;
}
}
}
/*---------------------------------------------------*/
void BZ2_hbAssignCodes ( Int32 *code,
UChar *length,
Int32 minLen,
Int32 maxLen,
Int32 alphaSize )
{
Int32 n, vec, i;
vec = 0;
for (n = minLen; n <= maxLen; n++) {
for (i = 0; i < alphaSize; i++)
if (length[i] == n) { code[i] = vec; vec++; };
vec <<= 1;
}
}
/*---------------------------------------------------*/
void BZ2_hbCreateDecodeTables ( Int32 *limit,
Int32 *base,
Int32 *perm,
UChar *length,
Int32 minLen,
Int32 maxLen,
Int32 alphaSize )
{
Int32 pp, i, j, vec;
pp = 0;
for (i = minLen; i <= maxLen; i++)
for (j = 0; j < alphaSize; j++)
if (length[j] == i) { perm[pp] = j; pp++; };
for (i = 0; i < BZ_MAX_CODE_LEN; i++) base[i] = 0;
for (i = 0; i < alphaSize; i++) base[length[i]+1]++;
for (i = 1; i < BZ_MAX_CODE_LEN; i++) base[i] += base[i-1];
for (i = 0; i < BZ_MAX_CODE_LEN; i++) limit[i] = 0;
vec = 0;
for (i = minLen; i <= maxLen; i++) {
vec += (base[i+1] - base[i]);
limit[i] = vec-1;
vec <<= 1;
}
for (i = minLen + 1; i <= maxLen; i++)
base[i] = ((limit[i-1] + 1) << 1) - base[i];
}
/*-------------------------------------------------------------*/
/*--- end huffman.c ---*/
/*-------------------------------------------------------------*/
================================================
FILE: ext/bzip2/randtable.c
================================================
/*-------------------------------------------------------------*/
/*--- Table for randomising repetitive blocks ---*/
/*--- randtable.c ---*/
/*-------------------------------------------------------------*/
/* ------------------------------------------------------------------
This file is part of bzip2/libbzip2, a program and library for
lossless, block-sorting data compression.
bzip2/libbzip2 version 1.0.6 of 6 September 2010
Copyright (C) 1996-2010 Julian Seward
Please read the WARNING, DISCLAIMER and PATENTS sections in the
README file.
This program is released under the terms of the license contained
in the file LICENSE.
------------------------------------------------------------------ */
#include "bzlib_private.h"
/*---------------------------------------------*/
Int32 BZ2_rNums[512] = {
619, 720, 127, 481, 931, 816, 813, 233, 566, 247,
985, 724, 205, 454, 863, 491, 741, 242, 949, 214,
733, 859, 335, 708, 621, 574, 73, 654, 730, 472,
419, 436, 278, 496, 867, 210, 399, 680, 480, 51,
878, 465, 811, 169, 869, 675, 611, 697, 867, 561,
862, 687, 507, 283, 482, 129, 807, 591, 733, 623,
150, 238, 59, 379, 684, 877, 625, 169, 643, 105,
170, 607, 520, 932, 727, 476, 693, 425, 174, 647,
73, 122, 335, 530, 442, 853, 695, 249, 445, 515,
909, 545, 703, 919, 874, 474, 882, 500, 594, 612,
641, 801, 220, 162, 819, 984, 589, 513, 495, 799,
161, 604, 958, 533, 221, 400, 386, 867, 600, 782,
382, 596, 414, 171, 516, 375, 682, 485, 911, 276,
98, 553, 163, 354, 666, 933, 424, 341, 533, 870,
227, 730, 475, 186, 263, 647, 537, 686, 600, 224,
469, 68, 770, 919, 190, 373, 294, 822, 808, 206,
184, 943, 795, 384, 383, 461, 404, 758, 839, 887,
715, 67, 618, 276, 204, 918, 873, 777, 604, 560,
951, 160, 578, 722, 79, 804, 96, 409, 713, 940,
652, 934, 970, 447, 318, 353, 859, 672, 112, 785,
645, 863, 803, 350, 139, 93, 354, 99, 820, 908,
609, 772, 154, 274, 580, 184, 79, 626, 630, 742,
653, 282, 762, 623, 680, 81, 927, 626, 789, 125,
411, 521, 938, 300, 821, 78, 343, 175, 128, 250,
170, 774, 972, 275, 999, 639, 495, 78, 352, 126,
857, 956, 358, 619, 580, 124, 737, 594, 701, 612,
669, 112, 134, 694, 363, 992, 809, 743, 168, 974,
944, 375, 748, 52, 600, 747, 642, 182, 862, 81,
344, 805, 988, 739, 511, 655, 814, 334, 249, 515,
897, 955, 664, 981, 649, 113, 974, 459, 893, 228,
433, 837, 553, 268, 926, 240, 102, 654, 459, 51,
686, 754, 806, 760, 493, 403, 415, 394, 687, 700,
946, 670, 656, 610, 738, 392, 760, 799, 887, 653,
978, 321, 576, 617, 626, 502, 894, 679, 243, 440,
680, 879, 194, 572, 640, 724, 926, 56, 204, 700,
707, 151, 457, 449, 797, 195, 791, 558, 945, 679,
297, 59, 87, 824, 713, 663, 412, 693, 342, 606,
134, 108, 571, 364, 631, 212, 174, 643, 304, 329,
343, 97, 430, 751, 497, 314, 983, 374, 822, 928,
140, 206, 73, 263, 980, 736, 876, 478, 430, 305,
170, 514, 364, 692, 829, 82, 855, 953, 676, 246,
369, 970, 294, 750, 807, 827, 150, 790, 288, 923,
804, 378, 215, 828, 592, 281, 565, 555, 710, 82,
896, 831, 547, 261, 524, 462, 293, 465, 502, 56,
661, 821, 976, 991, 658, 869, 905, 758, 745, 193,
768, 550, 608, 933, 378, 286, 215, 979, 792, 961,
61, 688, 793, 644, 986, 403, 106, 366, 905, 644,
372, 567, 466, 434, 645, 210, 389, 550, 919, 135,
780, 773, 635, 389, 707, 100, 626, 958, 165, 504,
920, 176, 193, 713, 857, 265, 203, 50, 668, 108,
645, 990, 626, 197, 510, 357, 358, 850, 858, 364,
936, 638
};
/*-------------------------------------------------------------*/
/*--- end randtable.c ---*/
/*-------------------------------------------------------------*/
================================================
FILE: ext/freetype2/ChangeLog
================================================
2014-12-30 Werner Lemberg
* Version 2.5.5 released.
=========================
Tag sources with `VER-2-5-5'.
* docs/VERSION.DLL: Update documentation and bump version number to
2.5.5.
* README, Jamfile (RefDoc), builds/windows/vc2005/freetype.vcproj,
builds/windows/vc2005/index.html,
builds/windows/vc2008/freetype.vcproj,
builds/windows/vc2008/index.html,
builds/windows/vc2010/freetype.vcxproj,
builds/windows/vc2010/index.html,
builds/windows/visualc/freetype.dsp,
builds/windows/visualc/freetype.vcproj,
builds/windows/visualc/index.html,
builds/windows/visualce/freetype.dsp,
builds/windows/visualce/freetype.vcproj,
builds/windows/visualce/index.html,
builds/wince/vc2005-ce/freetype.vcproj,
builds/wince/vc2005-ce/index.html,
builds/wince/vc2008-ce/freetype.vcproj,
builds/wince/vc2008-ce/index.html: s/2.5.4/2.5.5/, s/254/255/.
* include/freetype/freetype.h (FREETYPE_PATCH): Set to 5.
* builds/unix/configure.raw (version_info): Set to 17:4:11.
* CMakeLists.txt (VERSION_PATCH): Set to 5.
* docs/CHANGES: Updated.
2014-12-24 Alexei Podtelezhnikov
[base] Formatting and nanooptimizations.
* src/base/ftcalc.c,
* src/base/fttrigon.c: Revise sign restoration.
2014-12-13 Werner Lemberg
* src/pcf/pcfread.c (pcf_read_TOC): Improve fix from 2014-12-08.
2014-12-11 Werner Lemberg
* builds/toplevel.mk (dist): Use older POSIX standard for `tar'.
Apparently, BSD tar isn't capable yet of handling POSIX-1.2001
(contrary to GNU tar), so force the POSIX-1.1988 format.
Problem reported by Stephen Fisher .
2014-12-11 Werner Lemberg
* src/type42/t42parse.c (t42_parse_sfnts): Reject invalid TTF size.
2014-12-11 Werner Lemberg
* src/base/ftobjs.c (FT_Get_Glyph_Name): Fix off-by-one check.
Problem reported by Dennis Felsing .
2014-12-11 Werner Lemberg
* src/type42/t42parse.c (t42_parse_sfnts): Check `string_size'.
Problem reported by Dennis Felsing .
2014-12-09 suzuki toshiya
[gxvalid] Fix a naming convention conflicting with ftvalid.
See previous changeset for otvalid.
* src/gxvalid/{gxvcommn.h, gxvmort.h, gxvmorx.h}: Replace
`valid' by `gxvalid'.
* src/gxvalid/{gxvbsln.c, gxvcommn.c, gxvfeat.c, gxvjust.c,
gxvkern.c, gxvlcar.c, gxvmort.c, gxvmort0.c, gxvmort1.c,
gxvmort2.c, gxvmort4.c, gxvmort5.c, gxvmorx.c, gxvmorx0.c,
gxvmorx1.c, gxvmorx2.c, gxvmorx4.c, gxvmorx5.c, gxvopbd.c,
gxvprop.c, gxvtrak.c}: Replace `valid' by `gxvalid' if
it is typed as GXV_Validator.
2014-12-09 suzuki toshiya
[otvalid] Fix a naming convention conflicting with ftvalid.
Some prototypes in ftvalid.h use `valid' for the variables
typed as FT_Validator. Their implementations in src/base/
ftobjs.c and utilizations in src/sfnt/ttcmap.c do similar.
Some macros in otvcommn.h assume the exist of the variable
`valid' typed as OTV_Validator in the caller.
Mixing these two conventions cause invalid pointer conversion
and unexpected SEGV in longjmp. To prevent it, all variables
typed as OTV_Validator are renamed to `otvalid'.
* src/otvalid/otvcommn.h: Replace `valid' by `otvalid'.
* src/otvalid/{otvcommn.c, otvbase.c, otvgdef.c, otvgpos.c,
otvgsub.c, otvjstf.c, otvmath.c}: Replace `valid' by `otvalid'
if it is typed as OTV_Validator.
2014-12-09 suzuki toshiya
[ftvalid] Introduce FT_THROW() in FT_INVALID_XXX macros.
Original patch is designed by Werner Lemberg. Extra part
for otvalid and gxvalid are added by suzuki toshiya, see
discussion:
http://lists.nongnu.org/archive/html/freetype-devel/2014-12/msg00002.html
http://lists.nongnu.org/archive/html/freetype-devel/2014-12/msg00007.html
* include/internal/ftvalid.h: Introduce FT_THROW() in FT_INVALID_().
* src/gxvalid/gxvcommn.h: Ditto.
* src/otvalid/otvcommn.h: Ditto.
2014-12-08 Werner Lemberg
[pcf] Fix Savannah bug #43774.
Work around `features' of X11's `pcfWriteFont' and `pcfReadFont'
functions. Since the PCF format doesn't have an official
specification, we have to exactly follow these functions' behaviour.
The problem was unveiled with a patch from 2014-11-06, fixing issue
#43547.
* src/pcf/pcfread.c (pcf_read_TOC): Don't check table size for last
element. Instead, assign real size.
2014-12-07 Werner Lemberg
Work around a bug in Borland's C++ compiler.
See
http://qc.embarcadero.com/wc/qcmain.aspx?d=118998
for Borland's bug tracker entry.
Reported by Yuliana Zigangirova ,
http://lists.gnu.org/archive/html/freetype-devel/2014-04/msg00001.html.
* include/internal/ftvalid.h (FT_ValidatorRec), src/smooth/ftgrays.c
(gray_TWorker_): Move `ft_jmp_buf' field to be the first element.
2014-12-07 Werner Lemberg
*/*: Decorate hex constants with `U' and `L' where appropriate.
2014-12-07 Werner Lemberg
[truetype] Prevent memory leak for buggy fonts.
* src/truetype/ttobjs.c (tt_size_done): Unconditionally call
`tt_size_done_bytecode'.
2014-12-06 Werner Lemberg
* Version 2.5.4 released.
=========================
Tag sources with `VER-2-5-4'.
* docs/VERSION.DLL: Update documentation and bump version number to
2.5.4.
* README, Jamfile (RefDoc), builds/windows/vc2005/freetype.vcproj,
builds/windows/vc2005/index.html,
builds/windows/vc2008/freetype.vcproj,
builds/windows/vc2008/index.html,
builds/windows/vc2010/freetype.vcxproj,
builds/windows/vc2010/index.html,
builds/windows/visualc/freetype.dsp,
builds/windows/visualc/freetype.vcproj,
builds/windows/visualc/index.html,
builds/windows/visualce/freetype.dsp,
builds/windows/visualce/freetype.vcproj,
builds/windows/visualce/index.html,
builds/wince/vc2005-ce/freetype.vcproj,
builds/wince/vc2005-ce/index.html,
builds/wince/vc2008-ce/freetype.vcproj,
builds/wince/vc2008-ce/index.html: s/2.5.3/2.5.4/, s/253/254/.
* include/freetype/freetype.h (FREETYPE_PATCH): Set to 4.
* builds/unix/configure.raw (version_info): Set to 17:3:11.
* CMakeLists.txt (VERSION_PATCH): Set to 4.
* docs/CHANGES: Updated.
2014-12-04 Werner Lemberg
docs/CHANGES: Updated, formatted.
2014-12-04 Dave Arnold
[cff] Modify an FT_ASSERT.
* src/cff/cf2hints.c (cf2_hintmap_map): After the fix for Savannah
bug #43661, the test font `...aspartam.otf' still triggers an
FT_ASSERT. Since hintmap still works with count==0, ...
(cf2_glyphpath_lineTo, cf2_glyphpath_curveTo): ... add that term to
suppress the assert.
2014-12-04 Dave Arnold
[cff] Fix Savannah bug #43661.
* src/cff/cf2intrp.c (cf2_interpT2CharString) : Don't append to stem arrays after
hintmask is constructed.
* src/cff/cf2hints.c (cf2_hintmap_build): Add defensive code to
avoid reading past end of hintmask.
2014-12-03 Werner Lemberg
docs/CHANGES: Updated.
2014-12-03 Werner Lemberg
[autofit] Better fix for conversion specifiers in debug messages.
Using `%ld' for pointer differences causes warnings on 32bit
platforms. The correct type would be (the relatively new) `%td',
however, this is missing on some important platforms.
This patch improves the change from 2014-11-28.
* src/autofit/afhints.c (AF_INDEX_NUM): Use `int' typecast. Our
pointer differences are always sufficiently small.
(af_glyph_hints_dump_points, af_glyph_hints_dump_segments,
af_glyph_hints_dump_edge): Revert to `%d' and use `AF_INDEX_NUM'.
2014-12-03 Werner Lemberg
FT_Sfnt_Tag: s/ft_sfnt_xxx/FT_SFNT_XXX/ for orthogonality.
All public FreeType enumeration and flag values are uppercase...
* include/tttables.h (FT_Sfnt_Tag): Implement it. For backwards
compatilibity, retain the old values as macros.
* src/base/ftfstype.c (FT_Get_FSType_Flags), src/sfnt/sfdriver.c
(get_sfnt_table): Updated.
2014-12-02 Werner Lemberg
* include/*: Improve structure of documentation.
. Add and update many `' tags.
. Apply various documentation fixes.
. Remove details to deprecated (or never implemented) data.
2014-12-02 Werner Lemberg
[docmaker] Always handle `' section elements.
Previously, those elements were handled only for sections present in
a `' chapter element.
* src/tools/docmaker/content.py (ContentProcessor::finish):
Implement it.
2014-12-02 Werner Lemberg
[docmaker] Properly handle empty rows in Synopsis.
* src/tools/docmaker/tohtml.py (HtmlFormatter::section_enter): Emit
` ' for empty fields.
2014-12-02 Werner Lemberg
[docmaker] Thinko.
* src/tools/docmaker/content.py (DocBlock::get_markup_words_all):
Emit `/empty/' string for first element also.
2014-12-02 Werner Lemberg
[docmaker] Honour empty lines in `' section element.
This greatly improves the readability of the `Synopsis' links.
* src/tools/docmaker/content.py (DocBlock::get_markup_words_all):
Insert string `/empty/' between items.
* src/tools/docmaker/formatter.py (Formatter::section_dump): Make it
robust against nonexistent keys.
* src/tools/docmaker/tohtml.py (HtmlFormatter::section_enter): Emit
empty elements for `/empty/'.
2014-12-02 Werner Lemberg
[docmaker] Ensure Python 3 compatibility.
* src/tools/docmaker/content.py (ContentProcessor::set_section,
ContentProcessor::finish): Replace `has_key' function with `in'
keyword.
* src/tools/docmaker/formatter.py (Formatter::__init__): Replace
sorting function with a key generator.
(Formatter::add_identifier): Replace `has_key' function with `in'
keyword.
* src/tools/docmaker/tohtml.py (HtmlFormatter::html_source_quote):
Replace `has_key' function with `in' keyword.
(HtmlFormatter::index_exit, HtmlFormatter::section_enter): Use
integer division.
s/<>/>/.
* src/tools/docmaker/utils.py: Import `itertools'.
(index_sort): Replaced by...
(index_key): ... this new key generator (doing exactly the same).
2014-11-29 Werner Lemberg
[docmaker] Don't output a block multiple times.
This bug was hidden by not processing all lines of `' blocks.
* src/tools/docmaker/formatter.py (Formatter::section_dump): Filter
out field names.
2014-11-29 Werner Lemberg
[docmaker] Use field values as HTML link targets where possible.
* src/tools/docmaker/tohtml.py (HtmlFormatter::make_block_url):
Accept second, optional argument to specify a name.
(HtmlFormatter::html_source_quote): Link to field ID if possible.
(HtmlFormatter::print_html_field_list): Emit `id' attribute.
2014-11-29 Werner Lemberg
[docmaker] Allow empty lines in `' blocks.
Before this patch, the suggested order of entries stopped at the
first empty line.
Obviously, nobody noticed that this problem caused a much reduced
set of links in the `Synopsis' sections; in particular, the
`' blocks contain a lot of entries that wouldn't be listed
otherwise...
* src/tools/docmaker/content.py (DocBlock::get_markup_words_all):
New function to iterate over all items.
(DocSection::process): Use it.
2014-11-29 Werner Lemberg
* src/tools/docmaker/sources.py (column) [Format 2]: Fix regexp.
After the single asterisk there must be no other immediately following
asterisk.
2014-11-29 Werner Lemberg
* src/tools/docmaker/tohtml.py: Improve CSS for vertical spacing.
2014-11-29 Werner Lemberg
[docmaker] Improve HTML code for table of contents.
* src/tools/docmaker/tohtml.py: Introduce a new table class `toc',
together with proper CSS.
2014-11-29 Werner Lemberg
[docmaker] Provide higher-level markup and simplify HTML.
* src/tools/docmaker/tohtml.py: Instead of using extraneous `'
elements, use CSS descendants (of class `section') to format the
data.
Also remove reduntant and elements, replacing them with
proper CSS.
Globally reduce page width to 75%.
(block_header): Rename class to `section'.
2014-11-29 Werner Lemberg
[docmaker] Add `top' links after blocks.
* src/tools/docmaker/tohtml.py (block_footer_middle): Implement it.
2014-11-29 Werner Lemberg
* src/tools/docmaker/tohtml.py: Improve CSS for fields.
Make fields align horizontally relative to full line width.
2014-11-29 Werner Lemberg
* src/tools/docmaker/tohtml.py: Fix index and TOC templates.
This thinko was introduced 2014-11-27.
2014-11-28 Werner Lemberg
[docmaker] Format field lists with CSS.
This also simplifies the inserted HTML code.
* src/tools/docmaker/tohtml.py
(HtmlFormatter::print_html_field_list): Do it.
2014-11-28 suzuki toshiya
Fix compiler warning to the comparison between signed and
unsigned variable.
* src/pfr/pfrsbit.c (pfr_slot_load_bitmap): Fix the comparison
between `ypos + ysize' and FT_INT_{MAX,MIN}.
2014-11-28 Werner Lemberg
[docmaker] Replace empty `' with CSS.
* src/tools/docmaker/tohtml.py (HtmlFormatter::section_enter): Do
it.
2014-11-28 Werner Lemberg
[docmaker] Replace some ` | |