Repository: karnov/htmltoword Branch: master Commit: d58f91145356 Files: 75 Total size: 377.3 KB Directory structure: gitextract_jrowts2d/ ├── .gitattributes ├── .gitignore ├── CHANGELOG.txt ├── Gemfile ├── LICENSE.txt ├── README.md ├── Rakefile ├── bin/ │ └── htmltoword ├── docs/ │ ├── styles.md │ └── supported_elements.md ├── htmltoword.gemspec ├── lib/ │ ├── htmltoword/ │ │ ├── configuration.rb │ │ ├── document.rb │ │ ├── helpers/ │ │ │ ├── templates_helper.rb │ │ │ └── xslt_helper.rb │ │ ├── railtie.rb │ │ ├── renderer.rb │ │ ├── templates/ │ │ │ └── default.docx │ │ ├── version.rb │ │ └── xslt/ │ │ ├── base.xslt │ │ ├── cleanup.xslt │ │ ├── extras.xslt │ │ ├── functions.xslt │ │ ├── htmltoword.xslt │ │ ├── image_functions.xslt │ │ ├── images.xslt │ │ ├── inline_elements.xslt │ │ ├── links.xslt │ │ ├── numbering.xslt │ │ ├── relations.xslt │ │ ├── style2.xslt │ │ └── tables.xslt │ └── htmltoword.rb ├── script/ │ ├── build-template │ ├── extract-template │ └── setup ├── spec/ │ ├── document_spec.rb │ ├── fixtures/ │ │ ├── complex/ │ │ │ ├── nestings.html │ │ │ └── nestings.xml │ │ ├── description_lists/ │ │ │ ├── test01.html │ │ │ ├── test01.xml │ │ │ ├── test02.html │ │ │ ├── test02.xml │ │ │ ├── test03.html │ │ │ ├── test03.xml │ │ │ ├── test04.html │ │ │ └── test04.xml │ │ └── lists/ │ │ ├── lists_inline_elements.html │ │ └── lists_inline_elements.xml │ ├── spec_helper.rb │ ├── tmp/ │ │ └── .gitignore │ ├── xslt_alignment_spec.rb │ ├── xslt_breaks_spec.rb │ ├── xslt_complex_spec.rb │ ├── xslt_description_lists_spec.rb │ ├── xslt_heading_spec.rb │ ├── xslt_images_spec.rb │ ├── xslt_links_spec.rb │ ├── xslt_lists_spec.rb │ ├── xslt_simple_text_style_spec.rb │ ├── xslt_spec.rb │ └── xslt_tables_spec.rb └── templates/ └── default/ ├── [Content_Types].xml ├── _rels/ │ └── .rels ├── docProps/ │ ├── app.xml │ └── core.xml └── word/ ├── _rels/ │ └── document.xml.rels ├── document.xml ├── fontTable.xml ├── numbering.xml ├── settings.xml ├── styles.xml ├── stylesWithEffects.xml ├── theme/ │ └── theme1.xml └── webSettings.xml ================================================ FILE CONTENTS ================================================ ================================================ FILE: .gitattributes ================================================ templates/**/*.xml filter=xml-c14n templates/**/*.rels filter=xml-c14n ================================================ FILE: .gitignore ================================================ *.gem *.rbc .bundle .config .yardoc Gemfile.lock InstalledFiles _yardoc coverage doc/ lib/bundler/man pkg rdoc spec/reports test/tmp test/version_tmp .idea/* ================================================ FILE: CHANGELOG.txt ================================================ CHANGELOG Version 0.6.0 * Improve support for inline elements * Add basic support for section and article elements Version 0.5.1 * Bugfix on ActionController::Renderers formats Version 0.5 * Better handling of external links * Better handling of inline elements * Better support for complex lists Version 0.4.4 * Update default template to remove language * Bug fixes on lists and tables having br elements * Bug fixes on lists having inline elements and div or p tags * Bug fix on corrupted docx files generated in Windows * Add basic support to 'pre' tag Version 0.4.3 * Add method that generates and saves the document to a file in a specific file path Version 0.4.2 * Fix broken default template * List support * Support table cells bordering in the base xslt * br tag support Version 0.4.1 * Include forgotten templates and xslt Version 0.4 * Do not render output to file, but rather to a buffer. This means that apps using the gem no longer needs write permissions on a filesystem. * Support of table borders using the extra xslt * Support of basic lists using the extra xslt * Use of blockquotes as a replacement for spacing usint the extra xslt * Add a command line interface * Fixes on default template metadata Version 0.2 * Relocation of templates and xslt files * Enable gem configuration to set the location of default and custom word templates and xslt. * Allow the Htmltoword::Document.create to receive custom templates * Add .docx as a renderer when used with rails. Version 0.1.8 * Better alignment of text within a paragraph, div or table cell if it has a 'center', 'left', 'right', 'justify' class or by using the text-align property. * Support for table-bordered class on tables * Support for bold and italic text on table cells * Support for h1..h6 on table cells * Support for nested tables * Bugfix: thead without tr tag * Bugfix: nested bold and italic on paragraphs or div * Bugfix: allow
Some content inside a div
syntax * Bugfix: Headings in body tag are now correctly parsed Version 0.1.7 * Allow centering of text within a paragraph if the paragraph has a 'center' class, e.g.

Version 0.1.6: * Add tables support for borders and headers Version 0.1.5: * Bugfix: h5 & h6 also create w:p's so any wrapping divs dont need to. Version 0.1.4: * Details tag doesnt really work well for printing. By default its closed and not visible until you click on the summary so for now lets ignore it in the print out. Version 0.1.3: * Bugfix: Leaf i and b nodes with w:p creating siblings need to create their own w:p's. Version 0.1.2: * Bugfix: span.h (MS Word Highlights) with sibling block elements could result in invalid wordml. Version 0.1.1: * Nodes with neighbor nodes that create w:p's need to be wrapped in w:p themselves. * Fixed test suite. Version 0.1.0: * Rewrote xslt to support tables and updated rubyzip to v1. Version 0.0.1: * First basic implementation of htmltoword released. ================================================ FILE: Gemfile ================================================ source 'https://rubygems.org' # Specify your gem's dependencies in htmltoword.gemspec gemspec ================================================ FILE: LICENSE.txt ================================================ Copyright (c) 2013 Nicholas Frandsen MIT License Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ================================================ FILE: README.md ================================================ # Ruby Html to word Gem This simple gem allows you to create MS Word docx documents from simple html documents. This makes it easy to create dynamic reports and forms that can be downloaded by your users as simple MS Word docx files. Add this line to your application's Gemfile: gem 'htmltoword' And then execute: $ bundle Or install it yourself as: $ gem install htmltoword **Note:** Since version 0.4.0 the ```create``` method will return a string with the contents of the file. If you want to save the file please use ```create_and_save```. See the usage for more ### Security warnings In versions `0.7.0` and `1.0.0` we introduced a security vulnerability when allowing the use of local images since no check to the files was done, potentially exposing sensitive files in the output zipfile. Version `1.1.0` doesn't allow the use of local images but uses an insecure `open` ## Usage ### Standalone By default, the file will be saved at the specified location. In case you want to handle the contents of the file as a string and do what suits you best, you can specify that when calling the create function. Using the default word file as template ```ruby require 'htmltoword' my_html = '

Hello

' document = Htmltoword::Document.create(my_html) file = Htmltoword::Document.create_and_save(my_html, file_path) ``` Using your custom word file as a template, where you can setup your own style for normal text, h1,h2, etc. ```ruby require 'htmltoword' # Configure the location of your custom templates Htmltoword.config.custom_templates_path = 'some_path' my_html = '

Hello

' document = Htmltoword::Document.create(my_html, word_template_file_name) file = Htmltoword::Document.create_and_save(my_html, file_path, word_template_file_name) ``` The ```create``` function will return a string with the file, so you can do with it what you consider best. The ```create_and_save``` function will create the file in the specified file_path. ### With Rails **For htmltoword version >= 0.2** An action controller renderer has been defined, so there's no need to declare the mime-type and you can just respond to .docx format. It will look then for views with the extension ```.docx.erb``` which will provide the HTML that will be rendered in the Word file. ```ruby # On your controller. respond_to :docx # filename and word_template are optional. By default it will name the file as your action and use the default template provided by the gem. The use of the .docx in the filename and word_template is optional. def my_action # ... respond_with(@object, filename: 'my_file.docx', word_template: 'my_template.docx') # Alternatively, if you don't want to create the .docx.erb template you could respond_with(@object, content: '

Hello

', filename: 'my_file.docx') end def my_action2 # ... respond_to do |format| format.docx do render docx: 'my_view', filename: 'my_file.docx' # Alternatively, if you don't want to create the .docx.erb template you could render docx: 'my_file.docx', content: '

Hello

' end end end ``` Example of my_view.docx.erb ```

My custom template

<%= render partial: 'my_partial', collection: @objects, as: :item %> ``` Example of _my_partial.docx.erb ```

<%= item.title %>

My html for item <%= item.id %> goes here

``` **For htmltoword version <= 0.1.8** ```ruby # Add mime-type in /config/initializers/mime_types.rb: Mime::Type.register "application/vnd.openxmlformats-officedocument.wordprocessingml.document", :docx # Add docx responder in your controller def show respond_to do |format| format.docx do file = Htmltoword::Document.create params[:docx_html_source], "file_name.docx" send_file file.path, :disposition => "attachment" end end end ``` ```javascript // OPTIONAL: Use a jquery click handler to store the markup in a hidden form field before the form is submitted. // Using this strategy makes it easy to allow users to dynamically edit the document that will be turned // into a docx file, for example by toggling sections of a document. $('#download-as-docx').on('click', function () { $('input[name="docx_html_source"]').val('\n' + $('.delivery').html()); }); ``` ### Configure templates and xslt paths From version 2.0 you can configure the location of default and custom templates and xslt files. By default templates are defined under ```lib/htmltoword/templates``` and xslt under ```lib/htmltoword/xslt``` ```ruby Htmltoword.configure do |config| config.custom_templates_path = 'path_for_custom_templates' # If you modify this path, there should be a 'default.docx' file in there config.default_templates_path = 'path_for_default_template' # If you modify this path, there should be a 'html_to_wordml.xslt' file in there config.default_xslt_path = 'some_path' # The use of additional custom xslt will come soon config.custom_xslt_path = 'some_path' end ``` ## Features All standard html elements are supported and will create the closest equivalent in wordml. For example spans will create inline elements and divs will create block like elements. ### Highlighting text You can add highlighting to text by wrapping it in a span with class h and adding a data style with a color that wordml supports (http://www.schemacentral.com/sc/ooxml/t-w_ST_HighlightColor.html) ie: ```html This text will have a green highlight ``` ### Page breaks To create page breaks simply add a div with class -page-break ie: ```html
```` ### Images Support for images is very basic and is only possible for external images(i.e accessed via URL). If the image doesn't have correctly defined it's width and height it won't be included in the document **Limitations:** - Images are external i.e. pictures accessed via URL, not stored within document - only sizing is customisable Examples: ```html ``` ## Contributing / Extending Word docx files are essentially just a zipped collection of xml files and resources. This gem contains a standard empty MS Word docx file and a stylesheet to transform arbitrary html into wordml. The basic functioning of this gem can be summarised as: 1. Transform inputed html to wordml. 2. Unzip empty word docx file bundled with gem and replace its document.xml content with the new transformed result of step 1. 3. Zip up contents again into a resulting .docx file. For more info about WordML: http://rep.oio.dk/microsoft.com/officeschemas/wordprocessingml_article.htm Contributions would be very much appreciated. 1. Fork it 2. Create your feature branch (`git checkout -b my-new-feature`) 3. Commit your changes (`git commit -am 'Add some feature'`) 4. Push to the branch (`git push origin my-new-feature`) 5. Create new Pull Request ## License (The MIT License) Copyright © 2013: * Cristina Matonte * Nicholas Frandsen ================================================ FILE: Rakefile ================================================ require "bundler/gem_tasks" require 'rspec/core/rake_task' task :default => :spec RSpec::Core::RakeTask.new ================================================ FILE: bin/htmltoword ================================================ #!/usr/bin/env ruby require 'methadone' require 'rmultimarkdown' require_relative '../lib/htmltoword' include Methadone::Main include Methadone::CLILogging main do |input, output| puts "Converting #{input} to #{output}" if options[:verbose] markup = File.read input if options[:format] == 'markdown' markup = markdown2html(markup) end Htmltoword::Document.create_and_save(markup, output, options[:template_name], options[:extras]) puts "Done" if options[:verbose] end def markdown2html(text) MultiMarkdown.new(text.to_s).to_html end version Htmltoword::VERSION description 'Convert simple html input (or markdown) to MS Word (docx)' arg :input, :required arg :output, :required on('--verbose', '-v', 'Be verbose') on('--extras', '-e', 'Use extra formatting features') on('--template', '-t', 'Use custom word base template (.docx file)') on('-f FORMAT', '--format', 'Format', /markdown|html/) # options['ip-address'] = '127.0.0.1' # on('-i IP_ADDRESS', '--ip-address', 'IP Address', /^\d+\.\d+\.\d+\.\d+$/) go! ================================================ FILE: docs/styles.md ================================================ Courtesy of @fran-worley This list is not comprehensive or tested, but I have looked through the source code, and readme and come up with the following: ## Page formatting options: Check out: https://github.com/karnov/htmltoword/blob/master/lib/htmltoword/xslt/base.xslt ### Page Break (included in readme)
Returns a page break ### Highlighting (included in readme) Highlighted Text Here Will highlight text As per: http://www.schemacentral.com/sc/ooxml/t-w_ST_HighlightColor.html these are your color choices Valid value | Description -------------- | ---------------- black | Black Highlighting Color blue | Blue Highlighting Color cyan | Cyan Highlighting Color green | Green Highlighting Color magenta | Magenta Highlighting Color red | Red Highlighting Color yellow | Yellow Highlighting Color white | White Highlighting Color darkBlue | Dark Blue Highlighting Color darkCyan | Dark Cyan Highlighting Color darkGreen | Dark Green Highlighting Color darkMagenta | Dark Magenta Highlighting Color darkRed | Dark Red Highlighting Color darkYellow | Dark Yellow Highlighting Color darkGray | Dark Gray Highlighting Color lightGray | Light Gray Highlighting Color none | No Text Highlighting ### Text Align (not in readme) It looks like adding the following classes will align text in your word document appropriately. * left * right * center * justify ### HTML tags There is also support for the following HTML tags: * div * p * ul/ol, li * b/strong * i/em * u * br * pre * h1...h6 #### Images It appears that at this time images are not supported see #27 #### External links It appears that this isn't supported either yet, though there is an open pull request to add the functionality see #30 ### Table formatting options As per this: https://github.com/karnov/htmltoword/blob/master/lib/htmltoword/xslt/tables.xslt There is basic support for tables. You can use the following HTML tags: * thead * tbody * tr * th * td * h1...h6 And the following classes: * Text alignment: (as above) * Table borders: 'table-bordered' (Not configurable, the default is a black, single line border with a weight of 6) * Cell borders: 'ms-border-[location]-[value]-[color - default is 000000]-[size - default is 1]' * Cell fill: 'ms-fill-[color]' The following are documented here: http://officeopenxml.com/WPtableCellProperties-Borders.php I do not know exactly how much of these options work in the gem, but it should at least point you in the right direction #### [location] Specifies where to apply the border (see Elements table in officeopenxml docs above) Element | Description ---------- | --------------- top | Specifies the border displayed at the top of the cell. bottom | Specifies the border displayed at the bottom of a cell. start | Specifies the border displayed on the leading edge of the cell (left for left-to-right tables and right for right-to-left tables). *Note:* In the previous version of the standard, this element was *left*. end | Specifies the border displayed on the trailing edge of the cell (right for left-to-right tables and left for right-to-left tables). *Note:* In the previous version of the standard, this element was *right*. insideH | Specifies the border to be displayed on all interior horizontal edges of the cell. Note that this is mostly useless in the case of individual cells, as there is no concept of interior edge for individual cells. However, it is used to determine cell borders to apply to a specific group of cells as part of table conditional formatting in a table style, e.g., the inside edges on the set of cells in the first column. insideV | Specifies the border to be displayed on all interior vertical edges of the cell. Note that this is mostly useless in the case of individual cells, as there is no concept of interior edge for individual cells. However, it is used to determine cell borders to apply to a specific group of cells as part of table conditional formatting in a table style, e.g., the inside edges on the set of cells in the first column. tl2br | Specifies the border to be displayed on the top left side to bottom right diagonal within the cell. tr2bl | Specifies the border to be displayed on the top right to bottom left diagonal within the cell. #### [value] Specifies the style of the border. Possible values a per open XML docs are: Value | Description -------- | --------------- single | a single line dashDotStroked | a line with a series of alternating thin and thick strokes dashed | a dashed line dashSmallGap | a dashed line with small gaps dotDash | a line with alternating dots and dashes dotDotDash | a line with a repeating dot - dot - dash sequence dotted | a dotted line double | a double line doubleWave | a double wavy line inset | an inset set of lines nil | no border none | no border outset | an outset set of lines thick | a single line thickThinLargeGap | a thick line contained within a thin line with a large-sized intermediate gap thickThinMediumGap | a thick line contained within a thin line with a medium-sized intermediate gap thickThinSmallGap | a thick line contained within a thin line with a small intermediate gap thinThickLargeGap | a thin line contained within a thick line with a large-sized intermediate gap thinThickMediumGap | a thick line contained within a thin line with a medium-sized intermediate gap thinThickSmallGap | a thick line contained within a thin line with a small intermediate gap thinThickThinLargeGap | a thin-thick-thin line with a large gap thinThickThinMediumGap | a thin-thick-thin line with a medium gap thinThickThinSmallGap | a thin-thick-thin line with a small gap threeDEmboss | a three-staged gradient line, getting darker towards the paragraph threeDEngrave | a three-staged gradient like, getting darker away from the paragraph triple | a triple line wave | a wavy line #### [color] Specifies the color of the border. Values are given as hex values (in RRGGBB format). No #, unlike hex values in HTML/CSS. E.g., color="FFFF00". A value of auto is also permitted and will allow the consuming word processor to determine the color. #### [size] Specifies the width of the border. The gem seems to multiply whatever value you supply by 6. I nothing or 0 is entered then it uses 6. ================================================ FILE: docs/supported_elements.md ================================================ # Supported elements `` ## Unsupported elements The following elements are explicitly not supported (they will be removed, before doing any transformation) - applet - area - audio - base - basefont - canvas - command - details/summary - font - iframe - img - isindex - map - noframes - noscript - object - param - script - source - style - video All form related elements (form, fieldset, input, label, etc.) hasn't been implemented nor deleted. The outcome is unclear yet and it is highly probably that it will generate an invalid document. All other elements not specified below ## Supported elements/tags ### Inline [a](#a), [abbr](#abbr-acronym-bdo-bdi), [acronym](#abbr-acronym-bdo-bdi), [b](#b-strong), [bdi](#abbr-acronym-bdo-bdi), [bdo](#abbr-acronym-bdo-bdi), [big](#big-small), [cite](#cite-dfn-em-i), [code](#code-kbd-samp-tt-var), [del](#del-s-strike), [dfn](#cite-dfn-em-i), [em](#cite-dfn-em-i), [i](#cite-dfn-em-i), [ins](#ins-u) [kbd](#code-kbd-samp-tt-var), [mark](#mark), [q](#q), [s](#del-s-strike), [samp](#code-kbd-samp-tt-var), [small](#big-small), [span](#span), [strike](#del-s-strike), [strong](#b-strong), [sub](#sub-sup), [sup](#sub-sup) [tt](#code-kbd-samp-tt-var), [u](#ins-u), [var](#code-kbd-samp-tt-var) ### Blocks [article](#div) [div](#div) [p](#p) [section](#div) [table](#table) [ul/ol](#lists) #### a Links to anchors in the same document are displayed as normal text. Links to references starting with `http://` or `https://` will be displayed and behave as links. #### abbr, acronym, bdo, bdi Displayed as normal text #### b, strong Displayed as normal text, styled bold #### big, small Displayed as normal text. Future implementation: Make the text bigger or smaller than the normal one. #### cite, dfn, em, i Displayed as normal text, styled italic #### code, kbd, samp, tt, var Displayed as normal text. Future implementation: Assign a monospaced style, most likely using *Courier New* since it's one of the few monospaced fonts shipped with word #### del, s, strike Displayed as normal text with Strikethrough effect. #### div Divs, sections and articles are displayed as paragraphs. Only style supported at the moment is alignment. See the wiki on [styles](https://github.com/karnov/htmltoword/wiki/Styles-and-classes) for more #### ins, u Displayed as normal text, using single underline style. #### lists Lists support is very basic. . See the wiki on [styles](https://github.com/karnov/htmltoword/wiki/Styles-and-classes) for more Tables inside lists are NOT supported #### mark Displayed as normal text highlighted in yellow. See [span](#span) for more colors. #### q Text within `text` will be wrapped with double quotes `"text"` #### span Displayed as normal text. When using `class="h" data-color="selected_color"` The text will be highlighted in the color specified. See the wiki on [styles](https://github.com/karnov/htmltoword/wiki/Styles-and-classes) for a list of colors #### sub, sup Displayed as normal text with subscript or superscript effect. #### p Displayed as paragraphs. Only style supported at the moment is alignment. See the wiki on [styles](https://github.com/karnov/htmltoword/wiki/Styles-and-classes) for more #### table Basic support for tables. See the wiki on [styles](https://github.com/karnov/htmltoword/wiki/Styles-and-classes) for more ================================================ FILE: htmltoword.gemspec ================================================ # coding: utf-8 lib = File.expand_path('../lib', __FILE__) $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib) require 'htmltoword/version' Gem::Specification.new do |spec| spec.name = "htmltoword" spec.version = Htmltoword::VERSION spec.authors = ["Nicholas Frandsen, Cristina Matonte"] spec.email = ["nick.rowe.frandsen@gmail.com, anitsirc1@gmail.com"] spec.description = %q{Convert html to word docx document.} spec.summary = %q{This simple gem allows you to create MS Word docx documents from simple html documents. This makes it easy to create dynamic reports and forms that can be downloaded by your users as simple MS Word docx files.} spec.homepage = "http://github.com/karnov/htmltoword" spec.license = "MIT" spec.files = Dir.glob("lib/**/*.{rb,xslt,docx}") + %w{ bin/htmltoword README.md Rakefile } spec.executables = spec.files.grep(%r{^bin/}) { |f| File.basename(f) } spec.require_paths = ["lib"] spec.add_dependency "actionpack" spec.add_dependency "nokogiri" spec.add_dependency "rubyzip", ">= 1.0" spec.add_development_dependency "rspec" spec.add_development_dependency "bundler" spec.add_development_dependency "rake" spec.add_development_dependency "methadone" spec.add_development_dependency "rmultimarkdown" end ================================================ FILE: lib/htmltoword/configuration.rb ================================================ module Htmltoword class Configuration attr_accessor :default_templates_path, :custom_templates_path, :default_xslt_path, :custom_xslt_path def initialize @default_templates_path = File.join(File.expand_path('../', __FILE__), 'templates') @custom_templates_path = File.join(File.expand_path('../', __FILE__), 'templates') @default_xslt_path = File.join(File.expand_path('../', __FILE__), 'xslt') @custom_xslt_path = File.join(File.expand_path('../', __FILE__), 'xslt') end end end ================================================ FILE: lib/htmltoword/document.rb ================================================ module Htmltoword class Document include XSLTHelper class << self include TemplatesHelper def create(content, template_name = nil, extras = false) template_name += extension if template_name && !template_name.end_with?(extension) document = new(template_file(template_name)) document.replace_files(content, extras) document.generate end def create_and_save(content, file_path, template_name = nil, extras = false) File.open(file_path, 'wb') do |out| out << create(content, template_name, extras) end end def create_with_content(template, content, extras = false) template += extension unless template.end_with?(extension) document = new(template_file(template)) document.replace_files(content, extras) document.generate end def extension '.docx' end def doc_xml_file 'word/document.xml' end def numbering_xml_file 'word/numbering.xml' end def relations_xml_file 'word/_rels/document.xml.rels' end def content_types_xml_file '[Content_Types].xml' end end def initialize(template_path) @replaceable_files = {} @template_path = template_path @image_files = [] end # # Generate a string representing the contents of a docx file. # def generate Zip::File.open(@template_path) do |template_zip| buffer = Zip::OutputStream.write_buffer do |out| template_zip.each do |entry| out.put_next_entry entry.name if @replaceable_files[entry.name] && entry.name == Document.doc_xml_file source = entry.get_input_stream.read # Change only the body of document. TODO: Improve this... source = source.sub(/()((.|\n)*?)(\s+<')) source = xslt(stylesheet_name: 'cleanup').transform(original_source) transform_and_replace(source, xslt_path('numbering'), Document.numbering_xml_file) transform_and_replace(source, xslt_path('relations'), Document.relations_xml_file) transform_doc_xml(source, extras) local_images(source) end def transform_doc_xml(source, extras = false) transformed_source = xslt(stylesheet_name: 'cleanup').transform(source) transformed_source = xslt(stylesheet_name: 'inline_elements').transform(transformed_source) transform_and_replace(transformed_source, document_xslt(extras), Document.doc_xml_file, extras) end private def transform_and_replace(source, stylesheet_path, file, remove_ns = false) stylesheet = xslt(stylesheet_path: stylesheet_path) content = stylesheet.apply_to(source) content.gsub!(/\s*xmlns:(\w+)="(.*?)\s*"/, '') if remove_ns @replaceable_files[file] = content end #generates an array of hashes with filename and full url #for all images to be embeded in the word document def local_images(source) source.css('img').each_with_index do |image,i| filename = image['data-filename'] ? image['data-filename'] : image['src'].split("/").last ext = File.extname(filename).delete(".").downcase @image_files << { filename: "image#{i+1}.#{ext}", url: image['src'], ext: ext } end end #get extension from filename and clean to match content_types def content_type_from_extension(ext) ext == "jpg" ? "jpeg" : ext end #inject the required content_types into the [content_types].xml file... def inject_image_content_types(source) doc = Nokogiri::XML(source) #get a list of all extensions currently in content_types file existing_exts = doc.css("Default").map { |node| node.attribute("Extension").value }.compact #get a list of extensions we need for our images required_exts = @image_files.map{ |i| i[:ext] } #workout which required extensions are missing from the content_types file missing_exts = (required_exts - existing_exts).uniq #inject missing extensions into document missing_exts.each do |ext| doc.at_css("Types").add_child( "") end #return the amended source to be saved into the zip doc.to_s end end end ================================================ FILE: lib/htmltoword/helpers/templates_helper.rb ================================================ module Htmltoword module TemplatesHelper def template_file(template_file_name = nil) default_path = File.join(::Htmltoword.config.default_templates_path, 'default.docx') template_path = template_file_name.nil? ? '' : File.join(::Htmltoword.config.custom_templates_path, template_file_name) File.exist?(template_path) ? template_path : default_path end end end ================================================ FILE: lib/htmltoword/helpers/xslt_helper.rb ================================================ module Htmltoword module XSLTHelper def document_xslt(extras = false) file_name = extras ? 'htmltoword' : 'base' xslt_path(file_name) end def xslt_path(template_name) File.join(Htmltoword.config.default_xslt_path, "#{template_name}.xslt") end def xslt(stylesheet_name: nil, stylesheet_path: nil) return Nokogiri::XSLT(File.open(stylesheet_path)) if stylesheet_path Nokogiri::XSLT(File.open(xslt_path(stylesheet_name))) end end end ================================================ FILE: lib/htmltoword/railtie.rb ================================================ module Htmltoword class Railtie < ::Rails::Railtie initializer 'htmltoword.setup' do if defined?(Mime) and Mime[:docx].nil? Mime::Type.register 'application/vnd.openxmlformats-officedocument.wordprocessingml.document', :docx end ActionController::Renderers.add :docx do |file_name, options| Htmltoword::Renderer.send_file(self, file_name, options) end if defined? ActionController::Responder ActionController::Responder.class_eval do def to_docx if @default_response @default_response.call(options) else controller.render({ docx: controller.action_name }.merge(options)) end end end end end end end ================================================ FILE: lib/htmltoword/renderer.rb ================================================ module Htmltoword class Renderer class << self def send_file(context, filename, options = {}) new(context, filename, options).send_file end end def initialize(context, filename, options) @word_template = options[:word_template].presence @disposition = options.fetch(:disposition, 'attachment') @use_extras = options.fetch(:extras, false) @file_name = file_name(filename, options) @context = context define_template(filename, options) @content = options[:content] || @context.render_to_string(options) end def send_file document = Htmltoword::Document.create(@content, @word_template, @use_extras) @context.send_data(document, filename: @file_name, type: Mime[:docx], disposition: @disposition) end private def define_template(filename, options) if options[:template] == @context.action_name if filename =~ %r{^([^\/]+)/(.+)$} options[:prefixes] ||= [] options[:prefixes].unshift $1 options[:template] = $2 else options[:template] = filename end end end def file_name(filename, options) name = options[:filename].presence || filename name =~ /\.docx$/ ? name : "#{name}.docx" end end end ================================================ FILE: lib/htmltoword/version.rb ================================================ module Htmltoword VERSION = '1.1.1' end ================================================ FILE: lib/htmltoword/xslt/base.xslt ================================================ KNOWN BUGS: div h2 div textnode (WONT BE WRAPPED IN A W:P) div table span text Divs should create a p if nothing above them has and nothing below them will In the following situation: div h2 span textnode span textnode p The div template will not create a w:p because the div contains a h2. Therefore we need to wrap the inline elements span|a|small in a p here. This template adds images. In the following situation: div h2 textnode p The div template will not create a w:p because the div contains a h2. Therefore we need to wrap the textnode in a p here. This template adds MS Word highlighting ability. magenta cyan darkYellow center right left both ================================================ FILE: lib/htmltoword/xslt/cleanup.xslt ================================================ ""
================================================ FILE: lib/htmltoword/xslt/extras.xslt ================================================ ================================================ FILE: lib/htmltoword/xslt/functions.xslt ================================================ rId ================================================ FILE: lib/htmltoword/xslt/htmltoword.xslt ================================================ ================================================ FILE: lib/htmltoword/xslt/image_functions.xslt ================================================ Unrecognized unit of measure: . 1 ================================================ FILE: lib/htmltoword/xslt/images.xslt ================================================ Picture ================================================ FILE: lib/htmltoword/xslt/inline_elements.xslt ================================================
================================================ FILE: lib/htmltoword/xslt/links.xslt ================================================ ================================================ FILE: lib/htmltoword/xslt/numbering.xslt ================================================ lowerLetter upperLetter lowerRoman upperRoman none decimal bullet,● bullet,o bullet,■ none decimal bullet,● 1 ================================================ FILE: lib/htmltoword/xslt/relations.xslt ================================================ External media/image. ================================================ FILE: lib/htmltoword/xslt/style2.xslt ================================================ Im block ================================================ FILE: lib/htmltoword/xslt/tables.xslt ================================================ 6 0 none single 000000 6 0 ================================================ FILE: lib/htmltoword.rb ================================================ # encoding: UTF-8 require 'nokogiri' require 'zip' require 'open-uri' require_relative 'htmltoword/configuration' module Htmltoword class << self def configure yield configuration end def configuration @configuration ||= Configuration.new end alias_method :config, :configuration end end require_relative 'htmltoword/version' require_relative 'htmltoword/helpers/templates_helper' require_relative 'htmltoword/helpers/xslt_helper' require_relative 'htmltoword/document' if defined?(Rails) require_relative 'htmltoword/renderer' require_relative 'htmltoword/railtie' end ================================================ FILE: script/build-template ================================================ #!/bin/sh set -e cwd=$(pwd) path="$(pwd)/lib/htmltoword/templates/default.docx" rm -f $path cd templates/default # Zip options: # r - recursive # l - convert LF to CR LF line endings # D - Don't make entries for directories # T - test the resulting zip # 9 - Moar compression # q - quiet # o - make the file as old as the latest entry zip -r -l -D -T -9 -q -o $path * cd $cwd ================================================ FILE: script/extract-template ================================================ #!/bin/sh set -e rm -rf templates/default unzip lib/htmltoword/templates/default.docx -d templates/default ================================================ FILE: script/setup ================================================ #!/usr/bin/env sh if ! which xmllint &>/dev/null; then echo "Please install xmllint" exit 0 fi for f in clean smudge; do if ! git config --get "filter.xml-c14n.$f" &>/dev/null; then git config --add "filter.xml-c14n.$f" "xmllint --c14n11 -" fi done ================================================ FILE: spec/document_spec.rb ================================================ require 'spec_helper' require 'securerandom' describe Htmltoword::Document do describe "local_images" do let(:html) do <<-EOL Hello!

EOL end #in order to test if the images files are embedded correctly we have to create_and_save, then open the zip and look for them. Then ensure that we cleanup the created file. #There is probably a better way of doing it as this seems very performance heavy but does have the advantage of testing the create method all the way through... it "should only embed local images" do filename = SecureRandom.urlsafe_base64 begin docx = Htmltoword::Document.create_and_save(html, tmp_path(filename)) Zip::File.open(tmp_path(filename)) do |zip_file| # Find specific entry expect(zip_file.glob('word/media/*').size).to eq 6 end ensure File.delete(tmp_path(filename)) end end end end ================================================ FILE: spec/fixtures/complex/nestings.html ================================================

My document

Fake TOC

Section 1

Link

By Author

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Etiam convallis ut felis a cursus. Etiam sodales quis nisl ac elementum. Suspendisse egestas hendrerit diam sit amet mattis.

1. Sect

Etiam convallis ut felis a cursus. link 1 and 28, idet aktier

title 2

Lorem ipsum dolor sit amet, consectetur adipiscing elit. :[1]

1,1

1,2

1,3

= 1,4

1,5

1,6

100

100

Text

Lorem

ipsum

bla

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Etiam convallis ut felis a cursus. Something beautiful

1. Funny spans and links, in , here

Table list

With a p and no class*

Etiam sodales quis nisl ac elementum. Suspendisse egestas hendrerit diam sit amet mattis.

Sed id consectetur orci. Phasellus ultrices laoreet lectus, a laoreet magna accumsan eget.

Curabitur erat mi, congue non turpis non, faucibus dapibus magna. Phasellus a accumsan tortor. Quisque quam purus, vehicula a auctor vitae,

2. Div h1

lobortis nec dui. Aenean rutrum elementum nulla, ut consequat velit accumsan vitae. Duis iaculis nulla elit, quis rhoncus mi malesuada vel.

Noter


Section 2

Placeholder Image

Sec 2

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Etiam convallis ut felis a cursus

• Fake item 1. Beløbet skal være indbetalt til den finansielle virksomhed

• Fake item 2, long one. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Etiam convallis ut felis a cursus. Etiam sodales quis nisl ac elementum. Suspendisse egestas hendrerit diam sit amet mattis. Sed id consectetur orci. Phasellus ultrices laoreet lectus, a laoreet magna accumsan eget. Curabitur erat mi, congue non turpis non, faucibus dapibus magna. Phasellus a accumsan tortor. Quisque quam purus, vehicula a auctor vitae, lobortis nec dui. Aenean rutrum elementum nulla, ut consequat velit accumsan vitae. Duis iaculis nulla elit, quis rhoncus mi malesuada vel.

“… Quoted text. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Etiam convallis ut felis a cursus. …”.


[1] Jf. Ref 1 lorem ipsum, and Ref 2 Lorem ipsum dolor sit amet, consectetur adipiscing Italics: elit. Etiam convallis ut felis a cursus: Etiam sodales quis nisl ac elementum. Suspendisse egestas hendrerit diam sit amet mattis. Ref 3 ed id consectetur orci. Phasellus ultrices laoreet lectus, a laoreet magna accumsan eget.

 

[2] Jf. Ref 4 Curabitur erat mi, congue non turpis non, faucibus dapibus magna. Phasellus a accumsan tortor. Quisque quam purus, vehicula a auctor vitae, lobortis nec dui

 


================================================ FILE: spec/fixtures/complex/nestings.xml ================================================ My document Fake TOC Section 1 (Sec1) Section 2 (Sec2) Section 1 Link By Author Lorem ipsum dolor sit amet, consectetur adipiscing elit. Etiam convallis ut felis a cursus. Etiam sodales quis nisl ac elementum. Suspendisse egestas hendrerit diam sit amet mattis. 1. Sect Etiam convallis ut felis a cursus. link 1 and 28 , idet aktier title 2 Lorem ipsum dolor sit amet, consectetur adipiscing elit. : [1] 1,1 1,2 1,3 = 1,4 1,5 1,6 100 100 Text Lorem ipsum bla Lorem ipsum dolor sit amet, consectetur adipiscing elit. Etiam convallis ut felis a cursus. 1. Funny spans and links , in , here Table list With a p and no class * Etiam sodales quis nisl ac elementum. Suspendisse egestas hendrerit diam sit amet mattis. Sed id consectetur orci. Phasellus ultrices laoreet lectus, a laoreet magna accumsan eget. Curabitur erat mi, congue non turpis non, faucibus dapibus magna. Phasellus a accumsan tortor. Quisque quam purus, vehicula a auctor vitae, 2. Div h1 lobortis nec dui. Aenean rutrum elementum nulla, ut consequat velit accumsan vitae. Duis iaculis nulla elit, quis rhoncus mi malesuada vel. Noter Section 2 Sec 2 Lorem ipsum dolor sit amet, consectetur adipiscing elit. Etiam convallis ut felis a cursus • Fake item 1. Beløbet skal være indbetalt til den finansielle virksomhed • Fake item 2, long one. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Etiam convallis ut felis a cursus. Etiam sodales quis nisl ac elementum. Suspendisse egestas hendrerit diam sit amet mattis. Sed id consectetur orci. Phasellus ultrices laoreet lectus, a laoreet magna accumsan eget. Curabitur erat mi, congue non turpis non, faucibus dapibus magna. Phasellus a accumsan tortor. Quisque quam purus, vehicula a auctor vitae, lobortis nec dui. Aenean rutrum elementum nulla, ut consequat velit accumsan vitae. Duis iaculis nulla elit, quis rhoncus mi malesuada vel. “… Quoted text. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Etiam convallis ut felis a cursus. …”. [1] Jf. Ref 1 lorem ipsum, and Ref 2 Lorem ipsum dolor sit amet, consectetur adipiscing Italics : elit. Etiam convallis ut felis a cursus : Etiam sodales quis nisl ac elementum. Suspendisse egestas hendrerit diam sit amet mattis. Ref 3 ed id consectetur orci. Phasellus ultrices laoreet lectus, a laoreet magna accumsan eget.   [2] Jf. Ref 4 Curabitur erat mi, congue non turpis non, faucibus dapibus magna. Phasellus a accumsan tortor. Quisque quam purus, vehicula a auctor vitae, lobortis nec dui   ================================================ FILE: spec/fixtures/description_lists/test01.html ================================================
Firefox
A free, open source, cross-platform, graphical web browser developed by the Mozilla Corporation and hundreds of volunteers.
================================================ FILE: spec/fixtures/description_lists/test01.xml ================================================ Firefox A free, open source, cross-platform, graphical web browser developed by the Mozilla Corporation and hundreds of volunteers. ================================================ FILE: spec/fixtures/description_lists/test02.html ================================================
Firefox
Mozilla Firefox
Fx
A free, open source, cross-platform, graphical web browser developed by the Mozilla Corporation and hundreds of volunteers.
================================================ FILE: spec/fixtures/description_lists/test02.xml ================================================ Firefox Mozilla Firefox Fx A free, open source, cross-platform, graphical web browser developed by the Mozilla Corporation and hundreds of volunteers. ================================================ FILE: spec/fixtures/description_lists/test03.html ================================================
Firefox
A free, open source, cross-platform, graphical web browser developed by the Mozilla Corporation and hundreds of volunteers.
The Red Panda also known as the Lesser Panda, Wah, Bear Cat or Firefox, is a mostly herbivorous mammal, slightly larger than a domestic cat (60 cm long).
================================================ FILE: spec/fixtures/description_lists/test03.xml ================================================ Firefox A free, open source, cross-platform, graphical web browser developed by the Mozilla Corporation and hundreds of volunteers. The Red Panda also known as the Lesser Panda, Wah, Bear Cat or Firefox, is a mostly herbivorous mammal, slightly larger than a domestic cat (60 cm long). ================================================ FILE: spec/fixtures/description_lists/test04.html ================================================

Heading

Some text
Firefox
Mozilla Firefox
Fx
A free, open source, cross-platform, graphical web browser developed by the Mozilla Corporation and hundreds of volunteers.
The Red Panda also known as the Lesser Panda, Wah, Bear Cat or Firefox, is a mostly herbivorous mammal, slightly larger than a domestic cat (60 cm long).
================================================ FILE: spec/fixtures/description_lists/test04.xml ================================================ Heading Some text Firefox Mozilla Firefox Fx A free, open source, cross-platform, graphical web browser developed by the Mozilla Corporation and hundreds of volunteers. The Red Panda also known as the Lesser Panda, Wah, Bear Cat or Firefox, is a mostly herbivorous mammal, slightly larger than a domestic cat (60 cm long). ================================================ FILE: spec/fixtures/lists/lists_inline_elements.html ================================================
  • Some text in a span and some more text
    Text in a new line in div
  • Some text ( link ) and some more text
    Text in a new line in div
  • Some text in small and again normal
  • Some text in strong and again normal

    New paragraph

  • Some text in em and normal text

    New paragraph

  • Some text in italics and normal text

    New paragraph

  • Some text in bold and normal text

    New paragraph

  • Some text underlined and normal text

    New paragraph

================================================ FILE: spec/fixtures/lists/lists_inline_elements.xml ================================================ Some text in a span and some more text Text in a new line in div Some text ( link ) and some more text Text in a new line in div Some text in small and again normal Some text in strong and again normal New paragraph Some text in em and normal text New paragraph Some text in italics and normal text New paragraph Some text in bold and normal text New paragraph Some text underlined and normal text New paragraph ================================================ FILE: spec/spec_helper.rb ================================================ require 'rubygems' require 'bundler/setup' require 'htmltoword' include Htmltoword::XSLTHelper include Htmltoword::TemplatesHelper def compare_transformed_files(test:, test_file_name:, extras: false) source = File.read(fixture_path(test, test_file_name, :html), encoding: 'utf-8') expected_content = File.read(fixture_path(test, test_file_name, :xml), encoding: 'utf-8') compare_resulting_wordml_with_expected(source, expected_content, extras: extras) end def compare_resulting_wordml_with_expected(html, resulting_wordml, extras: false) source = Nokogiri::HTML(html.gsub(/>\s+<')) result = Htmltoword::Document.new(template_file(nil)).transform_doc_xml(source, extras) result.gsub!(/\s*\s*/m, '') result = remove_declaration(result) expect(remove_whitespace(result)).to eq(remove_whitespace(resulting_wordml)) end def compare_numbering_xml(html, expected_xml) source = Nokogiri::HTML(html.gsub(/>\s+<')) stylesheet = xslt(stylesheet_name: 'numbering') result = stylesheet.transform(source) result.xpath('//comment()').remove result = remove_declaration(result.to_s) expect(remove_whitespace(result.to_s)).to eq(remove_whitespace(expected_xml)) end def compare_relations_xml(html, expected_xml) source = Nokogiri::HTML(html.gsub(/>\s+<')) stylesheet = xslt(stylesheet_name: 'relations') result = stylesheet.transform(source) result.xpath('//comment()').remove result = remove_declaration(result.to_s) expect(remove_whitespace(result.to_s)).to eq(remove_whitespace(expected_xml)) end def check_link_text(html, resulting_wordml, extras: false) source = Nokogiri::HTML(html.gsub(/>\s+<')) result = Htmltoword::Document.new(template_file(nil)).transform_doc_xml(source, extras) result.gsub!(/\s*\s*/m, '') result = remove_declaration(result) expect(remove_whitespace_for_link_check(result)).to eq(remove_whitespace_for_link_check(resulting_wordml)) end private def fixture_path(folder, file_name, extension) File.join(File.dirname(__FILE__), 'fixtures', folder, "#{file_name}.#{extension}") end #used to temporarily save documents for testing def tmp_path(filename) File.join(File.dirname(__FILE__), 'tmp', "#{filename}.docx") end def compare_content_of_body?(wordml) wordml !~ />)\s+|\s+(?<)/, '\k').strip end def remove_whitespace_for_link_check(wordml) wordml.gsub(/^\s+/, '') end def remove_declaration(wordml) wordml.sub(/<\?xml (.*?)>/, '').gsub(/\s*xmlns:(\w+)="(.*?)\s*"/, '') end ================================================ FILE: spec/tmp/.gitignore ================================================ * !.gitignore ================================================ FILE: spec/xslt_alignment_spec.rb ================================================ require 'spec_helper' describe "XSLT to align div, p and td tags" do it "transforms a p element with the correct alignment." do html = <<-EOL

p using text-aligned center

p using text-aligned right

p using text-aligned left

Praesent commodo leo et tincidunt tincidunt. Aliquam vestibulum vehicula accumsan. In suscipit nunc vitae facilisis mattis. Interdum et malesuada fames ac ante ipsum primis in faucibus. Proin fringilla, odio in rhoncus tincidunt, mauris lectus gravida nibh, ac consectetur est arcu a turpis. Proin sodales tellus imperdiet, auctor ante sed, pulvinar nisl. Aenean ultricies elementum leo, in mattis dolor dapibus feugiat. Nunc scelerisque nec purus ac tempus. Praesent at velit ac ipsum hendrerit auctor. Nam dui nunc, ultrices quis aliquet in, pellentesque quis diam. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos.

p class center and strong

p class right and strong

p class left and strong

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse ornare sem at sapien accumsan, in pellentesque elit consectetur. Mauris quis dui a magna scelerisque ornare. Vivamus scelerisque sollicitudin ante, auctor dictum nunc iaculis sed. Nullam lobortis ligula odio, a dapibus augue malesuada eget. Nam ac justo nunc. Vestibulum tristique diam sit amet ornare maximus. Duis sit amet libero elit. Proin massa nunc, rutrum nec odio et, hendrerit egestas dui. Donec sit amet aliquam libero.

Just a p

EOL expected_wordml = <<-EOL p using text-aligned center p using text-aligned right p using text-aligned left Praesent commodo leo et tincidunt tincidunt. Aliquam vestibulum vehicula accumsan. In suscipit nunc vitae facilisis mattis. Interdum et malesuada fames ac ante ipsum primis in faucibus. Proin fringilla, odio in rhoncus tincidunt, mauris lectus gravida nibh, ac consectetur est arcu a turpis. Proin sodales tellus imperdiet, auctor ante sed, pulvinar nisl. Aenean ultricies elementum leo, in mattis dolor dapibus feugiat. Nunc scelerisque nec purus ac tempus. Praesent at velit ac ipsum hendrerit auctor. Nam dui nunc, ultrices quis aliquet in, pellentesque quis diam. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos. p class center and strong p class right and strong p class left and strong Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse ornare sem at sapien accumsan, in pellentesque elit consectetur. Mauris quis dui a magna scelerisque ornare. Vivamus scelerisque sollicitudin ante, auctor dictum nunc iaculis sed. Nullam lobortis ligula odio, a dapibus augue malesuada eget. Nam ac justo nunc. Vestibulum tristique diam sit amet ornare maximus. Duis sit amet libero elit. Proin massa nunc, rutrum nec odio et, hendrerit egestas dui. Donec sit amet aliquam libero. Just a p EOL compare_resulting_wordml_with_expected(html, expected_wordml.strip) end it "transforms a table having its cells properly aligned" do html = <<-EOL
Left aligned text Right aligned text Center aligned text Justify aligned text
To the left To the right Centered Justified
To the left To the right Centered Justified
EOL expected_wordml = <<-EOL Left aligned text Right aligned text Center aligned text Justify aligned text To the left To the right Centered Justified To the left To the right Centered Justified EOL compare_resulting_wordml_with_expected(html, expected_wordml.strip) end it "transforms div element with the correct alignment" do html = <<-EOL
div using text-aligned center
div using text-aligned right
div using text-aligned left
Praesent commodo leo et tincidunt tincidunt. Aliquam vestibulum vehicula accumsan. In suscipit nunc vitae facilisis mattis. Interdum et malesuada fames ac ante ipsum primis in faucibus. Proin fringilla, odio in rhoncus tincidunt, mauris lectus gravida nibh, ac consectetur est arcu a turpis. Proin sodales tellus imperdiet, auctor ante sed, pulvinar nisl. Aenean ultricies elementum leo, in mattis dolor dapibus feugiat. Nunc scelerisque nec purus ac tempus. Praesent at velit ac ipsum hendrerit auctor. Nam dui nunc, ultrices quis aliquet in, pellentesque quis diam. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos.
div class center and strong
div class right and strong
div class left and strong
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse ornare sem at sapien accumsan, in pellentesque elit consectetur. Mauris quis dui a magna scelerisque ornare. Vivamus scelerisque sollicitudin ante, auctor dictum nunc iaculis sed. Nullam lobortis ligula odio, a dapibus augue malesuada eget. Nam ac justo nunc. Vestibulum tristique diam sit amet ornare maximus. Duis sit amet libero elit. Proin massa nunc, rutrum nec odio et, hendrerit egestas dui. Donec sit amet aliquam libero.
Just a div
EOL expected_wordml = <<-EOL div using text-aligned center div using text-aligned right div using text-aligned left Praesent commodo leo et tincidunt tincidunt. Aliquam vestibulum vehicula accumsan. In suscipit nunc vitae facilisis mattis. Interdum et malesuada fames ac ante ipsum primis in faucibus. Proin fringilla, odio in rhoncus tincidunt, mauris lectus gravida nibh, ac consectetur est arcu a turpis. Proin sodales tellus imperdiet, auctor ante sed, pulvinar nisl. Aenean ultricies elementum leo, in mattis dolor dapibus feugiat. Nunc scelerisque nec purus ac tempus. Praesent at velit ac ipsum hendrerit auctor. Nam dui nunc, ultrices quis aliquet in, pellentesque quis diam. Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos. div class center and strong div class right and strong div class left and strong Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse ornare sem at sapien accumsan, in pellentesque elit consectetur. Mauris quis dui a magna scelerisque ornare. Vivamus scelerisque sollicitudin ante, auctor dictum nunc iaculis sed. Nullam lobortis ligula odio, a dapibus augue malesuada eget. Nam ac justo nunc. Vestibulum tristique diam sit amet ornare maximus. Duis sit amet libero elit. Proin massa nunc, rutrum nec odio et, hendrerit egestas dui. Donec sit amet aliquam libero. Just a div EOL compare_resulting_wordml_with_expected(html, expected_wordml.strip) end it "transforms nested divs with proper alignment" do # TODO: Known bug, not implemented yet. #
Something
else
-> else won’t be centered. end it "transforms articles with proper alignment" do html = <<-EOL
article using text-aligned center
article using text-aligned right
article class left and strong
Just an article
EOL expected_wordml = <<-EOL article using text-aligned center article using text-aligned right article class left and strong Just an article EOL compare_resulting_wordml_with_expected(html, expected_wordml.strip) end it "transforms sections with proper alignment" do html = <<-EOL
section using text-aligned center
section using text-aligned right
section class left and strong
Just an section
EOL expected_wordml = <<-EOL section using text-aligned center section using text-aligned right section class left and strong Just an section EOL compare_resulting_wordml_with_expected(html, expected_wordml.strip) end end ================================================ FILE: spec/xslt_breaks_spec.rb ================================================ require 'spec_helper' describe "XSLT to align div, p and td tags" do it "transforms br tags children of document body" do html = <<-EOL

Paragraph 1


Paragraph 2

Paragraph 3

Paragraph 4
EOL expected_wordml = <<-EOL Paragraph 1 Paragraph 2 Paragraph 3 Paragraph 4 EOL compare_resulting_wordml_with_expected(html, expected_wordml.strip) end it "transforms br tags inside a br or p" do html = <<-EOL

Lorem ipsum 1

Lorem ipsum 2
Lorem ipsum 3

Lorem ipsum 4
Lorem ipsum 5
Lorem ipsum 6
Lorem ipsum 7

Lorem ipsum 6
Lorem ipsum 7


Lorem ipsum 8


Lorem ipsum 9
Lorem ipsum 10

EOL expected_wordml = <<-EOL Lorem ipsum 1 Lorem ipsum 2 Lorem ipsum 3 Lorem ipsum 4 Lorem ipsum 5 Lorem ipsum 6 Lorem ipsum 7 Lorem ipsum 6 Lorem ipsum 7 Lorem ipsum 8 Lorem ipsum 9 Lorem ipsum 10 EOL compare_resulting_wordml_with_expected(html, expected_wordml.strip) end it "transforms br tags inside lists" do html = <<-EOL
  1. Text
    new line
  2. Text inside div
    new line
  3. Text inside p
    new line

  4. Some text
    Text in div
EOL expected_wordml = <<-EOL Text new line Text inside div new line Text inside p new line Some text Text in div EOL compare_resulting_wordml_with_expected(html, expected_wordml.strip) end it "transforms br tags inside table cells" do html = <<-EOL
Inside a div
in a cell
Inside a cell
no div
  • Text
    new line
  • Text inside div
    new line
  • Text inside p
    new line

Some text

Text inside p

EOL expected_wordml = <<-EOL Inside a div in a cell Inside a cell no div Text new line Text inside div new line Text inside p new line Some text Text inside p EOL compare_resulting_wordml_with_expected(html, expected_wordml.strip) end end ================================================ FILE: spec/xslt_complex_spec.rb ================================================ require 'spec_helper' describe 'Bigger and a bit more complex documents' do it 'transforms documents with nested tables and normal lists and extra classes' do html = <<-EOL

My Document

Section 1

 

Subsection 1.1
Lorem
Ipsum: #
dolor: #
sit: #
amet: #
consectetur: #
adipiscing: #
elit
NULLAM AMET

Subsection 1.2

Subsection 1.2.1

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nullam amet.
Lorem 
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Fusce volutpat nunc at nulla scelerisque vehicula. Vestibulum id enim nisi. Quisque luctus amet. 
Fake list
(a) Lorem ipsum dolor sit amet, consectetur posuere. 
 
(b) Fusce volutpat nunc at nulla scelerisque vehicula. Vestibulum id enim nisi.  
 
(c) Lorem ipsum dolor sit amet, consectetur adipiscing elit. Fusce volutpat nunc at nulla scelerisque vehic 
List
  • Lorem ipsum
  • dolor sit amet,
  • consectetur adipiscing elit

 

Nullam ipsum in magna

 

Consectetus adipiscing elit.
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Fusce volutpat nunc at nulla scelerisque vehicula. Vestibulum id enim nisi. Quisque luctus amet.
  • (a) Lorem ipsum dolor sit amet, consectetur adipiscing elit. Fusce volutpat nunc at nulla scelerisque vehicula. Vestibulum id enim nisi.
  • (b) Lorem ipsum dolor sit amet
  • (c) Lorem ipsum dolor sit amet, consectetur adipiscing elit.
Nullam in magna ut nulla efficitur scelerisque.
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Fusce volutpat nunc at nulla scelerisque vehicula. Vestibulum id enim nisi.
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Praesent aliquam ornare augue. Nullam in magna ut nulla efficitur scelerisque. Sed scelerisque, ante ac fringilla porta, mi ipsum rhoncus amet.
Praesent aliquam ornare augue. Nullam in magna ut nulla efficitur scelerisque.

 

æåø

 

ÁÑÇ ÉÍóü
Œ Æ
  • (a) Lorem ipsum dolor sit amet
  • (b) Lorem ipsum dolor sit amet, consectetur adipiscing elit.
  • (c) Lorem ipsum dolor sit amet, consectetur adipiscing elit. Praesent aliquam ornare augue. Nullam in magna ut nulla efficitur scelerisque. Sed scelerisque, ante ac fringilla porta, mi ipsum rhoncus amet.
  • (d) Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis sit amet arcu augue. Quisque ultricies nec justo a blandit. Sed scelerisque turpis felis. Integer id mauris massa. Donec pretium, dui tincidunt vestibulum interdum, dui ex commodo turpis, eget scelerisque orci lacus id mi. Vivamus sed id.
  • (e) Vivamus sed id.
  • (f) Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis sit amet arcu augue. Quisque ultricies nec justo a blandit. Sed scelerisque turpis felis. Integer id mauris massa. Donec pretium, dui tincidunt vestibulum interdum, dui ex commodo turpis, eget scelerisque orci lacus id mi.
  • (g) Duis sit amet arcu augue. Quisque ultricies nec justo a blandit. Sed scelerisque turpis felis. Integer id mauris massa. Donec pretium, dui tincidunt vestibulum interdum, dui ex commodo turpis, eget scelerisque orci lacus id mi. Vivamus sed id.
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis sit amet arcu augue. Quisque ultricies nec justo a blandit. Sed scelerisque turpis felis. Integer id mauris massa.

Clause

Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Låneavtal # svenska kronor daterat den # 20#
1. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nunc suscipit gravida arcu, quis auctor turpis.
2. Lorem ipsum dolor sit amet, consectetur adipiscing elit. [Nunc suscipit gravida arcu].
3. Datum: #
4. Lorem ipsum dolor sit amet #
5. Nunc suscipit:
  • Lorem ipsum dolor sit amet: #
  • Nam volutpat: #
  • Consectetur adipiscing elit. Nam volutpat: #
  • Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nam volutpat. #
  • Lorem ipsum dolor sit amet, [consectetur]: #
  • [Lorem ipsum] dolor sit amet. Nam volutpat: [ ]
Place/Date 
______________________ 
EOL expected_wordml = <<-EOL My Document Section 1   Subsection 1.1 Lorem Ipsum: # dolor: # sit: # amet: # consectetur: # adipiscing: # elit NULLAM AMET Subsection 1.2 Subsection 1.2.1 Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nullam amet. Lorem  Lorem ipsum dolor sit amet, consectetur adipiscing elit. Fusce volutpat nunc at nulla scelerisque vehicula. Vestibulum id enim nisi. Quisque luctus amet.  Fake list (a) Lorem ipsum dolor sit amet, consectetur posuere.    (b) Fusce volutpat nunc at nulla scelerisque vehicula. Vestibulum id enim nisi.     (c) Lorem ipsum dolor sit amet, consectetur adipiscing elit. Fusce volutpat nunc at nulla scelerisque vehic  List Lorem ipsum dolor sit amet, consectetur adipiscing elit   Nullam ipsum in magna   Consectetus adipiscing elit. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Fusce volutpat nunc at nulla scelerisque vehicula. Vestibulum id enim nisi. Quisque luctus amet. (a) Lorem ipsum dolor sit amet, consectetur adipiscing elit. Fusce volutpat nunc at nulla scelerisque vehicula. Vestibulum id enim nisi. (b) Lorem ipsum dolor sit amet (c) Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nullam in magna ut nulla efficitur scelerisque. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Fusce volutpat nunc at nulla scelerisque vehicula. Vestibulum id enim nisi. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Praesent aliquam ornare augue. Nullam in magna ut nulla efficitur scelerisque. Sed scelerisque, ante ac fringilla porta, mi ipsum rhoncus amet. Praesent aliquam ornare augue. Nullam in magna ut nulla efficitur scelerisque.   æåø   ÁÑÇ ÉÍóü Œ Æ (a) Lorem ipsum dolor sit amet (b) Lorem ipsum dolor sit amet, consectetur adipiscing elit. (c) Lorem ipsum dolor sit amet, consectetur adipiscing elit. Praesent aliquam ornare augue. Nullam in magna ut nulla efficitur scelerisque. Sed scelerisque, ante ac fringilla porta, mi ipsum rhoncus amet. (d) Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis sit amet arcu augue. Quisque ultricies nec justo a blandit. Sed scelerisque turpis felis. Integer id mauris massa. Donec pretium, dui tincidunt vestibulum interdum, dui ex commodo turpis, eget scelerisque orci lacus id mi. Vivamus sed id. (e) Vivamus sed id. (f) Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis sit amet arcu augue. Quisque ultricies nec justo a blandit. Sed scelerisque turpis felis. Integer id mauris massa. Donec pretium, dui tincidunt vestibulum interdum, dui ex commodo turpis, eget scelerisque orci lacus id mi. (g) Duis sit amet arcu augue. Quisque ultricies nec justo a blandit. Sed scelerisque turpis felis. Integer id mauris massa. Donec pretium, dui tincidunt vestibulum interdum, dui ex commodo turpis, eget scelerisque orci lacus id mi. Vivamus sed id. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis sit amet arcu augue. Quisque ultricies nec justo a blandit. Sed scelerisque turpis felis. Integer id mauris massa. Clause Lorem ipsum dolor sit amet, consectetur adipiscing elit. Låneavtal # svenska kronor daterat den # 20# 1. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nunc suscipit gravida arcu, quis auctor turpis. 2. Lorem ipsum dolor sit amet, consectetur adipiscing elit. [Nunc suscipit gravida arcu]. 3. Datum: # 4. Lorem ipsum dolor sit amet # 5. Nunc suscipit: Lorem ipsum dolor sit amet: # Nam volutpat: # Consectetur adipiscing elit. Nam volutpat: # Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nam volutpat. # Lorem ipsum dolor sit amet, [consectetur]: # [Lorem ipsum] dolor sit amet. Nam volutpat: [ ] Place/Date  ______________________  EOL compare_resulting_wordml_with_expected(html, expected_wordml.strip) end it 'transforms documents with multiple pages and highlight and nested things' do html = <<-EOL

trial bundle

Indholdsfortegnelse

Section 1

§ 2

Lorem ipsum: Link 1

Excepteur sint occaecat cupidatat non proident:

Consectetur adipiscing elit
  1. 1)sed do eiusmod tempor incididunt ut labore et dolore magna aliqua,
  2. 6)quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat Link. 2 Duis aute.

Curabitur erat mi, congue non turpis non, faucibus dapibus magna. Phasellus a accumsan tortor. Quisque quam purus, vehicula a auctor vitae.

Mauris felis massa, malesuada a aliquet sed, volutpat sed sapien. Sed porttitor ex in nisi commodo varius. Ut ligula enim, mollis at mi eget, elementum imperdiet augue.
Stk. 2. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Etiam convallis ut felis a cursus. Etiam sodales quis nisl ac elementum. Suspendisse egestas hendrerit diam sit amet mattis. Sed id consectetur orci. Phasellus ultrices laoreet lectus, a laoreet magna accumsan eget. Curabitur erat mi, congue non turpis non, faucibus dapibus magna. Phasellus a accumsan tortor. Quisque quam purus, vehicula a auctor vitae, lobortis nec dui. ligningslovens § 16 H, stk. 6, Donec malesuada dictum sagittis Text on link 3 or Link 4, Aenean rutrum elementum nulla, ut consequat velit accumsan vitae. Duis iaculis nulla elit, quis rhoncus mi malesuada vel.
Stk. 3. Pellentesque nisi tortor, fermentum nec iaculis non, efficitur eu ex. Donec volutpat felis at turpis accumsan, nec interdum sem porta. Sed sed.
Stk. 6. Quisque finibus purus urna, ac condimentum purus sollicitudin non. In egestas vel libero non pharetra. Donec malesuada dictum sagittis.

Section 2

Notes

My note
― By Cristina Matonte

§ 4

Lorem ipsum: Section 2

Datum:

Aenean rutrum elementum nulla, ut consequat velit accumsan vitae. Duis iaculis nulla elit, quis rhoncus mi malesuada vel.
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Etiam convallis ut felis a cursus..
Stk. 3. Etiam sodales quis nisl ac elementum. Suspendisse egestas hendrerit diam sit amet mattis.

Section 3

Highlights on p

Pinky


EOL expected_wordml = <<-EOL trial bundle Indholdsfortegnelse Section 1 (Text section) Section 2 (With highlights) Section 3 (Highlight only) Section 1 § 2 Lorem ipsum: Link 1 Excepteur sint occaecat cupidatat non proident: 2013-01-01 Consectetur adipiscing elit 1) sed do eiusmod tempor incididunt ut labore et dolore magna aliqua, 6) quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat Link. 2 Duis aute. Curabitur erat mi, congue non turpis non, faucibus dapibus magna. Phasellus a accumsan tortor. Quisque quam purus, vehicula a auctor vitae. Mauris felis massa, malesuada a aliquet sed, volutpat sed sapien. Sed porttitor ex in nisi commodo varius. Ut ligula enim, mollis at mi eget, elementum imperdiet augue. Stk. 2. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Etiam convallis ut felis a cursus. Etiam sodales quis nisl ac elementum. Suspendisse egestas hendrerit diam sit amet mattis. Sed id consectetur orci. Phasellus ultrices laoreet lectus, a laoreet magna accumsan eget. Curabitur erat mi, congue non turpis non, faucibus dapibus magna. Phasellus a accumsan tortor. Quisque quam purus, vehicula a auctor vitae, lobortis nec dui. ligningslovens § 16 H, stk. 6 , Donec malesuada dictum sagittis Text on link 3 or Link 4 , Aenean rutrum elementum nulla, ut consequat velit accumsan vitae. Duis iaculis nulla elit, quis rhoncus mi malesuada vel. Stk. 3. Pellentesque nisi tortor, fermentum nec iaculis non, efficitur eu ex. Donec volutpat felis at turpis accumsan, nec interdum sem porta. Sed sed. Stk. 6. Quisque finibus purus urna, ac condimentum purus sollicitudin non. In egestas vel libero non pharetra. Donec malesuada dictum sagittis. Section 2 Notes My note ― By Cristina Matonte § 4 Lorem ipsum: Section 2 Datum: 2015-01-01 Aenean rutrum elementum nulla, ut consequat velit accumsan vitae. Duis iaculis nulla elit, quis rhoncus mi malesuada vel. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Etiam convallis ut felis a cursus. . Stk. 3. Etiam sodales quis nisl ac elementum. Suspendisse egestas hendrerit diam sit amet mattis. Section 3 Highlights on p Pinky EOL compare_resulting_wordml_with_expected(html, expected_wordml.strip) end it 'transforms other nestings' do compare_transformed_files( test: 'complex', test_file_name: 'nestings', extras: false ) end end ================================================ FILE: spec/xslt_description_lists_spec.rb ================================================ require 'spec_helper' RSpec.describe 'XSLT supporting description lists' do it 'transforms correctly a single term and description' do compare_transformed_files( test: 'description_lists', test_file_name: 'test01', extras: false ) end it 'transforms correctly multiple terms, single description' do compare_transformed_files( test: 'description_lists', test_file_name: 'test02', extras: false ) end it 'transforms correctly a single term, multiple descriptions' do compare_transformed_files( test: 'description_lists', test_file_name: 'test03', extras: false ) end it 'transforms correctly a description list nested in divs' do compare_transformed_files( test: 'description_lists', test_file_name: 'test04', extras: false ) end end ================================================ FILE: spec/xslt_heading_spec.rb ================================================ require 'spec_helper' describe "XSLT for Headings" do it "transforms heading tags in the body" do html = <<-EOL

Heading 1

Heading 2

Heading 3

Heading 4

Heading 5
Heading 6
EOL expected_wordml = <<-EOL Heading 1 Heading 2 Heading 3 Heading 4 Heading 5 Heading 6 EOL compare_resulting_wordml_with_expected(html, expected_wordml.strip) end it "transforms heading tags in a div" do html = <<-EOL

Heading 1

Heading 2

Heading 3

Heading 4

Heading 5
Heading 6
EOL expected_wordml = <<-EOL Heading 1 Heading 2 Heading 3 Heading 4 Heading 5 Heading 6 EOL compare_resulting_wordml_with_expected(html, expected_wordml.strip) end it "transforms heading tags in a table" do html = <<-EOL

Heading 1

normal text

Heading 2

normal text

Heading 3

normal text

Heading 4

normal text
Heading 5
normal text
Heading 6
normal text
EOL expected_wordml = <<-EOL Heading 1 normal text Heading 2 normal text Heading 3 normal text Heading 4 normal text Heading 5 normal text Heading 6 normal text EOL compare_resulting_wordml_with_expected(html, expected_wordml.strip) end it "transform tables with empty headers" do html = <<-EOL
Header 2
Cell 1,1 Cell 1,2
EOL expected_wordml = <<-EOL Header 2 Cell 1,1 Cell 1,2 EOL compare_resulting_wordml_with_expected(html, expected_wordml.strip) end it "transforms tables without tag on " do html = <<-EOL
Header 1
Cell 1,1 Cell 1,2
EOL expected_wordml = <<-EOL Header 1 Cell 1,1 Cell 1,2 EOL compare_resulting_wordml_with_expected(html, expected_wordml.strip) end it "transforms tables with border attribute and table-bordered class" do html = <<-EOL
Hello World
Using table-bordered class
Hello world Part 2
EOL expected_wordml = <<-EOL Hello World Using table-bordered class Hello world Part 2 EOL compare_resulting_wordml_with_expected(html, expected_wordml.strip) end it "transforms nested elements inside table cells" do html = <<-EOL
Pre H1

This is a H1

Post H1
Text

A paragraph with Strong text

More text
Some content inside a div
Something

Inside a p strong and strong em

Text inside div
EOL expected_wordml = <<-EOL Pre H1 This is a H1 Post H1 Text A paragraph with Strong text More text Some content inside a div Something Inside a p strong and strong em Text inside div EOL compare_resulting_wordml_with_expected(html, expected_wordml.strip) end end ================================================ FILE: spec/xslt_images_spec.rb ================================================ require 'spec_helper' describe "XSLT to include images" do it "generates correct image wordml from html" do html = <<-EOL

Fancy image description

EOL expected_wordml = <<-EOL EOL compare_resulting_wordml_with_expected(html, expected_wordml.strip) end it "generates correct relations" do html = <<-EOL

Link text.

Other link text 2.

EOL expected_relations_xml = <<-EOL EOL compare_relations_xml(html, expected_relations_xml) end end ================================================ FILE: spec/xslt_links_spec.rb ================================================ require 'spec_helper' describe "XSLT for Links" do it "transforms heading tags in a div" do html = <<-EOL This is internal link. EOL expected_wordml = <<-EOL This is internal link. Link text. Other link text. This is other internal link. Other link text 2. Other link text 3. Some text First item: List link text. EOL expected_relations_xml = <<-EOL EOL compare_resulting_wordml_with_expected(html, expected_wordml.strip) compare_relations_xml(html, expected_relations_xml) check_link_text(html, expected_wordml) end end ================================================ FILE: spec/xslt_lists_spec.rb ================================================ require 'spec_helper' describe "XSLT supporting lists" do it "transforms a simple ol list" do html = <<-EOL
  1. Item 1
  2. Item 2
EOL expected_wordml = <<-EOL Item 1 Item 2 EOL expected_numbering_xml = <<-EOL EOL compare_resulting_wordml_with_expected(html, expected_wordml.strip) compare_numbering_xml(html, expected_numbering_xml) end it "transforms a simple ul list" do html = <<-EOL
  • Item 1
  • Item 2
EOL expected_wordml = <<-EOL Item 1 Item 2 EOL expected_numbering_xml = <<-EOL EOL compare_resulting_wordml_with_expected(html, expected_wordml.strip) compare_numbering_xml(html, expected_numbering_xml) end it "transforms nested lists with styles" do html = <<-EOL
  1. Item 1
    • Sub item 1
    • Sub item 2
  2. Item 2
    1. Item a
      1. Item i
      2. Item ii
  3. Item 3
EOL expected_wordml = <<-EOL Item 1 Sub item 1 Sub item 2 Item 2 Item a Item i Item ii Item 3 EOL expected_numbering_xml = <<-EOL EOL compare_resulting_wordml_with_expected(html, expected_wordml.strip) compare_numbering_xml(html, expected_numbering_xml) end it 'restarts the counter on new list' do html = <<-EOL
  1. Item 1
  2. Item 2
  1. New Item 1
  2. New Item 2
EOL expected_wordml = <<-EOL Item 1 Item 2 New Item 1 New Item 2 EOL expected_numbering_xml = <<-EOL EOL compare_resulting_wordml_with_expected(html, expected_wordml.strip) compare_numbering_xml(html, expected_numbering_xml) end it 'handles inline elements inside a div' do compare_transformed_files( test: 'lists', test_file_name: 'lists_inline_elements', extras: false ) end end ================================================ FILE: spec/xslt_simple_text_style_spec.rb ================================================ require 'spec_helper' describe "XSLT to make text bold or italic" do it "transforms simple b, strong, em and italic tags" do html = <<-EOL
Testing bold, italic, strong, em text within a div.

Testing bold, italic, strong, em text within a p.

Testing bold, italic, strong, em text within a span. EOL expected_wordml = <<-EOL Testing bold , italic , strong , em text within a div. Testing bold , italic , strong , em text within a p. Testing bold , italic , strong , em text within a span. EOL compare_resulting_wordml_with_expected(html, expected_wordml.strip) end it "transforms all combinations of b, strong, em and italic within div and p" do html = <<-EOL

Combinations in p tag: Bold italic, Italic bold, Bold em, Em bold, Strong italic, Italic strong, Strong em, Em strong. Should be ok

More on combinations : Just bold. Bold italic. Again just bold, Just italic. Italic bold. Again just italic, Just bold. Bold em Again just bold, Just em. Em bold. Again just em, Just Strong. Strong italic. Again just strong, Just italic. Italic strong. Again just italic , Just Strong. Strong em. Again just strong, Just em. Em strong. Again just em. Should be ok
EOL expected_wordml = <<-EOL Combinations in p tag: Bold italic , Italic bold , Bold em , Em bold , Strong italic , Italic strong , Strong em , Em strong . Should be ok More on combinations : Just bold. Bold italic. Again just bold , Just italic. Italic bold. Again just italic , Just bold. Bold em Again just bold , Just em. Em bold. Again just em , Just Strong. Strong italic. Again just strong , Just italic. Italic strong. Again just italic , Just Strong. Strong em. Again just strong , Just em. Em strong. Again just em . Should be ok EOL compare_resulting_wordml_with_expected(html, expected_wordml.strip) end it "transforms b, strong, em and italic within table cells" do html = <<-EOL
Column 1 Column 2 Column 3
Row 1 Em text I text
Row 2 Text: Strong Strong em Strong Bold tag Bold italic More bold. End Text: Just em Em strong text More em text Italic tag Italic bold More italic. End
Row 2

Text: Strong Strong em Strong Bold tag Bold italic More bold. End

Td
Div Strong italic
Bold Em
Td Span
A div
Strong em
Td
A div
Span em
Td
A div
Italic strong
EOL expected_wordml = <<-EOL Column 1 Column 2 Column 3 Row 1 Em text I text Row 2 Text: Strong Strong em Strong Bold tag Bold italic More bold. End Text: Just em Em strong text More em text Italic tag Italic bold More italic. End Row 2 Text: Strong Strong em Strong Bold tag Bold italic More bold. End Td Div Strong italic Bold Em Td Span A div Strong em Td A div Span em Td A div Italic strong EOL compare_resulting_wordml_with_expected(html, expected_wordml.strip) end end ================================================ FILE: spec/xslt_spec.rb ================================================ require 'spec_helper' describe "XSLT" do it "transforms an empty html doc into an empty docx doc" do html = '' compare_resulting_wordml_with_expected(html, " ") end it "transforms a div into a docx block element." do html = '
Hello
' compare_resulting_wordml_with_expected(html, " Hello ") end context "transform a span" do it "into a docx block element if child of body." do html = 'Hello' compare_resulting_wordml_with_expected(html, " Hello ") end it "into a docx inline element if not child of body." do html = '
Hello
' compare_resulting_wordml_with_expected(html, " Hello ") end end it "transforms a p into a docx block element." do html = '

Hello

' compare_resulting_wordml_with_expected(html, " Hello ") end it "Should strip out details tags" do html = '

Hello

Title

Second

' compare_resulting_wordml_with_expected(html, " Hello ") end it "Should transform a pre" do html = '
Lorem ipsum 
Boldie... Not boldie
dolor sit amet Italic
consectetur
adipiscing
elit link 

Title

Pre 1
Pre 2
Pre 3
' expected = ' Lorem ipsum Boldie... Not boldie dolor sit amet Italic consectetur adipiscing elit link Title Pre 1 Pre 2 Pre 3 ' compare_resulting_wordml_with_expected(html, expected) end end ================================================ FILE: spec/xslt_tables_spec.rb ================================================ require 'spec_helper' describe "XSLT for tables" do it "transforms a table into a tbl element" do html = <<-EOL
Hello
EOL expected_wordml = <<-EOL Hello EOL compare_resulting_wordml_with_expected(html, expected_wordml.strip) end it "transforms a nested table" do html = <<-EOL
Cell 1,1
Nested Table
Cell 1,3
EOL expected_wordml = <<-EOL Cell 1,1 Nested Table Cell 1,3 EOL compare_resulting_wordml_with_expected(html, expected_wordml.strip) end it "transforms tables with empty cells" do html = <<-EOL
Cell 1,1
EOL expected_wordml = <<-EOL Cell 1,1 EOL compare_resulting_wordml_with_expected(html, expected_wordml.strip) end it "transform tables with empty headers" do html = <<-EOL
Header 2
Cell 1,1 Cell 1,2
EOL expected_wordml = <<-EOL Header 2 Cell 1,1 Cell 1,2 EOL compare_resulting_wordml_with_expected(html, expected_wordml.strip) end it "transforms tables without tag on " do html = <<-EOL
Header 1
Cell 1,1 Cell 1,2
EOL expected_wordml = <<-EOL Header 1 Cell 1,1 Cell 1,2 EOL compare_resulting_wordml_with_expected(html, expected_wordml.strip) end it "transforms tables with border attribute and table-bordered class" do html = <<-EOL
Hello World
Using table-bordered class
Hello world Part 2
EOL expected_wordml = <<-EOL Hello World Using table-bordered class Hello world Part 2 EOL compare_resulting_wordml_with_expected(html, expected_wordml.strip) end it "transforms nested elements inside table cells" do html = <<-EOL
Pre H1

This is a H1

Post H1
Text

A paragraph with Strong text

More text
Some content inside a div
Something

Inside a p strong and strong em

Text inside div
EOL expected_wordml = <<-EOL Pre H1 This is a H1 Post H1 Text A paragraph with Strong text More text Some content inside a div Something Inside a p strong and strong em Text inside div EOL compare_resulting_wordml_with_expected(html, expected_wordml.strip) end it "handles cell borders" do html = <<-EOL
Sum total : 1.000.000
EOL expected_wordml = <<-EOL Sum total : 1.000.000 EOL compare_resulting_wordml_with_expected(html, expected_wordml.strip, extras: false) end it "transforms a colspan into a gridSpan table cell property" do html = <<-EOL
Hello
EOL expected_wordml = <<-EOL Hello EOL compare_resulting_wordml_with_expected(html, expected_wordml.strip) end end ================================================ FILE: templates/default/[Content_Types].xml ================================================ ================================================ FILE: templates/default/_rels/.rels ================================================ ================================================ FILE: templates/default/docProps/app.xml ================================================ ================================================ FILE: templates/default/docProps/core.xml ================================================ ================================================ FILE: templates/default/word/_rels/document.xml.rels ================================================ ================================================ FILE: templates/default/word/document.xml ================================================ Heading1 HEADING2 HEADING3 HEADING4 normal one two three normal one two three ================================================ FILE: templates/default/word/fontTable.xml ================================================ ================================================ FILE: templates/default/word/numbering.xml ================================================ ================================================ FILE: templates/default/word/settings.xml ================================================ ================================================ FILE: templates/default/word/styles.xml ================================================ ================================================ FILE: templates/default/word/stylesWithEffects.xml ================================================ ================================================ FILE: templates/default/word/theme/theme1.xml ================================================ ================================================ FILE: templates/default/word/webSettings.xml ================================================