Full Code of parallax/svg-animation-tools for AI

master 3cbad1696760 cached

24 files

351.2 KB

82.3k tokens

614 symbols

1 requests

Download .txt

Showing preview only (365K chars total). Download the full file or copy to clipboard to get everything.

Repository: parallax/svg-animation-tools
Branch: master
Commit: 3cbad1696760
Files: 24
Total size: 351.2 KB

Directory structure:
gitextract__k8xnxid/

├── LICENSE.md
├── README.md
├── example/
│   ├── animation.html
│   ├── output/
│   │   └── animation.html
│   └── parallax_svg_tools/
│       ├── bs4/
│       │   ├── __init__.py
│       │   ├── builder/
│       │   │   ├── __init__.py
│       │   │   ├── _html5lib.py
│       │   │   ├── _htmlparser.py
│       │   │   └── _lxml.py
│       │   ├── dammit.py
│       │   ├── diagnose.py
│       │   └── element.py
│       ├── run.py
│       └── svg/
│           └── __init__.py
└── parallax_svg_tools/
    ├── bs4/
    │   ├── __init__.py
    │   ├── builder/
    │   │   ├── __init__.py
    │   │   ├── _html5lib.py
    │   │   ├── _htmlparser.py
    │   │   └── _lxml.py
    │   ├── dammit.py
    │   ├── diagnose.py
    │   └── element.py
    ├── run.py
    └── svg/
        └── __init__.py

================================================
FILE CONTENTS
================================================

================================================
FILE: LICENSE.md
================================================
Copyright 2017 Parallax Agency Ltd

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.


================================================
FILE: README.md
================================================
# Parallax SVG Animation Tools

A simple set of python functions to help working with animated SVGs exported from Illustrator. More features coming soon!
We used it to create animations like this.

[Viva La Velo](https://parall.ax/viva-le-velo)

![Viva La Velo intro animation](vlv-intro-gif.gif)


## Overview

Part of animating with SVGs is getting references to elements in code and passing them to animation functions. For complicated animations this becomes difficult and hand editing SVG code is slow and gets overwritten when your artwork updates. We decided to write a post-processer for SVGs produced by Illustrator to help speed this up. Layer names are used to create attributes, classes and ID's making selecting them in JS or CSS far easier.

This is the what the svg code looks like before and after the processing step.

```xml
<!-- Before post processer -->
<svg id="Layer_1" data-name="Layer 1" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 800 600">
  <rect id="_class_my-element_origin_144_234" data-name="#class=my-element, origin=144 234" x="144" y="234" width="148" height="148"/>
  <rect id="_id_my-unique-element" data-name="#id=my-unique-element" x="316" y="234" width="148" height="148" fill="#29abe2"/>
  <rect id="_class_my-element" data-name="#class=my-element" x="488" y="234" width="148" height="148" fill="#fbb03b"/>
</svg>

<!-- After post processer -->
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 800 600">
  <rect id="my-unique-element" x="316" y="234" width="148" height="148" fill="#29abe2"/>
  <rect class="my-element" data-svg-origin="144 234" x="144" y="234" width="148" height="148"/>
  <rect class="my-element" x="488" y="234" width="148" height="148" fill="#fbb03b"/>
</svg>
```

![Illustrator layers example](example-image.png)


## Quick Example

Download the [svg tools](parallax_svg_tools.zip) and unzip them into your project folder.

Create an Illustrator file, add an element and change its layer name to say `#class=my-element`. Export the SVG using the **File > Export > Export for Screens** option with the following settings. Call the svg `animation.svg`.

![Illustrator svg export settings](svg-settings.png)

Create a HTML file as below. The import statements inline the SVG inline into our HTML file so we don't have to do any copy and pasting. Not strictly neccessary but makes the workflow a little easier. Save it as `animation.html`.

```html
<!DOCTYPE html>
<html>
<head>
	<meta charset='utf-8'/>
</head>
<body>

//import processed_animation.svg

</body>
</html>
```


Open the file called `run.py`. Here you can edit how the SVGs will be processed. The default looks like this. The sections below describe what the various options do.

```javascript
from svg import *

compile_svg('animation.svg', 'processed_animation.svg', 
{
	'process_layer_names': True,
	'namespace' : 'example'
})

inline_svg('animation.html', 'output/animation.html')
```

Open the command line and navigate to your project folder. Call the script using `python parallax_svg_tools/run.py`. You should see a list of processed files (or just one in this case) printed to the console if everything worked correctly. Note that the script must be called from a directory that has access to the svg files.

There should now be a folder called `output` containing an `animation.html` file with your processed SVG in it. All that is left to do is animate it with your tool of choice (ours is [GSAP](https://greensock.com/)).


## Functions

### process\_svg(src\_path, dst\_path, options)
Processes a single SVG and places it in the supplied destination directory. The following options are available.

+ **process\_layer\_names:**
Converts layer names as defined in Illustator into attributes. Begin the layer name with a '#' to indicate the layer should be parsed. 
For example `#id=my-id, class=my-class my-other-class, role=my-role` ...etc.
This is useful for fetching elements with Javascript as well as marking up elements for accessibility - see this [CSS Tricks Accessible SVG ](https://css-tricks.com/accessible-svgs/) article.
NOTE: Requires using commas to separate the attributes as that makes the parsing code a lot simpler :)

+ **expand_origin:**
Allows you to use `origin=100 100` to set origins for rotating/scaling with GSAP (expands to data-svg-origin). 

+ **namespace:** 
Appends a namespace to classes and IDs if one is provided. Useful for avoiding conflicts with other SVG files for things like masks and clipPaths.

+ **nowhitespace:**
Removes unneeded whitespace. We don't do anything fancier than that so as to not break animations. Use the excellent [SVGO](<https://github.com/svg/svgo>) if you need better minification.

+ **attributes:**
An object of key:value strings that will be applied as attributes to the root SVG element.

+ **title:**
Sets the title or removes it completely when set to `false`

+ **description:**
Sets the description or removes it completely when set to `false`

+ **convert_svg_text_to_html:**
Converts SVG text in HTML text via the foriegn object tag reducing file bloat and allowing you to style it with CSS. Requires the text be grouped inside a rectangle with the layer name set to `#TEXT`. 

+ **spirit:**
Expands `#spirit=my-id` to `data-spirit-id` when set to `true` for use with the [Spirit animation app](<https://spiritapp.io/>)


### inline\_svg(src\_path, dst\_path)
In order to animate SVGs markup needs to be placed in-line with HTML. This function will look at the source HTML file and include any references defined by `//import` statements to SVGs that it finds.

================================================
FILE: example/animation.html
================================================
<!DOCTYPE html>
<html>
<head>
	<meta charset='utf-8'/>
</head>
<body>

//import processed_animation.svg

</body>
</html>

================================================
FILE: example/output/animation.html
================================================
<!DOCTYPE html>
<html>
<head>
	<meta charset='utf-8'/>
</head>
<body>

<svg viewbox="0 0 800 600" xmlns="http://www.w3.org/2000/svg">
<title>animation</title>
<rect class="my-element" height="148" origin="144 234" width="148" x="144" y="234"/>
<rect fill="#29abe2" height="148" id="my-unique-element" width="148" x="316" y="234"/>
<rect class="my-element" fill="#fbb03b" height="148" width="148" x="488" y="234"/>
</svg>

</body>
</html>

================================================
FILE: example/parallax_svg_tools/bs4/__init__.py
================================================
"""Beautiful Soup
Elixir and Tonic
"The Screen-Scraper's Friend"
http://www.crummy.com/software/BeautifulSoup/

Beautiful Soup uses a pluggable XML or HTML parser to parse a
(possibly invalid) document into a tree representation. Beautiful Soup
provides methods and Pythonic idioms that make it easy to navigate,
search, and modify the parse tree.

Beautiful Soup works with Python 2.7 and up. It works better if lxml
and/or html5lib is installed.

For more than you ever wanted to know about Beautiful Soup, see the
documentation:
http://www.crummy.com/software/BeautifulSoup/bs4/doc/

"""

# Use of this source code is governed by a BSD-style license that can be
# found in the LICENSE file.

__author__ = "Leonard Richardson (leonardr@segfault.org)"
__version__ = "4.5.1"
__copyright__ = "Copyright (c) 2004-2016 Leonard Richardson"
__license__ = "MIT"

__all__ = ['BeautifulSoup']

import os
import re
import traceback
import warnings

from .builder import builder_registry, ParserRejectedMarkup
from .dammit import UnicodeDammit
from .element import (
    CData,
    Comment,
    DEFAULT_OUTPUT_ENCODING,
    Declaration,
    Doctype,
    NavigableString,
    PageElement,
    ProcessingInstruction,
    ResultSet,
    SoupStrainer,
    Tag,
    )

# The very first thing we do is give a useful error if someone is
# running this code under Python 3 without converting it.
'You are trying to run the Python 2 version of Beautiful Soup under Python 3. This will not work.'<>'You need to convert the code, either by installing it (`python setup.py install`) or by running 2to3 (`2to3 -w bs4`).'

class BeautifulSoup(Tag):
    """
    This class defines the basic interface called by the tree builders.

    These methods will be called by the parser:
      reset()
      feed(markup)

    The tree builder may call these methods from its feed() implementation:
      handle_starttag(name, attrs) # See note about return value
      handle_endtag(name)
      handle_data(data) # Appends to the current data node
      endData(containerClass=NavigableString) # Ends the current data node

    No matter how complicated the underlying parser is, you should be
    able to build a tree using 'start tag' events, 'end tag' events,
    'data' events, and "done with data" events.

    If you encounter an empty-element tag (aka a self-closing tag,
    like HTML's <br> tag), call handle_starttag and then
    handle_endtag.
    """
    ROOT_TAG_NAME = u'[document]'

    # If the end-user gives no indication which tree builder they
    # want, look for one with these features.
    DEFAULT_BUILDER_FEATURES = ['html', 'fast']

    ASCII_SPACES = '\x20\x0a\x09\x0c\x0d'

    NO_PARSER_SPECIFIED_WARNING = "No parser was explicitly specified, so I'm using the best available %(markup_type)s parser for this system (\"%(parser)s\"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.\n\nThe code that caused this warning is on line %(line_number)s of the file %(filename)s. To get rid of this warning, change code that looks like this:\n\n BeautifulSoup([your markup])\n\nto this:\n\n BeautifulSoup([your markup], \"%(parser)s\")\n"

    def __init__(self, markup="", features=None, builder=None,
                 parse_only=None, from_encoding=None, exclude_encodings=None,
                 **kwargs):
        """The Soup object is initialized as the 'root tag', and the
        provided markup (which can be a string or a file-like object)
        is fed into the underlying parser."""

        if 'convertEntities' in kwargs:
            warnings.warn(
                "BS4 does not respect the convertEntities argument to the "
                "BeautifulSoup constructor. Entities are always converted "
                "to Unicode characters.")

        if 'markupMassage' in kwargs:
            del kwargs['markupMassage']
            warnings.warn(
                "BS4 does not respect the markupMassage argument to the "
                "BeautifulSoup constructor. The tree builder is responsible "
                "for any necessary markup massage.")

        if 'smartQuotesTo' in kwargs:
            del kwargs['smartQuotesTo']
            warnings.warn(
                "BS4 does not respect the smartQuotesTo argument to the "
                "BeautifulSoup constructor. Smart quotes are always converted "
                "to Unicode characters.")

        if 'selfClosingTags' in kwargs:
            del kwargs['selfClosingTags']
            warnings.warn(
                "BS4 does not respect the selfClosingTags argument to the "
                "BeautifulSoup constructor. The tree builder is responsible "
                "for understanding self-closing tags.")

        if 'isHTML' in kwargs:
            del kwargs['isHTML']
            warnings.warn(
                "BS4 does not respect the isHTML argument to the "
                "BeautifulSoup constructor. Suggest you use "
                "features='lxml' for HTML and features='lxml-xml' for "
                "XML.")

        def deprecated_argument(old_name, new_name):
            if old_name in kwargs:
                warnings.warn(
                    'The "%s" argument to the BeautifulSoup constructor '
                    'has been renamed to "%s."' % (old_name, new_name))
                value = kwargs[old_name]
                del kwargs[old_name]
                return value
            return None

        parse_only = parse_only or deprecated_argument(
            "parseOnlyThese", "parse_only")

        from_encoding = from_encoding or deprecated_argument(
            "fromEncoding", "from_encoding")

        if from_encoding and isinstance(markup, unicode):
            warnings.warn("You provided Unicode markup but also provided a value for from_encoding. Your from_encoding will be ignored.")
            from_encoding = None

        if len(kwargs) > 0:
            arg = kwargs.keys().pop()
            raise TypeError(
                "__init__() got an unexpected keyword argument '%s'" % arg)

        if builder is None:
            original_features = features
            if isinstance(features, basestring):
                features = [features]
            if features is None or len(features) == 0:
                features = self.DEFAULT_BUILDER_FEATURES
            builder_class = builder_registry.lookup(*features)
            if builder_class is None:
                raise FeatureNotFound(
                    "Couldn't find a tree builder with the features you "
                    "requested: %s. Do you need to install a parser library?"
                    % ",".join(features))
            builder = builder_class()
            if not (original_features == builder.NAME or
                    original_features in builder.ALTERNATE_NAMES):
                if builder.is_xml:
                    markup_type = "XML"
                else:
                    markup_type = "HTML"

                caller = traceback.extract_stack()[0]
                filename = caller[0]
                line_number = caller[1]
                warnings.warn(self.NO_PARSER_SPECIFIED_WARNING % dict(
                    filename=filename,
                    line_number=line_number,
                    parser=builder.NAME,
                    markup_type=markup_type))

        self.builder = builder
        self.is_xml = builder.is_xml
        self.known_xml = self.is_xml
        self.builder.soup = self

        self.parse_only = parse_only

        if hasattr(markup, 'read'):        # It's a file-type object.
            markup = markup.read()
        elif len(markup) <= 256 and (
                (isinstance(markup, bytes) and not b'<' in markup)
                or (isinstance(markup, unicode) and not u'<' in markup)
        ):
            # Print out warnings for a couple beginner problems
            # involving passing non-markup to Beautiful Soup.
            # Beautiful Soup will still parse the input as markup,
            # just in case that's what the user really wants.
            if (isinstance(markup, unicode)
                and not os.path.supports_unicode_filenames):
                possible_filename = markup.encode("utf8")
            else:
                possible_filename = markup
            is_file = False
            try:
                is_file = os.path.exists(possible_filename)
            except Exception, e:
                # This is almost certainly a problem involving
                # characters not valid in filenames on this
                # system. Just let it go.
                pass
            if is_file:
                if isinstance(markup, unicode):
                    markup = markup.encode("utf8")
                warnings.warn(
                    '"%s" looks like a filename, not markup. You should'
                    'probably open this file and pass the filehandle into'
                    'Beautiful Soup.' % markup)
            self._check_markup_is_url(markup)

        for (self.markup, self.original_encoding, self.declared_html_encoding,
         self.contains_replacement_characters) in (
             self.builder.prepare_markup(
                 markup, from_encoding, exclude_encodings=exclude_encodings)):
            self.reset()
            try:
                self._feed()
                break
            except ParserRejectedMarkup:
                pass

        # Clear out the markup and remove the builder's circular
        # reference to this object.
        self.markup = None
        self.builder.soup = None

    def __copy__(self):
        copy = type(self)(
            self.encode('utf-8'), builder=self.builder, from_encoding='utf-8'
        )

        # Although we encoded the tree to UTF-8, that may not have
        # been the encoding of the original markup. Set the copy's
        # .original_encoding to reflect the original object's
        # .original_encoding.
        copy.original_encoding = self.original_encoding
        return copy

    def __getstate__(self):
        # Frequently a tree builder can't be pickled.
        d = dict(self.__dict__)
        if 'builder' in d and not self.builder.picklable:
            d['builder'] = None
        return d

    @staticmethod
    def _check_markup_is_url(markup):
        """ 
        Check if markup looks like it's actually a url and raise a warning 
        if so. Markup can be unicode or str (py2) / bytes (py3).
        """
        if isinstance(markup, bytes):
            space = b' '
            cant_start_with = (b"http:", b"https:")
        elif isinstance(markup, unicode):
            space = u' '
            cant_start_with = (u"http:", u"https:")
        else:
            return

        if any(markup.startswith(prefix) for prefix in cant_start_with):
            if not space in markup:
                if isinstance(markup, bytes):
                    decoded_markup = markup.decode('utf-8', 'replace')
                else:
                    decoded_markup = markup
                warnings.warn(
                    '"%s" looks like a URL. Beautiful Soup is not an'
                    ' HTTP client. You should probably use an HTTP client like'
                    ' requests to get the document behind the URL, and feed'
                    ' that document to Beautiful Soup.' % decoded_markup
                )

    def _feed(self):
        # Convert the document to Unicode.
        self.builder.reset()

        self.builder.feed(self.markup)
        # Close out any unfinished strings and close all the open tags.
        self.endData()
        while self.currentTag.name != self.ROOT_TAG_NAME:
            self.popTag()

    def reset(self):
        Tag.__init__(self, self, self.builder, self.ROOT_TAG_NAME)
        self.hidden = 1
        self.builder.reset()
        self.current_data = []
        self.currentTag = None
        self.tagStack = []
        self.preserve_whitespace_tag_stack = []
        self.pushTag(self)

    def new_tag(self, name, namespace=None, nsprefix=None, **attrs):
        """Create a new tag associated with this soup."""
        return Tag(None, self.builder, name, namespace, nsprefix, attrs)

    def new_string(self, s, subclass=NavigableString):
        """Create a new NavigableString associated with this soup."""
        return subclass(s)

    def insert_before(self, successor):
        raise NotImplementedError("BeautifulSoup objects don't support insert_before().")

    def insert_after(self, successor):
        raise NotImplementedError("BeautifulSoup objects don't support insert_after().")

    def popTag(self):
        tag = self.tagStack.pop()
        if self.preserve_whitespace_tag_stack and tag == self.preserve_whitespace_tag_stack[-1]:
            self.preserve_whitespace_tag_stack.pop()
        #print "Pop", tag.name
        if self.tagStack:
            self.currentTag = self.tagStack[-1]
        return self.currentTag

    def pushTag(self, tag):
        #print "Push", tag.name
        if self.currentTag:
            self.currentTag.contents.append(tag)
        self.tagStack.append(tag)
        self.currentTag = self.tagStack[-1]
        if tag.name in self.builder.preserve_whitespace_tags:
            self.preserve_whitespace_tag_stack.append(tag)

    def endData(self, containerClass=NavigableString):
        if self.current_data:
            current_data = u''.join(self.current_data)
            # If whitespace is not preserved, and this string contains
            # nothing but ASCII spaces, replace it with a single space
            # or newline.
            if not self.preserve_whitespace_tag_stack:
                strippable = True
                for i in current_data:
                    if i not in self.ASCII_SPACES:
                        strippable = False
                        break
                if strippable:
                    if '\n' in current_data:
                        current_data = '\n'
                    else:
                        current_data = ' '

            # Reset the data collector.
            self.current_data = []

            # Should we add this string to the tree at all?
            if self.parse_only and len(self.tagStack) <= 1 and \
                   (not self.parse_only.text or \
                    not self.parse_only.search(current_data)):
                return

            o = containerClass(current_data)
            self.object_was_parsed(o)

    def object_was_parsed(self, o, parent=None, most_recent_element=None):
        """Add an object to the parse tree."""
        parent = parent or self.currentTag
        previous_element = most_recent_element or self._most_recent_element

        next_element = previous_sibling = next_sibling = None
        if isinstance(o, Tag):
            next_element = o.next_element
            next_sibling = o.next_sibling
            previous_sibling = o.previous_sibling
            if not previous_element:
                previous_element = o.previous_element

        o.setup(parent, previous_element, next_element, previous_sibling, next_sibling)

        self._most_recent_element = o
        parent.contents.append(o)

        if parent.next_sibling:
            # This node is being inserted into an element that has
            # already been parsed. Deal with any dangling references.
            index = len(parent.contents)-1
            while index >= 0:
                if parent.contents[index] is o:
                    break
                index -= 1
            else:
                raise ValueError(
                    "Error building tree: supposedly %r was inserted "
                    "into %r after the fact, but I don't see it!" % (
                        o, parent
                    )
                )
            if index == 0:
                previous_element = parent
                previous_sibling = None
            else:
                previous_element = previous_sibling = parent.contents[index-1]
            if index == len(parent.contents)-1:
                next_element = parent.next_sibling
                next_sibling = None
            else:
                next_element = next_sibling = parent.contents[index+1]

            o.previous_element = previous_element
            if previous_element:
                previous_element.next_element = o
            o.next_element = next_element
            if next_element:
                next_element.previous_element = o
            o.next_sibling = next_sibling
            if next_sibling:
                next_sibling.previous_sibling = o
            o.previous_sibling = previous_sibling
            if previous_sibling:
                previous_sibling.next_sibling = o

    def _popToTag(self, name, nsprefix=None, inclusivePop=True):
        """Pops the tag stack up to and including the most recent
        instance of the given tag. If inclusivePop is false, pops the tag
        stack up to but *not* including the most recent instqance of
        the given tag."""
        #print "Popping to %s" % name
        if name == self.ROOT_TAG_NAME:
            # The BeautifulSoup object itself can never be popped.
            return

        most_recently_popped = None

        stack_size = len(self.tagStack)
        for i in range(stack_size - 1, 0, -1):
            t = self.tagStack[i]
            if (name == t.name and nsprefix == t.prefix):
                if inclusivePop:
                    most_recently_popped = self.popTag()
                break
            most_recently_popped = self.popTag()

        return most_recently_popped

    def handle_starttag(self, name, namespace, nsprefix, attrs):
        """Push a start tag on to the stack.

        If this method returns None, the tag was rejected by the
        SoupStrainer. You should proceed as if the tag had not occurred
        in the document. For instance, if this was a self-closing tag,
        don't call handle_endtag.
        """

        # print "Start tag %s: %s" % (name, attrs)
        self.endData()

        if (self.parse_only and len(self.tagStack) <= 1
            and (self.parse_only.text
                 or not self.parse_only.search_tag(name, attrs))):
            return None

        tag = Tag(self, self.builder, name, namespace, nsprefix, attrs,
                  self.currentTag, self._most_recent_element)
        if tag is None:
            return tag
        if self._most_recent_element:
            self._most_recent_element.next_element = tag
        self._most_recent_element = tag
        self.pushTag(tag)
        return tag

    def handle_endtag(self, name, nsprefix=None):
        #print "End tag: " + name
        self.endData()
        self._popToTag(name, nsprefix)

    def handle_data(self, data):
        self.current_data.append(data)

    def decode(self, pretty_print=False,
               eventual_encoding=DEFAULT_OUTPUT_ENCODING,
               formatter="minimal"):
        """Returns a string or Unicode representation of this document.
        To get Unicode, pass None for encoding."""

        if self.is_xml:
            # Print the XML declaration
            encoding_part = ''
            if eventual_encoding != None:
                encoding_part = ' encoding="%s"' % eventual_encoding
            prefix = u'<?xml version="1.0"%s?>\n' % encoding_part
        else:
            prefix = u''
        if not pretty_print:
            indent_level = None
        else:
            indent_level = 0
        return prefix + super(BeautifulSoup, self).decode(
            indent_level, eventual_encoding, formatter)

# Alias to make it easier to type import: 'from bs4 import _soup'
_s = BeautifulSoup
_soup = BeautifulSoup

class BeautifulStoneSoup(BeautifulSoup):
    """Deprecated interface to an XML parser."""

    def __init__(self, *args, **kwargs):
        kwargs['features'] = 'xml'
        warnings.warn(
            'The BeautifulStoneSoup class is deprecated. Instead of using '
            'it, pass features="xml" into the BeautifulSoup constructor.')
        super(BeautifulStoneSoup, self).__init__(*args, **kwargs)


class StopParsing(Exception):
    pass

class FeatureNotFound(ValueError):
    pass


#By default, act as an HTML pretty-printer.
if __name__ == '__main__':
    import sys
    soup = BeautifulSoup(sys.stdin)
    print soup.prettify()


================================================
FILE: example/parallax_svg_tools/bs4/builder/__init__.py
================================================
# Use of this source code is governed by a BSD-style license that can be
# found in the LICENSE file.

from collections import defaultdict
import itertools
import sys
from bs4.element import (
    CharsetMetaAttributeValue,
    ContentMetaAttributeValue,
    HTMLAwareEntitySubstitution,
    whitespace_re
    )

__all__ = [
    'HTMLTreeBuilder',
    'SAXTreeBuilder',
    'TreeBuilder',
    'TreeBuilderRegistry',
    ]

# Some useful features for a TreeBuilder to have.
FAST = 'fast'
PERMISSIVE = 'permissive'
STRICT = 'strict'
XML = 'xml'
HTML = 'html'
HTML_5 = 'html5'


class TreeBuilderRegistry(object):

    def __init__(self):
        self.builders_for_feature = defaultdict(list)
        self.builders = []

    def register(self, treebuilder_class):
        """Register a treebuilder based on its advertised features."""
        for feature in treebuilder_class.features:
            self.builders_for_feature[feature].insert(0, treebuilder_class)
        self.builders.insert(0, treebuilder_class)

    def lookup(self, *features):
        if len(self.builders) == 0:
            # There are no builders at all.
            return None

        if len(features) == 0:
            # They didn't ask for any features. Give them the most
            # recently registered builder.
            return self.builders[0]

        # Go down the list of features in order, and eliminate any builders
        # that don't match every feature.
        features = list(features)
        features.reverse()
        candidates = None
        candidate_set = None
        while len(features) > 0:
            feature = features.pop()
            we_have_the_feature = self.builders_for_feature.get(feature, [])
            if len(we_have_the_feature) > 0:
                if candidates is None:
                    candidates = we_have_the_feature
                    candidate_set = set(candidates)
                else:
                    # Eliminate any candidates that don't have this feature.
                    candidate_set = candidate_set.intersection(
                        set(we_have_the_feature))

        # The only valid candidates are the ones in candidate_set.
        # Go through the original list of candidates and pick the first one
        # that's in candidate_set.
        if candidate_set is None:
            return None
        for candidate in candidates:
            if candidate in candidate_set:
                return candidate
        return None

# The BeautifulSoup class will take feature lists from developers and use them
# to look up builders in this registry.
builder_registry = TreeBuilderRegistry()

class TreeBuilder(object):
    """Turn a document into a Beautiful Soup object tree."""

    NAME = "[Unknown tree builder]"
    ALTERNATE_NAMES = []
    features = []

    is_xml = False
    picklable = False
    preserve_whitespace_tags = set()
    empty_element_tags = None # A tag will be considered an empty-element
                              # tag when and only when it has no contents.

    # A value for these tag/attribute combinations is a space- or
    # comma-separated list of CDATA, rather than a single CDATA.
    cdata_list_attributes = {}


    def __init__(self):
        self.soup = None

    def reset(self):
        pass

    def can_be_empty_element(self, tag_name):
        """Might a tag with this name be an empty-element tag?

        The final markup may or may not actually present this tag as
        self-closing.

        For instance: an HTMLBuilder does not consider a <p> tag to be
        an empty-element tag (it's not in
        HTMLBuilder.empty_element_tags). This means an empty <p> tag
        will be presented as "<p></p>", not "<p />".

        The default implementation has no opinion about which tags are
        empty-element tags, so a tag will be presented as an
        empty-element tag if and only if it has no contents.
        "<foo></foo>" will become "<foo />", and "<foo>bar</foo>" will
        be left alone.
        """
        if self.empty_element_tags is None:
            return True
        return tag_name in self.empty_element_tags

    def feed(self, markup):
        raise NotImplementedError()

    def prepare_markup(self, markup, user_specified_encoding=None,
                       document_declared_encoding=None):
        return markup, None, None, False

    def test_fragment_to_document(self, fragment):
        """Wrap an HTML fragment to make it look like a document.

        Different parsers do this differently. For instance, lxml
        introduces an empty <head> tag, and html5lib
        doesn't. Abstracting this away lets us write simple tests
        which run HTML fragments through the parser and compare the
        results against other HTML fragments.

        This method should not be used outside of tests.
        """
        return fragment

    def set_up_substitutions(self, tag):
        return False

    def _replace_cdata_list_attribute_values(self, tag_name, attrs):
        """Replaces class="foo bar" with class=["foo", "bar"]

        Modifies its input in place.
        """
        if not attrs:
            return attrs
        if self.cdata_list_attributes:
            universal = self.cdata_list_attributes.get('*', [])
            tag_specific = self.cdata_list_attributes.get(
                tag_name.lower(), None)
            for attr in attrs.keys():
                if attr in universal or (tag_specific and attr in tag_specific):
                    # We have a "class"-type attribute whose string
                    # value is a whitespace-separated list of
                    # values. Split it into a list.
                    value = attrs[attr]
                    if isinstance(value, basestring):
                        values = whitespace_re.split(value)
                    else:
                        # html5lib sometimes calls setAttributes twice
                        # for the same tag when rearranging the parse
                        # tree. On the second call the attribute value
                        # here is already a list.  If this happens,
                        # leave the value alone rather than trying to
                        # split it again.
                        values = value
                    attrs[attr] = values
        return attrs

class SAXTreeBuilder(TreeBuilder):
    """A Beautiful Soup treebuilder that listens for SAX events."""

    def feed(self, markup):
        raise NotImplementedError()

    def close(self):
        pass

    def startElement(self, name, attrs):
        attrs = dict((key[1], value) for key, value in list(attrs.items()))
        #print "Start %s, %r" % (name, attrs)
        self.soup.handle_starttag(name, attrs)

    def endElement(self, name):
        #print "End %s" % name
        self.soup.handle_endtag(name)

    def startElementNS(self, nsTuple, nodeName, attrs):
        # Throw away (ns, nodeName) for now.
        self.startElement(nodeName, attrs)

    def endElementNS(self, nsTuple, nodeName):
        # Throw away (ns, nodeName) for now.
        self.endElement(nodeName)
        #handler.endElementNS((ns, node.nodeName), node.nodeName)

    def startPrefixMapping(self, prefix, nodeValue):
        # Ignore the prefix for now.
        pass

    def endPrefixMapping(self, prefix):
        # Ignore the prefix for now.
        # handler.endPrefixMapping(prefix)
        pass

    def characters(self, content):
        self.soup.handle_data(content)

    def startDocument(self):
        pass

    def endDocument(self):
        pass


class HTMLTreeBuilder(TreeBuilder):
    """This TreeBuilder knows facts about HTML.

    Such as which tags are empty-element tags.
    """

    preserve_whitespace_tags = HTMLAwareEntitySubstitution.preserve_whitespace_tags
    empty_element_tags = set(['br' , 'hr', 'input', 'img', 'meta',
                              'spacer', 'link', 'frame', 'base'])

    # The HTML standard defines these attributes as containing a
    # space-separated list of values, not a single value. That is,
    # class="foo bar" means that the 'class' attribute has two values,
    # 'foo' and 'bar', not the single value 'foo bar'.  When we
    # encounter one of these attributes, we will parse its value into
    # a list of values if possible. Upon output, the list will be
    # converted back into a string.
    cdata_list_attributes = {
        "*" : ['class', 'accesskey', 'dropzone'],
        "a" : ['rel', 'rev'],
        "link" :  ['rel', 'rev'],
        "td" : ["headers"],
        "th" : ["headers"],
        "td" : ["headers"],
        "form" : ["accept-charset"],
        "object" : ["archive"],

        # These are HTML5 specific, as are *.accesskey and *.dropzone above.
        "area" : ["rel"],
        "icon" : ["sizes"],
        "iframe" : ["sandbox"],
        "output" : ["for"],
        }

    def set_up_substitutions(self, tag):
        # We are only interested in <meta> tags
        if tag.name != 'meta':
            return False

        http_equiv = tag.get('http-equiv')
        content = tag.get('content')
        charset = tag.get('charset')

        # We are interested in <meta> tags that say what encoding the
        # document was originally in. This means HTML 5-style <meta>
        # tags that provide the "charset" attribute. It also means
        # HTML 4-style <meta> tags that provide the "content"
        # attribute and have "http-equiv" set to "content-type".
        #
        # In both cases we will replace the value of the appropriate
        # attribute with a standin object that can take on any
        # encoding.
        meta_encoding = None
        if charset is not None:
            # HTML 5 style:
            # <meta charset="utf8">
            meta_encoding = charset
            tag['charset'] = CharsetMetaAttributeValue(charset)

        elif (content is not None and http_equiv is not None
              and http_equiv.lower() == 'content-type'):
            # HTML 4 style:
            # <meta http-equiv="content-type" content="text/html; charset=utf8">
            tag['content'] = ContentMetaAttributeValue(content)

        return (meta_encoding is not None)

def register_treebuilders_from(module):
    """Copy TreeBuilders from the given module into this module."""
    # I'm fairly sure this is not the best way to do this.
    this_module = sys.modules['bs4.builder']
    for name in module.__all__:
        obj = getattr(module, name)

        if issubclass(obj, TreeBuilder):
            setattr(this_module, name, obj)
            this_module.__all__.append(name)
            # Register the builder while we're at it.
            this_module.builder_registry.register(obj)

class ParserRejectedMarkup(Exception):
    pass

# Builders are registered in reverse order of priority, so that custom
# builder registrations will take precedence. In general, we want lxml
# to take precedence over html5lib, because it's faster. And we only
# want to use HTMLParser as a last result.
from . import _htmlparser
register_treebuilders_from(_htmlparser)
try:
    from . import _html5lib
    register_treebuilders_from(_html5lib)
except ImportError:
    # They don't have html5lib installed.
    pass
try:
    from . import _lxml
    register_treebuilders_from(_lxml)
except ImportError:
    # They don't have lxml installed.
    pass


================================================
FILE: example/parallax_svg_tools/bs4/builder/_html5lib.py
================================================
# Use of this source code is governed by a BSD-style license that can be
# found in the LICENSE file.

__all__ = [
    'HTML5TreeBuilder',
    ]

import warnings
from bs4.builder import (
    PERMISSIVE,
    HTML,
    HTML_5,
    HTMLTreeBuilder,
    )
from bs4.element import (
    NamespacedAttribute,
    whitespace_re,
)
import html5lib
from html5lib.constants import namespaces
from bs4.element import (
    Comment,
    Doctype,
    NavigableString,
    Tag,
    )

try:
    # Pre-0.99999999
    from html5lib.treebuilders import _base as treebuilder_base
    new_html5lib = False
except ImportError, e:
    # 0.99999999 and up
    from html5lib.treebuilders import base as treebuilder_base
    new_html5lib = True

class HTML5TreeBuilder(HTMLTreeBuilder):
    """Use html5lib to build a tree."""

    NAME = "html5lib"

    features = [NAME, PERMISSIVE, HTML_5, HTML]

    def prepare_markup(self, markup, user_specified_encoding,
                       document_declared_encoding=None, exclude_encodings=None):
        # Store the user-specified encoding for use later on.
        self.user_specified_encoding = user_specified_encoding

        # document_declared_encoding and exclude_encodings aren't used
        # ATM because the html5lib TreeBuilder doesn't use
        # UnicodeDammit.
        if exclude_encodings:
            warnings.warn("You provided a value for exclude_encoding, but the html5lib tree builder doesn't support exclude_encoding.")
        yield (markup, None, None, False)

    # These methods are defined by Beautiful Soup.
    def feed(self, markup):
        if self.soup.parse_only is not None:
            warnings.warn("You provided a value for parse_only, but the html5lib tree builder doesn't support parse_only. The entire document will be parsed.")
        parser = html5lib.HTMLParser(tree=self.create_treebuilder)

        extra_kwargs = dict()
        if not isinstance(markup, unicode):
            if new_html5lib:
                extra_kwargs['override_encoding'] = self.user_specified_encoding
            else:
                extra_kwargs['encoding'] = self.user_specified_encoding
        doc = parser.parse(markup, **extra_kwargs)

        # Set the character encoding detected by the tokenizer.
        if isinstance(markup, unicode):
            # We need to special-case this because html5lib sets
            # charEncoding to UTF-8 if it gets Unicode input.
            doc.original_encoding = None
        else:
            original_encoding = parser.tokenizer.stream.charEncoding[0]
            if not isinstance(original_encoding, basestring):
                # In 0.99999999 and up, the encoding is an html5lib
                # Encoding object. We want to use a string for compatibility
                # with other tree builders.
                original_encoding = original_encoding.name
            doc.original_encoding = original_encoding

    def create_treebuilder(self, namespaceHTMLElements):
        self.underlying_builder = TreeBuilderForHtml5lib(
            self.soup, namespaceHTMLElements)
        return self.underlying_builder

    def test_fragment_to_document(self, fragment):
        """See `TreeBuilder`."""
        return u'<html><head></head><body>%s</body></html>' % fragment


class TreeBuilderForHtml5lib(treebuilder_base.TreeBuilder):

    def __init__(self, soup, namespaceHTMLElements):
        self.soup = soup
        super(TreeBuilderForHtml5lib, self).__init__(namespaceHTMLElements)

    def documentClass(self):
        self.soup.reset()
        return Element(self.soup, self.soup, None)

    def insertDoctype(self, token):
        name = token["name"]
        publicId = token["publicId"]
        systemId = token["systemId"]

        doctype = Doctype.for_name_and_ids(name, publicId, systemId)
        self.soup.object_was_parsed(doctype)

    def elementClass(self, name, namespace):
        tag = self.soup.new_tag(name, namespace)
        return Element(tag, self.soup, namespace)

    def commentClass(self, data):
        return TextNode(Comment(data), self.soup)

    def fragmentClass(self):
        self.soup = BeautifulSoup("")
        self.soup.name = "[document_fragment]"
        return Element(self.soup, self.soup, None)

    def appendChild(self, node):
        # XXX This code is not covered by the BS4 tests.
        self.soup.append(node.element)

    def getDocument(self):
        return self.soup

    def getFragment(self):
        return treebuilder_base.TreeBuilder.getFragment(self).element

class AttrList(object):
    def __init__(self, element):
        self.element = element
        self.attrs = dict(self.element.attrs)
    def __iter__(self):
        return list(self.attrs.items()).__iter__()
    def __setitem__(self, name, value):
        # If this attribute is a multi-valued attribute for this element,
        # turn its value into a list.
        list_attr = HTML5TreeBuilder.cdata_list_attributes
        if (name in list_attr['*']
            or (self.element.name in list_attr
                and name in list_attr[self.element.name])):
            # A node that is being cloned may have already undergone
            # this procedure.
            if not isinstance(value, list):
                value = whitespace_re.split(value)
        self.element[name] = value
    def items(self):
        return list(self.attrs.items())
    def keys(self):
        return list(self.attrs.keys())
    def __len__(self):
        return len(self.attrs)
    def __getitem__(self, name):
        return self.attrs[name]
    def __contains__(self, name):
        return name in list(self.attrs.keys())


class Element(treebuilder_base.Node):
    def __init__(self, element, soup, namespace):
        treebuilder_base.Node.__init__(self, element.name)
        self.element = element
        self.soup = soup
        self.namespace = namespace

    def appendChild(self, node):
        string_child = child = None
        if isinstance(node, basestring):
            # Some other piece of code decided to pass in a string
            # instead of creating a TextElement object to contain the
            # string.
            string_child = child = node
        elif isinstance(node, Tag):
            # Some other piece of code decided to pass in a Tag
            # instead of creating an Element object to contain the
            # Tag.
            child = node
        elif node.element.__class__ == NavigableString:
            string_child = child = node.element
        else:
            child = node.element

        if not isinstance(child, basestring) and child.parent is not None:
            node.element.extract()

        if (string_child and self.element.contents
            and self.element.contents[-1].__class__ == NavigableString):
            # We are appending a string onto another string.
            # TODO This has O(n^2) performance, for input like
            # "a</a>a</a>a</a>..."
            old_element = self.element.contents[-1]
            new_element = self.soup.new_string(old_element + string_child)
            old_element.replace_with(new_element)
            self.soup._most_recent_element = new_element
        else:
            if isinstance(node, basestring):
                # Create a brand new NavigableString from this string.
                child = self.soup.new_string(node)

            # Tell Beautiful Soup to act as if it parsed this element
            # immediately after the parent's last descendant. (Or
            # immediately after the parent, if it has no children.)
            if self.element.contents:
                most_recent_element = self.element._last_descendant(False)
            elif self.element.next_element is not None:
                # Something from further ahead in the parse tree is
                # being inserted into this earlier element. This is
                # very annoying because it means an expensive search
                # for the last element in the tree.
                most_recent_element = self.soup._last_descendant()
            else:
                most_recent_element = self.element

            self.soup.object_was_parsed(
                child, parent=self.element,
                most_recent_element=most_recent_element)

    def getAttributes(self):
        return AttrList(self.element)

    def setAttributes(self, attributes):

        if attributes is not None and len(attributes) > 0:

            converted_attributes = []
            for name, value in list(attributes.items()):
                if isinstance(name, tuple):
                    new_name = NamespacedAttribute(*name)
                    del attributes[name]
                    attributes[new_name] = value

            self.soup.builder._replace_cdata_list_attribute_values(
                self.name, attributes)
            for name, value in attributes.items():
                self.element[name] = value

            # The attributes may contain variables that need substitution.
            # Call set_up_substitutions manually.
            #
            # The Tag constructor called this method when the Tag was created,
            # but we just set/changed the attributes, so call it again.
            self.soup.builder.set_up_substitutions(self.element)
    attributes = property(getAttributes, setAttributes)

    def insertText(self, data, insertBefore=None):
        if insertBefore:
            text = TextNode(self.soup.new_string(data), self.soup)
            self.insertBefore(data, insertBefore)
        else:
            self.appendChild(data)

    def insertBefore(self, node, refNode):
        index = self.element.index(refNode.element)
        if (node.element.__class__ == NavigableString and self.element.contents
            and self.element.contents[index-1].__class__ == NavigableString):
            # (See comments in appendChild)
            old_node = self.element.contents[index-1]
            new_str = self.soup.new_string(old_node + node.element)
            old_node.replace_with(new_str)
        else:
            self.element.insert(index, node.element)
            node.parent = self

    def removeChild(self, node):
        node.element.extract()

    def reparentChildren(self, new_parent):
        """Move all of this tag's children into another tag."""
        # print "MOVE", self.element.contents
        # print "FROM", self.element
        # print "TO", new_parent.element
        element = self.element
        new_parent_element = new_parent.element
        # Determine what this tag's next_element will be once all the children
        # are removed.
        final_next_element = element.next_sibling

        new_parents_last_descendant = new_parent_element._last_descendant(False, False)
        if len(new_parent_element.contents) > 0:
            # The new parent already contains children. We will be
            # appending this tag's children to the end.
            new_parents_last_child = new_parent_element.contents[-1]
            new_parents_last_descendant_next_element = new_parents_last_descendant.next_element
        else:
            # The new parent contains no children.
            new_parents_last_child = None
            new_parents_last_descendant_next_element = new_parent_element.next_element

        to_append = element.contents
        append_after = new_parent_element.contents
        if len(to_append) > 0:
            # Set the first child's previous_element and previous_sibling
            # to elements within the new parent
            first_child = to_append[0]
            if new_parents_last_descendant:
                first_child.previous_element = new_parents_last_descendant
            else:
                first_child.previous_element = new_parent_element
            first_child.previous_sibling = new_parents_last_child
            if new_parents_last_descendant:
                new_parents_last_descendant.next_element = first_child
            else:
                new_parent_element.next_element = first_child
            if new_parents_last_child:
                new_parents_last_child.next_sibling = first_child

            # Fix the last child's next_element and next_sibling
            last_child = to_append[-1]
            last_child.next_element = new_parents_last_descendant_next_element
            if new_parents_last_descendant_next_element:
                new_parents_last_descendant_next_element.previous_element = last_child
            last_child.next_sibling = None

        for child in to_append:
            child.parent = new_parent_element
            new_parent_element.contents.append(child)

        # Now that this element has no children, change its .next_element.
        element.contents = []
        element.next_element = final_next_element

        # print "DONE WITH MOVE"
        # print "FROM", self.element
        # print "TO", new_parent_element

    def cloneNode(self):
        tag = self.soup.new_tag(self.element.name, self.namespace)
        node = Element(tag, self.soup, self.namespace)
        for key,value in self.attributes:
            node.attributes[key] = value
        return node

    def hasContent(self):
        return self.element.contents

    def getNameTuple(self):
        if self.namespace == None:
            return namespaces["html"], self.name
        else:
            return self.namespace, self.name

    nameTuple = property(getNameTuple)

class TextNode(Element):
    def __init__(self, element, soup):
        treebuilder_base.Node.__init__(self, None)
        self.element = element
        self.soup = soup

    def cloneNode(self):
        raise NotImplementedError


================================================
FILE: example/parallax_svg_tools/bs4/builder/_htmlparser.py
================================================
"""Use the HTMLParser library to parse HTML files that aren't too bad."""

# Use of this source code is governed by a BSD-style license that can be
# found in the LICENSE file.

__all__ = [
    'HTMLParserTreeBuilder',
    ]

from HTMLParser import HTMLParser

try:
    from HTMLParser import HTMLParseError
except ImportError, e:
    # HTMLParseError is removed in Python 3.5. Since it can never be
    # thrown in 3.5, we can just define our own class as a placeholder.
    class HTMLParseError(Exception):
        pass

import sys
import warnings

# Starting in Python 3.2, the HTMLParser constructor takes a 'strict'
# argument, which we'd like to set to False. Unfortunately,
# http://bugs.python.org/issue13273 makes strict=True a better bet
# before Python 3.2.3.
#
# At the end of this file, we monkeypatch HTMLParser so that
# strict=True works well on Python 3.2.2.
major, minor, release = sys.version_info[:3]
CONSTRUCTOR_TAKES_STRICT = major == 3 and minor == 2 and release >= 3
CONSTRUCTOR_STRICT_IS_DEPRECATED = major == 3 and minor == 3
CONSTRUCTOR_TAKES_CONVERT_CHARREFS = major == 3 and minor >= 4


from bs4.element import (
    CData,
    Comment,
    Declaration,
    Doctype,
    ProcessingInstruction,
    )
from bs4.dammit import EntitySubstitution, UnicodeDammit

from bs4.builder import (
    HTML,
    HTMLTreeBuilder,
    STRICT,
    )


HTMLPARSER = 'html.parser'

class BeautifulSoupHTMLParser(HTMLParser):
    def handle_starttag(self, name, attrs):
        # XXX namespace
        attr_dict = {}
        for key, value in attrs:
            # Change None attribute values to the empty string
            # for consistency with the other tree builders.
            if value is None:
                value = ''
            attr_dict[key] = value
            attrvalue = '""'
        self.soup.handle_starttag(name, None, None, attr_dict)

    def handle_endtag(self, name):
        self.soup.handle_endtag(name)

    def handle_data(self, data):
        self.soup.handle_data(data)

    def handle_charref(self, name):
        # XXX workaround for a bug in HTMLParser. Remove this once
        # it's fixed in all supported versions.
        # http://bugs.python.org/issue13633
        if name.startswith('x'):
            real_name = int(name.lstrip('x'), 16)
        elif name.startswith('X'):
            real_name = int(name.lstrip('X'), 16)
        else:
            real_name = int(name)

        try:
            data = unichr(real_name)
        except (ValueError, OverflowError), e:
            data = u"\N{REPLACEMENT CHARACTER}"

        self.handle_data(data)

    def handle_entityref(self, name):
        character = EntitySubstitution.HTML_ENTITY_TO_CHARACTER.get(name)
        if character is not None:
            data = character
        else:
            data = "&%s;" % name
        self.handle_data(data)

    def handle_comment(self, data):
        self.soup.endData()
        self.soup.handle_data(data)
        self.soup.endData(Comment)

    def handle_decl(self, data):
        self.soup.endData()
        if data.startswith("DOCTYPE "):
            data = data[len("DOCTYPE "):]
        elif data == 'DOCTYPE':
            # i.e. "<!DOCTYPE>"
            data = ''
        self.soup.handle_data(data)
        self.soup.endData(Doctype)

    def unknown_decl(self, data):
        if data.upper().startswith('CDATA['):
            cls = CData
            data = data[len('CDATA['):]
        else:
            cls = Declaration
        self.soup.endData()
        self.soup.handle_data(data)
        self.soup.endData(cls)

    def handle_pi(self, data):
        self.soup.endData()
        self.soup.handle_data(data)
        self.soup.endData(ProcessingInstruction)


class HTMLParserTreeBuilder(HTMLTreeBuilder):

    is_xml = False
    picklable = True
    NAME = HTMLPARSER
    features = [NAME, HTML, STRICT]

    def __init__(self, *args, **kwargs):
        if CONSTRUCTOR_TAKES_STRICT and not CONSTRUCTOR_STRICT_IS_DEPRECATED:
            kwargs['strict'] = False
        if CONSTRUCTOR_TAKES_CONVERT_CHARREFS:
            kwargs['convert_charrefs'] = False
        self.parser_args = (args, kwargs)

    def prepare_markup(self, markup, user_specified_encoding=None,
                       document_declared_encoding=None, exclude_encodings=None):
        """
        :return: A 4-tuple (markup, original encoding, encoding
        declared within markup, whether any characters had to be
        replaced with REPLACEMENT CHARACTER).
        """
        if isinstance(markup, unicode):
            yield (markup, None, None, False)
            return

        try_encodings = [user_specified_encoding, document_declared_encoding]
        dammit = UnicodeDammit(markup, try_encodings, is_html=True,
                               exclude_encodings=exclude_encodings)
        yield (dammit.markup, dammit.original_encoding,
               dammit.declared_html_encoding,
               dammit.contains_replacement_characters)

    def feed(self, markup):
        args, kwargs = self.parser_args
        parser = BeautifulSoupHTMLParser(*args, **kwargs)
        parser.soup = self.soup
        try:
            parser.feed(markup)
        except HTMLParseError, e:
            warnings.warn(RuntimeWarning(
                "Python's built-in HTMLParser cannot parse the given document. This is not a bug in Beautiful Soup. The best solution is to install an external parser (lxml or html5lib), and use Beautiful Soup with that parser. See http://www.crummy.com/software/BeautifulSoup/bs4/doc/#installing-a-parser for help."))
            raise e

# Patch 3.2 versions of HTMLParser earlier than 3.2.3 to use some
# 3.2.3 code. This ensures they don't treat markup like <p></p> as a
# string.
#
# XXX This code can be removed once most Python 3 users are on 3.2.3.
if major == 3 and minor == 2 and not CONSTRUCTOR_TAKES_STRICT:
    import re
    attrfind_tolerant = re.compile(
        r'\s*((?<=[\'"\s])[^\s/>][^\s/=>]*)(\s*=+\s*'
        r'(\'[^\']*\'|"[^"]*"|(?![\'"])[^>\s]*))?')
    HTMLParserTreeBuilder.attrfind_tolerant = attrfind_tolerant

    locatestarttagend = re.compile(r"""
  <[a-zA-Z][-.a-zA-Z0-9:_]*          # tag name
  (?:\s+                             # whitespace before attribute name
    (?:[a-zA-Z_][-.:a-zA-Z0-9_]*     # attribute name
      (?:\s*=\s*                     # value indicator
        (?:'[^']*'                   # LITA-enclosed value
          |\"[^\"]*\"                # LIT-enclosed value
          |[^'\">\s]+                # bare value
         )
       )?
     )
   )*
  \s*                                # trailing whitespace
""", re.VERBOSE)
    BeautifulSoupHTMLParser.locatestarttagend = locatestarttagend

    from html.parser import tagfind, attrfind

    def parse_starttag(self, i):
        self.__starttag_text = None
        endpos = self.check_for_whole_start_tag(i)
        if endpos < 0:
            return endpos
        rawdata = self.rawdata
        self.__starttag_text = rawdata[i:endpos]

        # Now parse the data between i+1 and j into a tag and attrs
        attrs = []
        match = tagfind.match(rawdata, i+1)
        assert match, 'unexpected call to parse_starttag()'
        k = match.end()
        self.lasttag = tag = rawdata[i+1:k].lower()
        while k < endpos:
            if self.strict:
                m = attrfind.match(rawdata, k)
            else:
                m = attrfind_tolerant.match(rawdata, k)
            if not m:
                break
            attrname, rest, attrvalue = m.group(1, 2, 3)
            if not rest:
                attrvalue = None
            elif attrvalue[:1] == '\'' == attrvalue[-1:] or \
                 attrvalue[:1] == '"' == attrvalue[-1:]:
                attrvalue = attrvalue[1:-1]
            if attrvalue:
                attrvalue = self.unescape(attrvalue)
            attrs.append((attrname.lower(), attrvalue))
            k = m.end()

        end = rawdata[k:endpos].strip()
        if end not in (">", "/>"):
            lineno, offset = self.getpos()
            if "\n" in self.__starttag_text:
                lineno = lineno + self.__starttag_text.count("\n")
                offset = len(self.__starttag_text) \
                         - self.__starttag_text.rfind("\n")
            else:
                offset = offset + len(self.__starttag_text)
            if self.strict:
                self.error("junk characters in start tag: %r"
                           % (rawdata[k:endpos][:20],))
            self.handle_data(rawdata[i:endpos])
            return endpos
        if end.endswith('/>'):
            # XHTML-style empty tag: <span attr="value" />
            self.handle_startendtag(tag, attrs)
        else:
            self.handle_starttag(tag, attrs)
            if tag in self.CDATA_CONTENT_ELEMENTS:
                self.set_cdata_mode(tag)
        return endpos

    def set_cdata_mode(self, elem):
        self.cdata_elem = elem.lower()
        self.interesting = re.compile(r'</\s*%s\s*>' % self.cdata_elem, re.I)

    BeautifulSoupHTMLParser.parse_starttag = parse_starttag
    BeautifulSoupHTMLParser.set_cdata_mode = set_cdata_mode

    CONSTRUCTOR_TAKES_STRICT = True


================================================
FILE: example/parallax_svg_tools/bs4/builder/_lxml.py
================================================
# Use of this source code is governed by a BSD-style license that can be
# found in the LICENSE file.
__all__ = [
    'LXMLTreeBuilderForXML',
    'LXMLTreeBuilder',
    ]

from io import BytesIO
from StringIO import StringIO
import collections
from lxml import etree
from bs4.element import (
    Comment,
    Doctype,
    NamespacedAttribute,
    ProcessingInstruction,
    XMLProcessingInstruction,
)
from bs4.builder import (
    FAST,
    HTML,
    HTMLTreeBuilder,
    PERMISSIVE,
    ParserRejectedMarkup,
    TreeBuilder,
    XML)
from bs4.dammit import EncodingDetector

LXML = 'lxml'

class LXMLTreeBuilderForXML(TreeBuilder):
    DEFAULT_PARSER_CLASS = etree.XMLParser

    is_xml = True
    processing_instruction_class = XMLProcessingInstruction

    NAME = "lxml-xml"
    ALTERNATE_NAMES = ["xml"]

    # Well, it's permissive by XML parser standards.
    features = [NAME, LXML, XML, FAST, PERMISSIVE]

    CHUNK_SIZE = 512

    # This namespace mapping is specified in the XML Namespace
    # standard.
    DEFAULT_NSMAPS = {'http://www.w3.org/XML/1998/namespace' : "xml"}

    def default_parser(self, encoding):
        # This can either return a parser object or a class, which
        # will be instantiated with default arguments.
        if self._default_parser is not None:
            return self._default_parser
        return etree.XMLParser(
            target=self, strip_cdata=False, recover=True, encoding=encoding)

    def parser_for(self, encoding):
        # Use the default parser.
        parser = self.default_parser(encoding)

        if isinstance(parser, collections.Callable):
            # Instantiate the parser with default arguments
            parser = parser(target=self, strip_cdata=False, encoding=encoding)
        return parser

    def __init__(self, parser=None, empty_element_tags=None):
        # TODO: Issue a warning if parser is present but not a
        # callable, since that means there's no way to create new
        # parsers for different encodings.
        self._default_parser = parser
        if empty_element_tags is not None:
            self.empty_element_tags = set(empty_element_tags)
        self.soup = None
        self.nsmaps = [self.DEFAULT_NSMAPS]

    def _getNsTag(self, tag):
        # Split the namespace URL out of a fully-qualified lxml tag
        # name. Copied from lxml's src/lxml/sax.py.
        if tag[0] == '{':
            return tuple(tag[1:].split('}', 1))
        else:
            return (None, tag)

    def prepare_markup(self, markup, user_specified_encoding=None,
                       exclude_encodings=None,
                       document_declared_encoding=None):
        """
        :yield: A series of 4-tuples.
         (markup, encoding, declared encoding,
          has undergone character replacement)

        Each 4-tuple represents a strategy for parsing the document.
        """
        # Instead of using UnicodeDammit to convert the bytestring to
        # Unicode using different encodings, use EncodingDetector to
        # iterate over the encodings, and tell lxml to try to parse
        # the document as each one in turn.
        is_html = not self.is_xml
        if is_html:
            self.processing_instruction_class = ProcessingInstruction
        else:
            self.processing_instruction_class = XMLProcessingInstruction

        if isinstance(markup, unicode):
            # We were given Unicode. Maybe lxml can parse Unicode on
            # this system?
            yield markup, None, document_declared_encoding, False

        if isinstance(markup, unicode):
            # No, apparently not. Convert the Unicode to UTF-8 and
            # tell lxml to parse it as UTF-8.
            yield (markup.encode("utf8"), "utf8",
                   document_declared_encoding, False)

        try_encodings = [user_specified_encoding, document_declared_encoding]
        detector = EncodingDetector(
            markup, try_encodings, is_html, exclude_encodings)
        for encoding in detector.encodings:
            yield (detector.markup, encoding, document_declared_encoding, False)

    def feed(self, markup):
        if isinstance(markup, bytes):
            markup = BytesIO(markup)
        elif isinstance(markup, unicode):
            markup = StringIO(markup)

        # Call feed() at least once, even if the markup is empty,
        # or the parser won't be initialized.
        data = markup.read(self.CHUNK_SIZE)
        try:
            self.parser = self.parser_for(self.soup.original_encoding)
            self.parser.feed(data)
            while len(data) != 0:
                # Now call feed() on the rest of the data, chunk by chunk.
                data = markup.read(self.CHUNK_SIZE)
                if len(data) != 0:
                    self.parser.feed(data)
            self.parser.close()
        except (UnicodeDecodeError, LookupError, etree.ParserError), e:
            raise ParserRejectedMarkup(str(e))

    def close(self):
        self.nsmaps = [self.DEFAULT_NSMAPS]

    def start(self, name, attrs, nsmap={}):
        # Make sure attrs is a mutable dict--lxml may send an immutable dictproxy.
        attrs = dict(attrs)
        nsprefix = None
        # Invert each namespace map as it comes in.
        if len(self.nsmaps) > 1:
            # There are no new namespaces for this tag, but
            # non-default namespaces are in play, so we need a
            # separate tag stack to know when they end.
            self.nsmaps.append(None)
        elif len(nsmap) > 0:
            # A new namespace mapping has come into play.
            inverted_nsmap = dict((value, key) for key, value in nsmap.items())
            self.nsmaps.append(inverted_nsmap)
            # Also treat the namespace mapping as a set of attributes on the
            # tag, so we can recreate it later.
            attrs = attrs.copy()
            for prefix, namespace in nsmap.items():
                attribute = NamespacedAttribute(
                    "xmlns", prefix, "http://www.w3.org/2000/xmlns/")
                attrs[attribute] = namespace

        # Namespaces are in play. Find any attributes that came in
        # from lxml with namespaces attached to their names, and
        # turn then into NamespacedAttribute objects.
        new_attrs = {}
        for attr, value in attrs.items():
            namespace, attr = self._getNsTag(attr)
            if namespace is None:
                new_attrs[attr] = value
            else:
                nsprefix = self._prefix_for_namespace(namespace)
                attr = NamespacedAttribute(nsprefix, attr, namespace)
                new_attrs[attr] = value
        attrs = new_attrs

        namespace, name = self._getNsTag(name)
        nsprefix = self._prefix_for_namespace(namespace)
        self.soup.handle_starttag(name, namespace, nsprefix, attrs)

    def _prefix_for_namespace(self, namespace):
        """Find the currently active prefix for the given namespace."""
        if namespace is None:
            return None
        for inverted_nsmap in reversed(self.nsmaps):
            if inverted_nsmap is not None and namespace in inverted_nsmap:
                return inverted_nsmap[namespace]
        return None

    def end(self, name):
        self.soup.endData()
        completed_tag = self.soup.tagStack[-1]
        namespace, name = self._getNsTag(name)
        nsprefix = None
        if namespace is not None:
            for inverted_nsmap in reversed(self.nsmaps):
                if inverted_nsmap is not None and namespace in inverted_nsmap:
                    nsprefix = inverted_nsmap[namespace]
                    break
        self.soup.handle_endtag(name, nsprefix)
        if len(self.nsmaps) > 1:
            # This tag, or one of its parents, introduced a namespace
            # mapping, so pop it off the stack.
            self.nsmaps.pop()

    def pi(self, target, data):
        self.soup.endData()
        self.soup.handle_data(target + ' ' + data)
        self.soup.endData(self.processing_instruction_class)

    def data(self, content):
        self.soup.handle_data(content)

    def doctype(self, name, pubid, system):
        self.soup.endData()
        doctype = Doctype.for_name_and_ids(name, pubid, system)
        self.soup.object_was_parsed(doctype)

    def comment(self, content):
        "Handle comments as Comment objects."
        self.soup.endData()
        self.soup.handle_data(content)
        self.soup.endData(Comment)

    def test_fragment_to_document(self, fragment):
        """See `TreeBuilder`."""
        return u'<?xml version="1.0" encoding="utf-8"?>\n%s' % fragment


class LXMLTreeBuilder(HTMLTreeBuilder, LXMLTreeBuilderForXML):

    NAME = LXML
    ALTERNATE_NAMES = ["lxml-html"]

    features = ALTERNATE_NAMES + [NAME, HTML, FAST, PERMISSIVE]
    is_xml = False
    processing_instruction_class = ProcessingInstruction

    def default_parser(self, encoding):
        return etree.HTMLParser

    def feed(self, markup):
        encoding = self.soup.original_encoding
        try:
            self.parser = self.parser_for(encoding)
            self.parser.feed(markup)
            self.parser.close()
        except (UnicodeDecodeError, LookupError, etree.ParserError), e:
            raise ParserRejectedMarkup(str(e))


    def test_fragment_to_document(self, fragment):
        """See `TreeBuilder`."""
        return u'<html><body>%s</body></html>' % fragment


================================================
FILE: example/parallax_svg_tools/bs4/dammit.py
================================================
# -*- coding: utf-8 -*-
"""Beautiful Soup bonus library: Unicode, Dammit

This library converts a bytestream to Unicode through any means
necessary. It is heavily based on code from Mark Pilgrim's Universal
Feed Parser. It works best on XML and HTML, but it does not rewrite the
XML or HTML to reflect a new encoding; that's the tree builder's job.
"""
# Use of this source code is governed by a BSD-style license that can be
# found in the LICENSE file.
__license__ = "MIT"

import codecs
from htmlentitydefs import codepoint2name
import re
import logging
import string

# Import a library to autodetect character encodings.
chardet_type = None
try:
    # First try the fast C implementation.
    #  PyPI package: cchardet
    import cchardet
    def chardet_dammit(s):
        return cchardet.detect(s)['encoding']
except ImportError:
    try:
        # Fall back to the pure Python implementation
        #  Debian package: python-chardet
        #  PyPI package: chardet
        import chardet
        def chardet_dammit(s):
            return chardet.detect(s)['encoding']
        #import chardet.constants
        #chardet.constants._debug = 1
    except ImportError:
        # No chardet available.
        def chardet_dammit(s):
            return None

# Available from http://cjkpython.i18n.org/.
try:
    import iconv_codec
except ImportError:
    pass

xml_encoding_re = re.compile(
    '^<\?.*encoding=[\'"](.*?)[\'"].*\?>'.encode(), re.I)
html_meta_re = re.compile(
    '<\s*meta[^>]+charset\s*=\s*["\']?([^>]*?)[ /;\'">]'.encode(), re.I)

class EntitySubstitution(object):

    """Substitute XML or HTML entities for the corresponding characters."""

    def _populate_class_variables():
        lookup = {}
        reverse_lookup = {}
        characters_for_re = []
        for codepoint, name in list(codepoint2name.items()):
            character = unichr(codepoint)
            if codepoint != 34:
                # There's no point in turning the quotation mark into
                # &quot;, unless it happens within an attribute value, which
                # is handled elsewhere.
                characters_for_re.append(character)
                lookup[character] = name
            # But we do want to turn &quot; into the quotation mark.
            reverse_lookup[name] = character
        re_definition = "[%s]" % "".join(characters_for_re)
        return lookup, reverse_lookup, re.compile(re_definition)
    (CHARACTER_TO_HTML_ENTITY, HTML_ENTITY_TO_CHARACTER,
     CHARACTER_TO_HTML_ENTITY_RE) = _populate_class_variables()

    CHARACTER_TO_XML_ENTITY = {
        "'": "apos",
        '"': "quot",
        "&": "amp",
        "<": "lt",
        ">": "gt",
        }

    BARE_AMPERSAND_OR_BRACKET = re.compile("([<>]|"
                                           "&(?!#\d+;|#x[0-9a-fA-F]+;|\w+;)"
                                           ")")

    AMPERSAND_OR_BRACKET = re.compile("([<>&])")

    @classmethod
    def _substitute_html_entity(cls, matchobj):
        entity = cls.CHARACTER_TO_HTML_ENTITY.get(matchobj.group(0))
        return "&%s;" % entity

    @classmethod
    def _substitute_xml_entity(cls, matchobj):
        """Used with a regular expression to substitute the
        appropriate XML entity for an XML special character."""
        entity = cls.CHARACTER_TO_XML_ENTITY[matchobj.group(0)]
        return "&%s;" % entity

    @classmethod
    def quoted_attribute_value(self, value):
        """Make a value into a quoted XML attribute, possibly escaping it.

         Most strings will be quoted using double quotes.

          Bob's Bar -> "Bob's Bar"

         If a string contains double quotes, it will be quoted using
         single quotes.

          Welcome to "my bar" -> 'Welcome to "my bar"'

         If a string contains both single and double quotes, the
         double quotes will be escaped, and the string will be quoted
         using double quotes.

          Welcome to "Bob's Bar" -> "Welcome to &quot;Bob's bar&quot;
        """
        quote_with = '"'
        if '"' in value:
            if "'" in value:
                # The string contains both single and double
                # quotes.  Turn the double quotes into
                # entities. We quote the double quotes rather than
                # the single quotes because the entity name is
                # "&quot;" whether this is HTML or XML.  If we
                # quoted the single quotes, we'd have to decide
                # between &apos; and &squot;.
                replace_with = "&quot;"
                value = value.replace('"', replace_with)
            else:
                # There are double quotes but no single quotes.
                # We can use single quotes to quote the attribute.
                quote_with = "'"
        return quote_with + value + quote_with

    @classmethod
    def substitute_xml(cls, value, make_quoted_attribute=False):
        """Substitute XML entities for special XML characters.

        :param value: A string to be substituted. The less-than sign
          will become &lt;, the greater-than sign will become &gt;,
          and any ampersands will become &amp;. If you want ampersands
          that appear to be part of an entity definition to be left
          alone, use substitute_xml_containing_entities() instead.

        :param make_quoted_attribute: If True, then the string will be
         quoted, as befits an attribute value.
        """
        # Escape angle brackets and ampersands.
        value = cls.AMPERSAND_OR_BRACKET.sub(
            cls._substitute_xml_entity, value)

        if make_quoted_attribute:
            value = cls.quoted_attribute_value(value)
        return value

    @classmethod
    def substitute_xml_containing_entities(
        cls, value, make_quoted_attribute=False):
        """Substitute XML entities for special XML characters.

        :param value: A string to be substituted. The less-than sign will
          become &lt;, the greater-than sign will become &gt;, and any
          ampersands that are not part of an entity defition will
          become &amp;.

        :param make_quoted_attribute: If True, then the string will be
         quoted, as befits an attribute value.
        """
        # Escape angle brackets, and ampersands that aren't part of
        # entities.
        value = cls.BARE_AMPERSAND_OR_BRACKET.sub(
            cls._substitute_xml_entity, value)

        if make_quoted_attribute:
            value = cls.quoted_attribute_value(value)
        return value

    @classmethod
    def substitute_html(cls, s):
        """Replace certain Unicode characters with named HTML entities.

        This differs from data.encode(encoding, 'xmlcharrefreplace')
        in that the goal is to make the result more readable (to those
        with ASCII displays) rather than to recover from
        errors. There's absolutely nothing wrong with a UTF-8 string
        containg a LATIN SMALL LETTER E WITH ACUTE, but replacing that
        character with "&eacute;" will make it more readable to some
        people.
        """
        return cls.CHARACTER_TO_HTML_ENTITY_RE.sub(
            cls._substitute_html_entity, s)


class EncodingDetector:
    """Suggests a number of possible encodings for a bytestring.

    Order of precedence:

    1. Encodings you specifically tell EncodingDetector to try first
    (the override_encodings argument to the constructor).

    2. An encoding declared within the bytestring itself, either in an
    XML declaration (if the bytestring is to be interpreted as an XML
    document), or in a <meta> tag (if the bytestring is to be
    interpreted as an HTML document.)

    3. An encoding detected through textual analysis by chardet,
    cchardet, or a similar external library.

    4. UTF-8.

    5. Windows-1252.
    """
    def __init__(self, markup, override_encodings=None, is_html=False,
                 exclude_encodings=None):
        self.override_encodings = override_encodings or []
        exclude_encodings = exclude_encodings or []
        self.exclude_encodings = set([x.lower() for x in exclude_encodings])
        self.chardet_encoding = None
        self.is_html = is_html
        self.declared_encoding = None

        # First order of business: strip a byte-order mark.
        self.markup, self.sniffed_encoding = self.strip_byte_order_mark(markup)

    def _usable(self, encoding, tried):
        if encoding is not None:
            encoding = encoding.lower()
            if encoding in self.exclude_encodings:
                return False
            if encoding not in tried:
                tried.add(encoding)
                return True
        return False

    @property
    def encodings(self):
        """Yield a number of encodings that might work for this markup."""
        tried = set()
        for e in self.override_encodings:
            if self._usable(e, tried):
                yield e

        # Did the document originally start with a byte-order mark
        # that indicated its encoding?
        if self._usable(self.sniffed_encoding, tried):
            yield self.sniffed_encoding

        # Look within the document for an XML or HTML encoding
        # declaration.
        if self.declared_encoding is None:
            self.declared_encoding = self.find_declared_encoding(
                self.markup, self.is_html)
        if self._usable(self.declared_encoding, tried):
            yield self.declared_encoding

        # Use third-party character set detection to guess at the
        # encoding.
        if self.chardet_encoding is None:
            self.chardet_encoding = chardet_dammit(self.markup)
        if self._usable(self.chardet_encoding, tried):
            yield self.chardet_encoding

        # As a last-ditch effort, try utf-8 and windows-1252.
        for e in ('utf-8', 'windows-1252'):
            if self._usable(e, tried):
                yield e

    @classmethod
    def strip_byte_order_mark(cls, data):
        """If a byte-order mark is present, strip it and return the encoding it implies."""
        encoding = None
        if isinstance(data, unicode):
            # Unicode data cannot have a byte-order mark.
            return data, encoding
        if (len(data) >= 4) and (data[:2] == b'\xfe\xff') \
               and (data[2:4] != '\x00\x00'):
            encoding = 'utf-16be'
            data = data[2:]
        elif (len(data) >= 4) and (data[:2] == b'\xff\xfe') \
                 and (data[2:4] != '\x00\x00'):
            encoding = 'utf-16le'
            data = data[2:]
        elif data[:3] == b'\xef\xbb\xbf':
            encoding = 'utf-8'
            data = data[3:]
        elif data[:4] == b'\x00\x00\xfe\xff':
            encoding = 'utf-32be'
            data = data[4:]
        elif data[:4] == b'\xff\xfe\x00\x00':
            encoding = 'utf-32le'
            data = data[4:]
        return data, encoding

    @classmethod
    def find_declared_encoding(cls, markup, is_html=False, search_entire_document=False):
        """Given a document, tries to find its declared encoding.

        An XML encoding is declared at the beginning of the document.

        An HTML encoding is declared in a <meta> tag, hopefully near the
        beginning of the document.
        """
        if search_entire_document:
            xml_endpos = html_endpos = len(markup)
        else:
            xml_endpos = 1024
            html_endpos = max(2048, int(len(markup) * 0.05))
            
        declared_encoding = None
        declared_encoding_match = xml_encoding_re.search(markup, endpos=xml_endpos)
        if not declared_encoding_match and is_html:
            declared_encoding_match = html_meta_re.search(markup, endpos=html_endpos)
        if declared_encoding_match is not None:
            declared_encoding = declared_encoding_match.groups()[0].decode(
                'ascii', 'replace')
        if declared_encoding:
            return declared_encoding.lower()
        return None

class UnicodeDammit:
    """A class for detecting the encoding of a *ML document and
    converting it to a Unicode string. If the source encoding is
    windows-1252, can replace MS smart quotes with their HTML or XML
    equivalents."""

    # This dictionary maps commonly seen values for "charset" in HTML
    # meta tags to the corresponding Python codec names. It only covers
    # values that aren't in Python's aliases and can't be determined
    # by the heuristics in find_codec.
    CHARSET_ALIASES = {"macintosh": "mac-roman",
                       "x-sjis": "shift-jis"}

    ENCODINGS_WITH_SMART_QUOTES = [
        "windows-1252",
        "iso-8859-1",
        "iso-8859-2",
        ]

    def __init__(self, markup, override_encodings=[],
                 smart_quotes_to=None, is_html=False, exclude_encodings=[]):
        self.smart_quotes_to = smart_quotes_to
        self.tried_encodings = []
        self.contains_replacement_characters = False
        self.is_html = is_html
        self.log = logging.getLogger(__name__)
        self.detector = EncodingDetector(
            markup, override_encodings, is_html, exclude_encodings)

        # Short-circuit if the data is in Unicode to begin with.
        if isinstance(markup, unicode) or markup == '':
            self.markup = markup
            self.unicode_markup = unicode(markup)
            self.original_encoding = None
            return

        # The encoding detector may have stripped a byte-order mark.
        # Use the stripped markup from this point on.
        self.markup = self.detector.markup

        u = None
        for encoding in self.detector.encodings:
            markup = self.detector.markup
            u = self._convert_from(encoding)
            if u is not None:
                break

        if not u:
            # None of the encodings worked. As an absolute last resort,
            # try them again with character replacement.

            for encoding in self.detector.encodings:
                if encoding != "ascii":
                    u = self._convert_from(encoding, "replace")
                if u is not None:
                    self.log.warning(
                            "Some characters could not be decoded, and were "
                            "replaced with REPLACEMENT CHARACTER."
                    )
                    self.contains_replacement_characters = True
                    break

        # If none of that worked, we could at this point force it to
        # ASCII, but that would destroy so much data that I think
        # giving up is better.
        self.unicode_markup = u
        if not u:
            self.original_encoding = None

    def _sub_ms_char(self, match):
        """Changes a MS smart quote character to an XML or HTML
        entity, or an ASCII character."""
        orig = match.group(1)
        if self.smart_quotes_to == 'ascii':
            sub = self.MS_CHARS_TO_ASCII.get(orig).encode()
        else:
            sub = self.MS_CHARS.get(orig)
            if type(sub) == tuple:
                if self.smart_quotes_to == 'xml':
                    sub = '&#x'.encode() + sub[1].encode() + ';'.encode()
                else:
                    sub = '&'.encode() + sub[0].encode() + ';'.encode()
            else:
                sub = sub.encode()
        return sub

    def _convert_from(self, proposed, errors="strict"):
        proposed = self.find_codec(proposed)
        if not proposed or (proposed, errors) in self.tried_encodings:
            return None
        self.tried_encodings.append((proposed, errors))
        markup = self.markup
        # Convert smart quotes to HTML if coming from an encoding
        # that might have them.
        if (self.smart_quotes_to is not None
            and proposed in self.ENCODINGS_WITH_SMART_QUOTES):
            smart_quotes_re = b"([\x80-\x9f])"
            smart_quotes_compiled = re.compile(smart_quotes_re)
            markup = smart_quotes_compiled.sub(self._sub_ms_char, markup)

        try:
            #print "Trying to convert document to %s (errors=%s)" % (
            #    proposed, errors)
            u = self._to_unicode(markup, proposed, errors)
            self.markup = u
            self.original_encoding = proposed
        except Exception as e:
            #print "That didn't work!"
            #print e
            return None
        #print "Correct encoding: %s" % proposed
        return self.markup

    def _to_unicode(self, data, encoding, errors="strict"):
        '''Given a string and its encoding, decodes the string into Unicode.
        %encoding is a string recognized by encodings.aliases'''
        return unicode(data, encoding, errors)

    @property
    def declared_html_encoding(self):
        if not self.is_html:
            return None
        return self.detector.declared_encoding

    def find_codec(self, charset):
        value = (self._codec(self.CHARSET_ALIASES.get(charset, charset))
               or (charset and self._codec(charset.replace("-", "")))
               or (charset and self._codec(charset.replace("-", "_")))
               or (charset and charset.lower())
               or charset
                )
        if value:
            return value.lower()
        return None

    def _codec(self, charset):
        if not charset:
            return charset
        codec = None
        try:
            codecs.lookup(charset)
            codec = charset
        except (LookupError, ValueError):
            pass
        return codec


    # A partial mapping of ISO-Latin-1 to HTML entities/XML numeric entities.
    MS_CHARS = {b'\x80': ('euro', '20AC'),
                b'\x81': ' ',
                b'\x82': ('sbquo', '201A'),
                b'\x83': ('fnof', '192'),
                b'\x84': ('bdquo', '201E'),
                b'\x85': ('hellip', '2026'),
                b'\x86': ('dagger', '2020'),
                b'\x87': ('Dagger', '2021'),
                b'\x88': ('circ', '2C6'),
                b'\x89': ('permil', '2030'),
                b'\x8A': ('Scaron', '160'),
                b'\x8B': ('lsaquo', '2039'),
                b'\x8C': ('OElig', '152'),
                b'\x8D': '?',
                b'\x8E': ('#x17D', '17D'),
                b'\x8F': '?',
                b'\x90': '?',
                b'\x91': ('lsquo', '2018'),
                b'\x92': ('rsquo', '2019'),
                b'\x93': ('ldquo', '201C'),
                b'\x94': ('rdquo', '201D'),
                b'\x95': ('bull', '2022'),
                b'\x96': ('ndash', '2013'),
                b'\x97': ('mdash', '2014'),
                b'\x98': ('tilde', '2DC'),
                b'\x99': ('trade', '2122'),
                b'\x9a': ('scaron', '161'),
                b'\x9b': ('rsaquo', '203A'),
                b'\x9c': ('oelig', '153'),
                b'\x9d': '?',
                b'\x9e': ('#x17E', '17E'),
                b'\x9f': ('Yuml', ''),}

    # A parochial partial mapping of ISO-Latin-1 to ASCII. Contains
    # horrors like stripping diacritical marks to turn á into a, but also
    # contains non-horrors like turning “ into ".
    MS_CHARS_TO_ASCII = {
        b'\x80' : 'EUR',
        b'\x81' : ' ',
        b'\x82' : ',',
        b'\x83' : 'f',
        b'\x84' : ',,',
        b'\x85' : '...',
        b'\x86' : '+',
        b'\x87' : '++',
        b'\x88' : '^',
        b'\x89' : '%',
        b'\x8a' : 'S',
        b'\x8b' : '<',
        b'\x8c' : 'OE',
        b'\x8d' : '?',
        b'\x8e' : 'Z',
        b'\x8f' : '?',
        b'\x90' : '?',
        b'\x91' : "'",
        b'\x92' : "'",
        b'\x93' : '"',
        b'\x94' : '"',
        b'\x95' : '*',
        b'\x96' : '-',
        b'\x97' : '--',
        b'\x98' : '~',
        b'\x99' : '(TM)',
        b'\x9a' : 's',
        b'\x9b' : '>',
        b'\x9c' : 'oe',
        b'\x9d' : '?',
        b'\x9e' : 'z',
        b'\x9f' : 'Y',
        b'\xa0' : ' ',
        b'\xa1' : '!',
        b'\xa2' : 'c',
        b'\xa3' : 'GBP',
        b'\xa4' : '$', #This approximation is especially parochial--this is the
                       #generic currency symbol.
        b'\xa5' : 'YEN',
        b'\xa6' : '|',
        b'\xa7' : 'S',
        b'\xa8' : '..',
        b'\xa9' : '',
        b'\xaa' : '(th)',
        b'\xab' : '<<',
        b'\xac' : '!',
        b'\xad' : ' ',
        b'\xae' : '(R)',
        b'\xaf' : '-',
        b'\xb0' : 'o',
        b'\xb1' : '+-',
        b'\xb2' : '2',
        b'\xb3' : '3',
        b'\xb4' : ("'", 'acute'),
        b'\xb5' : 'u',
        b'\xb6' : 'P',
        b'\xb7' : '*',
        b'\xb8' : ',',
        b'\xb9' : '1',
        b'\xba' : '(th)',
        b'\xbb' : '>>',
        b'\xbc' : '1/4',
        b'\xbd' : '1/2',
        b'\xbe' : '3/4',
        b'\xbf' : '?',
        b'\xc0' : 'A',
        b'\xc1' : 'A',
        b'\xc2' : 'A',
        b'\xc3' : 'A',
        b'\xc4' : 'A',
        b'\xc5' : 'A',
        b'\xc6' : 'AE',
        b'\xc7' : 'C',
        b'\xc8' : 'E',
        b'\xc9' : 'E',
        b'\xca' : 'E',
        b'\xcb' : 'E',
        b'\xcc' : 'I',
        b'\xcd' : 'I',
        b'\xce' : 'I',
        b'\xcf' : 'I',
        b'\xd0' : 'D',
        b'\xd1' : 'N',
        b'\xd2' : 'O',
        b'\xd3' : 'O',
        b'\xd4' : 'O',
        b'\xd5' : 'O',
        b'\xd6' : 'O',
        b'\xd7' : '*',
        b'\xd8' : 'O',
        b'\xd9' : 'U',
        b'\xda' : 'U',
        b'\xdb' : 'U',
        b'\xdc' : 'U',
        b'\xdd' : 'Y',
        b'\xde' : 'b',
        b'\xdf' : 'B',
        b'\xe0' : 'a',
        b'\xe1' : 'a',
        b'\xe2' : 'a',
        b'\xe3' : 'a',
        b'\xe4' : 'a',
        b'\xe5' : 'a',
        b'\xe6' : 'ae',
        b'\xe7' : 'c',
        b'\xe8' : 'e',
        b'\xe9' : 'e',
        b'\xea' : 'e',
        b'\xeb' : 'e',
        b'\xec' : 'i',
        b'\xed' : 'i',
        b'\xee' : 'i',
        b'\xef' : 'i',
        b'\xf0' : 'o',
        b'\xf1' : 'n',
        b'\xf2' : 'o',
        b'\xf3' : 'o',
        b'\xf4' : 'o',
        b'\xf5' : 'o',
        b'\xf6' : 'o',
        b'\xf7' : '/',
        b'\xf8' : 'o',
        b'\xf9' : 'u',
        b'\xfa' : 'u',
        b'\xfb' : 'u',
        b'\xfc' : 'u',
        b'\xfd' : 'y',
        b'\xfe' : 'b',
        b'\xff' : 'y',
        }

    # A map used when removing rogue Windows-1252/ISO-8859-1
    # characters in otherwise UTF-8 documents.
    #
    # Note that \x81, \x8d, \x8f, \x90, and \x9d are undefined in
    # Windows-1252.
    WINDOWS_1252_TO_UTF8 = {
        0x80 : b'\xe2\x82\xac', # €
        0x82 : b'\xe2\x80\x9a', # ‚
        0x83 : b'\xc6\x92',     # ƒ
        0x84 : b'\xe2\x80\x9e', # „
        0x85 : b'\xe2\x80\xa6', # …
        0x86 : b'\xe2\x80\xa0', # †
        0x87 : b'\xe2\x80\xa1', # ‡
        0x88 : b'\xcb\x86',     # ˆ
        0x89 : b'\xe2\x80\xb0', # ‰
        0x8a : b'\xc5\xa0',     # Š
        0x8b : b'\xe2\x80\xb9', # ‹
        0x8c : b'\xc5\x92',     # Œ
        0x8e : b'\xc5\xbd',     # Ž
        0x91 : b'\xe2\x80\x98', # ‘
        0x92 : b'\xe2\x80\x99', # ’
        0x93 : b'\xe2\x80\x9c', # “
        0x94 : b'\xe2\x80\x9d', # ”
        0x95 : b'\xe2\x80\xa2', # •
        0x96 : b'\xe2\x80\x93', # –
        0x97 : b'\xe2\x80\x94', # —
        0x98 : b'\xcb\x9c',     # ˜
        0x99 : b'\xe2\x84\xa2', # ™
        0x9a : b'\xc5\xa1',     # š
        0x9b : b'\xe2\x80\xba', # ›
        0x9c : b'\xc5\x93',     # œ
        0x9e : b'\xc5\xbe',     # ž
        0x9f : b'\xc5\xb8',     # Ÿ
        0xa0 : b'\xc2\xa0',     #  
        0xa1 : b'\xc2\xa1',     # ¡
        0xa2 : b'\xc2\xa2',     # ¢
        0xa3 : b'\xc2\xa3',     # £
        0xa4 : b'\xc2\xa4',     # ¤
        0xa5 : b'\xc2\xa5',     # ¥
        0xa6 : b'\xc2\xa6',     # ¦
        0xa7 : b'\xc2\xa7',     # §
        0xa8 : b'\xc2\xa8',     # ¨
        0xa9 : b'\xc2\xa9',     # ©
        0xaa : b'\xc2\xaa',     # ª
        0xab : b'\xc2\xab',     # «
        0xac : b'\xc2\xac',     # ¬
        0xad : b'\xc2\xad',     # 
        0xae : b'\xc2\xae',     # ®
        0xaf : b'\xc2\xaf',     # ¯
        0xb0 : b'\xc2\xb0',     # °
        0xb1 : b'\xc2\xb1',     # ±
        0xb2 : b'\xc2\xb2',     # ²
        0xb3 : b'\xc2\xb3',     # ³
        0xb4 : b'\xc2\xb4',     # ´
        0xb5 : b'\xc2\xb5',     # µ
        0xb6 : b'\xc2\xb6',     # ¶
        0xb7 : b'\xc2\xb7',     # ·
        0xb8 : b'\xc2\xb8',     # ¸
        0xb9 : b'\xc2\xb9',     # ¹
        0xba : b'\xc2\xba',     # º
        0xbb : b'\xc2\xbb',     # »
        0xbc : b'\xc2\xbc',     # ¼
        0xbd : b'\xc2\xbd',     # ½
        0xbe : b'\xc2\xbe',     # ¾
        0xbf : b'\xc2\xbf',     # ¿
        0xc0 : b'\xc3\x80',     # À
        0xc1 : b'\xc3\x81',     # Á
        0xc2 : b'\xc3\x82',     # Â
        0xc3 : b'\xc3\x83',     # Ã
        0xc4 : b'\xc3\x84',     # Ä
        0xc5 : b'\xc3\x85',     # Å
        0xc6 : b'\xc3\x86',     # Æ
        0xc7 : b'\xc3\x87',     # Ç
        0xc8 : b'\xc3\x88',     # È
        0xc9 : b'\xc3\x89',     # É
        0xca : b'\xc3\x8a',     # Ê
        0xcb : b'\xc3\x8b',     # Ë
        0xcc : b'\xc3\x8c',     # Ì
        0xcd : b'\xc3\x8d',     # Í
        0xce : b'\xc3\x8e',     # Î
        0xcf : b'\xc3\x8f',     # Ï
        0xd0 : b'\xc3\x90',     # Ð
        0xd1 : b'\xc3\x91',     # Ñ
        0xd2 : b'\xc3\x92',     # Ò
        0xd3 : b'\xc3\x93',     # Ó
        0xd4 : b'\xc3\x94',     # Ô
        0xd5 : b'\xc3\x95',     # Õ
        0xd6 : b'\xc3\x96',     # Ö
        0xd7 : b'\xc3\x97',     # ×
        0xd8 : b'\xc3\x98',     # Ø
        0xd9 : b'\xc3\x99',     # Ù
        0xda : b'\xc3\x9a',     # Ú
        0xdb : b'\xc3\x9b',     # Û
        0xdc : b'\xc3\x9c',     # Ü
        0xdd : b'\xc3\x9d',     # Ý
        0xde : b'\xc3\x9e',     # Þ
        0xdf : b'\xc3\x9f',     # ß
        0xe0 : b'\xc3\xa0',     # à
        0xe1 : b'\xa1',     # á
        0xe2 : b'\xc3\xa2',     # â
        0xe3 : b'\xc3\xa3',     # ã
        0xe4 : b'\xc3\xa4',     # ä
        0xe5 : b'\xc3\xa5',     # å
        0xe6 : b'\xc3\xa6',     # æ
        0xe7 : b'\xc3\xa7',     # ç
        0xe8 : b'\xc3\xa8',     # è
        0xe9 : b'\xc3\xa9',     # é
        0xea : b'\xc3\xaa',     # ê
        0xeb : b'\xc3\xab',     # ë
        0xec : b'\xc3\xac',     # ì
        0xed : b'\xc3\xad',     # í
        0xee : b'\xc3\xae',     # î
        0xef : b'\xc3\xaf',     # ï
        0xf0 : b'\xc3\xb0',     # ð
        0xf1 : b'\xc3\xb1',     # ñ
        0xf2 : b'\xc3\xb2',     # ò
        0xf3 : b'\xc3\xb3',     # ó
        0xf4 : b'\xc3\xb4',     # ô
        0xf5 : b'\xc3\xb5',     # õ
        0xf6 : b'\xc3\xb6',     # ö
        0xf7 : b'\xc3\xb7',     # ÷
        0xf8 : b'\xc3\xb8',     # ø
        0xf9 : b'\xc3\xb9',     # ù
        0xfa : b'\xc3\xba',     # ú
        0xfb : b'\xc3\xbb',     # û
        0xfc : b'\xc3\xbc',     # ü
        0xfd : b'\xc3\xbd',     # ý
        0xfe : b'\xc3\xbe',     # þ
        }

    MULTIBYTE_MARKERS_AND_SIZES = [
        (0xc2, 0xdf, 2), # 2-byte characters start with a byte C2-DF
        (0xe0, 0xef, 3), # 3-byte characters start with E0-EF
        (0xf0, 0xf4, 4), # 4-byte characters start with F0-F4
        ]

    FIRST_MULTIBYTE_MARKER = MULTIBYTE_MARKERS_AND_SIZES[0][0]
    LAST_MULTIBYTE_MARKER = MULTIBYTE_MARKERS_AND_SIZES[-1][1]

    @classmethod
    def detwingle(cls, in_bytes, main_encoding="utf8",
                  embedded_encoding="windows-1252"):
        """Fix characters from one encoding embedded in some other encoding.

        Currently the only situation supported is Windows-1252 (or its
        subset ISO-8859-1), embedded in UTF-8.

        The input must be a bytestring. If you've already converted
        the document to Unicode, you're too late.

        The output is a bytestring in which `embedded_encoding`
        characters have been converted to their `main_encoding`
        equivalents.
        """
        if embedded_encoding.replace('_', '-').lower() not in (
            'windows-1252', 'windows_1252'):
            raise NotImplementedError(
                "Windows-1252 and ISO-8859-1 are the only currently supported "
                "embedded encodings.")

        if main_encoding.lower() not in ('utf8', 'utf-8'):
            raise NotImplementedError(
                "UTF-8 is the only currently supported main encoding.")

        byte_chunks = []

        chunk_start = 0
        pos = 0
        while pos < len(in_bytes):
            byte = in_bytes[pos]
            if not isinstance(byte, int):
                # Python 2.x
                byte = ord(byte)
            if (byte >= cls.FIRST_MULTIBYTE_MARKER
                and byte <= cls.LAST_MULTIBYTE_MARKER):
                # This is the start of a UTF-8 multibyte character. Skip
                # to the end.
                for start, end, size in cls.MULTIBYTE_MARKERS_AND_SIZES:
                    if byte >= start and byte <= end:
                        pos += size
                        break
            elif byte >= 0x80 and byte in cls.WINDOWS_1252_TO_UTF8:
                # We found a Windows-1252 character!
                # Save the string up to this point as a chunk.
                byte_chunks.append(in_bytes[chunk_start:pos])

                # Now translate the Windows-1252 character into UTF-8
                # and add it as another, one-byte chunk.
                byte_chunks.append(cls.WINDOWS_1252_TO_UTF8[byte])
                pos += 1
                chunk_start = pos
            else:
                # Go on to the next character.
                pos += 1
        if chunk_start == 0:
            # The string is unchanged.
            return in_bytes
        else:
            # Store the final chunk.
            byte_chunks.append(in_bytes[chunk_start:])
        return b''.join(byte_chunks)



================================================
FILE: example/parallax_svg_tools/bs4/diagnose.py
================================================
"""Diagnostic functions, mainly for use when doing tech support."""

# Use of this source code is governed by a BSD-style license that can be
# found in the LICENSE file.
__license__ = "MIT"

import cProfile
from StringIO import StringIO
from HTMLParser import HTMLParser
import bs4
from bs4 import BeautifulSoup, __version__
from bs4.builder import builder_registry

import os
import pstats
import random
import tempfile
import time
import traceback
import sys
import cProfile

def diagnose(data):
    """Diagnostic suite for isolating common problems."""
    print "Diagnostic running on Beautiful Soup %s" % __version__
    print "Python version %s" % sys.version

    basic_parsers = ["html.parser", "html5lib", "lxml"]
    for name in basic_parsers:
        for builder in builder_registry.builders:
            if name in builder.features:
                break
        else:
            basic_parsers.remove(name)
            print (
                "I noticed that %s is not installed. Installing it may help." %
                name)

    if 'lxml' in basic_parsers:
        basic_parsers.append(["lxml", "xml"])
        try:
            from lxml import etree
            print "Found lxml version %s" % ".".join(map(str,etree.LXML_VERSION))
        except ImportError, e:
            print (
                "lxml is not installed or couldn't be imported.")


    if 'html5lib' in basic_parsers:
        try:
            import html5lib
            print "Found html5lib version %s" % html5lib.__version__
        except ImportError, e:
            print (
                "html5lib is not installed or couldn't be imported.")

    if hasattr(data, 'read'):
        data = data.read()
    elif os.path.exists(data):
        print '"%s" looks like a filename. Reading data from the file.' % data
        with open(data) as fp:
            data = fp.read()
    elif data.startswith("http:") or data.startswith("https:"):
        print '"%s" looks like a URL. Beautiful Soup is not an HTTP client.' % data
        print "You need to use some other library to get the document behind the URL, and feed that document to Beautiful Soup."
        return
    print

    for parser in basic_parsers:
        print "Trying to parse your markup with %s" % parser
        success = False
        try:
            soup = BeautifulSoup(data, parser)
            success = True
        except Exception, e:
            print "%s could not parse the markup." % parser
            traceback.print_exc()
        if success:
            print "Here's what %s did with the markup:" % parser
            print soup.prettify()

        print "-" * 80

def lxml_trace(data, html=True, **kwargs):
    """Print out the lxml events that occur during parsing.

    This lets you see how lxml parses a document when no Beautiful
    Soup code is running.
    """
    from lxml import etree
    for event, element in etree.iterparse(StringIO(data), html=html, **kwargs):
        print("%s, %4s, %s" % (event, element.tag, element.text))

class AnnouncingParser(HTMLParser):
    """Announces HTMLParser parse events, without doing anything else."""

    def _p(self, s):
        print(s)

    def handle_starttag(self, name, attrs):
        self._p("%s START" % name)

    def handle_endtag(self, name):
        self._p("%s END" % name)

    def handle_data(self, data):
        self._p("%s DATA" % data)

    def handle_charref(self, name):
        self._p("%s CHARREF" % name)

    def handle_entityref(self, name):
        self._p("%s ENTITYREF" % name)

    def handle_comment(self, data):
        self._p("%s COMMENT" % data)

    def handle_decl(self, data):
        self._p("%s DECL" % data)

    def unknown_decl(self, data):
        self._p("%s UNKNOWN-DECL" % data)

    def handle_pi(self, data):
        self._p("%s PI" % data)

def htmlparser_trace(data):
    """Print out the HTMLParser events that occur during parsing.

    This lets you see how HTMLParser parses a document when no
    Beautiful Soup code is running.
    """
    parser = AnnouncingParser()
    parser.feed(data)

_vowels = "aeiou"
_consonants = "bcdfghjklmnpqrstvwxyz"

def rword(length=5):
    "Generate a random word-like string."
    s = ''
    for i in range(length):
        if i % 2 == 0:
            t = _consonants
        else:
            t = _vowels
        s += random.choice(t)
    return s

def rsentence(length=4):
    "Generate a random sentence-like string."
    return " ".join(rword(random.randint(4,9)) for i in range(length))
        
def rdoc(num_elements=1000):
    """Randomly generate an invalid HTML document."""
    tag_names = ['p', 'div', 'span', 'i', 'b', 'script', 'table']
    elements = []
    for i in range(num_elements):
        choice = random.randint(0,3)
        if choice == 0:
            # New tag.
            tag_name = random.choice(tag_names)
            elements.append("<%s>" % tag_name)
        elif choice == 1:
            elements.append(rsentence(random.randint(1,4)))
        elif choice == 2:
            # Close a tag.
            tag_name = random.choice(tag_names)
            elements.append("</%s>" % tag_name)
    return "<html>" + "\n".join(elements) + "</html>"

def benchmark_parsers(num_elements=100000):
    """Very basic head-to-head performance benchmark."""
    print "Comparative parser benchmark on Beautiful Soup %s" % __version__
    data = rdoc(num_elements)
    print "Generated a large invalid HTML document (%d bytes)." % len(data)
    
    for parser in ["lxml", ["lxml", "html"], "html5lib", "html.parser"]:
        success = False
        try:
            a = time.time()
            soup = BeautifulSoup(data, parser)
            b = time.time()
            success = True
        except Exception, e:
            print "%s could not parse the markup." % parser
            traceback.print_exc()
        if success:
            print "BS4+%s parsed the markup in %.2fs." % (parser, b-a)

    from lxml import etree
    a = time.time()
    etree.HTML(data)
    b = time.time()
    print "Raw lxml parsed the markup in %.2fs." % (b-a)

    import html5lib
    parser = html5lib.HTMLParser()
    a = time.time()
    parser.parse(data)
    b = time.time()
    print "Raw html5lib parsed the markup in %.2fs." % (b-a)

def profile(num_elements=100000, parser="lxml"):

    filehandle = tempfile.NamedTemporaryFile()
    filename = filehandle.name

    data = rdoc(num_elements)
    vars = dict(bs4=bs4, data=data, parser=parser)
    cProfile.runctx('bs4.BeautifulSoup(data, parser)' , vars, vars, filename)

    stats = pstats.Stats(filename)
    # stats.strip_dirs()
    stats.sort_stats("cumulative")
    stats.print_stats('_html5lib|bs4', 50)

if __name__ == '__main__':
    diagnose(sys.stdin.read())


================================================
FILE: example/parallax_svg_tools/bs4/element.py
================================================
# Use of this source code is governed by a BSD-style license that can be
# found in the LICENSE file.
__license__ = "MIT"

import collections
import re
import shlex
import sys
import warnings
from bs4.dammit import EntitySubstitution

DEFAULT_OUTPUT_ENCODING = "utf-8"
PY3K = (sys.version_info[0] > 2)

whitespace_re = re.compile("\s+")

def _alias(attr):
    """Alias one attribute name to another for backward compatibility"""
    @property
    def alias(self):
        return getattr(self, attr)

    @alias.setter
    def alias(self):
        return setattr(self, attr)
    return alias


class NamespacedAttribute(unicode):

    def __new__(cls, prefix, name, namespace=None):
        if name is None:
            obj = unicode.__new__(cls, prefix)
        elif prefix is None:
            # Not really namespaced.
            obj = unicode.__new__(cls, name)
        else:
            obj = unicode.__new__(cls, prefix + ":" + name)
        obj.prefix = prefix
        obj.name = name
        obj.namespace = namespace
        return obj

class AttributeValueWithCharsetSubstitution(unicode):
    """A stand-in object for a character encoding specified in HTML."""

class CharsetMetaAttributeValue(AttributeValueWithCharsetSubstitution):
    """A generic stand-in for the value of a meta tag's 'charset' attribute.

    When Beautiful Soup parses the markup '<meta charset="utf8">', the
    value of the 'charset' attribute will be one of these objects.
    """

    def __new__(cls, original_value):
        obj = unicode.__new__(cls, original_value)
        obj.original_value = original_value
        return obj

    def encode(self, encoding):
        return encoding


class ContentMetaAttributeValue(AttributeValueWithCharsetSubstitution):
    """A generic stand-in for the value of a meta tag's 'content' attribute.

    When Beautiful Soup parses the markup:
     <meta http-equiv="content-type" content="text/html; charset=utf8">

    The value of the 'content' attribute will be one of these objects.
    """

    CHARSET_RE = re.compile("((^|;)\s*charset=)([^;]*)", re.M)

    def __new__(cls, original_value):
        match = cls.CHARSET_RE.search(original_value)
        if match is None:
            # No substitution necessary.
            return unicode.__new__(unicode, original_value)

        obj = unicode.__new__(cls, original_value)
        obj.original_value = original_value
        return obj

    def encode(self, encoding):
        def rewrite(match):
            return match.group(1) + encoding
        return self.CHARSET_RE.sub(rewrite, self.original_value)

class HTMLAwareEntitySubstitution(EntitySubstitution):

    """Entity substitution rules that are aware of some HTML quirks.

    Specifically, the contents of <script> and <style> tags should not
    undergo entity substitution.

    Incoming NavigableString objects are checked to see if they're the
    direct children of a <script> or <style> tag.
    """

    cdata_containing_tags = set(["script", "style"])

    preformatted_tags = set(["pre"])

    preserve_whitespace_tags = set(['pre', 'textarea'])

    @classmethod
    def _substitute_if_appropriate(cls, ns, f):
        if (isinstance(ns, NavigableString)
            and ns.parent is not None
            and ns.parent.name in cls.cdata_containing_tags):
            # Do nothing.
            return ns
        # Substitute.
        return f(ns)

    @classmethod
    def substitute_html(cls, ns):
        return cls._substitute_if_appropriate(
            ns, EntitySubstitution.substitute_html)

    @classmethod
    def substitute_xml(cls, ns):
        return cls._substitute_if_appropriate(
            ns, EntitySubstitution.substitute_xml)

class PageElement(object):
    """Contains the navigational information for some part of the page
    (either a tag or a piece of text)"""

    # There are five possible values for the "formatter" argument passed in
    # to methods like encode() and prettify():
    #
    # "html" - All Unicode characters with corresponding HTML entities
    #   are converted to those entities on output.
    # "minimal" - Bare ampersands and angle brackets are converted to
    #   XML entities: &amp; &lt; &gt;
    # None - The null formatter. Unicode characters are never
    #   converted to entities.  This is not recommended, but it's
    #   faster than "minimal".
    # A function - This function will be called on every string that
    #  needs to undergo entity substitution.
    #

    # In an HTML document, the default "html" and "minimal" functions
    # will leave the contents of <script> and <style> tags alone. For
    # an XML document, all tags will be given the same treatment.

    HTML_FORMATTERS = {
        "html" : HTMLAwareEntitySubstitution.substitute_html,
        "minimal" : HTMLAwareEntitySubstitution.substitute_xml,
        None : None
        }

    XML_FORMATTERS = {
        "html" : EntitySubstitution.substitute_html,
        "minimal" : EntitySubstitution.substitute_xml,
        None : None
        }

    def format_string(self, s, formatter='minimal'):
        """Format the given string using the given formatter."""
        if not callable(formatter):
            formatter = self._formatter_for_name(formatter)
        if formatter is None:
            output = s
        else:
            output = formatter(s)
        return output

    @property
    def _is_xml(self):
        """Is this element part of an XML tree or an HTML tree?

        This is used when mapping a formatter name ("minimal") to an
        appropriate function (one that performs entity-substitution on
        the contents of <script> and <style> tags, or not). It can be
        inefficient, but it should be called very rarely.
        """
        if self.known_xml is not None:
            # Most of the time we will have determined this when the
            # document is parsed.
            return self.known_xml

        # Otherwise, it's likely that this element was created by
        # direct invocation of the constructor from within the user's
        # Python code.
        if self.parent is None:
            # This is the top-level object. It should have .known_xml set
            # from tree creation. If not, take a guess--BS is usually
            # used on HTML markup.
            return getattr(self, 'is_xml', False)
        return self.parent._is_xml

    def _formatter_for_name(self, name):
        "Look up a formatter function based on its name and the tree."
        if self._is_xml:
            return self.XML_FORMATTERS.get(
                name, EntitySubstitution.substitute_xml)
        else:
            return self.HTML_FORMATTERS.get(
                name, HTMLAwareEntitySubstitution.substitute_xml)

    def setup(self, parent=None, previous_element=None, next_element=None,
              previous_sibling=None, next_sibling=None):
        """Sets up the initial relations between this element and
        other elements."""
        self.parent = parent

        self.previous_element = previous_element
        if previous_element is not None:
            self.previous_element.next_element = self

        self.next_element = next_element
        if self.next_element:
            self.next_element.previous_element = self

        self.next_sibling = next_sibling
        if self.next_sibling:
            self.next_sibling.previous_sibling = self

        if (not previous_sibling
            and self.parent is not None and self.parent.contents):
            previous_sibling = self.parent.contents[-1]

        self.previous_sibling = previous_sibling
        if previous_sibling:
            self.previous_sibling.next_sibling = self

    nextSibling = _alias("next_sibling")  # BS3
    previousSibling = _alias("previous_sibling")  # BS3

    def replace_with(self, replace_with):
        if not self.parent:
            raise ValueError(
                "Cannot replace one element with another when the"
                "element to be replaced is not part of a tree.")
        if replace_with is self:
            return
        if replace_with is self.parent:
            raise ValueError("Cannot replace a Tag with its parent.")
        old_parent = self.parent
        my_index = self.parent.index(self)
        self.extract()
        old_parent.insert(my_index, replace_with)
        return self
    replaceWith = replace_with  # BS3

    def unwrap(self):
        my_parent = self.parent
        if not self.parent:
            raise ValueError(
                "Cannot replace an element with its contents when that"
                "element is not part of a tree.")
        my_index = self.parent.index(self)
        self.extract()
        for child in reversed(self.contents[:]):
            my_parent.insert(my_index, child)
        return self
    replace_with_children = unwrap
    replaceWithChildren = unwrap  # BS3

    def wrap(self, wrap_inside):
        me = self.replace_with(wrap_inside)
        wrap_inside.append(me)
        return wrap_inside

    def extract(self):
        """Destructively rips this element out of the tree."""
        if self.parent is not None:
            del self.parent.contents[self.parent.index(self)]

        #Find the two elements that would be next to each other if
        #this element (and any children) hadn't been parsed. Connect
        #the two.
        last_child = self._last_descendant()
        next_element = last_child.next_element

        if (self.previous_element is not None and
            self.previous_element is not next_element):
            self.previous_element.next_element = next_element
        if next_element is not None and next_element is not self.previous_element:
            next_element.previous_element = self.previous_element
        self.previous_element = None
        last_child.next_element = None

        self.parent = None
        if (self.previous_sibling is not None
            and self.previous_sibling is not self.next_sibling):
            self.previous_sibling.next_sibling = self.next_sibling
        if (self.next_sibling is not None
            and self.next_sibling is not self.previous_sibling):
            self.next_sibling.previous_sibling = self.previous_sibling
        self.previous_sibling = self.next_sibling = None
        return self

    def _last_descendant(self, is_initialized=True, accept_self=True):
        "Finds the last element beneath this object to be parsed."
        if is_initialized and self.next_sibling:
            last_child = self.next_sibling.previous_element
        else:
            last_child = self
            while isinstance(last_child, Tag) and last_child.contents:
                last_child = last_child.contents[-1]
        if not accept_self and last_child is self:
            last_child = None
        return last_child
    # BS3: Not part of the API!
    _lastRecursiveChild = _last_descendant

    def insert(self, position, new_child):
        if new_child is None:
            raise ValueError("Cannot insert None into a tag.")
        if new_child is self:
            raise ValueError("Cannot insert a tag into itself.")
        if (isinstance(new_child, basestring)
            and not isinstance(new_child, NavigableString)):
            new_child = NavigableString(new_child)

        position = min(position, len(self.contents))
        if hasattr(new_child, 'parent') and new_child.parent is not None:
            # We're 'inserting' an element that's already one
            # of this object's children.
            if new_child.parent is self:
                current_index = self.index(new_child)
                if current_index < position:
                    # We're moving this element further down the list
                    # of this object's children. That means that when
                    # we extract this element, our target index will
                    # jump down one.
                    position -= 1
            new_child.extract()

        new_child.parent = self
        previous_child = None
        if position == 0:
            new_child.previous_sibling = None
            new_child.previous_element = self
        else:
            previous_child = self.contents[position - 1]
            new_child.previous_sibling = previous_child
            new_child.previous_sibling.next_sibling = new_child
            new_child.previous_element = previous_child._last_descendant(False)
        if new_child.previous_element is not None:
            new_child.previous_element.next_element = new_child

        new_childs_last_element = new_child._last_descendant(False)

        if position >= len(self.contents):
            new_child.next_sibling = None

            parent = self
            parents_next_sibling = None
            while parents_next_sibling is None and parent is not None:
                parents_next_sibling = parent.next_sibling
                parent = parent.parent
                if parents_next_sibling is not None:
                    # We found the element that comes next in the document.
                    break
            if parents_next_sibling is not None:
                new_childs_last_element.next_element = parents_next_sibling
            else:
                # The last element of this tag is the last element in
                # the document.
                new_childs_last_element.next_element = None
        else:
            next_child = self.contents[position]
            new_child.next_sibling = next_child
            if new_child.next_sibling is not None:
                new_child.next_sibling.previous_sibling = new_child
            new_childs_last_element.next_element = next_child

        if new_childs_last_element.next_element is not None:
            new_childs_last_element.next_element.previous_element = new_childs_last_element
        self.contents.insert(position, new_child)

    def append(self, tag):
        """Appends the given tag to the contents of this tag."""
        self.insert(len(self.contents), tag)

    def insert_before(self, predecessor):
        """Makes the given element the immediate predecessor of this one.

        The two elements will have the same parent, and the given element
        will be immediately before this one.
        """
        if self is predecessor:
            raise ValueError("Can't insert an element before itself.")
        parent = self.parent
        if parent is None:
            raise ValueError(
                "Element has no parent, so 'before' has no meaning.")
        # Extract first so that the index won't be screwed up if they
        # are siblings.
        if isinstance(predecessor, PageElement):
            predecessor.extract()
        index = parent.index(self)
        parent.insert(index, predecessor)

    def insert_after(self, successor):
        """Makes the given element the immediate successor of this one.

        The two elements will have the same parent, and the given element
        will be immediately after this one.
        """
        if self is successor:
            raise ValueError("Can't insert an element after itself.")
        parent = self.parent
        if parent is None:
            raise ValueError(
                "Element has no parent, so 'after' has no meaning.")
        # Extract first so that the index won't be screwed up if they
        # are siblings.
        if isinstance(successor, PageElement):
            successor.extract()
        index = parent.index(self)
        parent.insert(index+1, successor)

    def find_next(self, name=None, attrs={}, text=None, **kwargs):
        """Returns the first item that matches the given criteria and
        appears after this Tag in the document."""
        return self._find_one(self.find_all_next, name, attrs, text, **kwargs)
    findNext = find_next  # BS3

    def find_all_next(self, name=None, attrs={}, text=None, limit=None,
                    **kwargs):
        """Returns all items that match the given criteria and appear
        after this Tag in the document."""
        return self._find_all(name, attrs, text, limit, self.next_elements,
                             **kwargs)
    findAllNext = find_all_next  # BS3

    def find_next_sibling(self, name=None, attrs={}, text=None, **kwargs):
        """Returns the closest sibling to this Tag that matches the
        given criteria and appears after this Tag in the document."""
        return self._find_one(self.find_next_siblings, name, attrs, text,
                             **kwargs)
    findNextSibling = find_next_sibling  # BS3

    def find_next_siblings(self, name=None, attrs={}, text=None, limit=None,
                           **kwargs):
        """Returns the siblings of this Tag that match the given
        criteria and appear after this Tag in the document."""
        return self._find_all(name, attrs, text, limit,
                              self.next_siblings, **kwargs)
    findNextSiblings = find_next_siblings   # BS3
    fetchNextSiblings = find_next_siblings  # BS2

    def find_previous(self, name=None, attrs={}, text=None, **kwargs):
        """Returns the first item that matches the given criteria and
        appears before this Tag in the document."""
        return self._find_one(
            self.find_all_previous, name, attrs, text, **kwargs)
    findPrevious = find_previous  # BS3

    def find_all_previous(self, name=None, attrs={}, text=None, limit=None,
                        **kwargs):
        """Returns all items that match the given criteria and appear
        before this Tag in the document."""
        return self._find_all(name, attrs, text, limit, self.previous_elements,
                           **kwargs)
    findAllPrevious = find_all_previous  # BS3
    fetchPrevious = find_all_previous    # BS2

    def find_previous_sibling(self, name=None, attrs={}, text=None, **kwargs):
        """Returns the closest sibling to this Tag that matches the
        given criteria and appears before this Tag in the document."""
        return self._find_one(self.find_previous_siblings, name, attrs, text,
                             **kwargs)
    findPreviousSibling = find_previous_sibling  # BS3

    def find_previous_siblings(self, name=None, attrs={}, text=None,
                               limit=None, **kwargs):
        """Returns the siblings of this Tag that match the given
        criteria and appear before this Tag in the document."""
        return self._find_all(name, attrs, text, limit,
                              self.previous_siblings, **kwargs)
    findPreviousSiblings = find_previous_siblings   # BS3
    fetchPreviousSiblings = find_previous_siblings  # BS2

    def find_parent(self, name=None, attrs={}, **kwargs):
        """Returns the closest parent of this Tag that matches the given
        criteria."""
        # NOTE: We can't use _find_one because findParents takes a different
        # set of arguments.
        r = None
        l = self.find_parents(name, attrs, 1, **kwargs)
        if l:
            r = l[0]
        return r
    findParent = find_parent  # BS3

    def find_parents(self, name=None, attrs={}, limit=None, **kwargs):
        """Returns the parents of this Tag that match the given
        criteria."""

        return self._find_all(name, attrs, None, limit, self.parents,
                             **kwargs)
    findParents = find_parents   # BS3
    fetchParents = find_parents  # BS2

    @property
    def next(self):
        return self.next_element

    @property
    def previous(self):
        return self.previous_element

    #These methods do the real heavy lifting.

    def _find_one(self, method, name, attrs, text, **kwargs):
        r = None
        l = method(name, attrs, text, 1, **kwargs)
        if l:
            r = l[0]
        return r

    def _find_all(self, name, attrs, text, limit, generator, **kwargs):
        "Iterates over a generator looking for things that match."

        if text is None and 'string' in kwargs:
            text = kwargs['string']
            del kwargs['string']

        if isinstance(name, SoupStrainer):
            strainer = name
        else:
            strainer = SoupStrainer(name, attrs, text, **kwargs)

        if text is None and not limit and not attrs and not kwargs:
            if name is True or name is None:
                # Optimization to find all tags.
                result = (element for element in generator
                          if isinstance(element, Tag))
                return ResultSet(strainer, result)
            elif isinstance(name, basestring):
                # Optimization to find all tags with a given name.
                result = (element for element in generator
                          if isinstance(element, Tag)
                            and element.name == name)
                return ResultSet(strainer, result)
        results = ResultSet(strainer)
        while True:
            try:
                i = next(generator)
            except StopIteration:
                break
            if i:
                found = strainer.search(i)
                if found:
                    results.append(found)
                    if limit and len(results) >= limit:
                        break
        return results

    #These generators can be used to navigate starting from both
    #NavigableStrings and Tags.
    @property
    def next_elements(self):
        i = self.next_element
        while i is not None:
            yield i
            i = i.next_element

    @property
    def next_siblings(self):
        i = self.next_sibling
        while i is not None:
            yield i
            i = i.next_sibling

    @property
    def previous_elements(self):
        i = self.previous_element
        while i is not None:
            yield i
            i = i.previous_element

    @property
    def previous_siblings(self):
        i = self.previous_sibling
        while i is not None:
            yield i
            i = i.previous_sibling

    @property
    def parents(self):
        i = self.parent
        while i is not None:
            yield i
            i = i.parent

    # Methods for supporting CSS selectors.

    tag_name_re = re.compile('^[a-zA-Z0-9][-.a-zA-Z0-9:_]*$')

    # /^([a-zA-Z0-9][-.a-zA-Z0-9:_]*)\[(\w+)([=~\|\^\$\*]?)=?"?([^\]"]*)"?\]$/
    #   \---------------------------/  \---/\-------------/    \-------/
    #     |                              |         |               |
    #     |                              |         |           The value
    #     |                              |    ~,|,^,$,* or =
    #     |                           Attribute
    #    Tag
    attribselect_re = re.compile(
        r'^(?P<tag>[a-zA-Z0-9][-.a-zA-Z0-9:_]*)?\[(?P<attribute>[\w-]+)(?P<operator>[=~\|\^\$\*]?)' +
        r'=?"?(?P<value>[^\]"]*)"?\]$'
        )

    def _attr_value_as_string(self, value, default=None):
        """Force an attribute value into a string representation.

        A multi-valued attribute will be converted into a
        space-separated stirng.
        """
        value = self.get(value, default)
        if isinstance(value, list) or isinstance(value, tuple):
            value =" ".join(value)
        return value

    def _tag_name_matches_and(self, function, tag_name):
        if not tag_name:
            return function
        else:
            def _match(tag):
                return tag.name == tag_name and function(tag)
            return _match

    def _attribute_checker(self, operator, attribute, value=''):
        """Create a function that performs a CSS selector operation.

        Takes an operator, attribute and optional value. Returns a
        function that will return True for elements that match that
        combination.
        """
        if operator == '=':
            # string representation of `attribute` is equal to `value`
            return lambda el: el._attr_value_as_string(attribute) == value
        elif operator == '~':
            # space-separated list representation of `attribute`
            # contains `value`
            def _includes_value(element):
                attribute_value = element.get(attribute, [])
                if not isinstance(attribute_value, list):
                    attribute_value = attribute_value.split()
                return value in attribute_value
            return _includes_value
        elif operator == '^':
            # string representation of `attribute` starts with `value`
            return lambda el: el._attr_value_as_string(
                attribute, '').startswith(value)
        elif operator == '$':
            # string representation of `attribute` ends with `value`
            return lambda el: el._attr_value_as_string(
                attribute, '').endswith(value)
        elif operator == '*':
            # string representation of `attribute` contains `value`
            return lambda el: value in el._attr_value_as_string(attribute, '')
        elif operator == '|':
            # string representation of `attribute` is either exactly
            # `value` or starts with `value` and then a dash.
            def _is_or_starts_with_dash(element):
                attribute_value = element._attr_value_as_string(attribute, '')
                return (attribute_value == value or attribute_value.startswith(
                        value + '-'))
            return _is_or_starts_with_dash
        else:
            return lambda el: el.has_attr(attribute)

    # Old non-property versions of the generators, for backwards
    # compatibility with BS3.
    def nextGenerator(self):
        return self.next_elements

    def nextSiblingGenerator(self):
        return self.next_siblings

    def previousGenerator(self):
        return self.previous_elements

    def previousSiblingGenerator(self):
        return self.previous_siblings

    def parentGenerator(self):
        return self.parents


class NavigableString(unicode, PageElement):

    PREFIX = ''
    SUFFIX = ''

    # We can't tell just by looking at a string whether it's contained
    # in an XML document or an HTML document.

    known_xml = None

    def __new__(cls, value):
        """Create a new NavigableString.

        When unpickling a NavigableString, this method is called with
        the string in DEFAULT_OUTPUT_ENCODING. That encoding needs to be
        passed in to the superclass's __new__ or the superclass won't know
        how to handle non-ASCII characters.
        """
        if isinstance(value, unicode):
            u = unicode.__new__(cls, value)
        else:
            u = unicode.__new__(cls, value, DEFAULT_OUTPUT_ENCODING)
        u.setup()
        return u

    def __copy__(self):
        """A copy of a NavigableString has the same contents and class
        as the original, but it is not connected to the parse tree.
        """
        return type(self)(self)

    def __getnewargs__(self):
        return (unicode(self),)

    def __getattr__(self, attr):
        """text.string gives you text. This is for backwards
        compatibility for Navigable*String, but for CData* it lets you
        get the string without the CData wrapper."""
        if attr == 'string':
            return self
        else:
            raise AttributeError(
                "'%s' object has no attribute '%s'" % (
                    self.__class__.__name__, attr))

    def output_ready(self, formatter="minimal"):
        output = self.format_string(self, formatter)
        return self.PREFIX + output + self.SUFFIX

    @property
    def name(self):
        return None

    @name.setter
    def name(self, name):
        raise AttributeError("A NavigableString cannot be given a name.")

class PreformattedString(NavigableString):
    """A NavigableString not subject to the normal formatting rules.

    The string will be passed into the formatter (to trigger side effects),
    but the return value will be ignored.
    """

    def output_ready(self, formatter="minimal"):
        """CData strings are passed into the formatter.
        But the return value is ignored."""
        self.format_string(self, formatter)
        return self.PREFIX + self + self.SUFFIX

class CData(PreformattedString):

    PREFIX = u'<![CDATA['
    SUFFIX = u']]>'

class ProcessingInstruction(PreformattedString):
    """A SGML processing instruction."""

    PREFIX = u'<?'
    SUFFIX = u'>'

class XMLProcessingInstruction(ProcessingInstruction):
    """An XML processing instruction."""
    PREFIX = u'<?'
    SUFFIX = u'?>'

class Comment(PreformattedString):

    PREFIX = u'<!--'
    SUFFIX = u'-->'


class Declaration(PreformattedString):
    PREFIX = u'<?'
    SUFFIX = u'?>'


class Doctype(PreformattedString):

    @classmethod
    def for_name_and_ids(cls, name, pub_id, system_id):
        value = name or ''
        if pub_id is not None:
            value += ' PUBLIC "%s"' % pub_id
            if system_id is not None:
                value += ' "%s"' % system_id
        elif system_id is not None:
            value += ' SYSTEM "%s"' % system_id

        return Doctype(value)

    PREFIX = u'<!DOCTYPE '
    SUFFIX = u'>\n'


class Tag(PageElement):

    """Represents a found HTML tag with its attributes and contents."""

    def __init__(self, parser=None, builder=None, name=None, namespace=None,
                 prefix=None, attrs=None, parent=None, previous=None,
                 is_xml=None):
        "Basic constructor."

        if parser is None:
            self.parser_class = None
        else:
            # We don't actually store the parser object: that lets extracted
            # chunks be garbage-collected.
            self.parser_class = parser.__class__
        if name is None:
            raise ValueError("No value provided for new tag's name.")
        self.name = name
        self.namespace = namespace
        self.prefix = prefix
        if builder is not None:
            preserve_whitespace_tags = builder.preserve_whitespace_tags
        else:
            if is_xml:
                preserve_whitespace_tags = []
            else:
                preserve_whitespace_tags = HTMLAwareEntitySubstitution.preserve_whitespace_tags
        self.preserve_whitespace_tags = preserve_whitespace_tags
        if attrs is None:
            attrs = {}
        elif attrs:
            if builder is not None and builder.cdata_list_attributes:
                attrs = builder._replace_cdata_list_attribute_values(
                    self.name, attrs)
            else:
                attrs = dict(attrs)
        else:
            attrs = dict(attrs)

        # If possible, determine ahead of time whether this tag is an
        # XML tag.
        if builder:
            self.known_xml = builder.is_xml
        else:
            self.known_xml = is_xml
        self.attrs = attrs
        self.contents = []
        self.setup(parent, previous)
        self.hidden = False

        # Set up any substitutions, such as the charset in a META tag.
        if builder is not None:
            builder.set_up_substitutions(self)
            self.can_be_empty_element = builder.can_be_empty_element(name)
        else:
            self.can_be_empty_element = False

    parserClass = _alias("parser_class")  # BS3

    def __copy__(self):
        """A copy of a Tag is a new Tag, unconnected to the parse tree.
        Its contents are a copy of the old Tag's contents.
        """
        clone = type(self)(None, self.builder, self.name, self.namespace,
                           self.nsprefix, self.attrs, is_xml=self._is_xml)
        for attr in ('can_be_empty_element', 'hidden'):
            setattr(clone, attr, getattr(self, attr))
        for child in self.contents:
            clone.append(child.__copy__())
        return clone

    @property
    def is_empty_element(self):
        """Is this tag an empty-element tag? (aka a self-closing tag)

        A tag that has contents is never an empty-element tag.

        A tag that has no contents may or may not be an empty-element
        tag. It depends on the builder used to create the tag. If the
        builder has a designated list of empty-element tags, then only
        a tag whose name shows up in that list is considered an
        empty-element tag.

        If the builder has no designated list of empty-element tags,
        then any tag with no contents is an empty-element tag.
        """
        return len(self.contents) == 0 and self.can_be_empty_element
    isSelfClosing = is_empty_element  # BS3

    @property
    def string(self):
        """Convenience property to get the single string within this tag.

        :Return: If this tag has a single string child, return value
         is that string. If this tag has no children, or more than one
         child, return value is None. If this tag has one child tag,
         return value is the 'string' attribute of the child tag,
         recursively.
        """
        if len(self.contents) != 1:
            return None
        child = self.contents[0]
        if isinstance(child, NavigableString):
            return child
        return child.string

    @string.setter
    def string(self, string):
        self.clear()
        self.append(string.__class__(string))

    def _all_strings(self, strip=False, types=(NavigableString, CData)):
        """Yield all strings of certain classes, possibly stripping them.

        By default, yields only NavigableString and CData objects. So
        no comments, processing instructions, etc.
        """
        for descendant in self.descendants:
            if (
                (types is None and not isinstance(descendant, NavigableString))
                or
                (types is not None and type(descendant) not in types)):
                continue
            if strip:
                descendant = descendant.strip()
                if len(descendant) == 0:
                    continue
            yield descendant

    strings = property(_all_strings)

    @property
    def stripped_strings(self):
        for string in self._all_strings(True):
            yield string

    def get_text(self, separator=u"", strip=False,
                 types=(NavigableString, CData)):
        """
        Get all child strings, concatenated using the given separator.
        """
        return separator.join([s for s in self._all_strings(
                    strip, types=types)])
    getText = get_text
    text = property(get_text)

    def decompose(self):
        """Recursively destroys the contents of this tree."""
        self.extract()
        i = self
        while i is not None:
            next = i.next_element
            i.__dict__.clear()
            i.contents = []
            i = next

    def clear(self, decompose=False):
        """
        Extract all children. If decompose is True, decompose instead.
        """
        if decompose:
            for element in self.contents[:]:
                if isinstance(element, Tag):
                    element.decompose()
                else:
                    element.extract()
        else:
            for element in self.contents[:]:
                element.extract()

    def index(self, element):
        """
        Find the index of a child by identity, not value. Avoids issues with
        tag.contents.index(element) getting the index of equal elements.
        """
        for i, child in enumerate(self.contents):
            if child is element:
                return i
        raise ValueError("Tag.index: element not in tag")

    def get(self, key, default=None):
        """Returns the value of the 'key' attribute for the tag, or
        the value given for 'default' if it doesn't have that
        attribute."""
        return self.attrs.get(key, default)

    def has_attr(self, key):
        return key in self.attrs

    def __hash__(self):
        return str(self).__hash__()

    def __getitem__(self, key):
        """tag[key] returns the value of the 'key' attribute for the tag,
        and throws an exception if it's not there."""
        return self.attrs[key]

    def __iter__(self):
        "Iterating over a tag iterates over its contents."
        return iter(self.contents)

    def __len__(self):
        "The length of a tag is the length of its list of contents."
        return len(self.contents)

    def __contains__(self, x):
        return x in self.contents

    def __nonzero__(self):
        "A tag is non-None even if it has no contents."
        return True

    def __setitem__(self, key, value):
        """Setting tag[key] sets the value of the 'key' attribute for the
        tag."""
        self.attrs[key] = value

    def __delitem__(self, key):
        "Deleting tag[key] deletes all 'key' attributes for the tag."
        self.attrs.pop(key, None)

    def __call__(self, *args, **kwargs):
        """Calling a tag like a function is the same as calling its
        find_all() method. Eg. tag('a') returns a list of all the A tags
        found within this tag."""
        return self.find_all(*args, **kwargs)

    def __getattr__(self, tag):
        #print "Getattr %s.%s" % (self.__class__, tag)
        if len(tag) > 3 and tag.endswith('Tag'):
            # BS3: soup.aTag -> "soup.find("a")
            tag_name = tag[:-3]
            warnings.warn(
                '.%sTag is deprecated, use .find("%s") instead.' % (
                    tag_name, tag_name))
            return self.find(tag_name)
        # We special case contents to avoid recursion.
        elif not tag.startswith("__") and not tag == "contents":
            return self.find(tag)
        raise AttributeError(
            "'%s' object has no attribute '%s'" % (self.__class__, tag))

    def __eq__(self, other):
        """Returns true iff this tag has the same name, the same attributes,
        and the same contents (recursively) as the given tag."""
        if self is other:
            return True
        if (not hasattr(other, 'name') or
            not hasattr(other, 'attrs') or
            not hasattr(other, 'contents') or
            self.name != other.name or
            self.attrs != other.attrs or
            len(self) != len(other)):
            return False
        for i, my_child in enumerate(self.contents):
            if my_child != other.contents[i]:
                return False
        return True

    def __ne__(self, other):
        """Returns true iff this tag is not identical to the other tag,
        as defined in __eq__."""
        return not self == other

    def __repr__(self, encoding="unicode-escape"):
        """Renders this tag as a string."""
        if PY3K:
            # "The return value must be a string object", i.e. Unicode
            return self.decode()
        else:
            # "The return value must be a string object", i.e. a bytestring.
            # By convention, the return value of __repr__ should also be
            # an ASCII string.
            return self.encode(encoding)

    def __unicode__(self):
        return self.decode()

    def __str__(self):
        if PY3K:
            return self.decode()
        else:
            return self.encode()

    if PY3K:
        __str__ = __repr__ = __unicode__

    def encode(self, encoding=DEFAULT_OUTPUT_ENCODING,
               indent_level=None, formatter="minimal",
               errors="xmlcharrefreplace"):
        # Turn the data structure into Unicode, then encode the
        # Unicode.
        u = self.decode(indent_level, encoding, formatter)
        return u.encode(encoding, errors)

    def _should_pretty_print(self, indent_level):
        """Should this tag be pretty-printed?"""

        return (
            indent_level is not None
            and self.name not in self.preserve_whitespace_tags
        )

    def decode(self, indent_level=None,
               eventual_encoding=DEFAULT_OUTPUT_ENCODING,
               formatter="minimal"):
        """Returns a Unicode representation of this tag and its contents.

        :param eventual_encoding: The tag is destined to be
           encoded into this encoding. This method is _not_
           responsible for performing that encoding. This information
           is passed in so that it can be substituted in if the
           document contains a <META> tag that mentions the document's
           encoding.
        """

        # First off, turn a string formatter into a function. This
        # will stop the lookup from happening over and over again.
        if not callable(formatter):
            formatter = self._formatter_for_name(formatter)

        attrs = []
        if self.attrs:
            for key, val in sorted(self.attrs.items()):
                if val is None:
                    decoded = key
                else:
                    if isinstance(val, list) or isinstance(val, tuple):
                        val = ' '.join(val)
                    elif not isinstance(val, basestring):
                        val = unicode(val)
                    elif (
                        isinstance(val, AttributeValueWithCharsetSubstitution)
                        and eventual_encoding is not None):
                        val = val.encode(eventual_encoding)

                    text = self.format_string(val, formatter)
                    decoded = (
                        unicode(key) + '='
                        + EntitySubstitution.quoted_attribute_value(text))
                attrs.append(decoded)
        close = ''
        closeTag = ''

        prefix = ''
        if self.prefix:
            prefix = self.prefix + ":"

        if self.is_empty_element:
            close = '/'
        else:
            closeTag = '</%s%s>' % (prefix, self.name)

        pretty_print = self._should_pretty_print(indent_level)
        space = ''
        indent_space = ''
        if indent_level is not None:
            indent_space = (' ' * (indent_level - 1))
        if pretty_print:
            space = indent_space
            indent_contents = indent_level + 1
        else:
            indent_contents = None
        contents = self.decode_contents(
            indent_contents, eventual_encoding, formatter)

        if self.hidden:
            # This is the 'document root' object.
            s = contents
        else:
            s = []
            attribute_string = ''
            if attrs:
                attribute_string = ' ' + ' '.join(attrs)
            if indent_level is not None:
                # Even if this particular tag is not pretty-printed,
                # we should indent up to the start of the tag.
                s.append(indent_space)
            s.append('<%s%s%s%s>' % (
                    prefix, self.name, attribute_string, close))
            if pretty_print:
                s.append("\n")
            s.append(contents)
            if pretty_print and contents and contents[-1] != "\n":
                s.append("\n")
            if pretty_print and closeTag:
                s.append(space)
            s.append(closeTag)
            if indent_level is not None and closeTag and self.next_sibling:
                # Even if this particular tag is not pretty-printed,
                # we're now done with the tag, and we should add a
                # newline if appropriate.
                s.append("\n")
            s = ''.join(s)
        return s

    def prettify(self, encoding=None, formatter="minimal"):
        if encoding is None:
            return self.decode(True, formatter=formatter)
        else:
            return self.encode(encoding, True, formatter=formatter)

    def decode_contents(self, indent_level=None,
                       eventual_encoding=DEFAULT_OUTPUT_ENCODING,
                       formatter="minimal"):
        """Renders the contents of this tag as a Unicode string.

        :param indent_level: Each line of the rendering will be
           indented this many spaces.

        :param eventual_encoding: The tag is destined to be
           encoded into this encoding. This method is _not_
           responsible for performing that encoding. This information
           is passed in so that it can be substituted in if the
           document contains a <META> tag that mentions the document's
           encoding.

        :param formatter: The output formatter responsible for converting
           entities to Unicode characters.
        """
        # First off, turn a string formatter into a function. This
        # will stop the lookup from happening over and over again.
        if not callable(formatter):
            formatter = self._formatter_for_name(formatter)

        pretty_print = (indent_level is not None)
        s = []
        for c in self:
            text = None
            if isinstance(c, NavigableString):
                text = c.output_ready(formatter)
            elif isinstance(c, Tag):
                s.append(c.decode(indent_level, eventual_encoding,
                                  formatter))
            if text and indent_level and not self.name == 'pre':
                text = text.strip()
            if text:
                if pretty_print and not self.name == 'pre':
                    s.append(" " * (indent_level - 1))
                s.append(text)
                if pretty_print and not self.name == 'pre':
                    s.append("\n")
        return ''.join(s)

    def encode_contents(
        self, indent_level=None, encoding=DEFAULT_OUTPUT_ENCODING,
        formatter="minimal"):
        """Renders the contents of this tag as a bytestring.

        :param indent_level: Each line of the rendering will be
           indented this many spaces.

        :param eventual_encoding: The bytestring will be in this encoding.

        :param formatter: The output formatter responsible for converting
           entities to Unicode characters.
        """

        contents = self.decode_contents(indent_level, encoding, formatter)
        return contents.encode(encoding)

    # Old method for BS3 compatibility
    def renderContents(self, encoding=DEFAULT_OUTPUT_ENCODING,
                       prettyPrint=False, indentLevel=0):
        if not prettyPrint:
            indentLevel = None
        return self.encode_contents(
            indent_level=indentLevel, encoding=encoding)

    #Soup methods

    def find(self, name=None, attrs={}, recursive=True, text=None,
             **kwargs):
        """Return only the first child of this Tag matching the given
        criteria."""
        r = None
        l = self.find_all(name, attrs, recursive, text, 1, **kwargs)
        if l:
            r = l[0]
        return r
    findChild = find

    def find_all(self, name=None, attrs={}, recursive=True, text=None,
                 limit=None, **kwargs):
        """Extracts a list of Tag objects that match the given
        criteria.  You can specify the name of the Tag and any
        attributes you want the Tag to have.

        The value of a key-value pair in the 'attrs' map can be a
        string, a list of strings, a regular expression object, or a
        callable that takes a string and returns whether or not the
        string matches for some custom definition of 'matches'. The
        same is true of the tag name."""

        generator = self.descendants
        if not recursive:
            generator = self.children
        return self._find_all(name, attrs, text, limit, generator, **kwargs)
    findAll = find_all       # BS3
    findChildren = find_all  # BS2

    #Generator methods
    @property
    def children(self):
        # return iter() to make the purpose of the method clear
        return iter(self.contents)  # XXX This seems to be untested.

    @property
    def descendants(self):
        if not len(self.contents):
            return
        stopNode = self._last_descendant().next_element
        current = self.contents[0]
        while current is not stopNode:
            yield current
            current = current.next_element

    # CSS selector code

    _selector_combinators = ['>', '+', '~']
    _select_debug = False
    quoted_colon = re.compile('"[^"]*:[^"]*"')
    def select_one(self, selector):
        """Perform a CSS selection operation on the current element."""
        value = self.select(selector, limit=1)
        if value:
            return value[0]
        return None

    def select(self, selector, _candidate_generator=None, limit=None):
        """Perform a CSS selection operation on the current element."""

        # Handle grouping selectors if ',' exists, ie: p,a
        if ',' in selector:
            context = []
            for partial_selector in selector.split(','):
                partial_selector = partial_selector.strip()
                if partial_selector == '':
                    raise ValueError('Invalid group selection syntax: %s' % selector)
                candidates = self.select(partial_selector, limit=limit)
                for candidate in candidates:
                    if candidate not in context:
                        context.append(candidate)

                if limit and len(context) >= limit:
                    break
            return context
        tokens = shlex.split(selector)
        current_context = [self]

        if tokens[-1] in self._selector_combinators:
            raise ValueError(
                'Final combinator "%s" is missing an argument.' % tokens[-1])

        if self._select_debug:
            print 'Running CSS selector "%s"' % selector

        for index, token in enumerate(tokens):
            new_context = []
            new_context_ids = set([])

            if tokens[index-1] in self._selector_combinators:
                # This token was consumed by the previous combinator. Skip it.
                if self._select_debug:
                    print '  Token was consumed by the previous combinator.'
                continue

            if self._select_debug:
                print ' Considering token "%s"' % token
            recursive_candidate_generator = None
            tag_name = None

            # Each operation corresponds to a checker function, a rule
            # for determining whether a candidate matches the
            # selector. Candidates are generated by the active
            # iterator.
            checker = None

            m = self.attribselect_re.match(token)
            if m is not None:
                # Attribute selector
                tag_name, attribute, operator, value = m.groups()
                checker = self._attribute_checker(operator, attribute, value)

            elif '#' in token:
                # ID selector
                tag_name, tag_id = token.split('#', 1)
                def id_matches(tag):
                    return tag.get('id', None) == tag_id
                checker = id_matches

            elif '.' in token:
                # Class selector
                tag_name, klass = token.split('.', 1)
                classes = set(klass.split('.'))
                def classes_match(candidate):
                    return classes.issubset(candidate.get('class', []))
                checker = classes_match

            elif ':' in token and not self.quoted_colon.search(token):
                # Pseudo-class
                tag_name, pseudo = token.split(':', 1)
                if tag_name == '':
                    raise ValueError(
                        "A pseudo-class must be prefixed with a tag name.")
                pseudo_attributes = re.match('([a-zA-Z\d-]+)\(([a-zA-Z\d]+)\)', pseudo)
                found = []
                if pseudo_attributes is None:
                    pseudo_type = pseudo
                    pseudo_value = None
                else:
                    pseudo_type, pseudo_value = pseudo_attributes.groups()
                if pseudo_type == 'nth-of-type':
                    try:
                        pseudo_value = int(pseudo_value)
                    except:
                        raise NotImplementedError(
                            'Only numeric values are currently supported for the nth-of-type pseudo-class.')
                    if pseudo_value < 1:
                        raise ValueError(
                            'nth-of-type pseudo-class value must be at least 1.')
                    class Counter(object):
                        def __init__(self, destination):
                            self.count = 0
                            self.destination = destination

                        def nth_child_of_type(self, tag):
                            self.count += 1
                            if self.count == self.destination:
                                return True
                            else:
                                return False
                    checker = Counter(pseudo_value).nth_child_of_type
                else:
                    raise NotImplementedError(
                        'Only the following pseudo-classes are implemented: nth-of-type.')

            elif token == '*':
                # Star selector -- matches everything
                pass
            elif token == '>':
                # Run the next token as a CSS selector against the
                # direct children of each tag in the current context.
                recursive_candidate_generator = lambda tag: tag.children
            elif token == '~':
                # Run the next token as a CSS selector against the
                # siblings of each tag in the current context.
                recursive_candidate_generator = lambda tag: tag.next_siblings
            elif token == '+':
                # For each tag in the current context, run the next
                # token as a CSS selector against the tag's next
                # sibling that's a tag.
                def next_tag_sibling(tag):
                    yield tag.find_next_sibling(True)
                recursive_candidate_generator = next_tag_sibling

            elif self.tag_name_re.match(token):
                # Just a tag name.
                tag_name = token
            else:
                raise ValueError(
                    'Unsupported or invalid CSS selector: "%s"' % token)
            if recursive_candidate_generator:
                # This happens when the selector looks like  "> foo".
                #
                # The generator calls select() recursively on every
                # member of the current context, passing in a different
                # candidate generator and a different selector.
                #
                # In the case of "> foo", the candidate generator is
                # one that yields a tag's direct children (">"), and
                # the selector is "foo".
                next_token = tokens[index+1]
                def recursive_select(tag):
                    if self._select_debug:
                        print '    Calling select("%s") recursively on %s %s' % (next_token, tag.name, tag.attrs)
                        print '-' * 40
                    for i in tag.select(next_token, recursive_candidate_generator):
                        if self._select_debug:
                            print '(Recursive select picked up candidate %s %s)' % (i.name, i.attrs)
                        yield i
                    if self._select_debug:
                        print '-' * 40
                _use_candidate_generator = recursive_select
            elif _candidate_generator is None:
                # By default, a tag's candidates are all of its
                # children. If tag_name is defined, only yield tags
                # with that name.
                if self._select_debug:
                    if tag_name:
                        check = "[any]"
                    else:
                        check = tag_name
                    print '   Default candidate generator, tag name="%s"' % check
                if self._select_debug:
                    # This is redundant with later code, but it stops
                    # a bunch of bogus tags from cluttering up the
                    # debug log.
                    def default_candidate_generator(tag):
                        for child in tag.descendants:
                            if not isinstance(child, Tag):
                                continue
                            if tag_name and not child.name == tag_name:
                                continue
                            yield child
                    _use_candidate_generator = default_candidate_generator
                else:
                    _use_candidate_generator = lambda tag: tag.descendants
            else:
                _use_candidate_generator = _candidate_generator

            count = 0
            for tag in current_context:
                if self._select_debug:
                    print "    Running candidate generator on %s %s" % (
                        tag.name, repr(tag.attrs))
                for candidate in _use_candidate_generator(tag):
                    if not isinstance(candidate, Tag):
                        continue
                    if tag_name and candidate.name != tag_name:
                        continue
                    if checker is not None:
                        try:
                            result = checker(candidate)
                        except StopIteration:
                            # The checker has decided we should no longer
                            # run the generator.
                            break
                    if checker is None or result:
                        if self._select_debug:
                            print "     SUCCESS %s %s" % (candidate.name, repr(candidate.attrs))
                        if id(candidate) not in new_context_ids:
                            # If a tag matches a selector more than once,
                            # don't include it in the context more than once.
                            new_context.append(candidate)
                            new_context_ids.add(id(candidate))
                    elif self._select_debug:
                        print "     FAILURE %s %s" % (candidate.name, repr(candidate.attrs))

            current_context = new_context
        if limit and len(current_context) >= limit:
            current_context = current_context[:limit]

        if self._select_debug:
            print "Final verdict:"
            for i in current_context:
                print " %s %s" % (i.name, i.attrs)
        return current_context

    # Old names for backwards compatibility
    def childGenerator(self):
        return self.children

    def recursiveChildGenerator(self):
        return self.descendants

    def has_key(self, key):
        """This was kind of misleading because has_key() (attributes)
        was different from __in__ (contents). has_key() is gone in
        Python 3, anyway."""
        warnings.warn('has_key is deprecated. Use has_attr("%s") instead.' % (
                key))
        return self.has_attr(key)

# Next, a couple classes to represent queries and their results.
class SoupStrainer(object):
    """Encapsulates a number of ways of matching a markup element (tag or
    text)."""

    def __init__(self, name=None, attrs={}, text=None, **kwargs):
        self.name = self._normalize_search_value(name)
        if not isinstance(attrs, dict):
            # Treat a non-dict value for attrs as a search for the 'class'
            # attribute.
            kwargs['class'] = attrs
            attrs = None

        if 'class_' in kwargs:
            # Treat class_="foo" as a search for the 'class'
            # attribute, overriding any non-dict value for attrs.
            kwargs['class'] = kwargs['class_']
            del kwargs['class_']

        if kwargs:
            if attrs:
                attrs = attrs.copy()
                attrs.update(kwargs)
            else:
                attrs = kwargs
        normalized_attrs = {}
        for key, value in attrs.items():
            normalized_attrs[key] = self._normalize_search_value(value)

        self.attrs = normalized_attrs
        self.text = self._normalize_search_value(text)

    def _normalize_search_value(self, value):
        # Leave it alone if it's a Unicode string, a callable, a
        # regular expression, a boolean, or None.
        if (isinstance(value, unicode) or callable(value) or hasattr(value, 'match')
            or isinstance(value, bool) or value is None):
            return value

        # If it's a bytestring, convert it to Unicode, treating it as UTF-8.
        if isinstance(value, bytes):
            return value.decode("utf8")

        # If it's listlike, convert it into a list of strings.
        if hasattr(value, '__iter__'):
            new_value = []
            for v in value:
                if (hasattr(v, '__iter__') and not isinstance(v, bytes)
                    and not isinstance(v, unicode)):
                    # This is almost certainly the user's mistake. In the
                    # interests of avoiding infinite loops, we'll let
                    # it through as-is rather than doing a recursive call.
                    new_value.append(v)
                else:
                    new_value.append(self._normalize_search_value(v))
            return new_value

        # Otherwise, convert it into a Unicode string.
        # The unicode(str()) thing is so this will do the same thing on Python 2
        # and Python 3.
        return unicode(str(value))

    def __str__(self):
        if self.text:
            return self.text
        else:
            return "%s|%s" % (self.name, self.attrs)

    def search_tag(self, markup_name=None, markup_attrs={}):
        found = None
        markup = None
        if isinstance(markup_name, Tag):
            markup = markup_name
            markup_attrs = markup
        call_function_with_tag_data = (
            isinstance(self.name, collections.Callable)
            and not isinstance(markup_name, Tag))

        if ((not self.name)
            or call_function_with_tag_data
            or (markup and self._matches(markup, self.name))
            or (not markup and self._matches(markup_name, self.name))):
            if call_function_with_tag_data:
                match = self.name(markup_name, markup_attrs)
            else:
                match = True
                markup_attr_map = None
                for attr, match_against in list(self.attrs.items()):
                    if not markup_attr_map:
                        if hasattr(markup_attrs, 'get'):
                            markup_attr_map = markup_attrs
                        else:
                            markup_attr_map = {}
                            for k, v in markup_attrs:
                                markup_attr_map[k] = v
                    attr_value = markup_attr_map.get(attr)
                    if not self._matches(attr_value, match_against):
                        match = False
                        break
            if match:
                if markup:
                    found = markup
                else:
                    found = markup_name
        if found and self.text and not self._matches(found.string, self.text):
            found = None
        return found
    searchTag = search_tag

    def search(self, markup):
        # print 'looking for %s in %s' % (self, markup)
        found = None
        # If given a list of items, scan it for a text element that
        # matches.
        if hasattr(markup, '__iter__') and not isinstance(markup, (Tag, basestring)):
            for element in markup:
                if isinstance(element, NavigableString) \
                       and self.search(element):
                    found = element
                    break
        # If it's a Tag, make sure its name or attributes match.
        # Don't bother with Tags if we're searching for text.
        elif isinstance(markup, Tag):
            if not self.text or self.name or self.attrs:
                found = self.search_tag(markup)
        # If it's text, make sure the text matches.
        elif isinstance(markup, NavigableString) or \
                 isinstance(markup, basestring):
            if not self.name and not self.attrs and self._matches(markup, self.text):
                found = markup
        else:
            raise Exception(
                "I don't know how to match against a %s" % markup.__class__)
        return found

    def _matches(self, markup, match_against):
        # print u"Matching %s against %s" % (markup, match_against)
        result = False
        if isinstance(markup, list) or isinstance(markup, tuple):
            # This should only happen when searching a multi-valued attribute
            # like 'class'.
            for item in markup:
                if self._matches(item, match_against):
                    return True
            # We didn't match any particular value of the multivalue
            # attribute, but maybe we match the attribute value when
            # considered as a string.
            if self._matches(' '.join(markup), match_against):
                return True
            return False

        if match_against is True:
            # True matches any non-None value.
            return markup is not None

        if isinstance(match_against, collections.Callable):
            return match_against(markup)

        # Custom callables take the tag as an argument, but all
        # other ways of matching match the tag name as a string.
        if isinstance(markup, Tag):
            markup = markup.name

        # Ensure that `markup` is either a Unicode string, or None.
        markup = self._normalize_search_value(markup)

        if markup is None:
            # None matches None, False, an empty string, an empty list, and so on.
            return not match_against

        if isinstance(match_against, unicode):
            # Exact string match
            return markup == match_against

        if hasattr(match_against, 'match'):
            # Regexp match
            return match_against.search(markup)

        if hasattr(match_against, '__iter__'):
            # The markup must be an exact match against something
            # in the iterable.
            return markup in match_against


class ResultSet(list):
    """A ResultSet is just a list that keeps track of the SoupStrainer
    that created it."""
    def __init__(self, source, result=()):
        super(ResultSet, self).__init__(result)
        self.source = source


================================================
FILE: example/parallax_svg_tools/run.py
================================================
from svg import * 

compile_svg('animation.svg', 'processed_animation.svg', 
{
	'process_layer_names': True,
	'namespace': 'example'
})

inline_svg('animation.html', 'output/animation.html')

================================================
FILE: example/parallax_svg_tools/svg/__init__.py
================================================
# Super simple Illustrator SVG processor for animations. Uses the BeautifulSoup python xml library. 

import os
import errno
from bs4 import BeautifulSoup

def create_file(path, mode):
	directory = os.path.dirname(path)
	if directory != '' and not os.path.exists(directory):
		try:
			os.makedirs(directory)
		except OSError as e:
		    if e.errno != errno.EEXIST:
		        raise
	
	file = open(path, mode)
	return file

def parse_svg(path, namespace, options):
	#print(path)
	file = open(path,'r')
	file_string = file.read().decode('utf8')
	file.close();

	if namespace == None:
		namespace = ''
	else:
		namespace = namespace + '-'

	# BeautifulSoup can't parse attributes with dashes so we replace them with underscores instead		
	file_string = file_string.replace('data-name', 'data_name')

	# Expand origin to data-svg-origin as its a pain in the ass to type
	if 'expand_origin' in options and options['expand_origin'] == True:
		file_string = file_string.replace('origin=', 'data-svg-origin=')
	
	# Add namespaces to ids
	if namespace:
		file_string = file_string.replace('id="', 'id="' + namespace)
		file_string = file_string.replace('url(#', 'url(#' + namespace)

	svg = BeautifulSoup(file_string, 'html.parser')

	# namespace symbols
	symbol_elements = svg.select('symbol')
	for element in symbol_elements:
		del element['data_name']

	use_elements = svg.select('use')
	for element in use_elements:
		if namespace:
			href = element['xlink:href']
			element['xlink:href'] = href.replace('#', '#' + namespace)

		del element['id']


	# remove titles
	if 'title' in options and options['title'] == False:
		titles = svg.select('title')
		for t in titles: t.extract()


	foreign_tags_to_add = []
	if 'convert_svg_text_to_html' in options and options['convert_svg_text_to_html'] == True:
		text_elements = svg.select('[data_name="#TEXT"]')
		for element in text_elements:

			area = element.rect
			if not area: 
				print('WARNING: Text areas require a rectangle to be in the same group as the text element')
				continue

			text_element = element.select('text')[0]
			if not text_element:
				print('WARNING: No text element found in text area')
				continue

			x = area['x']
			y = area['y']
			width = area['width']
			height = area['height']

			text_content = text_element.getText()
			text_tag = BeautifulSoup(text_content, 'html.parser')
			
			data_name = None
			if area.has_attr('data_name'): data_name = area['data_name']
			#print(data_name)
						
			area.extract()
			text_element.extract()
			
			foreign_object_tag = svg.new_tag('foreignObject')
			foreign_object_tag['requiredFeatures'] = "http://www.w3.org/TR/SVG11/feature#Extensibility"
			foreign_object_tag['transform'] = 'translate(' + x + ' ' + y + ')'
			foreign_object_tag['width'] = width + 'px'
			foreign_object_tag['height'] = height + 'px'

			if 'dont_overflow_text_areas' in options and options['dont_overflow_text_areas'] == True:
				foreign_object_tag['style'] = 'overflow:hidden'

			if data_name:
				val = data_name
				if not val.startswith('#'): continue
				val = val.replace('#', '')
				
				attributes = str.split(str(val), ',')
				for a in attributes:
					split = str.split(a.strip(), '=')
					if (len(split) < 2): continue
					key = split[0]
					value = split[1]
					if key == 'id': key = namespace + key
					foreign_object_tag[key] = value
			
			foreign_object_tag.append(text_tag)

			# modyfing the tree affects searches so we need to defer it until the end
			foreign_tags_to_add.append({'element':element, 'tag':foreign_object_tag})
			

	if (not 'process_layer_names' in options or ('process_layer_names' in options and options['process_layer_names'] == True)):
		elements_with_data_names = svg.select('[data_name]')
		for element in elements_with_data_names:

			# remove any existing id tag as we'll be making our own
			if element.has_attr('id'): del element.attrs['id']
			
			val = element['data_name']
			#print(val)
			del element['data_name']

			if not val.startswith('#'): continue
			val = val.replace('#', '')
			
			attributes = str.split(str(val), ',')
			for a in attributes:
				split = str.split(a.strip(), '=')
				if (len(split) < 2): continue
				key = split[0]
				value = split[1]
				if key == 'id' or key == 'class': value = namespace + value
				element[key] = value
	
	
	if 'remove_text_attributes' in options and options['remove_text_attributes'] == True:
		#Remove attributes from text tags
		text_elements = svg.select('text')
		for element in text_elements:
			if element.has_attr('font-size'): del element.attrs['font-size']
			if element.has_attr('font-family'): del element.attrs['font-family']
			if element.has_attr('font-weight'): del element.attrs['font-weight']
			if element.has_attr('fill'): del element.attrs['fill']

	# Do tree modifications here
	if 'convert_svg_text_to_html' in options and options['convert_svg_text_to_html'] == True:
		for t in foreign_tags_to_add:
			t['element'].append(t['tag'])
	

	return svg


def write_svg(svg, dst_path, options):
	
	result = str(svg)
	result = unicode(result, "utf8")	
	#Remove self closing tags
	result = result.replace('></circle>','/>') 
	result = result.replace('></rect>','/>') 
	result = result.replace('></path>','/>') 
	result = result.replace('></polygon>','/>')

	if 'nowhitespace' in options and options['nowhitespace'] == True:
		result = result.replace('\n','')
	#else:
	#	result = svg.prettify()

	# bs4 incorrectly outputs clippath instead of clipPath 
	result = result.replace('clippath', 'clipPath')
	result = result.encode('UTF8')

	result_file = create_file(dst_path, 'wb')
	result_file.write(result)
	result_file.close()



def compile_svg(src_path, dst_path, options):
	namespace = None

	if 'namespace' in options: 
		namespace = options['namespace']
	svg = parse_svg(src_path, namespace, options)

	if 'attributes' in options: 
		attrs = options['attributes']
		for k in attrs:
			svg.svg[k] = attrs[k]

	if 'description' in options:
		current_desc = svg.select('description')
		if current_desc:
			current_desc[0].string = options['description']
		else:
			desc_tag = svg.new_tag('description');
			desc_tag.string = options['description']
			svg.svg.append(desc_tag)
		
	write_svg(svg, dst_path, options)



def compile_master_svg(src_path, dst_path, options):
	print('\n')
	print(src_path)
	file = open(src_path)
	svg = BeautifulSoup(file, 'html.parser')
	file.close()

	master_viewbox = svg.svg.attrs['viewbox']

	import_tags = svg.select('[path]')
	for tag in import_tags:

		component_path = str(tag['path'])
		
		namespace = None
		if tag.has_attr('namespace'): namespace = tag['namespace']

		component = parse_svg(component_path, namespace, options)

		component_viewbox = component.svg.attrs['viewbox']
		if master_viewbox != component_viewbox:
			print('WARNING: Master viewbox: [' + master_viewbox + '] does not match component viewbox [' + component_viewbox + ']')
	
		# Moves the contents of the component svg file into the master svg
		for child in component.svg: tag.contents.append(child)

		# Remove redundant path and namespace attributes from the import element
		del tag.attrs['path']
		if namespace: del tag.attrs['namespace']


	if 'attributes' in options: 
		attrs = options['attributes']
		for k in attrs:
			print(k + ' = ' + attrs[k])
			svg.svg[k] = attrs[k]


	if 'title' in options and options['title'] is not False:
		current_title = svg.select('title')
		if current_title:
			current_title[0].string = options['title']
		else:
			title_tag = svg.new_tag('title');
			title_tag.string = options['title']
			svg.svg.append(title_tag)


	if 'description' in options:
		current_desc = svg.select('description')
		if current_desc:
			current_desc[0].string = options['description']
		else:
			desc_tag = svg.new_tag('description');
			desc_tag.string = options['description']
			svg.svg.append(desc_tag)


	write_svg(svg, dst_path, options)


# Super dumb / simple function that inlines svgs into html source files

def parse_markup(src_path, output):
	print(src_path)
	read_state = 0
	file = open(src_path, 'r')
	for line in file:
		if line.startswith('//import'):
			path = line.split('//import ')[1].rstrip('\n').rstrip('\r')
			parse_markup(path, output)
		else:
			output.append(line)

	file.close()

def inline_svg(src_path, dst_path):
	output = [];

	file = create_file(dst_path, 'w')
	parse_markup(src_path, output)
	for line in output: file.write(line)
	file.close()
	print('')	

================================================
FILE: parallax_svg_tools/bs4/__init__.py
================================================
"""Beautiful Soup
Elixir and Tonic
"The Screen-Scraper's Friend"
http://www.crummy.com/software/BeautifulSoup/

Beautiful Soup uses a pluggable XML or HTML parser to parse a
(possibly invalid) document into a tree representation. Beautiful Soup
provides methods and Pythonic idioms that make it easy to navigate,
search, and modify the parse tree.

Beautiful Soup works with Python 2.7 and up. It works better if lxml
and/or html5lib is installed.

For more than you ever wanted to know about Beautiful Soup, see the
documentation:
http://www.crummy.com/software/BeautifulSoup/bs4/doc/

"""

# Use of this source code is governed by a BSD-style license that can be
# found in the LICENSE file.

__author__ = "Leonard Richardson (leonardr@segfault.org)"
__version__ = "4.5.1"
__copyright__ = "Copyright (c) 2004-2016 Leonard Richardson"
__license__ = "MIT"

__all__ = ['BeautifulSoup']

import os
import re
import traceback
import warnings

from .builder import builder_registry, ParserRejectedMarkup
from .dammit import UnicodeDammit
from .element import (
    CData,
    Comment,
    DEFAULT_OUTPUT_ENCODING,
    Declaration,
    Doctype,
    NavigableString,
    PageElement,
    ProcessingInstruction,
    ResultSet,
    SoupStrainer,
    Tag,
    )

# The very first thing we do is give a useful error if someone is
# running this code under Python 3 without converting it.
'You are trying to run the Python 2 version of Beautiful Soup under Python 3. This will not work.'<>'You need to convert the code, either by installing it (`python setup.py install`) or by running 2to3 (`2to3 -w bs4`).'

class BeautifulSoup(Tag):
    """
    This class defines the basic interface called by the tree builders.

    These methods will be called by the parser:
      reset()
      feed(markup)

    The tree builder may call these methods from its feed() implementation:
      handle_starttag(name, attrs) # See note about return value
      handle_endtag(name)
      handle_data(data) # Appends to the current data node
      endData(containerClass=NavigableString) # Ends the current data node

    No matter how complicated the underlying parser is, you should be
    able to build a tree using 'start tag' events, 'end tag' events,
    'data' events, and "done with data" events.

    If you encounter an empty-element tag (aka a self-closing tag,
    like HTML's <br> tag), call handle_starttag and then
    handle_endtag.
    """
    ROOT_TAG_NAME = u'[document]'

    # If the end-user gives no indication which tree builder they
    # want, look for one with these features.
    DEFAULT_BUILDER_FEATURES = ['html', 'fast']

    ASCII_SPACES = '\x20\x0a\x09\x0c\x0d'

    NO_PARSER_SPECIFIED_WARNING = "No parser was explicitly specified, so I'm using the best available %(markup_type)s parser for this system (\"%(parser)s\"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.\n\nThe code that caused this warning is on line %(line_number)s of the file %(filename)s. To get rid of this warning, change code that looks like this:\n\n BeautifulSoup([your markup])\n\nto this:\n\n BeautifulSoup([your markup], \"%(parser)s\")\n"

    def __init__(self, markup="", features=None, builder=None,
                 parse_only=None, from_encoding=None, exclude_encodings=None,
                 **kwargs):
        """The Soup object is initialized as the 'root tag', and the
        provided markup (which can be a string or a file-like object)
        is fed into the underlying parser."""

        if 'convertEntities' in kwargs:
            warnings.warn(
                "BS4 does not respect the convertEntities argument to the "
                "BeautifulSoup constructor. Entities are always converted "
                "to Unicode characters.")

        if 'markupMassage' in kwargs:
            del kwargs['markupMassage']
            warnings.warn(
                "BS4 does not respect the markupMassage argument to the "
                "BeautifulSoup constructor. The tree builder is responsible "
                "for any necessary markup massage.")

        if 'smartQuotesTo' in kwargs:
            del kwargs['smartQuotesTo']
            warnings.warn(
                "BS4 does not respect the smartQuotesTo argument to the "
                "BeautifulSoup constructor. Smart quotes are always converted "
                "to Unicode characters.")

        if 'selfClosingTags' in kwargs:
            del kwargs['selfClosingTags']
            warnings.warn(
                "BS4 does not respect the selfClosingTags argument to the "
                "BeautifulSoup constructor. The tree builder is responsible "
                "for understanding self-closing tags.")

        if 'isHTML' in kwargs:
            del kwargs['isHTML']
            warnings.warn(
                "BS4 does not respect the isHTML argument to the "
                "BeautifulSoup constructor. Suggest you use "
                "features='lxml' for HTML and features='lxml-xml' for "
                "XML.")

        def deprecated_argument(old_name, new_name):
            if old_name in kwargs:
                warnings.warn(
                    'The "%s" argument to the BeautifulSoup constructor '
                    'has been renamed to "%s."' % (old_name, new_name))
                value = kwargs[old_name]
                del kwargs[old_name]
                return value
            return None

        parse_only = parse_only or deprecated_argument(
            "parseOnlyThese", "parse_only")

        from_encoding = from_encoding or deprecated_argument(
            "fromEncoding", "from_encoding")

        if from_encoding and isinstance(markup, unicode):
            warnings.warn("You provided Unicode markup but also provided a value for from_encoding. Your from_encoding will be ignored.")
            from_encoding = None

        if len(kwargs) > 0:
            arg = kwargs.keys().pop()
            raise TypeError(
                "__init__() got an unexpected keyword argument '%s'" % arg)

        if builder is None:
            original_features = features
            if isinstance(features, basestring):
                features = [features]
            if features is None or len(features) == 0:
                features = self.DEFAULT_BUILDER_FEATURES
            builder_class = builder_registry.lookup(*features)
            if builder_class is None:
                raise FeatureNotFound(
                    "Couldn't find a tree builder with the features you "
                    "requested: %s. Do you need to install a parser library?"
                    % ",".join(features))
            builder = builder_class()
            if not (original_features == builder.NAME or
                    original_features in builder.ALTERNATE_NAMES):
                if builder.is_xml:
                    markup_type = "XML"
                else:
                    markup_type = "HTML"

                caller = traceback.extract_stack()[0]
                filename = caller[0]
                line_number = caller[1]
                warnings.warn(self.NO_PARSER_SPECIFIED_WARNING % dict(
                    filename=filename,
                    line_number=line_number,
                    parser=builder.NAME,
                    markup_type=markup_type))

        self.builder = builder
        self.is_xml = builder.is_xml
        self.known_xml = self.is_xml
        self.builder.soup = self

        self.parse_only = parse_only

        if hasattr(markup, 'read'):        # It's a file-type object.
            markup = markup.read()
        elif len(markup) <= 256 and (
                (isinstance(markup, bytes) and not b'<' in markup)
                or (isinstance(markup, unicode) and not u'<' in markup)
        ):
            # Print out warnings for a couple beginner problems
            # involving passing non-markup to Beautiful Soup.
            # Beautiful Soup will still parse the input as markup,
            # just in case that's what the user really wants.
            if (isinstance(markup, unicode)
                and not os.path.supports_unicode_filenames):
                possible_filename = markup.encode("utf8")
            else:
                possible_filename = markup
            is_file = False
            try:
                is_file = os.path.exists(possible_filename)
            except Exception, e:
                # This is almost certainly a problem involving
                # characters not valid in filenames on this
                # system. Just let it go.
                pass
            if is_file:
                if isinstance(markup, unicode):
                    markup = markup.encode("utf8")
                warnings.warn(
                    '"%s" looks like a filename, not markup. You should'
                    'probably open this file and pass the filehandle into'
                    'Beautiful Soup.' % markup)
            self._check_markup_is_url(markup)

        for (self.markup, self.original_encoding, self.declared_html_encoding,
         self.contains_replacement_characters) in (
             self.builder.prepare_markup(
                 markup, from_encoding, exclude_encodings=exclude_encodings)):
            self.reset()
            try:
                self._feed()
                break
            except ParserRejectedMarkup:
                pass

        # Clear out the markup and remove the builder's circular
        # reference to this object.
        self.markup = None
        self.builder.soup = None

    def __copy__(self):
        copy = type(self)(
            self.encode('utf-8'), builder=self.builder, from_encoding='utf-8'
        )

        # Although we encoded the tree to UTF-8, that may not have
        # been the encoding of the original markup. Set the copy's
        # .original_encoding to reflect the original object's
        # .original_encoding.
        copy.original_encoding = self.original_encoding
        return copy

    def __getstate__(self):
        # Frequently a tree builder can't be pickled.
        d = dict(self.__dict__)
        if 'builder' in d and not self.builder.picklable:
            d['builder'] = None
        return d

    @staticmethod
    def _check_markup_is_url(markup):
        """ 
        Check if markup looks like it's actually a url and raise a warning 
        if so. Markup can be unicode or str (py2) / bytes (py3).
        """
        if isinstance(markup, bytes):
            space = b' '
            cant_start_with = (b"http:", b"https:")
        elif isinstance(markup, unicode):
            space = u' '
            cant_start_with = (u"http:", u"https:")
        else:
            return

        if any(markup.startswith(prefix) for prefix in cant_start_with):
            if not space in markup:
                if isinstance(markup, bytes):
                    decoded_markup = markup.decode('utf-8', 'replace')
                else:
                    decoded_markup = markup
                warnings.warn(
                    '"%s" looks like a URL. Beautiful Soup is not an'
                    ' HTTP client. You should probably use an HTTP client like'
                    ' requests to get the document behind the URL, and feed'
                    ' that document to Beautiful Soup.' % decoded_markup
                )

    def _feed(self):
        # Convert the document to Unicode.
        self.builder.reset()

        self.builder.feed(self.markup)
        # Close out any unfinished strings and close all the open tags.
        self.endData()
        while self.currentTag.name != self.ROOT_TAG_NAME:
            self.popTag()

    def reset(self):
        Tag.__init__(self, self, self.builder, self.ROOT_TAG_NAME)
        self.hidden = 1
        self.builder.reset()
        self.current_data = []
        self.currentTag = None
        self.tagStack = []
        self.preserve_whitespace_tag_stack = []
        self.pushTag(self)

    def new_tag(self, name, namespace=None, nsprefix=None, **attrs):
        """Create a new tag associated with this soup."""
        return Tag(None, self.builder, name, namespace, nsprefix, attrs)

    def new_string(self, s, subclass=NavigableString):
        """Create a new NavigableString associated with this soup."""
        return subclass(s)

    def insert_before(self, successor):
        raise NotImplementedError("BeautifulSoup objects don't support insert_before().")

    def insert_after(self, successor):
        raise NotImplementedError("BeautifulSoup objects don't support insert_after().")

    def popTag(self):
        tag = self.tagStack.pop()
        if self.preserve_whitespace_tag_stack and tag == self.preserve_whitespace_tag_stack[-1]:
            self.preserve_whitespace_tag_stack.pop()
        #print "Pop", tag.name
        if self.tagStack:
            self.currentTag = self.tagStack[-1]
        return self.currentTag

    def pushTag(self, tag):
        #print "Push", tag.name
        if self.currentTag:
            self.currentTag.contents.append(tag)
        self.tagStack.append(tag)
        self.currentTag = self.tagStack[-1]
        if tag.name in self.builder.preserve_whitespace_tags:
            self.preserve_whitespace_tag_stack.append(tag)

    def endData(self, containerClass=NavigableString):
        if self.current_data:
            current_data = u''.join(self.current_data)
            # If whitespace is not preserved, and this string contains
            # nothing but ASCII spaces, replace it with a single space
            # or newline.
            if not self.preserve_whitespace_tag_stack:
                strippable = True
                for i in current_data:
                    if i not in self.ASCII_SPACES:
                        strippable = False
                        break
                if strippable:
                    if '\n' in current_data:
                        current_data = '\n'
                    else:
                        current_data = ' '

            # Reset the data collector.
            self.current_data = []

            # Should we add this string to the tree at all?
            if self.parse_only and len(self.tagStack) <= 1 and \
                   (not self.parse_only.text or \
                    not self.parse_only.search(current_data)):
                return

            o = containerClass(current_data)
            self.object_was_parsed(o)

    def object_was_parsed(self, o, parent=None, most_recent_element=None):
        """Add an object to the pa

Download .txt

gitextract__k8xnxid/

├── LICENSE.md
├── README.md
├── example/
│   ├── animation.html
│   ├── output/
│   │   └── animation.html
│   └── parallax_svg_tools/
│       ├── bs4/
│       │   ├── __init__.py
│       │   ├── builder/
│       │   │   ├── __init__.py
│       │   │   ├── _html5lib.py
│       │   │   ├── _htmlparser.py
│       │   │   └── _lxml.py
│       │   ├── dammit.py
│       │   ├── diagnose.py
│       │   └── element.py
│       ├── run.py
│       └── svg/
│           └── __init__.py
└── parallax_svg_tools/
    ├── bs4/
    │   ├── __init__.py
    │   ├── builder/
    │   │   ├── __init__.py
    │   │   ├── _html5lib.py
    │   │   ├── _htmlparser.py
    │   │   └── _lxml.py
    │   ├── dammit.py
    │   ├── diagnose.py
    │   └── element.py
    ├── run.py
    └── svg/
        └── __init__.py

Download .txt

SYMBOL INDEX (614 symbols across 18 files)

FILE: example/parallax_svg_tools/bs4/__init__.py
  class BeautifulSoup (line 55) | class BeautifulSoup(Tag):
    method __init__ (line 87) | def __init__(self, markup="", features=None, builder=None,
    method __copy__ (line 238) | def __copy__(self):
    method __getstate__ (line 250) | def __getstate__(self):
    method _check_markup_is_url (line 258) | def _check_markup_is_url(markup):
    method _feed (line 285) | def _feed(self):
    method reset (line 295) | def reset(self):
    method new_tag (line 305) | def new_tag(self, name, namespace=None, nsprefix=None, **attrs):
    method new_string (line 309) | def new_string(self, s, subclass=NavigableString):
    method insert_before (line 313) | def insert_before(self, successor):
    method insert_after (line 316) | def insert_after(self, successor):
    method popTag (line 319) | def popTag(self):
    method pushTag (line 328) | def pushTag(self, tag):
    method endData (line 337) | def endData(self, containerClass=NavigableString):
    method object_was_parsed (line 367) | def object_was_parsed(self, o, parent=None, most_recent_element=None):
    method _popToTag (line 424) | def _popToTag(self, name, nsprefix=None, inclusivePop=True):
    method handle_starttag (line 447) | def handle_starttag(self, name, namespace, nsprefix, attrs):
    method handle_endtag (line 474) | def handle_endtag(self, name, nsprefix=None):
    method handle_data (line 479) | def handle_data(self, data):
    method decode (line 482) | def decode(self, pretty_print=False,
  class BeautifulStoneSoup (line 507) | class BeautifulStoneSoup(BeautifulSoup):
    method __init__ (line 510) | def __init__(self, *args, **kwargs):
  class StopParsing (line 518) | class StopParsing(Exception):
  class FeatureNotFound (line 521) | class FeatureNotFound(ValueError):

FILE: example/parallax_svg_tools/bs4/builder/__init__.py
  class TreeBuilderRegistry (line 30) | class TreeBuilderRegistry(object):
    method __init__ (line 32) | def __init__(self):
    method register (line 36) | def register(self, treebuilder_class):
    method lookup (line 42) | def lookup(self, *features):
  class TreeBuilder (line 84) | class TreeBuilder(object):
    method __init__ (line 102) | def __init__(self):
    method reset (line 105) | def reset(self):
    method can_be_empty_element (line 108) | def can_be_empty_element(self, tag_name):
    method feed (line 129) | def feed(self, markup):
    method prepare_markup (line 132) | def prepare_markup(self, markup, user_specified_encoding=None,
    method test_fragment_to_document (line 136) | def test_fragment_to_document(self, fragment):
    method set_up_substitutions (line 149) | def set_up_substitutions(self, tag):
    method _replace_cdata_list_attribute_values (line 152) | def _replace_cdata_list_attribute_values(self, tag_name, attrs):
  class SAXTreeBuilder (line 182) | class SAXTreeBuilder(TreeBuilder):
    method feed (line 185) | def feed(self, markup):
    method close (line 188) | def close(self):
    method startElement (line 191) | def startElement(self, name, attrs):
    method endElement (line 196) | def endElement(self, name):
    method startElementNS (line 200) | def startElementNS(self, nsTuple, nodeName, attrs):
    method endElementNS (line 204) | def endElementNS(self, nsTuple, nodeName):
    method startPrefixMapping (line 209) | def startPrefixMapping(self, prefix, nodeValue):
    method endPrefixMapping (line 213) | def endPrefixMapping(self, prefix):
    method characters (line 218) | def characters(self, content):
    method startDocument (line 221) | def startDocument(self):
    method endDocument (line 224) | def endDocument(self):
  class HTMLTreeBuilder (line 228) | class HTMLTreeBuilder(TreeBuilder):
    method set_up_substitutions (line 262) | def set_up_substitutions(self, tag):
  function register_treebuilders_from (line 295) | def register_treebuilders_from(module):
  class ParserRejectedMarkup (line 308) | class ParserRejectedMarkup(Exception):

FILE: example/parallax_svg_tools/bs4/builder/_html5lib.py
  class HTML5TreeBuilder (line 37) | class HTML5TreeBuilder(HTMLTreeBuilder):
    method prepare_markup (line 44) | def prepare_markup(self, markup, user_specified_encoding,
    method feed (line 57) | def feed(self, markup):
    method create_treebuilder (line 84) | def create_treebuilder(self, namespaceHTMLElements):
    method test_fragment_to_document (line 89) | def test_fragment_to_document(self, fragment):
  class TreeBuilderForHtml5lib (line 94) | class TreeBuilderForHtml5lib(treebuilder_base.TreeBuilder):
    method __init__ (line 96) | def __init__(self, soup, namespaceHTMLElements):
    method documentClass (line 100) | def documentClass(self):
    method insertDoctype (line 104) | def insertDoctype(self, token):
    method elementClass (line 112) | def elementClass(self, name, namespace):
    method commentClass (line 116) | def commentClass(self, data):
    method fragmentClass (line 119) | def fragmentClass(self):
    method appendChild (line 124) | def appendChild(self, node):
    method getDocument (line 128) | def getDocument(self):
    method getFragment (line 131) | def getFragment(self):
  class AttrList (line 134) | class AttrList(object):
    method __init__ (line 135) | def __init__(self, element):
    method __iter__ (line 138) | def __iter__(self):
    method __setitem__ (line 140) | def __setitem__(self, name, value):
    method items (line 152) | def items(self):
    method keys (line 154) | def keys(self):
    method __len__ (line 156) | def __len__(self):
    method __getitem__ (line 158) | def __getitem__(self, name):
    method __contains__ (line 160) | def __contains__(self, name):
  class Element (line 164) | class Element(treebuilder_base.Node):
    method __init__ (line 165) | def __init__(self, element, soup, namespace):
    method appendChild (line 171) | def appendChild(self, node):
    method getAttributes (line 223) | def getAttributes(self):
    method setAttributes (line 226) | def setAttributes(self, attributes):
    method insertText (line 250) | def insertText(self, data, insertBefore=None):
    method insertBefore (line 257) | def insertBefore(self, node, refNode):
    method removeChild (line 269) | def removeChild(self, node):
    method reparentChildren (line 272) | def reparentChildren(self, new_parent):
    method cloneNode (line 331) | def cloneNode(self):
    method hasContent (line 338) | def hasContent(self):
    method getNameTuple (line 341) | def getNameTuple(self):
  class TextNode (line 349) | class TextNode(Element):
    method __init__ (line 350) | def __init__(self, element, soup):
    method cloneNode (line 355) | def cloneNode(self):

FILE: example/parallax_svg_tools/bs4/builder/_htmlparser.py
  class HTMLParseError (line 17) | class HTMLParseError(Exception):
  class BeautifulSoupHTMLParser (line 54) | class BeautifulSoupHTMLParser(HTMLParser):
    method handle_starttag (line 55) | def handle_starttag(self, name, attrs):
    method handle_endtag (line 67) | def handle_endtag(self, name):
    method handle_data (line 70) | def handle_data(self, data):
    method handle_charref (line 73) | def handle_charref(self, name):
    method handle_entityref (line 91) | def handle_entityref(self, name):
    method handle_comment (line 99) | def handle_comment(self, data):
    method handle_decl (line 104) | def handle_decl(self, data):
    method unknown_decl (line 114) | def unknown_decl(self, data):
    method handle_pi (line 124) | def handle_pi(self, data):
  class HTMLParserTreeBuilder (line 130) | class HTMLParserTreeBuilder(HTMLTreeBuilder):
    method __init__ (line 137) | def __init__(self, *args, **kwargs):
    method prepare_markup (line 144) | def prepare_markup(self, markup, user_specified_encoding=None,
    method feed (line 162) | def feed(self, markup):
  function parse_starttag (line 203) | def parse_starttag(self, i):
  function set_cdata_mode (line 258) | def set_cdata_mode(self, elem):

FILE: example/parallax_svg_tools/bs4/builder/_lxml.py
  class LXMLTreeBuilderForXML (line 31) | class LXMLTreeBuilderForXML(TreeBuilder):
    method default_parser (line 49) | def default_parser(self, encoding):
    method parser_for (line 57) | def parser_for(self, encoding):
    method __init__ (line 66) | def __init__(self, parser=None, empty_element_tags=None):
    method _getNsTag (line 76) | def _getNsTag(self, tag):
    method prepare_markup (line 84) | def prepare_markup(self, markup, user_specified_encoding=None,
    method feed (line 121) | def feed(self, markup):
    method close (line 142) | def close(self):
    method start (line 145) | def start(self, name, attrs, nsmap={}):
    method _prefix_for_namespace (line 185) | def _prefix_for_namespace(self, namespace):
    method end (line 194) | def end(self, name):
    method pi (line 210) | def pi(self, target, data):
    method data (line 215) | def data(self, content):
    method doctype (line 218) | def doctype(self, name, pubid, system):
    method comment (line 223) | def comment(self, content):
    method test_fragment_to_document (line 229) | def test_fragment_to_document(self, fragment):
  class LXMLTreeBuilder (line 234) | class LXMLTreeBuilder(HTMLTreeBuilder, LXMLTreeBuilderForXML):
    method default_parser (line 243) | def default_parser(self, encoding):
    method feed (line 246) | def feed(self, markup):
    method test_fragment_to_document (line 256) | def test_fragment_to_document(self, fragment):

FILE: example/parallax_svg_tools/bs4/dammit.py
  function chardet_dammit (line 25) | def chardet_dammit(s):
  function chardet_dammit (line 33) | def chardet_dammit(s):
  function chardet_dammit (line 39) | def chardet_dammit(s):
  class EntitySubstitution (line 53) | class EntitySubstitution(object):
    method _populate_class_variables (line 57) | def _populate_class_variables():
    method _substitute_html_entity (line 91) | def _substitute_html_entity(cls, matchobj):
    method _substitute_xml_entity (line 96) | def _substitute_xml_entity(cls, matchobj):
    method quoted_attribute_value (line 103) | def quoted_attribute_value(self, value):
    method substitute_xml (line 140) | def substitute_xml(cls, value, make_quoted_attribute=False):
    method substitute_xml_containing_entities (line 161) | def substitute_xml_containing_entities(
    method substitute_html (line 183) | def substitute_html(cls, s):
  class EncodingDetector (line 198) | class EncodingDetector:
    method __init__ (line 218) | def __init__(self, markup, override_encodings=None, is_html=False,
    method _usable (line 230) | def _usable(self, encoding, tried):
    method encodings (line 241) | def encodings(self):
    method strip_byte_order_mark (line 274) | def strip_byte_order_mark(cls, data):
    method find_declared_encoding (line 300) | def find_declared_encoding(cls, markup, is_html=False, search_entire_d...
  class UnicodeDammit (line 325) | class UnicodeDammit:
    method __init__ (line 344) | def __init__(self, markup, override_encodings=[],
    method _sub_ms_char (line 394) | def _sub_ms_char(self, match):
    method _convert_from (line 411) | def _convert_from(self, proposed, errors="strict"):
    method _to_unicode (line 438) | def _to_unicode(self, data, encoding, errors="strict"):
    method declared_html_encoding (line 444) | def declared_html_encoding(self):
    method find_codec (line 449) | def find_codec(self, charset):
    method _codec (line 460) | def _codec(self, charset):
    method detwingle (line 781) | def detwingle(cls, in_bytes, main_encoding="utf8",

FILE: example/parallax_svg_tools/bs4/diagnose.py
  function diagnose (line 23) | def diagnose(data):
  function lxml_trace (line 84) | def lxml_trace(data, html=True, **kwargs):
  class AnnouncingParser (line 94) | class AnnouncingParser(HTMLParser):
    method _p (line 97) | def _p(self, s):
    method handle_starttag (line 100) | def handle_starttag(self, name, attrs):
    method handle_endtag (line 103) | def handle_endtag(self, name):
    method handle_data (line 106) | def handle_data(self, data):
    method handle_charref (line 109) | def handle_charref(self, name):
    method handle_entityref (line 112) | def handle_entityref(self, name):
    method handle_comment (line 115) | def handle_comment(self, data):
    method handle_decl (line 118) | def handle_decl(self, data):
    method unknown_decl (line 121) | def unknown_decl(self, data):
    method handle_pi (line 124) | def handle_pi(self, data):
  function htmlparser_trace (line 127) | def htmlparser_trace(data):
  function rword (line 139) | def rword(length=5):
  function rsentence (line 150) | def rsentence(length=4):
  function rdoc (line 154) | def rdoc(num_elements=1000):
  function benchmark_parsers (line 172) | def benchmark_parsers(num_elements=100000):
  function profile (line 204) | def profile(num_elements=100000, parser="lxml"):

FILE: example/parallax_svg_tools/bs4/element.py
  function _alias (line 17) | def _alias(attr):
  class NamespacedAttribute (line 29) | class NamespacedAttribute(unicode):
    method __new__ (line 31) | def __new__(cls, prefix, name, namespace=None):
  class AttributeValueWithCharsetSubstitution (line 44) | class AttributeValueWithCharsetSubstitution(unicode):
  class CharsetMetaAttributeValue (line 47) | class CharsetMetaAttributeValue(AttributeValueWithCharsetSubstitution):
    method __new__ (line 54) | def __new__(cls, original_value):
    method encode (line 59) | def encode(self, encoding):
  class ContentMetaAttributeValue (line 63) | class ContentMetaAttributeValue(AttributeValueWithCharsetSubstitution):
    method __new__ (line 74) | def __new__(cls, original_value):
    method encode (line 84) | def encode(self, encoding):
  class HTMLAwareEntitySubstitution (line 89) | class HTMLAwareEntitySubstitution(EntitySubstitution):
    method _substitute_if_appropriate (line 107) | def _substitute_if_appropriate(cls, ns, f):
    method substitute_html (line 117) | def substitute_html(cls, ns):
    method substitute_xml (line 122) | def substitute_xml(cls, ns):
  class PageElement (line 126) | class PageElement(object):
    method format_string (line 160) | def format_string(self, s, formatter='minimal'):
    method _is_xml (line 171) | def _is_xml(self):
    method _formatter_for_name (line 194) | def _formatter_for_name(self, name):
    method setup (line 203) | def setup(self, parent=None, previous_element=None, next_element=None,
    method replace_with (line 232) | def replace_with(self, replace_with):
    method unwrap (line 248) | def unwrap(self):
    method wrap (line 262) | def wrap(self, wrap_inside):
    method extract (line 267) | def extract(self):
    method _last_descendant (line 296) | def _last_descendant(self, is_initialized=True, accept_self=True):
    method insert (line 310) | def insert(self, position, new_child):
    method append (line 376) | def append(self, tag):
    method insert_before (line 380) | def insert_before(self, predecessor):
    method insert_after (line 399) | def insert_after(self, successor):
    method find_next (line 418) | def find_next(self, name=None, attrs={}, text=None, **kwargs):
    method find_all_next (line 424) | def find_all_next(self, name=None, attrs={}, text=None, limit=None,
    method find_next_sibling (line 432) | def find_next_sibling(self, name=None, attrs={}, text=None, **kwargs):
    method find_next_siblings (line 439) | def find_next_siblings(self, name=None, attrs={}, text=None, limit=None,
    method find_previous (line 448) | def find_previous(self, name=None, attrs={}, text=None, **kwargs):
    method find_all_previous (line 455) | def find_all_previous(self, name=None, attrs={}, text=None, limit=None,
    method find_previous_sibling (line 464) | def find_previous_sibling(self, name=None, attrs={}, text=None, **kwar...
    method find_previous_siblings (line 471) | def find_previous_siblings(self, name=None, attrs={}, text=None,
    method find_parent (line 480) | def find_parent(self, name=None, attrs={}, **kwargs):
    method find_parents (line 492) | def find_parents(self, name=None, attrs={}, limit=None, **kwargs):
    method next (line 502) | def next(self):
    method previous (line 506) | def previous(self):
    method _find_one (line 511) | def _find_one(self, method, name, attrs, text, **kwargs):
    method _find_all (line 518) | def _find_all(self, name, attrs, text, limit, generator, **kwargs):
    method next_elements (line 559) | def next_elements(self):
    method next_siblings (line 566) | def next_siblings(self):
    method previous_elements (line 573) | def previous_elements(self):
    method previous_siblings (line 580) | def previous_siblings(self):
    method parents (line 587) | def parents(self):
    method _attr_value_as_string (line 609) | def _attr_value_as_string(self, value, default=None):
    method _tag_name_matches_and (line 620) | def _tag_name_matches_and(self, function, tag_name):
    method _attribute_checker (line 628) | def _attribute_checker(self, operator, attribute, value=''):
    method nextGenerator (line 671) | def nextGenerator(self):
    method nextSiblingGenerator (line 674) | def nextSiblingGenerator(self):
    method previousGenerator (line 677) | def previousGenerator(self):
    method previousSiblingGenerator (line 680) | def previousSiblingGenerator(self):
    method parentGenerator (line 683) | def parentGenerator(self):
  class NavigableString (line 687) | class NavigableString(unicode, PageElement):
    method __new__ (line 697) | def __new__(cls, value):
    method __copy__ (line 712) | def __copy__(self):
    method __getnewargs__ (line 718) | def __getnewargs__(self):
    method __getattr__ (line 721) | def __getattr__(self, attr):
    method output_ready (line 732) | def output_ready(self, formatter="minimal"):
    method name (line 737) | def name(self):
    method name (line 741) | def name(self, name):
  class PreformattedString (line 744) | class PreformattedString(NavigableString):
    method output_ready (line 751) | def output_ready(self, formatter="minimal"):
  class CData (line 757) | class CData(PreformattedString):
  class ProcessingInstruction (line 762) | class ProcessingInstruction(PreformattedString):
  class XMLProcessingInstruction (line 768) | class XMLProcessingInstruction(ProcessingInstruction):
  class Comment (line 773) | class Comment(PreformattedString):
  class Declaration (line 779) | class Declaration(PreformattedString):
  class Doctype (line 784) | class Doctype(PreformattedString):
    method for_name_and_ids (line 787) | def for_name_and_ids(cls, name, pub_id, system_id):
  class Tag (line 802) | class Tag(PageElement):
    method __init__ (line 806) | def __init__(self, parser=None, builder=None, name=None, namespace=None,
    method __copy__ (line 861) | def __copy__(self):
    method is_empty_element (line 874) | def is_empty_element(self):
    method string (line 892) | def string(self):
    method string (line 909) | def string(self, string):
    method _all_strings (line 913) | def _all_strings(self, strip=False, types=(NavigableString, CData)):
    method stripped_strings (line 934) | def stripped_strings(self):
    method get_text (line 938) | def get_text(self, separator=u"", strip=False,
    method decompose (line 948) | def decompose(self):
    method clear (line 958) | def clear(self, decompose=False):
    method index (line 972) | def index(self, element):
    method get (line 982) | def get(self, key, default=None):
    method has_attr (line 988) | def has_attr(self, key):
    method __hash__ (line 991) | def __hash__(self):
    method __getitem__ (line 994) | def __getitem__(self, key):
    method __iter__ (line 999) | def __iter__(self):
    method __len__ (line 1003) | def __len__(self):
    method __contains__ (line 1007) | def __contains__(self, x):
    method __nonzero__ (line 1010) | def __nonzero__(self):
    method __setitem__ (line 1014) | def __setitem__(self, key, value):
    method __delitem__ (line 1019) | def __delitem__(self, key):
    method __call__ (line 1023) | def __call__(self, *args, **kwargs):
    method __getattr__ (line 1029) | def __getattr__(self, tag):
    method __eq__ (line 1044) | def __eq__(self, other):
    method __ne__ (line 1061) | def __ne__(self, other):
    method __repr__ (line 1066) | def __repr__(self, encoding="unicode-escape"):
    method __unicode__ (line 1077) | def __unicode__(self):
    method __str__ (line 1080) | def __str__(self):
    method encode (line 1089) | def encode(self, encoding=DEFAULT_OUTPUT_ENCODING,
    method _should_pretty_print (line 1097) | def _should_pretty_print(self, indent_level):
    method decode (line 1105) | def decode(self, indent_level=None,
    method prettify (line 1198) | def prettify(self, encoding=None, formatter="minimal"):
    method decode_contents (line 1204) | def decode_contents(self, indent_level=None,
    method encode_contents (line 1246) | def encode_contents(
    method renderContents (line 1264) | def renderContents(self, encoding=DEFAULT_OUTPUT_ENCODING,
    method find (line 1273) | def find(self, name=None, attrs={}, recursive=True, text=None,
    method find_all (line 1284) | def find_all(self, name=None, attrs={}, recursive=True, text=None,
    method children (line 1305) | def children(self):
    method descendants (line 1310) | def descendants(self):
    method select_one (line 1324) | def select_one(self, selector):
    method select (line 1331) | def select(self, selector, _candidate_generator=None, limit=None):
    method childGenerator (line 1552) | def childGenerator(self):
    method recursiveChildGenerator (line 1555) | def recursiveChildGenerator(self):
    method has_key (line 1558) | def has_key(self, key):
  class SoupStrainer (line 1567) | class SoupStrainer(object):
    method __init__ (line 1571) | def __init__(self, name=None, attrs={}, text=None, **kwargs):
    method _normalize_search_value (line 1598) | def _normalize_search_value(self, value):
    method __str__ (line 1628) | def __str__(self):
    method search_tag (line 1634) | def search_tag(self, markup_name=None, markup_attrs={}):
    method search (line 1675) | def search(self, markup):
    method _matches (line 1701) | def _matches(self, markup, match_against):
  class ResultSet (line 1750) | class ResultSet(list):
    method __init__ (line 1753) | def __init__(self, source, result=()):

FILE: example/parallax_svg_tools/svg/__init__.py
  function create_file (line 7) | def create_file(path, mode):
  function parse_svg (line 19) | def parse_svg(path, namespace, options):
  function write_svg (line 165) | def write_svg(svg, dst_path, options):
  function compile_svg (line 190) | def compile_svg(src_path, dst_path, options):
  function compile_master_svg (line 215) | def compile_master_svg(src_path, dst_path, options):
  function parse_markup (line 278) | def parse_markup(src_path, output):
  function inline_svg (line 291) | def inline_svg(src_path, dst_path):

FILE: parallax_svg_tools/bs4/__init__.py
  class BeautifulSoup (line 55) | class BeautifulSoup(Tag):
    method __init__ (line 87) | def __init__(self, markup="", features=None, builder=None,
    method __copy__ (line 238) | def __copy__(self):
    method __getstate__ (line 250) | def __getstate__(self):
    method _check_markup_is_url (line 258) | def _check_markup_is_url(markup):
    method _feed (line 285) | def _feed(self):
    method reset (line 295) | def reset(self):
    method new_tag (line 305) | def new_tag(self, name, namespace=None, nsprefix=None, **attrs):
    method new_string (line 309) | def new_string(self, s, subclass=NavigableString):
    method insert_before (line 313) | def insert_before(self, successor):
    method insert_after (line 316) | def insert_after(self, successor):
    method popTag (line 319) | def popTag(self):
    method pushTag (line 328) | def pushTag(self, tag):
    method endData (line 337) | def endData(self, containerClass=NavigableString):
    method object_was_parsed (line 367) | def object_was_parsed(self, o, parent=None, most_recent_element=None):
    method _popToTag (line 424) | def _popToTag(self, name, nsprefix=None, inclusivePop=True):
    method handle_starttag (line 447) | def handle_starttag(self, name, namespace, nsprefix, attrs):
    method handle_endtag (line 474) | def handle_endtag(self, name, nsprefix=None):
    method handle_data (line 479) | def handle_data(self, data):
    method decode (line 482) | def decode(self, pretty_print=False,
  class BeautifulStoneSoup (line 507) | class BeautifulStoneSoup(BeautifulSoup):
    method __init__ (line 510) | def __init__(self, *args, **kwargs):
  class StopParsing (line 518) | class StopParsing(Exception):
  class FeatureNotFound (line 521) | class FeatureNotFound(ValueError):

FILE: parallax_svg_tools/bs4/builder/__init__.py
  class TreeBuilderRegistry (line 30) | class TreeBuilderRegistry(object):
    method __init__ (line 32) | def __init__(self):
    method register (line 36) | def register(self, treebuilder_class):
    method lookup (line 42) | def lookup(self, *features):
  class TreeBuilder (line 84) | class TreeBuilder(object):
    method __init__ (line 102) | def __init__(self):
    method reset (line 105) | def reset(self):
    method can_be_empty_element (line 108) | def can_be_empty_element(self, tag_name):
    method feed (line 129) | def feed(self, markup):
    method prepare_markup (line 132) | def prepare_markup(self, markup, user_specified_encoding=None,
    method test_fragment_to_document (line 136) | def test_fragment_to_document(self, fragment):
    method set_up_substitutions (line 149) | def set_up_substitutions(self, tag):
    method _replace_cdata_list_attribute_values (line 152) | def _replace_cdata_list_attribute_values(self, tag_name, attrs):
  class SAXTreeBuilder (line 182) | class SAXTreeBuilder(TreeBuilder):
    method feed (line 185) | def feed(self, markup):
    method close (line 188) | def close(self):
    method startElement (line 191) | def startElement(self, name, attrs):
    method endElement (line 196) | def endElement(self, name):
    method startElementNS (line 200) | def startElementNS(self, nsTuple, nodeName, attrs):
    method endElementNS (line 204) | def endElementNS(self, nsTuple, nodeName):
    method startPrefixMapping (line 209) | def startPrefixMapping(self, prefix, nodeValue):
    method endPrefixMapping (line 213) | def endPrefixMapping(self, prefix):
    method characters (line 218) | def characters(self, content):
    method startDocument (line 221) | def startDocument(self):
    method endDocument (line 224) | def endDocument(self):
  class HTMLTreeBuilder (line 228) | class HTMLTreeBuilder(TreeBuilder):
    method set_up_substitutions (line 262) | def set_up_substitutions(self, tag):
  function register_treebuilders_from (line 295) | def register_treebuilders_from(module):
  class ParserRejectedMarkup (line 308) | class ParserRejectedMarkup(Exception):

FILE: parallax_svg_tools/bs4/builder/_html5lib.py
  class HTML5TreeBuilder (line 37) | class HTML5TreeBuilder(HTMLTreeBuilder):
    method prepare_markup (line 44) | def prepare_markup(self, markup, user_specified_encoding,
    method feed (line 57) | def feed(self, markup):
    method create_treebuilder (line 84) | def create_treebuilder(self, namespaceHTMLElements):
    method test_fragment_to_document (line 89) | def test_fragment_to_document(self, fragment):
  class TreeBuilderForHtml5lib (line 94) | class TreeBuilderForHtml5lib(treebuilder_base.TreeBuilder):
    method __init__ (line 96) | def __init__(self, soup, namespaceHTMLElements):
    method documentClass (line 100) | def documentClass(self):
    method insertDoctype (line 104) | def insertDoctype(self, token):
    method elementClass (line 112) | def elementClass(self, name, namespace):
    method commentClass (line 116) | def commentClass(self, data):
    method fragmentClass (line 119) | def fragmentClass(self):
    method appendChild (line 124) | def appendChild(self, node):
    method getDocument (line 128) | def getDocument(self):
    method getFragment (line 131) | def getFragment(self):
  class AttrList (line 134) | class AttrList(object):
    method __init__ (line 135) | def __init__(self, element):
    method __iter__ (line 138) | def __iter__(self):
    method __setitem__ (line 140) | def __setitem__(self, name, value):
    method items (line 152) | def items(self):
    method keys (line 154) | def keys(self):
    method __len__ (line 156) | def __len__(self):
    method __getitem__ (line 158) | def __getitem__(self, name):
    method __contains__ (line 160) | def __contains__(self, name):
  class Element (line 164) | class Element(treebuilder_base.Node):
    method __init__ (line 165) | def __init__(self, element, soup, namespace):
    method appendChild (line 171) | def appendChild(self, node):
    method getAttributes (line 223) | def getAttributes(self):
    method setAttributes (line 226) | def setAttributes(self, attributes):
    method insertText (line 250) | def insertText(self, data, insertBefore=None):
    method insertBefore (line 257) | def insertBefore(self, node, refNode):
    method removeChild (line 269) | def removeChild(self, node):
    method reparentChildren (line 272) | def reparentChildren(self, new_parent):
    method cloneNode (line 331) | def cloneNode(self):
    method hasContent (line 338) | def hasContent(self):
    method getNameTuple (line 341) | def getNameTuple(self):
  class TextNode (line 349) | class TextNode(Element):
    method __init__ (line 350) | def __init__(self, element, soup):
    method cloneNode (line 355) | def cloneNode(self):

FILE: parallax_svg_tools/bs4/builder/_htmlparser.py
  class HTMLParseError (line 17) | class HTMLParseError(Exception):
  class BeautifulSoupHTMLParser (line 54) | class BeautifulSoupHTMLParser(HTMLParser):
    method handle_starttag (line 55) | def handle_starttag(self, name, attrs):
    method handle_endtag (line 67) | def handle_endtag(self, name):
    method handle_data (line 70) | def handle_data(self, data):
    method handle_charref (line 73) | def handle_charref(self, name):
    method handle_entityref (line 91) | def handle_entityref(self, name):
    method handle_comment (line 99) | def handle_comment(self, data):
    method handle_decl (line 104) | def handle_decl(self, data):
    method unknown_decl (line 114) | def unknown_decl(self, data):
    method handle_pi (line 124) | def handle_pi(self, data):
  class HTMLParserTreeBuilder (line 130) | class HTMLParserTreeBuilder(HTMLTreeBuilder):
    method __init__ (line 137) | def __init__(self, *args, **kwargs):
    method prepare_markup (line 144) | def prepare_markup(self, markup, user_specified_encoding=None,
    method feed (line 162) | def feed(self, markup):
  function parse_starttag (line 203) | def parse_starttag(self, i):
  function set_cdata_mode (line 258) | def set_cdata_mode(self, elem):

FILE: parallax_svg_tools/bs4/builder/_lxml.py
  class LXMLTreeBuilderForXML (line 31) | class LXMLTreeBuilderForXML(TreeBuilder):
    method default_parser (line 49) | def default_parser(self, encoding):
    method parser_for (line 57) | def parser_for(self, encoding):
    method __init__ (line 66) | def __init__(self, parser=None, empty_element_tags=None):
    method _getNsTag (line 76) | def _getNsTag(self, tag):
    method prepare_markup (line 84) | def prepare_markup(self, markup, user_specified_encoding=None,
    method feed (line 121) | def feed(self, markup):
    method close (line 142) | def close(self):
    method start (line 145) | def start(self, name, attrs, nsmap={}):
    method _prefix_for_namespace (line 185) | def _prefix_for_namespace(self, namespace):
    method end (line 194) | def end(self, name):
    method pi (line 210) | def pi(self, target, data):
    method data (line 215) | def data(self, content):
    method doctype (line 218) | def doctype(self, name, pubid, system):
    method comment (line 223) | def comment(self, content):
    method test_fragment_to_document (line 229) | def test_fragment_to_document(self, fragment):
  class LXMLTreeBuilder (line 234) | class LXMLTreeBuilder(HTMLTreeBuilder, LXMLTreeBuilderForXML):
    method default_parser (line 243) | def default_parser(self, encoding):
    method feed (line 246) | def feed(self, markup):
    method test_fragment_to_document (line 256) | def test_fragment_to_document(self, fragment):

FILE: parallax_svg_tools/bs4/dammit.py
  function chardet_dammit (line 25) | def chardet_dammit(s):
  function chardet_dammit (line 33) | def chardet_dammit(s):
  function chardet_dammit (line 39) | def chardet_dammit(s):
  class EntitySubstitution (line 53) | class EntitySubstitution(object):
    method _populate_class_variables (line 57) | def _populate_class_variables():
    method _substitute_html_entity (line 91) | def _substitute_html_entity(cls, matchobj):
    method _substitute_xml_entity (line 96) | def _substitute_xml_entity(cls, matchobj):
    method quoted_attribute_value (line 103) | def quoted_attribute_value(self, value):
    method substitute_xml (line 140) | def substitute_xml(cls, value, make_quoted_attribute=False):
    method substitute_xml_containing_entities (line 161) | def substitute_xml_containing_entities(
    method substitute_html (line 183) | def substitute_html(cls, s):
  class EncodingDetector (line 198) | class EncodingDetector:
    method __init__ (line 218) | def __init__(self, markup, override_encodings=None, is_html=False,
    method _usable (line 230) | def _usable(self, encoding, tried):
    method encodings (line 241) | def encodings(self):
    method strip_byte_order_mark (line 274) | def strip_byte_order_mark(cls, data):
    method find_declared_encoding (line 300) | def find_declared_encoding(cls, markup, is_html=False, search_entire_d...
  class UnicodeDammit (line 325) | class UnicodeDammit:
    method __init__ (line 344) | def __init__(self, markup, override_encodings=[],
    method _sub_ms_char (line 394) | def _sub_ms_char(self, match):
    method _convert_from (line 411) | def _convert_from(self, proposed, errors="strict"):
    method _to_unicode (line 438) | def _to_unicode(self, data, encoding, errors="strict"):
    method declared_html_encoding (line 444) | def declared_html_encoding(self):
    method find_codec (line 449) | def find_codec(self, charset):
    method _codec (line 460) | def _codec(self, charset):
    method detwingle (line 781) | def detwingle(cls, in_bytes, main_encoding="utf8",

FILE: parallax_svg_tools/bs4/diagnose.py
  function diagnose (line 23) | def diagnose(data):
  function lxml_trace (line 84) | def lxml_trace(data, html=True, **kwargs):
  class AnnouncingParser (line 94) | class AnnouncingParser(HTMLParser):
    method _p (line 97) | def _p(self, s):
    method handle_starttag (line 100) | def handle_starttag(self, name, attrs):
    method handle_endtag (line 103) | def handle_endtag(self, name):
    method handle_data (line 106) | def handle_data(self, data):
    method handle_charref (line 109) | def handle_charref(self, name):
    method handle_entityref (line 112) | def handle_entityref(self, name):
    method handle_comment (line 115) | def handle_comment(self, data):
    method handle_decl (line 118) | def handle_decl(self, data):
    method unknown_decl (line 121) | def unknown_decl(self, data):
    method handle_pi (line 124) | def handle_pi(self, data):
  function htmlparser_trace (line 127) | def htmlparser_trace(data):
  function rword (line 139) | def rword(length=5):
  function rsentence (line 150) | def rsentence(length=4):
  function rdoc (line 154) | def rdoc(num_elements=1000):
  function benchmark_parsers (line 172) | def benchmark_parsers(num_elements=100000):
  function profile (line 204) | def profile(num_elements=100000, parser="lxml"):

FILE: parallax_svg_tools/bs4/element.py
  function _alias (line 17) | def _alias(attr):
  class NamespacedAttribute (line 29) | class NamespacedAttribute(unicode):
    method __new__ (line 31) | def __new__(cls, prefix, name, namespace=None):
  class AttributeValueWithCharsetSubstitution (line 44) | class AttributeValueWithCharsetSubstitution(unicode):
  class CharsetMetaAttributeValue (line 47) | class CharsetMetaAttributeValue(AttributeValueWithCharsetSubstitution):
    method __new__ (line 54) | def __new__(cls, original_value):
    method encode (line 59) | def encode(self, encoding):
  class ContentMetaAttributeValue (line 63) | class ContentMetaAttributeValue(AttributeValueWithCharsetSubstitution):
    method __new__ (line 74) | def __new__(cls, original_value):
    method encode (line 84) | def encode(self, encoding):
  class HTMLAwareEntitySubstitution (line 89) | class HTMLAwareEntitySubstitution(EntitySubstitution):
    method _substitute_if_appropriate (line 107) | def _substitute_if_appropriate(cls, ns, f):
    method substitute_html (line 117) | def substitute_html(cls, ns):
    method substitute_xml (line 122) | def substitute_xml(cls, ns):
  class PageElement (line 126) | class PageElement(object):
    method format_string (line 160) | def format_string(self, s, formatter='minimal'):
    method _is_xml (line 171) | def _is_xml(self):
    method _formatter_for_name (line 194) | def _formatter_for_name(self, name):
    method setup (line 203) | def setup(self, parent=None, previous_element=None, next_element=None,
    method replace_with (line 232) | def replace_with(self, replace_with):
    method unwrap (line 248) | def unwrap(self):
    method wrap (line 262) | def wrap(self, wrap_inside):
    method extract (line 267) | def extract(self):
    method _last_descendant (line 296) | def _last_descendant(self, is_initialized=True, accept_self=True):
    method insert (line 310) | def insert(self, position, new_child):
    method append (line 376) | def append(self, tag):
    method insert_before (line 380) | def insert_before(self, predecessor):
    method insert_after (line 399) | def insert_after(self, successor):
    method find_next (line 418) | def find_next(self, name=None, attrs={}, text=None, **kwargs):
    method find_all_next (line 424) | def find_all_next(self, name=None, attrs={}, text=None, limit=None,
    method find_next_sibling (line 432) | def find_next_sibling(self, name=None, attrs={}, text=None, **kwargs):
    method find_next_siblings (line 439) | def find_next_siblings(self, name=None, attrs={}, text=None, limit=None,
    method find_previous (line 448) | def find_previous(self, name=None, attrs={}, text=None, **kwargs):
    method find_all_previous (line 455) | def find_all_previous(self, name=None, attrs={}, text=None, limit=None,
    method find_previous_sibling (line 464) | def find_previous_sibling(self, name=None, attrs={}, text=None, **kwar...
    method find_previous_siblings (line 471) | def find_previous_siblings(self, name=None, attrs={}, text=None,
    method find_parent (line 480) | def find_parent(self, name=None, attrs={}, **kwargs):
    method find_parents (line 492) | def find_parents(self, name=None, attrs={}, limit=None, **kwargs):
    method next (line 502) | def next(self):
    method previous (line 506) | def previous(self):
    method _find_one (line 511) | def _find_one(self, method, name, attrs, text, **kwargs):
    method _find_all (line 518) | def _find_all(self, name, attrs, text, limit, generator, **kwargs):
    method next_elements (line 559) | def next_elements(self):
    method next_siblings (line 566) | def next_siblings(self):
    method previous_elements (line 573) | def previous_elements(self):
    method previous_siblings (line 580) | def previous_siblings(self):
    method parents (line 587) | def parents(self):
    method _attr_value_as_string (line 609) | def _attr_value_as_string(self, value, default=None):
    method _tag_name_matches_and (line 620) | def _tag_name_matches_and(self, function, tag_name):
    method _attribute_checker (line 628) | def _attribute_checker(self, operator, attribute, value=''):
    method nextGenerator (line 671) | def nextGenerator(self):
    method nextSiblingGenerator (line 674) | def nextSiblingGenerator(self):
    method previousGenerator (line 677) | def previousGenerator(self):
    method previousSiblingGenerator (line 680) | def previousSiblingGenerator(self):
    method parentGenerator (line 683) | def parentGenerator(self):
  class NavigableString (line 687) | class NavigableString(unicode, PageElement):
    method __new__ (line 697) | def __new__(cls, value):
    method __copy__ (line 712) | def __copy__(self):
    method __getnewargs__ (line 718) | def __getnewargs__(self):
    method __getattr__ (line 721) | def __getattr__(self, attr):
    method output_ready (line 732) | def output_ready(self, formatter="minimal"):
    method name (line 737) | def name(self):
    method name (line 741) | def name(self, name):
  class PreformattedString (line 744) | class PreformattedString(NavigableString):
    method output_ready (line 751) | def output_ready(self, formatter="minimal"):
  class CData (line 757) | class CData(PreformattedString):
  class ProcessingInstruction (line 762) | class ProcessingInstruction(PreformattedString):
  class XMLProcessingInstruction (line 768) | class XMLProcessingInstruction(ProcessingInstruction):
  class Comment (line 773) | class Comment(PreformattedString):
  class Declaration (line 779) | class Declaration(PreformattedString):
  class Doctype (line 784) | class Doctype(PreformattedString):
    method for_name_and_ids (line 787) | def for_name_and_ids(cls, name, pub_id, system_id):
  class Tag (line 802) | class Tag(PageElement):
    method __init__ (line 806) | def __init__(self, parser=None, builder=None, name=None, namespace=None,
    method __copy__ (line 861) | def __copy__(self):
    method is_empty_element (line 874) | def is_empty_element(self):
    method string (line 892) | def string(self):
    method string (line 909) | def string(self, string):
    method _all_strings (line 913) | def _all_strings(self, strip=False, types=(NavigableString, CData)):
    method stripped_strings (line 934) | def stripped_strings(self):
    method get_text (line 938) | def get_text(self, separator=u"", strip=False,
    method decompose (line 948) | def decompose(self):
    method clear (line 958) | def clear(self, decompose=False):
    method index (line 972) | def index(self, element):
    method get (line 982) | def get(self, key, default=None):
    method has_attr (line 988) | def has_attr(self, key):
    method __hash__ (line 991) | def __hash__(self):
    method __getitem__ (line 994) | def __getitem__(self, key):
    method __iter__ (line 999) | def __iter__(self):
    method __len__ (line 1003) | def __len__(self):
    method __contains__ (line 1007) | def __contains__(self, x):
    method __nonzero__ (line 1010) | def __nonzero__(self):
    method __setitem__ (line 1014) | def __setitem__(self, key, value):
    method __delitem__ (line 1019) | def __delitem__(self, key):
    method __call__ (line 1023) | def __call__(self, *args, **kwargs):
    method __getattr__ (line 1029) | def __getattr__(self, tag):
    method __eq__ (line 1044) | def __eq__(self, other):
    method __ne__ (line 1061) | def __ne__(self, other):
    method __repr__ (line 1066) | def __repr__(self, encoding="unicode-escape"):
    method __unicode__ (line 1077) | def __unicode__(self):
    method __str__ (line 1080) | def __str__(self):
    method encode (line 1089) | def encode(self, encoding=DEFAULT_OUTPUT_ENCODING,
    method _should_pretty_print (line 1097) | def _should_pretty_print(self, indent_level):
    method decode (line 1105) | def decode(self, indent_level=None,
    method prettify (line 1198) | def prettify(self, encoding=None, formatter="minimal"):
    method decode_contents (line 1204) | def decode_contents(self, indent_level=None,
    method encode_contents (line 1246) | def encode_contents(
    method renderContents (line 1264) | def renderContents(self, encoding=DEFAULT_OUTPUT_ENCODING,
    method find (line 1273) | def find(self, name=None, attrs={}, recursive=True, text=None,
    method find_all (line 1284) | def find_all(self, name=None, attrs={}, recursive=True, text=None,
    method children (line 1305) | def children(self):
    method descendants (line 1310) | def descendants(self):
    method select_one (line 1324) | def select_one(self, selector):
    method select (line 1331) | def select(self, selector, _candidate_generator=None, limit=None):
    method childGenerator (line 1552) | def childGenerator(self):
    method recursiveChildGenerator (line 1555) | def recursiveChildGenerator(self):
    method has_key (line 1558) | def has_key(self, key):
  class SoupStrainer (line 1567) | class SoupStrainer(object):
    method __init__ (line 1571) | def __init__(self, name=None, attrs={}, text=None, **kwargs):
    method _normalize_search_value (line 1598) | def _normalize_search_value(self, value):
    method __str__ (line 1628) | def __str__(self):
    method search_tag (line 1634) | def search_tag(self, markup_name=None, markup_attrs={}):
    method search (line 1675) | def search(self, markup):
    method _matches (line 1701) | def _matches(self, markup, match_against):
  class ResultSet (line 1750) | class ResultSet(list):
    method __init__ (line 1753) | def __init__(self, source, result=()):

FILE: parallax_svg_tools/svg/__init__.py
  function create_file (line 7) | def create_file(path, mode):
  function parse_svg (line 19) | def parse_svg(path, namespace, options):
  function write_svg (line 173) | def write_svg(svg, dst_path, options):
  function compile_svg (line 198) | def compile_svg(src_path, dst_path, options):
  function compile_master_svg (line 223) | def compile_master_svg(src_path, dst_path, options):
  function parse_markup (line 286) | def parse_markup(src_path, output):
  function inline_svg (line 299) | def inline_svg(src_path, dst_path):

Download .json

Condensed preview — 24 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (376K chars).

[
  {
    "path": "LICENSE.md",
    "chars": 1059,
    "preview": "Copyright 2017 Parallax Agency Ltd\n\nPermission is hereby granted, free of charge, to any person obtaining a copy of this"
  },
  {
    "path": "README.md",
    "chars": 5576,
    "preview": "# Parallax SVG Animation Tools\n\nA simple set of python functions to help working with animated SVGs exported from Illust"
  },
  {
    "path": "example/animation.html",
    "chars": 120,
    "preview": "<!DOCTYPE html>\n<html>\n<head>\n\t<meta charset='utf-8'/>\n</head>\n<body>\n\n//import processed_animation.svg\n\n</body>\n</html>"
  },
  {
    "path": "example/output/animation.html",
    "chars": 437,
    "preview": "<!DOCTYPE html>\n<html>\n<head>\n\t<meta charset='utf-8'/>\n</head>\n<body>\n\n<svg viewbox=\"0 0 800 600\" xmlns=\"http://www.w3.o"
  },
  {
    "path": "example/parallax_svg_tools/bs4/__init__.py",
    "chars": 20421,
    "preview": "\"\"\"Beautiful Soup\nElixir and Tonic\n\"The Screen-Scraper's Friend\"\nhttp://www.crummy.com/software/BeautifulSoup/\n\nBeautifu"
  },
  {
    "path": "example/parallax_svg_tools/bs4/builder/__init__.py",
    "chars": 11398,
    "preview": "# Use of this source code is governed by a BSD-style license that can be\n# found in the LICENSE file.\n\nfrom collections "
  },
  {
    "path": "example/parallax_svg_tools/bs4/builder/_html5lib.py",
    "chars": 13672,
    "preview": "# Use of this source code is governed by a BSD-style license that can be\n# found in the LICENSE file.\n\n__all__ = [\n    '"
  },
  {
    "path": "example/parallax_svg_tools/bs4/builder/_htmlparser.py",
    "chars": 9205,
    "preview": "\"\"\"Use the HTMLParser library to parse HTML files that aren't too bad.\"\"\"\n\n# Use of this source code is governed by a BS"
  },
  {
    "path": "example/parallax_svg_tools/bs4/builder/_lxml.py",
    "chars": 9468,
    "preview": "# Use of this source code is governed by a BSD-style license that can be\n# found in the LICENSE file.\n__all__ = [\n    'L"
  },
  {
    "path": "example/parallax_svg_tools/bs4/dammit.py",
    "chars": 29796,
    "preview": "# -*- coding: utf-8 -*-\n\"\"\"Beautiful Soup bonus library: Unicode, Dammit\n\nThis library converts a bytestream to Unicode "
  },
  {
    "path": "example/parallax_svg_tools/bs4/diagnose.py",
    "chars": 6747,
    "preview": "\"\"\"Diagnostic functions, mainly for use when doing tech support.\"\"\"\n\n# Use of this source code is governed by a BSD-styl"
  },
  {
    "path": "example/parallax_svg_tools/bs4/element.py",
    "chars": 66681,
    "preview": "# Use of this source code is governed by a BSD-style license that can be\n# found in the LICENSE file.\n__license__ = \"MIT"
  },
  {
    "path": "example/parallax_svg_tools/run.py",
    "chars": 198,
    "preview": "from svg import * \r\n\r\ncompile_svg('animation.svg', 'processed_animation.svg', \r\n{\r\n\t'process_layer_names': True,\r\n\t'name"
  },
  {
    "path": "example/parallax_svg_tools/svg/__init__.py",
    "chars": 8446,
    "preview": "# Super simple Illustrator SVG processor for animations. Uses the BeautifulSoup python xml library. \n\nimport os\nimport e"
  },
  {
    "path": "parallax_svg_tools/bs4/__init__.py",
    "chars": 20421,
    "preview": "\"\"\"Beautiful Soup\nElixir and Tonic\n\"The Screen-Scraper's Friend\"\nhttp://www.crummy.com/software/BeautifulSoup/\n\nBeautifu"
  },
  {
    "path": "parallax_svg_tools/bs4/builder/__init__.py",
    "chars": 11398,
    "preview": "# Use of this source code is governed by a BSD-style license that can be\n# found in the LICENSE file.\n\nfrom collections "
  },
  {
    "path": "parallax_svg_tools/bs4/builder/_html5lib.py",
    "chars": 13672,
    "preview": "# Use of this source code is governed by a BSD-style license that can be\n# found in the LICENSE file.\n\n__all__ = [\n    '"
  },
  {
    "path": "parallax_svg_tools/bs4/builder/_htmlparser.py",
    "chars": 9205,
    "preview": "\"\"\"Use the HTMLParser library to parse HTML files that aren't too bad.\"\"\"\n\n# Use of this source code is governed by a BS"
  },
  {
    "path": "parallax_svg_tools/bs4/builder/_lxml.py",
    "chars": 9468,
    "preview": "# Use of this source code is governed by a BSD-style license that can be\n# found in the LICENSE file.\n__all__ = [\n    'L"
  },
  {
    "path": "parallax_svg_tools/bs4/dammit.py",
    "chars": 29796,
    "preview": "# -*- coding: utf-8 -*-\n\"\"\"Beautiful Soup bonus library: Unicode, Dammit\n\nThis library converts a bytestream to Unicode "
  },
  {
    "path": "parallax_svg_tools/bs4/diagnose.py",
    "chars": 6747,
    "preview": "\"\"\"Diagnostic functions, mainly for use when doing tech support.\"\"\"\n\n# Use of this source code is governed by a BSD-styl"
  },
  {
    "path": "parallax_svg_tools/bs4/element.py",
    "chars": 66681,
    "preview": "# Use of this source code is governed by a BSD-style license that can be\n# found in the LICENSE file.\n__license__ = \"MIT"
  },
  {
    "path": "parallax_svg_tools/run.py",
    "chars": 198,
    "preview": "from svg import * \r\n\r\ncompile_svg('animation.svg', 'processed_animation.svg', \r\n{\r\n\t'process_layer_names': True,\r\n\t'name"
  },
  {
    "path": "parallax_svg_tools/svg/__init__.py",
    "chars": 8799,
    "preview": "# Super simple Illustrator SVG processor for animations. Uses the BeautifulSoup python xml library. \n\nimport os\nimport e"
  }
]

About this extraction

This page contains the full source code of the parallax/svg-animation-tools GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 24 files (351.2 KB), approximately 82.3k tokens, and a symbol index with 614 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Extract another repo