Full Code of harelba/q for AI

master 03e8b3950557 cached

56 files

643.0 KB

177.6k tokens

620 symbols

1 requests

Download .txt

Showing preview only (668K chars total). Download the full file or copy to clipboard to get everything.

Repository: harelba/q
Branch: master
Commit: 03e8b3950557
Files: 56
Total size: 643.0 KB

Directory structure:
gitextract_x4ti_kab/

├── .github/
│   ├── FUNDING.yml
│   └── workflows/
│       └── build-and-package.yaml
├── .gitignore
├── LICENSE
├── QSQL-NOTES.md
├── README.markdown
├── benchmark-config.sh
├── bin/
│   ├── .qrc
│   ├── __init__.py
│   ├── q.bat
│   └── q.py
├── conftest.py
├── dist/
│   ├── fpm-config
│   ├── test-rpm-inside-container.sh
│   ├── test-using-deb.sh
│   └── test-using-rpm.sh
├── doc/
│   ├── AUTHORS
│   ├── IMPLEMENTATION.markdown
│   ├── LICENSE
│   ├── RATIONALE.markdown
│   ├── THANKS
│   └── USAGE.markdown
├── examples/
│   ├── EXAMPLES.markdown
│   ├── exampledatafile
│   └── group-emails-example
├── mkdocs/
│   ├── README.md
│   ├── docs/
│   │   ├── about.md
│   │   ├── fsg9b9b1.txt
│   │   ├── google0efeb4ff0a886e81.html
│   │   ├── index.md
│   │   ├── index_cn.md
│   │   ├── js/
│   │   │   └── google-analytics.js
│   │   └── stylesheets/
│   │       └── extra.css
│   ├── generate-web-site.sh
│   ├── mkdocs.yml
│   ├── requirements.txt
│   └── theme/
│       └── main.html
├── prepare-benchmark-env
├── pyoxidizer.bzl
├── pytest.ini
├── requirements.txt
├── run-benchmark
├── run-coverage.sh
├── run-tests.sh
├── setup.py
├── test/
│   ├── BENCHMARK.md
│   ├── __init__.py
│   ├── benchmark-results/
│   │   └── source-files-1443b7418b46594ad256abd9db4a7671cb251e6a/
│   │       └── 2020-09-17-v2.0.17/
│   │           ├── octosql_v0.3.0.benchmark-results
│   │           ├── q-benchmark-2.7.18.benchmark-results
│   │           ├── q-benchmark-3.6.4.benchmark-results
│   │           ├── q-benchmark-3.7.9.benchmark-results
│   │           ├── q-benchmark-3.8.5.benchmark-results
│   │           ├── summary.benchmark-results
│   │           └── textql_2.0.3.benchmark-results
│   └── test_suite.py
└── test-requirements.txt

================================================
FILE CONTENTS
================================================

================================================
FILE: .github/FUNDING.yml
================================================
# These are supported funding model platforms

github: harelba


================================================
FILE: .github/workflows/build-and-package.yaml
================================================
name: BuildAndPackage

on:
  push:
    tags:
      - "v*"
    branches: master
  pull_request:
    branches: master
    paths-ignore:
      - "*.md"
      - "*.markdown"
      - "mkdocs/**/*"
    tags-ignore:
      - "*"

jobs:
  version_info:
    runs-on: ubuntu-18.04
    steps:
      - name: Checkout
        uses: actions/checkout@v2
      - id: vars
        run: |
          set -x -e

          echo "github event ref is ${{ github.ref }}"

          if [ "x${{ startsWith(github.ref, 'refs/tags/v') }}" == "xtrue" ]
          then
            echo "Trigger was a version tag - ${{ github.ref }}"
            echo ::set-output name=q_version::${GITHUB_REF#refs/tags/v}
            echo ::set-output name=is_release::true
          else
            # For testing version propagation inside the PR
            echo "Either branch of a non-version tag - setting version to 0.0.0"
            echo ::set-output name=q_version::0.0.0
            echo ::set-output name=is_release::false
          fi

    outputs:
      q_version: ${{ steps.vars.outputs.q_version }}
      is_release: ${{ steps.vars.outputs.is_release }}

  check_version_info:
    runs-on: ubuntu-18.04
    needs: version_info
    steps:
      - name: test q_version
        run: |
          set -e -x

          echo "outputs: ${{ toJson(needs.version_info) }}"

  create-man:
    runs-on: ubuntu-18.04
    steps:
    - name: Checkout
      uses: actions/checkout@v2
    - name: Install Ruby
      uses: ruby/setup-ruby@v1
      with:
        ruby-version: '2.6'
    - name: Create man page
      run: |
        set -x -e
        gem install ronn

        ronn doc/USAGE.markdown
        # Must be gzipped, otherwise debian does not install it
        gzip doc/USAGE
    - name: Upload man page
      uses: actions/upload-artifact@v1.0.0
      with:
        name: q-man-page
        path: doc/USAGE.gz

  build-linux:
    runs-on: ubuntu-18.04
    steps:
    - name: Checkout
      uses: actions/checkout@v2
    - name: Cache pyox
      uses: actions/cache@v2
      with:
        path: |
          ~/.cache/pyoxidizer
        key: ${{ runner.os }}-pyox
    - name: Install pyoxidizer
      run: |
        set -e -x

        sudo apt-get update
        sudo apt-get install -y zip sqlite3 rpm

        curl -o pyoxidizer.zip -L "https://github.com/indygreg/PyOxidizer/releases/download/pyoxidizer%2F0.17/pyoxidizer-0.17.0-linux_x86_64.zip"
        unzip pyoxidizer.zip
        chmod +x ./pyoxidizer
    - name: Create Q Executable - Linux
      run: |
        set -e -x

        ./pyoxidizer build --release

        export Q_EXECUTABLE=./build/x86_64-unknown-linux-gnu/release/install/q
        chmod 755 $Q_EXECUTABLE

        seq 1 100 | $Q_EXECUTABLE -c 1 "select sum(c1),count(*) from -" -S test.sqlite

        mkdir -p packages/linux/
        cp $Q_EXECUTABLE packages/linux/linux-q
    - name: Upload Linux Executable
      uses: actions/upload-artifact@v1.0.0
      with:
        name: linux-q
        path: packages/linux/linux-q

  test-linux:
    needs: build-linux
    runs-on: ubuntu-18.04
    steps:
    - name: Checkout
      uses: actions/checkout@v2
    - name: Install Python for Testing
      uses: actions/setup-python@v2
      with:
        python-version: '3.8.12'
        architecture: 'x64'
    - name: Prepare Testing
      run: |
        set -e -x

        pip3 install -r test-requirements.txt
    - name: Download Linux Executable
      uses: actions/download-artifact@v2
      with:
        name: linux-q
    - name: Run Tests on Linux Executable
      run: |
        set -x -e

        find ./ -ls

        chmod 755 ./linux-q

        Q_EXECUTABLE=`pwd`/linux-q Q_SKIP_EXECUTABLE_VALIDATION=true ./run-tests.sh -v

  package-linux-deb:
    needs: [test-linux, create-man, version_info]
    runs-on: ubuntu-18.04
    steps:
    - name: Checkout
      uses: actions/checkout@v2
    - name: Install Ruby
      uses: ruby/setup-ruby@v1
      with:
        ruby-version: '2.6'
    - name: Downoad man page
      uses: actions/download-artifact@v2
      with:
        name: q-man-page
    - name: Download Linux Executable
      uses: actions/download-artifact@v2
      with:
        name: linux-q
    - name: Build DEB Package
      run: |
        set -e -x

        mkdir -p packages/linux/

        find ./ -ls

        chmod 755 ./linux-q

        export q_version=${{ needs.version_info.outputs.q_version }}

        gem install fpm
        cp dist/fpm-config ~/.fpm
        fpm -s dir -t deb --deb-use-file-permissions -p packages/linux/q-text-as-data-${q_version}-1.x86_64.deb --version ${q_version} ./linux-q=/usr/bin/q USAGE.gz=/usr/share/man/man1/q.1.gz
    - name: Upload DEB Package
      uses: actions/upload-artifact@v1.0.0
      with:
        name: q-text-as-data-${{ needs.version_info.outputs.q_version }}-1.x86_64.deb
        path: packages/linux/q-text-as-data-${{ needs.version_info.outputs.q_version }}-1.x86_64.deb

  test-deb-packaging:
    runs-on: ubuntu-18.04
    needs: [package-linux-deb, version_info]
    steps:
    - name: Checkout
      uses: actions/checkout@v2
    - name: Download DEB
      uses: actions/download-artifact@v2
      with:
        name: q-text-as-data-${{ needs.version_info.outputs.q_version }}-1.x86_64.deb
    - name: Install Python for Testing
      uses: actions/setup-python@v2
      with:
        python-version: '3.8.12'
        architecture: 'x64'
    - name: Prepare Testing
      run: |
        set -e -x

        pip3 install -r test-requirements.txt
    - name: Test DEB Package Installation
      run: ./dist/test-using-deb.sh ./q-text-as-data-${{ needs.version_info.outputs.q_version }}-1.x86_64.deb

  package-linux-rpm:
    needs: [test-linux, create-man, version_info]
    runs-on: ubuntu-18.04
    steps:
    - name: Checkout
      uses: actions/checkout@v2
    - name: Install Ruby
      uses: ruby/setup-ruby@v1
      with:
        ruby-version: '2.6'
    - name: Download man page
      uses: actions/download-artifact@v2
      with:
        name: q-man-page
    - name: Download Linux Executable
      uses: actions/download-artifact@v2
      with:
        name: linux-q
    - name: Build RPM Package
      run: |
        set -e -x

        mkdir -p packages/linux


        chmod 755 ./linux-q

        export q_version=${{ needs.version_info.outputs.q_version }}

        gem install fpm
        cp dist/fpm-config ~/.fpm
        fpm -s dir -t rpm --rpm-use-file-permissions -p packages/linux/q-text-as-data-${q_version}.x86_64.rpm --version ${q_version} ./linux-q=/usr/bin/q USAGE.gz=/usr/share/man/man1/q.1.gz
    - name: Upload RPM Package
      uses: actions/upload-artifact@v1.0.0
      with:
        name: q-text-as-data-${{ needs.version_info.outputs.q_version }}.x86_64.rpm
        path: packages/linux/q-text-as-data-${{ needs.version_info.outputs.q_version }}.x86_64.rpm

  test-rpm-packaging:
    runs-on: ubuntu-18.04
    needs: [package-linux-rpm, version_info]
    steps:
    - name: Checkout
      uses: actions/checkout@v2
    - name: Download RPM
      uses: actions/download-artifact@v2
      with:
        name: q-text-as-data-${{ needs.version_info.outputs.q_version }}.x86_64.rpm
    - name: Retest using RPM
      run: ./dist/test-using-rpm.sh ./q-text-as-data-${{ needs.version_info.outputs.q_version }}.x86_64.rpm

  build-mac:
    runs-on: macos-11
    steps:
    - name: Checkout
      uses: actions/checkout@v2
    - name: Cache pyox
      uses: actions/cache@v2
      with:
        path: |
          ~/.cache/pyoxidizer
        key: ${{ runner.os }}-pyox
    - name: Install pyoxidizer
      run: |
        set -e -x

        curl -o  pyoxidizer.zip -L "https://github.com/indygreg/PyOxidizer/releases/download/pyoxidizer%2F0.17/pyoxidizer-0.17.0-macos-universal.zip"
        unzip pyoxidizer.zip
        mv macos-universal/pyoxidizer ./pyoxidizer

        chmod +x ./pyoxidizer
    - name: Create Q Executable - Mac
      run: |
        set -e -x

        ./pyoxidizer build --release

        export Q_EXECUTABLE=./build/x86_64-apple-darwin/release/install/q
        chmod 755 $Q_EXECUTABLE

        seq 1 100 | $Q_EXECUTABLE -c 1 "select sum(c1),count(*) from -" -S test.sqlite

        mkdir -p packages/macos/
        cp $Q_EXECUTABLE packages/macos/macos-q
    - name: Upload MacOS Executable
      uses: actions/upload-artifact@v1.0.0
      with:
        name: macos-q
        path: packages/macos/macos-q

  test-mac:
    needs: build-mac
    runs-on: macos-11
    steps:
    - name: Checkout
      uses: actions/checkout@v2
    - name: Install Python for Testing
      uses: actions/setup-python@v2
      with:
        python-version: '3.8.12'
        architecture: 'x64'
    - name: Prepare Testing
      run: |
        set -e -x

        pip3 install wheel

        pip3 install -r test-requirements.txt
    - name: Download MacOS Executable
      uses: actions/download-artifact@v2
      with:
        name: macos-q
    - name: Run Tests on MacOS Executable
      run: |
        set -e -x

        chmod 755 ./macos-q

        Q_EXECUTABLE=`pwd`/macos-q Q_SKIP_EXECUTABLE_VALIDATION=true ./run-tests.sh -v

  not-package-mac:
    # create-man is not needed, as it's generated inside the brew formula independently
    needs: [test-mac]
    runs-on: macos-11
    steps:
    - name: Checkout
      uses: actions/checkout@v2
    - name: Not Packaging Mac
      run: |
        echo "homebrew mac cannot be packaged from the source code itself, due to the package build process of homebrew. See https://github.com/harelba/homebrew-q"

  not-test-mac-packaging:
    needs: not-package-mac
    runs-on: macos-11
    steps:
    - name: Checkout
      uses: actions/checkout@v2
    - name: Not Testing Mac Packaging
      run: |
        echo "homebrew mac packaging cannot be tested here, due to the package build process of homebrew. See https://github.com/harelba/homebrew-q"

  build-windows:
    runs-on: windows-latest
    needs: version_info
    steps:
    - name: Checkout
      uses: actions/checkout@v2
    - name: Install MSVC build tools
      uses: ilammy/msvc-dev-cmd@v1
    - name: Install Python
      uses: actions/setup-python@v2
      with:
        python-version: '3.8.10'
        architecture: 'x64'
    - name: Install pyoxidizer
      shell: bash
      run: |
        set -x -e

        python3 -V
        pip3 -V

        pip3 install pyoxidizer
    - name: Create Q Executable - Windows
      shell: bash
      run: |
        set -e -x

        pyoxidizer build --release --var Q_VERSION ${{ needs.version_info.outputs.q_version }}

        export Q_EXECUTABLE=./build/x86_64-pc-windows-msvc/release/install/q
        chmod 755 $Q_EXECUTABLE

        seq 1 100 | $Q_EXECUTABLE -c 1 "select sum(c1),count(*) from -" -S test.sqlite

        mkdir -p packages/windows/
        cp $Q_EXECUTABLE packages/windows/win-q.exe

        find ./ -ls
    - name: Upload Linux Executable
      uses: actions/upload-artifact@v1.0.0
      with:
        name: win-q.exe
        path: packages/windows/win-q.exe

  not-really-test-windows:
    needs: build-windows
    runs-on: windows-latest
    steps:
    - name: Checkout
      uses: actions/checkout@v2
    - name: Install Python for Testing
      uses: actions/setup-python@v2
      with:
        python-version: '3.8'
        architecture: 'x64'
    - name: Download Windows Executable
      uses: actions/download-artifact@v2
      with:
        name: win-q.exe
    - name: Not-Really-Test Windows
      shell: bash
      continue-on-error: true
      run: |
        echo "Tests are not compatible with Windows (path separators, tmp folder names etc.). Only a sanity wil be tested"

        chmod +x ./win-q.exe

        seq 1 10000 | ./win-q.exe -c 1 "select sum(c1),count(*) from -" -S some-db.sqlite

  package-windows:
    needs: [create-man, not-really-test-windows, version_info]
    runs-on: windows-latest
    steps:
    - name: Checkout
      uses: actions/checkout@v2
    - name: Install MSVC build tools
      uses: ilammy/msvc-dev-cmd@v1
    - name: Install Python
      uses: actions/setup-python@v2
      with:
        python-version: '3.8.10'
        architecture: 'x64'
    - name: Install pyoxidizer
      shell: bash
      run: |
        set -x -e

        python3 -V
        pip3 -V

        pip3 install pyoxidizer
    - name: Create Q MSI - Windows
      shell: bash
      run: |
        set -e -x

        pyoxidizer build --release msi_installer --var Q_VERSION ${{ needs.version_info.outputs.q_version }}

        export Q_MSI=./build/x86_64-pc-windows-msvc/release/msi_installer/q-text-as-data-${{ needs.version_info.outputs.q_version }}.msi
        chmod 755 $Q_MSI

        mkdir -p packages/windows/
        cp $Q_MSI packages/windows/q-text-as-data-${{ needs.version_info.outputs.q_version }}.msi

    - name: Upload Windows MSI
      uses: actions/upload-artifact@v1.0.0
      with:
        name: q-text-as-data-${{ needs.version_info.outputs.q_version }}.msi
        path: packages/windows/q-text-as-data-${{ needs.version_info.outputs.q_version }}.msi

  test-windows-packaging:
    needs: [package-windows, version_info]
    runs-on: windows-latest
    steps:
    - name: Checkout
      uses: actions/checkout@v2
    - name: Download Windows Package
      uses: actions/download-artifact@v2
      with:
        name: q-text-as-data-${{ needs.version_info.outputs.q_version }}.msi
    - name: Test Install of MSI
      continue-on-error: true
      shell: powershell
      run: |
        $process = Start-Process msiexec.exe -ArgumentList "/i q-text-as-data-${{ needs.version_info.outputs.q_version }}.msi -l* msi-install.log /norestart /quiet" -PassThru -Wait
        $process.ExitCode
        gc msi-install.log

        exit $process.ExitCode
    - name: Test Uninstall of MSI
      continue-on-error: true
      shell: powershell
      run: |
        $process = Start-Process msiexec.exe -ArgumentList "/u q-text-as-data-${{ needs.version_info.outputs.q_version }}.msi /norestart /quiet" -PassThru -Wait
        $process.ExitCode
        exit $process.ExitCode

  perform-prerelease:
    # We'd like artifacts to be uploaded regardless of tests succeeded or not,
    # this is why the dependency here is not on test-X-packaging jobs
    needs: [package-linux-deb, package-linux-rpm, not-package-mac, package-windows, version_info]
    runs-on: ubuntu-latest
    if: needs.version_info.outputs.is_release == 'false'
    steps:
    - name: Download All Artifacts
      uses: actions/download-artifact@v2
      with:
        path: artifacts/
    - name: Timestamp pre-release
      run: |
        set -e -x

        echo "Workflow finished at $(date)" >> artifacts/workflow-finish-time.txt
    - name: Create pre-release
      uses: "marvinpinto/action-automatic-releases@v1.2.1"
      with:
        repo_token: "${{ secrets.GITHUB_TOKEN }}"
        automatic_release_tag: "latest"
        prerelease: true
        title: "Next Release Development Build"
        files: |
          artifacts/**/*

  perform-release:
    needs: [not-test-mac-packaging, test-deb-packaging, test-rpm-packaging, test-windows-packaging, version_info]
    runs-on: ubuntu-latest
    if: needs.version_info.outputs.is_release == 'true'
    steps:
    - name: Download All Artifacts
      uses: actions/download-artifact@v2
      with:
        path: artifacts/
    - uses: "marvinpinto/action-automatic-releases@v1.2.1"
      with:
        repo_token: "${{ secrets.GITHUB_TOKEN }}"
        prerelease: false
        files: |
          artifacts/**/*


================================================
FILE: .gitignore
================================================
build
q.spec
q.1
*.pyc
.vagrant
rpm_build_area
*.deb
setup.exe
win_output
win_build
packages
.idea/
dist/windows/
generated-site/
benchmark_data.tar.gz
_benchmark_data/
q.egg-info/
.pytest_cache/
*.qsql
htmlcov/
*.sqlite
*.tar.gz
.coverage
.DS_Store
*.egg


================================================
FILE: LICENSE
================================================
                    GNU GENERAL PUBLIC LICENSE
                       Version 3, 29 June 2007

 Copyright (C) 2007 Free Software Foundation, Inc. <http://fsf.org/>
 Everyone is permitted to copy and distribute verbatim copies
 of this license document, but changing it is not allowed.

                            Preamble

  The GNU General Public License is a free, copyleft license for
software and other kinds of works.

  The licenses for most software and other practical works are designed
to take away your freedom to share and change the works.  By contrast,
the GNU General Public License is intended to guarantee your freedom to
share and change all versions of a program--to make sure it remains free
software for all its users.  We, the Free Software Foundation, use the
GNU General Public License for most of our software; it applies also to
any other work released this way by its authors.  You can apply it to
your programs, too.

  When we speak of free software, we are referring to freedom, not
price.  Our General Public Licenses are designed to make sure that you
have the freedom to distribute copies of free software (and charge for
them if you wish), that you receive source code or can get it if you
want it, that you can change the software or use pieces of it in new
free programs, and that you know you can do these things.

  To protect your rights, we need to prevent others from denying you
these rights or asking you to surrender the rights.  Therefore, you have
certain responsibilities if you distribute copies of the software, or if
you modify it: responsibilities to respect the freedom of others.

  For example, if you distribute copies of such a program, whether
gratis or for a fee, you must pass on to the recipients the same
freedoms that you received.  You must make sure that they, too, receive
or can get the source code.  And you must show them these terms so they
know their rights.

  Developers that use the GNU GPL protect your rights with two steps:
(1) assert copyright on the software, and (2) offer you this License
giving you legal permission to copy, distribute and/or modify it.

  For the developers' and authors' protection, the GPL clearly explains
that there is no warranty for this free software.  For both users' and
authors' sake, the GPL requires that modified versions be marked as
changed, so that their problems will not be attributed erroneously to
authors of previous versions.

  Some devices are designed to deny users access to install or run
modified versions of the software inside them, although the manufacturer
can do so.  This is fundamentally incompatible with the aim of
protecting users' freedom to change the software.  The systematic
pattern of such abuse occurs in the area of products for individuals to
use, which is precisely where it is most unacceptable.  Therefore, we
have designed this version of the GPL to prohibit the practice for those
products.  If such problems arise substantially in other domains, we
stand ready to extend this provision to those domains in future versions
of the GPL, as needed to protect the freedom of users.

  Finally, every program is threatened constantly by software patents.
States should not allow patents to restrict development and use of
software on general-purpose computers, but in those that do, we wish to
avoid the special danger that patents applied to a free program could
make it effectively proprietary.  To prevent this, the GPL assures that
patents cannot be used to render the program non-free.

  The precise terms and conditions for copying, distribution and
modification follow.

                       TERMS AND CONDITIONS

  0. Definitions.

  "This License" refers to version 3 of the GNU General Public License.

  "Copyright" also means copyright-like laws that apply to other kinds of
works, such as semiconductor masks.

  "The Program" refers to any copyrightable work licensed under this
License.  Each licensee is addressed as "you".  "Licensees" and
"recipients" may be individuals or organizations.

  To "modify" a work means to copy from or adapt all or part of the work
in a fashion requiring copyright permission, other than the making of an
exact copy.  The resulting work is called a "modified version" of the
earlier work or a work "based on" the earlier work.

  A "covered work" means either the unmodified Program or a work based
on the Program.

  To "propagate" a work means to do anything with it that, without
permission, would make you directly or secondarily liable for
infringement under applicable copyright law, except executing it on a
computer or modifying a private copy.  Propagation includes copying,
distribution (with or without modification), making available to the
public, and in some countries other activities as well.

  To "convey" a work means any kind of propagation that enables other
parties to make or receive copies.  Mere interaction with a user through
a computer network, with no transfer of a copy, is not conveying.

  An interactive user interface displays "Appropriate Legal Notices"
to the extent that it includes a convenient and prominently visible
feature that (1) displays an appropriate copyright notice, and (2)
tells the user that there is no warranty for the work (except to the
extent that warranties are provided), that licensees may convey the
work under this License, and how to view a copy of this License.  If
the interface presents a list of user commands or options, such as a
menu, a prominent item in the list meets this criterion.

  1. Source Code.

  The "source code" for a work means the preferred form of the work
for making modifications to it.  "Object code" means any non-source
form of a work.

  A "Standard Interface" means an interface that either is an official
standard defined by a recognized standards body, or, in the case of
interfaces specified for a particular programming language, one that
is widely used among developers working in that language.

  The "System Libraries" of an executable work include anything, other
than the work as a whole, that (a) is included in the normal form of
packaging a Major Component, but which is not part of that Major
Component, and (b) serves only to enable use of the work with that
Major Component, or to implement a Standard Interface for which an
implementation is available to the public in source code form.  A
"Major Component", in this context, means a major essential component
(kernel, window system, and so on) of the specific operating system
(if any) on which the executable work runs, or a compiler used to
produce the work, or an object code interpreter used to run it.

  The "Corresponding Source" for a work in object code form means all
the source code needed to generate, install, and (for an executable
work) run the object code and to modify the work, including scripts to
control those activities.  However, it does not include the work's
System Libraries, or general-purpose tools or generally available free
programs which are used unmodified in performing those activities but
which are not part of the work.  For example, Corresponding Source
includes interface definition files associated with source files for
the work, and the source code for shared libraries and dynamically
linked subprograms that the work is specifically designed to require,
such as by intimate data communication or control flow between those
subprograms and other parts of the work.

  The Corresponding Source need not include anything that users
can regenerate automatically from other parts of the Corresponding
Source.

  The Corresponding Source for a work in source code form is that
same work.

  2. Basic Permissions.

  All rights granted under this License are granted for the term of
copyright on the Program, and are irrevocable provided the stated
conditions are met.  This License explicitly affirms your unlimited
permission to run the unmodified Program.  The output from running a
covered work is covered by this License only if the output, given its
content, constitutes a covered work.  This License acknowledges your
rights of fair use or other equivalent, as provided by copyright law.

  You may make, run and propagate covered works that you do not
convey, without conditions so long as your license otherwise remains
in force.  You may convey covered works to others for the sole purpose
of having them make modifications exclusively for you, or provide you
with facilities for running those works, provided that you comply with
the terms of this License in conveying all material for which you do
not control copyright.  Those thus making or running the covered works
for you must do so exclusively on your behalf, under your direction
and control, on terms that prohibit them from making any copies of
your copyrighted material outside their relationship with you.

  Conveying under any other circumstances is permitted solely under
the conditions stated below.  Sublicensing is not allowed; section 10
makes it unnecessary.

  3. Protecting Users' Legal Rights From Anti-Circumvention Law.

  No covered work shall be deemed part of an effective technological
measure under any applicable law fulfilling obligations under article
11 of the WIPO copyright treaty adopted on 20 December 1996, or
similar laws prohibiting or restricting circumvention of such
measures.

  When you convey a covered work, you waive any legal power to forbid
circumvention of technological measures to the extent such circumvention
is effected by exercising rights under this License with respect to
the covered work, and you disclaim any intention to limit operation or
modification of the work as a means of enforcing, against the work's
users, your or third parties' legal rights to forbid circumvention of
technological measures.

  4. Conveying Verbatim Copies.

  You may convey verbatim copies of the Program's source code as you
receive it, in any medium, provided that you conspicuously and
appropriately publish on each copy an appropriate copyright notice;
keep intact all notices stating that this License and any
non-permissive terms added in accord with section 7 apply to the code;
keep intact all notices of the absence of any warranty; and give all
recipients a copy of this License along with the Program.

  You may charge any price or no price for each copy that you convey,
and you may offer support or warranty protection for a fee.

  5. Conveying Modified Source Versions.

  You may convey a work based on the Program, or the modifications to
produce it from the Program, in the form of source code under the
terms of section 4, provided that you also meet all of these conditions:

    a) The work must carry prominent notices stating that you modified
    it, and giving a relevant date.

    b) The work must carry prominent notices stating that it is
    released under this License and any conditions added under section
    7.  This requirement modifies the requirement in section 4 to
    "keep intact all notices".

    c) You must license the entire work, as a whole, under this
    License to anyone who comes into possession of a copy.  This
    License will therefore apply, along with any applicable section 7
    additional terms, to the whole of the work, and all its parts,
    regardless of how they are packaged.  This License gives no
    permission to license the work in any other way, but it does not
    invalidate such permission if you have separately received it.

    d) If the work has interactive user interfaces, each must display
    Appropriate Legal Notices; however, if the Program has interactive
    interfaces that do not display Appropriate Legal Notices, your
    work need not make them do so.

  A compilation of a covered work with other separate and independent
works, which are not by their nature extensions of the covered work,
and which are not combined with it such as to form a larger program,
in or on a volume of a storage or distribution medium, is called an
"aggregate" if the compilation and its resulting copyright are not
used to limit the access or legal rights of the compilation's users
beyond what the individual works permit.  Inclusion of a covered work
in an aggregate does not cause this License to apply to the other
parts of the aggregate.

  6. Conveying Non-Source Forms.

  You may convey a covered work in object code form under the terms
of sections 4 and 5, provided that you also convey the
machine-readable Corresponding Source under the terms of this License,
in one of these ways:

    a) Convey the object code in, or embodied in, a physical product
    (including a physical distribution medium), accompanied by the
    Corresponding Source fixed on a durable physical medium
    customarily used for software interchange.

    b) Convey the object code in, or embodied in, a physical product
    (including a physical distribution medium), accompanied by a
    written offer, valid for at least three years and valid for as
    long as you offer spare parts or customer support for that product
    model, to give anyone who possesses the object code either (1) a
    copy of the Corresponding Source for all the software in the
    product that is covered by this License, on a durable physical
    medium customarily used for software interchange, for a price no
    more than your reasonable cost of physically performing this
    conveying of source, or (2) access to copy the
    Corresponding Source from a network server at no charge.

    c) Convey individual copies of the object code with a copy of the
    written offer to provide the Corresponding Source.  This
    alternative is allowed only occasionally and noncommercially, and
    only if you received the object code with such an offer, in accord
    with subsection 6b.

    d) Convey the object code by offering access from a designated
    place (gratis or for a charge), and offer equivalent access to the
    Corresponding Source in the same way through the same place at no
    further charge.  You need not require recipients to copy the
    Corresponding Source along with the object code.  If the place to
    copy the object code is a network server, the Corresponding Source
    may be on a different server (operated by you or a third party)
    that supports equivalent copying facilities, provided you maintain
    clear directions next to the object code saying where to find the
    Corresponding Source.  Regardless of what server hosts the
    Corresponding Source, you remain obligated to ensure that it is
    available for as long as needed to satisfy these requirements.

    e) Convey the object code using peer-to-peer transmission, provided
    you inform other peers where the object code and Corresponding
    Source of the work are being offered to the general public at no
    charge under subsection 6d.

  A separable portion of the object code, whose source code is excluded
from the Corresponding Source as a System Library, need not be
included in conveying the object code work.

  A "User Product" is either (1) a "consumer product", which means any
tangible personal property which is normally used for personal, family,
or household purposes, or (2) anything designed or sold for incorporation
into a dwelling.  In determining whether a product is a consumer product,
doubtful cases shall be resolved in favor of coverage.  For a particular
product received by a particular user, "normally used" refers to a
typical or common use of that class of product, regardless of the status
of the particular user or of the way in which the particular user
actually uses, or expects or is expected to use, the product.  A product
is a consumer product regardless of whether the product has substantial
commercial, industrial or non-consumer uses, unless such uses represent
the only significant mode of use of the product.

  "Installation Information" for a User Product means any methods,
procedures, authorization keys, or other information required to install
and execute modified versions of a covered work in that User Product from
a modified version of its Corresponding Source.  The information must
suffice to ensure that the continued functioning of the modified object
code is in no case prevented or interfered with solely because
modification has been made.

  If you convey an object code work under this section in, or with, or
specifically for use in, a User Product, and the conveying occurs as
part of a transaction in which the right of possession and use of the
User Product is transferred to the recipient in perpetuity or for a
fixed term (regardless of how the transaction is characterized), the
Corresponding Source conveyed under this section must be accompanied
by the Installation Information.  But this requirement does not apply
if neither you nor any third party retains the ability to install
modified object code on the User Product (for example, the work has
been installed in ROM).

  The requirement to provide Installation Information does not include a
requirement to continue to provide support service, warranty, or updates
for a work that has been modified or installed by the recipient, or for
the User Product in which it has been modified or installed.  Access to a
network may be denied when the modification itself materially and
adversely affects the operation of the network or violates the rules and
protocols for communication across the network.

  Corresponding Source conveyed, and Installation Information provided,
in accord with this section must be in a format that is publicly
documented (and with an implementation available to the public in
source code form), and must require no special password or key for
unpacking, reading or copying.

  7. Additional Terms.

  "Additional permissions" are terms that supplement the terms of this
License by making exceptions from one or more of its conditions.
Additional permissions that are applicable to the entire Program shall
be treated as though they were included in this License, to the extent
that they are valid under applicable law.  If additional permissions
apply only to part of the Program, that part may be used separately
under those permissions, but the entire Program remains governed by
this License without regard to the additional permissions.

  When you convey a copy of a covered work, you may at your option
remove any additional permissions from that copy, or from any part of
it.  (Additional permissions may be written to require their own
removal in certain cases when you modify the work.)  You may place
additional permissions on material, added by you to a covered work,
for which you have or can give appropriate copyright permission.

  Notwithstanding any other provision of this License, for material you
add to a covered work, you may (if authorized by the copyright holders of
that material) supplement the terms of this License with terms:

    a) Disclaiming warranty or limiting liability differently from the
    terms of sections 15 and 16 of this License; or

    b) Requiring preservation of specified reasonable legal notices or
    author attributions in that material or in the Appropriate Legal
    Notices displayed by works containing it; or

    c) Prohibiting misrepresentation of the origin of that material, or
    requiring that modified versions of such material be marked in
    reasonable ways as different from the original version; or

    d) Limiting the use for publicity purposes of names of licensors or
    authors of the material; or

    e) Declining to grant rights under trademark law for use of some
    trade names, trademarks, or service marks; or

    f) Requiring indemnification of licensors and authors of that
    material by anyone who conveys the material (or modified versions of
    it) with contractual assumptions of liability to the recipient, for
    any liability that these contractual assumptions directly impose on
    those licensors and authors.

  All other non-permissive additional terms are considered "further
restrictions" within the meaning of section 10.  If the Program as you
received it, or any part of it, contains a notice stating that it is
governed by this License along with a term that is a further
restriction, you may remove that term.  If a license document contains
a further restriction but permits relicensing or conveying under this
License, you may add to a covered work material governed by the terms
of that license document, provided that the further restriction does
not survive such relicensing or conveying.

  If you add terms to a covered work in accord with this section, you
must place, in the relevant source files, a statement of the
additional terms that apply to those files, or a notice indicating
where to find the applicable terms.

  Additional terms, permissive or non-permissive, may be stated in the
form of a separately written license, or stated as exceptions;
the above requirements apply either way.

  8. Termination.

  You may not propagate or modify a covered work except as expressly
provided under this License.  Any attempt otherwise to propagate or
modify it is void, and will automatically terminate your rights under
this License (including any patent licenses granted under the third
paragraph of section 11).

  However, if you cease all violation of this License, then your
license from a particular copyright holder is reinstated (a)
provisionally, unless and until the copyright holder explicitly and
finally terminates your license, and (b) permanently, if the copyright
holder fails to notify you of the violation by some reasonable means
prior to 60 days after the cessation.

  Moreover, your license from a particular copyright holder is
reinstated permanently if the copyright holder notifies you of the
violation by some reasonable means, this is the first time you have
received notice of violation of this License (for any work) from that
copyright holder, and you cure the violation prior to 30 days after
your receipt of the notice.

  Termination of your rights under this section does not terminate the
licenses of parties who have received copies or rights from you under
this License.  If your rights have been terminated and not permanently
reinstated, you do not qualify to receive new licenses for the same
material under section 10.

  9. Acceptance Not Required for Having Copies.

  You are not required to accept this License in order to receive or
run a copy of the Program.  Ancillary propagation of a covered work
occurring solely as a consequence of using peer-to-peer transmission
to receive a copy likewise does not require acceptance.  However,
nothing other than this License grants you permission to propagate or
modify any covered work.  These actions infringe copyright if you do
not accept this License.  Therefore, by modifying or propagating a
covered work, you indicate your acceptance of this License to do so.

  10. Automatic Licensing of Downstream Recipients.

  Each time you convey a covered work, the recipient automatically
receives a license from the original licensors, to run, modify and
propagate that work, subject to this License.  You are not responsible
for enforcing compliance by third parties with this License.

  An "entity transaction" is a transaction transferring control of an
organization, or substantially all assets of one, or subdividing an
organization, or merging organizations.  If propagation of a covered
work results from an entity transaction, each party to that
transaction who receives a copy of the work also receives whatever
licenses to the work the party's predecessor in interest had or could
give under the previous paragraph, plus a right to possession of the
Corresponding Source of the work from the predecessor in interest, if
the predecessor has it or can get it with reasonable efforts.

  You may not impose any further restrictions on the exercise of the
rights granted or affirmed under this License.  For example, you may
not impose a license fee, royalty, or other charge for exercise of
rights granted under this License, and you may not initiate litigation
(including a cross-claim or counterclaim in a lawsuit) alleging that
any patent claim is infringed by making, using, selling, offering for
sale, or importing the Program or any portion of it.

  11. Patents.

  A "contributor" is a copyright holder who authorizes use under this
License of the Program or a work on which the Program is based.  The
work thus licensed is called the contributor's "contributor version".

  A contributor's "essential patent claims" are all patent claims
owned or controlled by the contributor, whether already acquired or
hereafter acquired, that would be infringed by some manner, permitted
by this License, of making, using, or selling its contributor version,
but do not include claims that would be infringed only as a
consequence of further modification of the contributor version.  For
purposes of this definition, "control" includes the right to grant
patent sublicenses in a manner consistent with the requirements of
this License.

  Each contributor grants you a non-exclusive, worldwide, royalty-free
patent license under the contributor's essential patent claims, to
make, use, sell, offer for sale, import and otherwise run, modify and
propagate the contents of its contributor version.

  In the following three paragraphs, a "patent license" is any express
agreement or commitment, however denominated, not to enforce a patent
(such as an express permission to practice a patent or covenant not to
sue for patent infringement).  To "grant" such a patent license to a
party means to make such an agreement or commitment not to enforce a
patent against the party.

  If you convey a covered work, knowingly relying on a patent license,
and the Corresponding Source of the work is not available for anyone
to copy, free of charge and under the terms of this License, through a
publicly available network server or other readily accessible means,
then you must either (1) cause the Corresponding Source to be so
available, or (2) arrange to deprive yourself of the benefit of the
patent license for this particular work, or (3) arrange, in a manner
consistent with the requirements of this License, to extend the patent
license to downstream recipients.  "Knowingly relying" means you have
actual knowledge that, but for the patent license, your conveying the
covered work in a country, or your recipient's use of the covered work
in a country, would infringe one or more identifiable patents in that
country that you have reason to believe are valid.

  If, pursuant to or in connection with a single transaction or
arrangement, you convey, or propagate by procuring conveyance of, a
covered work, and grant a patent license to some of the parties
receiving the covered work authorizing them to use, propagate, modify
or convey a specific copy of the covered work, then the patent license
you grant is automatically extended to all recipients of the covered
work and works based on it.

  A patent license is "discriminatory" if it does not include within
the scope of its coverage, prohibits the exercise of, or is
conditioned on the non-exercise of one or more of the rights that are
specifically granted under this License.  You may not convey a covered
work if you are a party to an arrangement with a third party that is
in the business of distributing software, under which you make payment
to the third party based on the extent of your activity of conveying
the work, and under which the third party grants, to any of the
parties who would receive the covered work from you, a discriminatory
patent license (a) in connection with copies of the covered work
conveyed by you (or copies made from those copies), or (b) primarily
for and in connection with specific products or compilations that
contain the covered work, unless you entered into that arrangement,
or that patent license was granted, prior to 28 March 2007.

  Nothing in this License shall be construed as excluding or limiting
any implied license or other defenses to infringement that may
otherwise be available to you under applicable patent law.

  12. No Surrender of Others' Freedom.

  If conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License.  If you cannot convey a
covered work so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you may
not convey it at all.  For example, if you agree to terms that obligate you
to collect a royalty for further conveying from those to whom you convey
the Program, the only way you could satisfy both those terms and this
License would be to refrain entirely from conveying the Program.

  13. Use with the GNU Affero General Public License.

  Notwithstanding any other provision of this License, you have
permission to link or combine any covered work with a work licensed
under version 3 of the GNU Affero General Public License into a single
combined work, and to convey the resulting work.  The terms of this
License will continue to apply to the part which is the covered work,
but the special requirements of the GNU Affero General Public License,
section 13, concerning interaction through a network will apply to the
combination as such.

  14. Revised Versions of this License.

  The Free Software Foundation may publish revised and/or new versions of
the GNU General Public License from time to time.  Such new versions will
be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.

  Each version is given a distinguishing version number.  If the
Program specifies that a certain numbered version of the GNU General
Public License "or any later version" applies to it, you have the
option of following the terms and conditions either of that numbered
version or of any later version published by the Free Software
Foundation.  If the Program does not specify a version number of the
GNU General Public License, you may choose any version ever published
by the Free Software Foundation.

  If the Program specifies that a proxy can decide which future
versions of the GNU General Public License can be used, that proxy's
public statement of acceptance of a version permanently authorizes you
to choose that version for the Program.

  Later license versions may give you additional or different
permissions.  However, no additional obligations are imposed on any
author or copyright holder as a result of your choosing to follow a
later version.

  15. Disclaimer of Warranty.

  THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
APPLICABLE LAW.  EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY
OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE.  THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM
IS WITH YOU.  SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF
ALL NECESSARY SERVICING, REPAIR OR CORRECTION.

  16. Limitation of Liability.

  IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS
THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY
GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),
EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF
SUCH DAMAGES.

  17. Interpretation of Sections 15 and 16.

  If the disclaimer of warranty and limitation of liability provided
above cannot be given local legal effect according to their terms,
reviewing courts shall apply local law that most closely approximates
an absolute waiver of all civil liability in connection with the
Program, unless a warranty or assumption of liability accompanies a
copy of the Program in return for a fee.

                     END OF TERMS AND CONDITIONS

            How to Apply These Terms to Your New Programs

  If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
free software which everyone can redistribute and change under these terms.

  To do so, attach the following notices to the program.  It is safest
to attach them to the start of each source file to most effectively
state the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.

    {one line to give the program's name and a brief idea of what it does.}
    Copyright (C) {year}  {name of author}

    This program is free software: you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation, either version 3 of the License, or
    (at your option) any later version.

    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU General Public License for more details.

    You should have received a copy of the GNU General Public License
    along with this program.  If not, see <http://www.gnu.org/licenses/>.

Also add information on how to contact you by electronic and paper mail.

  If the program does terminal interaction, make it output a short
notice like this when it starts in an interactive mode:

    {project}  Copyright (C) {year}  {fullname}
    This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
    This is free software, and you are welcome to redistribute it
    under certain conditions; type `show c' for details.

The hypothetical commands `show w' and `show c' should show the appropriate
parts of the General Public License.  Of course, your program's commands
might be different; for a GUI interface, you would use an "about box".

  You should also get your employer (if you work as a programmer) or school,
if any, to sign a "copyright disclaimer" for the program, if necessary.
For more information on this, and how to apply and follow the GNU GPL, see
<http://www.gnu.org/licenses/>.

  The GNU General Public License does not permit incorporating your program
into proprietary programs.  If your program is a subroutine library, you
may consider it more useful to permit linking proprietary applications with
the library.  If this is what you want to do, use the GNU Lesser General
Public License instead of this License.  But first, please read
<http://www.gnu.org/philosophy/why-not-lgpl.html>.


================================================
FILE: QSQL-NOTES.md
================================================

## Major changes and additions in the new `3.x` version
This is the list of new/changed functionality in this version. Large changes, please make sure to read the details if you're already using q.

* **Automatic Immutable Caching** - Automatic caching of data files (into `<my-csv-filename>.qsql` files), with huge speedups for medium/large files. Enabled through `-C readwrite` or `-C read`
* **Direct querying of standard sqlite databases** - Just use it as a table name in the query. Format is `select ... from <sqlitedb_filename>:::<table_name>`, or just `<sqlitedb_filename>` if the database contains only one table. Multiple separate sqlite databases are fully supported in the same query.
* **Direct querying of the `qsql` cache files** - The user can query directly from the `qsql` files, removing the need for the original files. Just use `select ... from <my-csv-filename>.qsql`. Please wait until the non-beta version is out before thinking about deleting any of your original files...
* **Revamped `.qrc` mechanism** - allows opting-in to caching without specifying it in every query. By default, caching is **disabled**, for backward compatibility and for finding usability issues.
* **Save-to-db is now reusable for queries** - `--save-db-to-disk` option (`-S`) has been enhanced to match the new capabilities. You can query the resulting file directly through q, using the method mentioned above (it's just a standard sqlite database).
* **Only python3 is supported from now on** - Shouldn't be an issue, since q is a self-contained binary executable which has its own python embedded in it. Internally, q is now packaged with Python 3.8. After everything cools down, I'll probably bump this to 3.9/3.10.
* **Minimal Linux Version Bumped** - Works with CentOS 8, Ubuntu 18.04+, Debian 10+. Currently only for x86_64. Depends on glibc version 2.25+. Haven't tested it on other architectures. Issuing other architectures will be possible later on
* **Completely revamped binary packaging** - Using [pyoxidizer](https://github.com/indygreg/PyOxidizer)

The following sections provide the details of each of the new functionalities in this major version.

## Automatic caching of data files
Speeding up subsequent reads from the same file by several orders of magnitude by automatically creating an immutable cache file for each tabular text file.  

For example, reading a 0.9GB file with 1M rows and 100 columns without caching takes ~50 seconds. When the cache exists, querying the same file will take around ~1-2 seconds. Obviously, the cache can be used in order to perform any query and not just the original query that was used for creating the cache.

When caching is enabled, the cache is created on the first read of a file, and used automatically when reading it in other queries. A separate cache is being created for each file that is being used, allowing reuse in multiple use-cases. For example, if two csv files each have their own cache file from previous queries, then running a query that JOINs these two files would use the caches as well (without loading the data into memory), speeding it up considerably.

The tradeoff for using cache files is disk space - A new file with the postfix `.qsql` is created and automatically detected and used in queries as needed. This file is essentially a standard sqlite file (with some additional metadata tables), and can be used directly by any standard sqlite tool later on.

For backward compatibility, the caching option is not turned on by default. You'd need to use the new `-C <mode>` to determine the caching mode. Available options are as follows:
* `none` - The default,  provides the original q's behaviour without caching
* `read` - Only reads cache files if they exists, but doesn't create any new ones
* `readwrite` - Uses cache files if they exists, or creates new ones if they don't. Writing new cache files doesn't interfere with the actual run of the query, so this option can be used in order to dynamically create the cache files if they don't exist

Content signatures are being stored in the caches, allowing to detect a state where the original file has been modified after the cache has been created. q will issue an error if this happens. For now, just delete the `.qsql` file in order to recreate the cache. In the future, another `-C` option would be added to automatically recreate the updated cache in such a case. Notice that the content signature contains various q flags which affect parsing, so make sure to use the same parameters to q when performing the queries, otherwise q will issue an error.

Notice that when running with `-A`, the cache is not written, even when `-C` is set to `readwrite`. This is due to the fact that `-A` does not really read the entire content of the files. For now, if you'd like to just prepare the cache without running the actual query, you can run it with a `select 1` query or something, although in terms of speed it will mostly not matter. If there's demand for adding an explicit `prepare caches only` option, I'll consider adding it.

## Revamped `.qrc` mechanism
Adding `-C <mode>` for each query can be cumbersome at some point, so the `.qrc` file has been revamped for easy addition of default parameters. 

For example, if you want the caching behaviour to be `read` all the time, then just add a `~/.qrc` file, and set the following in it:
```
[options]
caching_mode=read
```

All other flags and parameters to q can be controlled by the `.qrc` file. To see the proper names for each parameter, run `q --dump-defaults` and it will dump a default `.qrc` file that contains all parameters to `stdout`.

## Direct querying of standard sqlite databases
q now supports direct querying of standard sqlite databases. The syntax for accessing a table inside an sqlite database is `<sqlite-filename>:::<table_name>`. A query can contain any mix of sqlite files, qsql files or regular delimited files.

For example, this command joins two tables from two separate sqlite databases:
```
$ q "select count(*) from mydatabase1.sqlite:::mytable1 a left join mydatabase2.sqlite:::mytable2 b on (a.c1 = b.c1)"
```

Running queries on sqlite databases does not usually entail loading the data into memory. Databases are attached to a virtual database and queried directly from disk. This means that querying speed is practically identical to standard sqlite access. This is also true when multiple sqlite databases are used in a single query. The same mechanism is being used by q whenever it uses a qsql file (either directly or as a cache of a delimited fild). 

sqlite itself does have a pre-compiled limit of the number of databases that can be attached simultanously. If this limit is reached, then q will attach as many databases as possible, and then continue processing by loading additional tables into memory in order to execute the query. The standard limit in sqlite3 (unless compiled specifically with another limit) is 10 databases. This allows q to access as many as 8 user databases without having to load any data into memory (2 databases are always used for q's internal logic). Using more databases in a single query than this pre-compiled sqlite limit would slow things down, since some of the data would go into memory, but the query should still provide correct results.

Whenever the sqlite database file contains only one table, the table name part can be ommitted, and the user can specify only the sqlite-filename as the table name. For example, querying an sqlite database `mydatabase.sqlite` that only has one table `mytable` is possible with `q "SELECT ... FROM mydatabase.sqlite"`. There's no need to specify the table name in this case.

Since `.qsql` files are also standard sqlite files, they can be queried directly as well. This allows the user to actually delete the original CSV file and use the caches as if they were the original files. For example:

```
$ q "select count(*) from myfile.csv.qsql"
```

Notice that there's no need to write the `:::<table-name>` as part of the table name, since `qsql` files that are created as caches contain only one table (e.g. the table matching the original file).

Running a query that uses an sqlite/qsql database without specifying a table name will fail if there is more than one table in the database, showing the list of existing tables. This can be used in order to detect which tables exist in the database without resorting to other tools. For example:
```
$ q "select * from chinook.db:::blah"
Table blah could not be found in sqlite file chinook.db . Existing table names: albums,sqlite_sequence,artists,customers,employees,genres,invoices,invoice_items,media_types,playlists,playlist_track,tracks,sqlite_stat1
```

## Storing source data into a disk database
The `-S` option (`--save-db-to-disk`) has been modified to match the new capabilities. It works with all types of input tables/files, and writes the output database as a standard sqlite database. I've considered making the output a multi-table `qsql` file (e.g. with the additional metadata that q uses), but some things still need to be ironed out in order to make these qsql files work seamlessly with all other aspects of q. This will probably happen in the next version.  

This database can be accessed directly by q later on, by providing `<sqlite-database>:::<table-name>` as the table name in the query. The table names that are chosen match the original file names, but go through the following process:
* The names are normalised in order to by compatible with sqlite restrictions (e.g. `x.csv` is normalised to `x_dot_csv`)
* duplicate table names are de-deduped by adding `_<sequence-number>` to their names (e.g. two different csv files in separate folders which both have the name `companies` will be written to the file as `companies` and `companies_2`)

This table-name normalisation happens also inside `.qsql` cache files, but in most cases there won't be any need to know these table names, since q automatically detects table names for databases which have a single-table.

## File-concatenation and wildcard-matching features - Breaking change
File concatenation using '+' has been removed in this version, which is a breaking change.

This was a controversial feature anyway, and can be done using standard SQL relatively easily. It also complicated the caching implementation significantly, and it seemed that it was not worth it. If there's demand for bringing this feature back, please write to me and I'll consider re-adding it. 

If you have a case of using file concatenation, you can use the following SQL instead:
```
# Instead of writing
$ q "select * from myfile1+myfile2"
# Use the following:
$ q "select * from (select * from myfile1 UNION ALL select * from myfile2)"
```

This will provide the same results, but the error checking is a bit less robust, so be mindful on whether you're performing the right query on the right files.

Conceptually, this is similar to wildcard matching (e.g. `select * from myfolder/myfile*`), but I have decided to leave wildcard-matching intact, since it seems to be a more common use-case. Cache creation and use is limited for now when using wildcards. Use the same method as described above for file concatenation if you wanna make sure that caches are being used.

After this version is fully stabilised, I'll make more efforts to consolidate wildcard (and perhaps concatenation) to fully utilise caching seamlessly.

## Code runs only on python 3
Removed the dual py2/py3 support. Since q is packaged as a self-contained executable, along with python 3.8 itself, then this is not needed anymore.

Users which for some reason still use q's main source code file directly and use python 2 would need to stay with the latest 2.0.19 release. In some next version, q's code structure is going to change significantly anyway in order to become a standard python module, so using the main source code file directly would not be possible.

If you are such a user, and this decision hurts you considerably, please ping me.


================================================
FILE: README.markdown
================================================
[![Build and Package](https://github.com/harelba/q/workflows/BuildAndPackage/badge.svg?branch=master)](https://github.com/harelba/q/actions?query=branch%3Amaster)

# q - Text as Data
q's purpose is to bring SQL expressive power to the Linux command line and to provide easy access to text as actual data.

q allows the following:

* Performing SQL-like statements directly on tabular text data, auto-caching the data in order to accelerate additional querying on the same file. 
* Performing SQL statements directly on multi-file sqlite3 databases, without having to merge them or load them into memory

The following table shows the impact of using caching:

|    Rows   | Columns | File Size | Query time without caching | Query time with caching | Speed Improvement |
|:---------:|:-------:|:---------:|:--------------------------:|:-----------------------:|:-----------------:|
| 5,000,000 |   100   |   4.8GB   |    4 minutes, 47 seconds   |       1.92 seconds      |        x149       |
| 1,000,000 |   100   |   983MB   |        50.9 seconds        |      0.461 seconds      |        x110       |
| 1,000,000 |    50   |   477MB   |        27.1 seconds        |      0.272 seconds      |        x99        |
|  100,000  |   100   |    99MB   |         5.2 seconds        |      0.141 seconds      |        x36        |
|  100,000  |    50   |    48MB   |         2.7 seconds        |      0.105 seconds      |        x25        |

Notice that for the current version, caching is **not enabled** by default, since the caches take disk space. Use `-C readwrite` or `-C read` to enable it for a query, or add `caching_mode` to `.qrc` to set a new default.
 
q's web site is [https://harelba.github.io/q/](https://harelba.github.io/q/) or [https://q.textasdata.wiki](https://q.textasdata.wiki) It contains everything you need to download and use q immediately.


## Usage Examples
q treats ordinary files as database tables, and supports all SQL constructs, such as `WHERE`, `GROUP BY`, `JOIN`s, etc. It supports automatic column name and type detection, and provides full support for multiple character encodings.

Here are some example commands to get the idea:

```bash
$ q "SELECT COUNT(*) FROM ./clicks_file.csv WHERE c3 > 32.3"

$ ps -ef | q -H "SELECT UID, COUNT(*) cnt FROM - GROUP BY UID ORDER BY cnt DESC LIMIT 3"

$ q "select count(*) from some_db.sqlite3:::albums a left join another_db.sqlite3:::tracks t on (a.album_id = t.album_id)"
```

Detailed examples are in [here](https://harelba.github.io/q/#examples)

## Installation.
**New Major Version `3.1.6` is out with a lot of significant additions.**

Instructions for all OSs are [here](https://harelba.github.io/q/#installation).

The previous version `2.0.19` Can still be downloaded from [here](https://github.com/harelba/q/releases/tag/2.0.19)  

## Contact
Any feedback/suggestions/complaints regarding this tool would be much appreciated. Contributions are most welcome as well, of course.

Linkedin: [Harel Ben Attia](https://www.linkedin.com/in/harelba/)

Twitter [@harelba](https://twitter.com/harelba)

Email [harelba@gmail.com](mailto:harelba@gmail.com)

q on twitter: [#qtextasdata](https://twitter.com/hashtag/qtextasdata?src=hashtag_click)

Patreon: [harelba](https://www.patreon.com/harelba) - All the money received is donated to the [Center for the Prevention and Treatment of Domestic Violence](https://www.gov.il/he/departments/bureaus/molsa-almab-ramla) in my hometown - Ramla, Israel.




================================================
FILE: benchmark-config.sh
================================================
#!/bin/bash

BENCHMARK_PYTHON_VERSIONS=(3.8.5)


================================================
FILE: bin/.qrc
================================================
#
# q options ini file. Put either in your home folder as .qrc or in the working directory 
#   (both will be merged in that order)
#
# All options should reside in an [options] section
#
# Available options:
# * delimiter - escaped string (e.g. use \t for tab or \x20 for space)
# * outputdelimiter - escaped string (e.g. use \t for tab or \x20 for space)
# * gzipped - boolean True or False
# * beautify - boolean True or False
# * header_skip - integer number of lines to skip at the beginning of the file
# * formatting - regular string - post-query formatting - see docs for details
# * encoding - regular string - required encoding.
#
# All options have a matching command line option. See --help for details on defaults

[options]
#delimiter: \t
#output_delimiter: \t
#gzipped: False
#beautify: True
#skip_header: False
#formatting: 1=%4.3f,2=%4.3f
#encoding: UTF-8


================================================
FILE: bin/__init__.py
================================================
#!/usr/bin/env python



================================================
FILE: bin/q.bat
================================================
@echo off

setlocal
if exist "%~dp0..\python.exe" ( "%~dp0..\python" "%~dp0q" %* ) else ( python "%~dp0q" %* )
endlocal


================================================
FILE: bin/q.py
================================================
#!/usr/bin/env python3
# -*- coding: utf-8 -*-

#   Copyright (C) 2012-2021 Harel Ben-Attia
#
#   This program is free software; you can redistribute it and/or modify
#   it under the terms of the GNU General Public License as published by
#   the Free Software Foundation; either version 3, or (at your option)
#   any later version.
#
#   This program is distributed in the hope that it will be useful,
#   but WITHOUT ANY WARRANTY; without even the implied warranty of
#   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#   GNU General Public License for more details (doc/LICENSE contains
#   a copy of it)
#
#
# Name      : q (With respect to The Q Continuum)
# Author    : Harel Ben-Attia - harelba@gmail.com, harelba @ github, @harelba on twitter
#
#
# q allows performing SQL-like statements on tabular text data.
#
# Its purpose is to bring SQL expressive power to manipulating text data using the Linux command line.
#
# Full Documentation and details in https://harelba.github.io/q/
#
# Run with --help for command line details
#
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

from collections import OrderedDict
from sqlite3.dbapi2 import OperationalError
from uuid import uuid4

q_version = '3.1.6'

#__all__ = [ 'QTextAsData' ]

import os
import sys
import sqlite3
import glob
from argparse import ArgumentParser
import codecs
import locale
import time
import re
from six.moves import configparser, range, filter
import traceback
import csv
import uuid
import math
import six
import io
import json
import datetime
import hashlib

if six.PY2:
    assert False, 'Python 2 is not longer supported by q'

long = int
unicode = six.text_type

DEBUG = bool(os.environ.get('Q_DEBUG', None)) or '-V' in sys.argv
SQL_DEBUG = False

if DEBUG:
    def xprint(*args,**kwargs):
        print(datetime.datetime.utcnow().isoformat()," DEBUG ",*args,file=sys.stderr,**kwargs)

    def iprint(*args,**kwargs):
        print(datetime.datetime.utcnow().isoformat()," INFO ",*args,file=sys.stderr,**kwargs)

    def sqlprint(*args,**kwargs):
        pass
else:
    def xprint(*args,**kwargs): pass
    def iprint(*args,**kwargs): pass
    def sqlprint(*args,**kwargs): pass

if SQL_DEBUG:
    def sqlprint(*args,**kwargs):
        print(datetime.datetime.utcnow().isoformat(), " SQL ", *args, file=sys.stderr, **kwargs)


def get_stdout_encoding(encoding_override=None):
    if encoding_override is not None and encoding_override != 'none':
       return encoding_override

    if sys.stdout.isatty():
        return sys.stdout.encoding
    else:
        return locale.getpreferredencoding()

SHOW_SQL = False

sha_algorithms = {
    1 : hashlib.sha1,
    224: hashlib.sha224,
    256: hashlib.sha256,
    386: hashlib.sha384,
    512: hashlib.sha512
}

def sha(data,algorithm,encoding):
    try:
        f = sha_algorithms[algorithm]
        return f(six.text_type(data).encode(encoding)).hexdigest()
    except Exception as e:
        print(e)

# For backward compatibility only (doesn't handle encoding well enough)
def sha1(data):
    return hashlib.sha1(six.text_type(data).encode('utf-8')).hexdigest()

# TODO Add caching of compiled regexps - Will be added after benchmarking capability is baked in
def regexp(regular_expression, data):
    if data is not None:
        if not isinstance(data, str) and not isinstance(data, unicode):
            data = str(data)
        return re.search(regular_expression, data) is not None
    else:
        return False

def regexp_extract(regular_expression, data,group_number):
    if data is not None:
        if not isinstance(data, str) and not isinstance(data, unicode):
            data = str(data)
        m = re.search(regular_expression, data)
        if m is not None:
            return m.groups()[group_number]
    else:
        return False

def md5(data,encoding):
    m = hashlib.md5()
    m.update(six.text_type(data).encode(encoding))
    return m.hexdigest()

def sqrt(data):
    return math.sqrt(data)

def power(data,p):
    return data**p

def file_ext(data):
    if data is None:
        return None

    return os.path.splitext(data)[1]

def file_folder(data):
    if data is None:
        return None
    return os.path.split(data)[0]

def file_basename(data):
    if data is None:
        return None
    return os.path.split(data)[1]
    
def file_basename_no_ext(data):
    if data is None:
        return None

    return os.path.split(os.path.splitext(data)[0])[-1]

def percentile(l, p):
    # TODO Alpha implementation, need to provide multiple interpolation methods, and add tests
    if not l:
        return None
    k = p*(len(l) - 1)
    f = math.floor(k)
    c = math.ceil(k)
    if c == f:
        return l[int(k)]
    return (c-k) * l[int(f)] + (k-f) * l[int(c)]

# TODO Streaming Percentile to prevent memory consumption blowup for large datasets
class StrictPercentile(object):
    def __init__(self):
        self.values = []
        self.p = None

    def step(self,value,p):
        if self.p is None:
          self.p = p
        self.values.append(value)

    def finalize(self):
        if len(self.values) == 0 or (self.p < 0 or self.p > 1):
            return None
        else:
            return percentile(sorted(self.values),self.p)

class StdevPopulation(object):
    def __init__(self):
        self.M = 0.0
        self.S = 0.0
        self.k = 0

    def step(self, value):
        try:
            # Ignore nulls
            if value is None:
                return
            val = float(value) # if fails, skips this iteration, which also ignores nulls
            tM = self.M
            self.k += 1
            self.M += ((val - tM) / self.k)
            self.S += ((val - tM) * (val - self.M))
        except ValueError:
            # TODO propagate udf errors to console
            raise Exception("Data is not numeric when calculating stddev (%s)" % value)

    def finalize(self):
        if self.k <= 1: # avoid division by zero
            return None
        else:
            return math.sqrt(self.S / (self.k))

class StdevSample(object):
    def __init__(self):
        self.M = 0.0
        self.S = 0.0
        self.k = 0

    def step(self, value):
        try:
            # Ignore nulls
            if value is None:
                return
            val = float(value) # if fails, skips this iteration, which also ignores nulls
            tM = self.M
            self.k += 1
            self.M += ((val - tM) / self.k)
            self.S += ((val - tM) * (val - self.M))
        except ValueError:
            # TODO propagate udf errors to console
            raise Exception("Data is not numeric when calculating stddev (%s)" % value)

    def finalize(self):
        if self.k <= 1: # avoid division by zero
            return None
        else:
            return math.sqrt(self.S / (self.k-1))

class FunctionType(object):
    REGULAR = 1
    AGG = 2

class UserFunctionDef(object):
    def __init__(self,func_type,name,usage,description,func_or_obj,param_count):
        self.func_type = func_type
        self.name = name
        self.usage = usage
        self.description = description
        self.func_or_obj = func_or_obj
        self.param_count = param_count

user_functions = [
    UserFunctionDef(FunctionType.REGULAR,
                    "regexp","regexp(<regular_expression>,<expr>) = <1|0>",
                    "Find regexp in string expression. Returns 1 if found or 0 if not",
                    regexp,
                    2),
    UserFunctionDef(FunctionType.REGULAR,
                    "regexp_extract","regexp_extract(<regular_expression>,<expr>,group_number) = <substring|null>",
                    "Get regexp capture group content",
                    regexp_extract,
                    3),
    UserFunctionDef(FunctionType.REGULAR,
                    "sha","sha(<expr>,<encoding>,<algorithm>) = <hex-string-of-sha>",
                    "Calculate sha of some expression. Algorithm can be one of 1,224,256,384,512. For now encoding must be manually provided. Will use the input encoding automatically in the future.",
                    sha,
                    3),
    UserFunctionDef(FunctionType.REGULAR,
                    "sha1","sha1(<expr>) = <hex-string-of-sha>",
                    "Exists for backward compatibility only, since it doesn't handle encoding properly. Calculates sha1 of some expression",
                    sha1,
                    1),
    UserFunctionDef(FunctionType.REGULAR,
                    "md5","md5(<expr>,<encoding>) = <hex-string-of-md5>",
                    "Calculate md5 of expression. Returns a hex-string of the result. Currently requires to manually provide the encoding of the data. Will be taken automatically from the input encoding in the future.",
                    md5,
                    2),
    UserFunctionDef(FunctionType.REGULAR,
                    "sqrt","sqrt(<expr>) = <square-root>",
                    "Calculate the square root of the expression",
                    sqrt,
                    1),
    UserFunctionDef(FunctionType.REGULAR,
                    "power","power(<expr1>,<expr2>) = <expr1-to-the-power-of-expr2>",
                    "Raise expr1 to the power of expr2",
                    power,
                    2),
    UserFunctionDef(FunctionType.REGULAR,
                    "file_ext","file_ext(<expr>) = <filename-extension-or-empty-string>",
                    "Get the extension of a filename",
                    file_ext,
                    1),
    UserFunctionDef(FunctionType.REGULAR,
                    "file_folder","file_folder(<expr>) = <folder-name-of-filename>",
                    "Get the folder part of a filename",
                    file_folder,
                    1),
    UserFunctionDef(FunctionType.REGULAR,
                    "file_basename","file_basename(<expr>) = <basename-of-filename-including-extension>",
                    "Get the basename of a filename, including extension if any",
                    file_basename,
                    1),
    UserFunctionDef(FunctionType.REGULAR,
                    "file_basename_no_ext","file_basename_no_ext(<expr>) = <basename-of-filename-without-extension>",
                    "Get the basename of a filename, without the extension if there is one",
                    file_basename_no_ext,
                    1),
    UserFunctionDef(FunctionType.AGG,
                    "percentile","percentile(<expr>,<percentile-in-the-range-0-to-1>) = <percentile-value>",
                    "Calculate the strict percentile of a set of a values.",
                    StrictPercentile,
                    2),
    UserFunctionDef(FunctionType.AGG,
                    "stddev_pop","stddev_pop(<expr>) = <stddev-value>",
                    "Calculate the population standard deviation of a set of values",
                    StdevPopulation,
                    1),
    UserFunctionDef(FunctionType.AGG,
                    "stddev_sample","stddev_sample(<expr>) = <stddev-value>",
                    "Calculate the sample standard deviation of a set of values",
                    StdevSample,
                    1)
]

def print_user_functions():
    for udf in user_functions:
        print("Function: %s" % udf.name)
        print("     Usage: %s" % udf.usage)
        print("     Description: %s" % udf.description)

class Sqlite3DBResults(object):
    def __init__(self,query_column_names,results):
        self.query_column_names = query_column_names
        self.results = results

    def __str__(self):
        return "Sqlite3DBResults<result_count=%d,query_column_names=%s>" % (len(self.results),str(self.query_column_names))
    __repr__ = __str__

def get_sqlite_type_affinity(sqlite_type):
    sqlite_type = sqlite_type.upper()
    if 'INT' in sqlite_type:
        return 'INTEGER'
    elif 'CHAR' in sqlite_type or 'TEXT' in sqlite_type or 'CLOB' in sqlite_type:
        return 'TEXT'
    elif 'BLOB' in sqlite_type:
        return 'BLOB'
    elif 'REAL' in sqlite_type or 'FLOA' in sqlite_type or 'DOUB' in sqlite_type:
        return 'REAL'
    else:
        return 'NUMERIC'

def sqlite_type_to_python_type(sqlite_type):
    SQLITE_AFFINITY_TO_PYTHON_TYPE_NAMES = {
        'INTEGER': long,
        'TEXT': unicode,
        'BLOB': bytes,
        'REAL': float,
        'NUMERIC': float
    }
    return SQLITE_AFFINITY_TO_PYTHON_TYPE_NAMES[get_sqlite_type_affinity(sqlite_type)]


class Sqlite3DB(object):
    # TODO Add metadata table with qsql file version

    QCATALOG_TABLE_NAME = '_qcatalog'
    NUMERIC_COLUMN_TYPES =  {int, long, float}
    PYTHON_TO_SQLITE_TYPE_NAMES = { str: 'TEXT', int: 'INT', long : 'INT' , float: 'REAL', None: 'TEXT' }


    def __str__(self):
        return "Sqlite3DB<url=%s>" % self.sqlite_db_url
    __repr__ = __str__

    def __init__(self, db_id, sqlite_db_url, sqlite_db_filename, create_qcatalog, show_sql=SHOW_SQL):
        self.show_sql = show_sql
        self.create_qcatalog = create_qcatalog

        self.db_id = db_id
        # TODO Is this needed anymore?
        self.sqlite_db_filename = sqlite_db_filename
        self.sqlite_db_url = sqlite_db_url
        self.conn = sqlite3.connect(self.sqlite_db_url, uri=True)
        self.last_temp_table_id = 10000
        self.cursor = self.conn.cursor()
        self.add_user_functions()

        if create_qcatalog:
            self.create_qcatalog_table()
        else:
            xprint('Not creating qcatalog for db_id %s' % db_id)

    def retrieve_all_table_names(self):
        return [x[0] for x in self.execute_and_fetch("select tbl_name from sqlite_master where type='table'").results]

    def get_sqlite_table_info(self,table_name):
        return self.execute_and_fetch('PRAGMA table_info(%s)' % table_name).results

    def get_sqlite_database_list(self):
        return self.execute_and_fetch('pragma database_list').results

    def find_new_table_name(self,planned_table_name):
        existing_table_names = self.retrieve_all_table_names()

        possible_indices = range(1,1000)

        for index in possible_indices:
            if index == 1:
                suffix = ''
            else:
                suffix = '_%s' % index

            table_name_attempt = '%s%s' % (planned_table_name,suffix)

            if table_name_attempt not in existing_table_names:
                xprint("Found free table name %s in db %s for planned table name %s" % (table_name_attempt,self.db_id,planned_table_name))
                return table_name_attempt

        # TODO Add test for this
        raise Exception('Cannot find free table name in db %s for planned table name %s' % (self.db_id,planned_table_name))

    def create_qcatalog_table(self):
        if not self.qcatalog_table_exists():
            xprint("qcatalog table does not exist. Creating it")
            r = self.conn.execute("""CREATE TABLE %s ( 
                               qcatalog_entry_id text not null primary key,
                               content_signature_key text,
                               temp_table_name text,
                               content_signature text,
                               creation_time text,
                               source_type text,
                               source text)""" % self.QCATALOG_TABLE_NAME).fetchall()
        else:
            xprint("qcatalog table already exists. No need to create it")

    def qcatalog_table_exists(self):
        return sqlite_table_exists(self.conn,self.QCATALOG_TABLE_NAME)

    def calculate_content_signature_key(self,content_signature):
        assert type(content_signature) == OrderedDict
        pp = json.dumps(content_signature,sort_keys=True)
        xprint("Calculating content signature for:",pp,six.b(pp))
        return hashlib.sha1(six.b(pp)).hexdigest()

    def add_to_qcatalog_table(self, temp_table_name, content_signature, creation_time,source_type, source):
        assert source is not None
        assert source_type is not None
        content_signature_key = self.calculate_content_signature_key(content_signature)
        xprint("db_id: %s Adding to qcatalog table: %s. Calculated signature key %s" % (self.db_id, temp_table_name,content_signature_key))
        r = self.execute_and_fetch(
            'INSERT INTO %s (qcatalog_entry_id,content_signature_key, temp_table_name,content_signature,creation_time,source_type,source) VALUES (?,?,?,?,?,?,?)' % self.QCATALOG_TABLE_NAME,
                              (str(uuid4()),content_signature_key,temp_table_name,json.dumps(content_signature),creation_time,source_type,source))
        # Ensure transaction is completed
        self.conn.commit()

    def get_from_qcatalog(self, content_signature):
        content_signature_key = self.calculate_content_signature_key(content_signature)
        xprint("Finding table in db_id %s that matches content signature key %s" % (self.db_id,content_signature_key))

        field_names = ["content_signature_key", "temp_table_name", "content_signature", "creation_time","source_type","source","qcatalog_entry_id"]

        q = "SELECT %s FROM %s where content_signature_key = ?" % (",".join(field_names),self.QCATALOG_TABLE_NAME)
        r = self.execute_and_fetch(q,(content_signature_key,))

        if r is None:
            return None

        if len(r.results) == 0:
            return None

        if len(r.results) > 1:
            raise Exception("Bug - Exactly one result should have been provided: %s" % str(r.results))

        d = dict(zip(field_names,r.results[0]))
        return d

    def get_from_qcatalog_using_table_name(self, temp_table_name):
        xprint("getting from qcatalog using table name")

        field_names = ["content_signature", "temp_table_name","creation_time","source_type","source","content_signature_key","qcatalog_entry_id"]

        q = "SELECT %s FROM %s where temp_table_name = ?" % (",".join(field_names),self.QCATALOG_TABLE_NAME)
        xprint("Query from qcatalog %s params %s" % (q,str(temp_table_name,)))
        r = self.execute_and_fetch(q,(temp_table_name,))
        xprint("results: ",r.results)

        if r is None:
            return None

        if len(r.results) == 0:
            return None

        if len(r.results) > 1:
            raise Exception("Bug - Exactly one result should have been provided: %s" % str(r.results))

        d = dict(zip(field_names,r.results[0]))
        # content_signature should be the first in the list of field_names
        cs = OrderedDict(json.loads(r.results[0][0]))
        if self.calculate_content_signature_key(cs) != d['content_signature_key']:
            raise Exception('Table contains an invalid entry - content signature key is not matching the actual content signature')
        return d

    def get_all_from_qcatalog(self):
        xprint("getting from qcatalog using table name")

        field_names = ["temp_table_name", "content_signature", "creation_time","source_type","source","qcatalog_entry_id"]

        q = "SELECT %s FROM %s" % (",".join(field_names),self.QCATALOG_TABLE_NAME)
        xprint("Query from qcatalog %s" % q)
        r = self.execute_and_fetch(q)

        if r is None:
            return None

        def convert(res):
            d = dict(zip(field_names, res))
            cs = OrderedDict(json.loads(res[1]))
            d['content_signature_key'] = self.calculate_content_signature_key(cs)
            return d

        rr = [convert(r) for r in r.results]

        return rr

    def done(self):
        xprint("Closing database %s" % self.db_id)
        try:
            self.conn.commit()
            self.conn.close()
            xprint("Database %s closed" % self.db_id)
        except Exception as e:
            xprint("Could not close database %s" % self.db_id)
            raise

    def add_user_functions(self):
        for udf in user_functions:
            if type(udf.func_or_obj) == type(object):
                self.conn.create_aggregate(udf.name,udf.param_count,udf.func_or_obj)
            elif type(udf.func_or_obj) == type(md5):
                self.conn.create_function(udf.name,udf.param_count,udf.func_or_obj)
            else:
                raise Exception("Invalid user function definition %s" % str(udf))

    def is_numeric_type(self, column_type):
        return column_type in Sqlite3DB.NUMERIC_COLUMN_TYPES

    def update_many(self, sql, params):
        try:
            sqlprint(sql, " params: " + str(params))
            self.cursor.executemany(sql, params)
            _ = self.cursor.fetchall()
        finally:
            pass  # cursor.close()

    def execute_and_fetch(self, q,params = None):
        try:
            try:
                if self.show_sql:
                    print(repr(q))
                if params is None:
                    r = self.cursor.execute(q)
                else:
                    r = self.cursor.execute(q,params)
                if self.cursor.description is not None:
                    # we decode the column names, so they can be encoded to any output format later on
                    query_column_names = [c[0] for c in self.cursor.description]
                else:
                    query_column_names = None
                result = self.cursor.fetchall()
            finally:
                pass  # cursor.close()
        except OperationalError as e:
            raise SqliteOperationalErrorException("Failed executing sqlite query %s with params %s . error: %s" % (q,params,str(e)),e)
        return Sqlite3DBResults(query_column_names,result)

    def _get_as_list_str(self, l):
        return ",".join(['"%s"' % x.replace('"', '""') for x in l])

    def generate_insert_row(self, table_name, column_names):
        col_names_str = self._get_as_list_str(column_names)
        question_marks = ", ".join(["?" for i in range(0, len(column_names))])
        return 'INSERT INTO %s (%s) VALUES (%s)' % (table_name, col_names_str, question_marks)

    # Get a list of column names so order will be preserved (Could have used OrderedDict, but
    # then we would need python 2.7)
    def generate_create_table(self, table_name, column_names, column_dict):
        # Convert dict from python types to db types
        column_name_to_db_type = dict(
            (n, Sqlite3DB.PYTHON_TO_SQLITE_TYPE_NAMES[t]) for n, t in six.iteritems(column_dict))
        column_defs = ','.join(['"%s" %s' % (
            n.replace('"', '""'), column_name_to_db_type[n]) for n in column_names])
        return 'CREATE TABLE %s (%s)' % (table_name, column_defs)

    def generate_temp_table_name(self):
        # WTF - From my own past mutable-self
        self.last_temp_table_id += 1
        tn = "temp_table_%s" % self.last_temp_table_id
        return tn

    def generate_drop_table(self, table_name):
        return "DROP TABLE %s" % table_name

    def drop_table(self, table_name):
        return self.execute_and_fetch(self.generate_drop_table(table_name))

    def attach_and_copy_table(self, from_db, relevant_table,stop_after_analysis):
        xprint("Attaching %s into db %s and copying table %s into it" % (from_db,self,relevant_table))
        temp_db_id = 'temp_db_id'
        q = "attach '%s' as %s" % (from_db.sqlite_db_url,temp_db_id)
        xprint("Attach query: %s" % q)
        c = self.execute_and_fetch(q)

        new_temp_table_name = 'temp_table_%s' % (self.last_temp_table_id + 1)
        fully_qualified_table_name = '%s.%s' % (temp_db_id,relevant_table)

        if stop_after_analysis:
            limit = ' limit 100'
        else:
            limit = ''

        copy_query = 'create table %s as select * from %s %s' % (new_temp_table_name,fully_qualified_table_name,limit)
        copy_results = self.execute_and_fetch(copy_query)
        xprint("Copied %s.%s into %s in db_id %s. Results %s" % (temp_db_id,relevant_table,new_temp_table_name,self.db_id,copy_results))
        self.last_temp_table_id += 1

        xprint("Copied table into %s. Detaching db that was attached temporarily" % self.db_id)

        q = "detach database %s" % temp_db_id
        xprint("detach query: %s" % q)
        c = self.execute_and_fetch(q)
        xprint(c)
        return new_temp_table_name


class CouldNotConvertStringToNumericValueException(Exception):

    def __init__(self, msg):
        self.msg = msg

    def __str(self):
        return repr(self.msg)

class SqliteOperationalErrorException(Exception):

    def __init__(self, msg,original_error):
        self.msg = msg
        self.original_error = original_error

    def __str(self):
        return repr(self.msg) + "//" + repr(self.original_error)

class IncorrectDefaultValueException(Exception):

    def __init__(self, option_type,option,actual_value):
        self.option_type = option_type
        self.option = option
        self.actual_value = actual_value

    def __str__(self):
        return repr(self)

class NonExistentTableNameInQsql(Exception):

    def __init__(self, qsql_filename,table_name,existing_table_names):
        self.qsql_filename = qsql_filename
        self.table_name = table_name
        self.existing_table_names = existing_table_names

class NonExistentTableNameInSqlite(Exception):

    def __init__(self, qsql_filename,table_name,existing_table_names):
        self.qsql_filename = qsql_filename
        self.table_name = table_name
        self.existing_table_names = existing_table_names

class TooManyTablesInQsqlException(Exception):

    def __init__(self, qsql_filename,existing_table_names):
        self.qsql_filename = qsql_filename
        self.existing_table_names = existing_table_names

class NoTableInQsqlExcption(Exception):

    def __init__(self, qsql_filename):
        self.qsql_filename = qsql_filename

class TooManyTablesInSqliteException(Exception):

    def __init__(self, qsql_filename,existing_table_names):
        self.qsql_filename = qsql_filename
        self.existing_table_names = existing_table_names

class NoTablesInSqliteException(Exception):

    def __init__(self, sqlite_filename):
        self.sqlite_filename = sqlite_filename

class ColumnMaxLengthLimitExceededException(Exception):

    def __init__(self, msg):
        self.msg = msg

    def __str(self):
        return repr(self.msg)

class CouldNotParseInputException(Exception):

    def __init__(self, msg):
        self.msg = msg

    def __str(self):
        return repr(self.msg)

class BadHeaderException(Exception):

    def __init__(self, msg):
        self.msg = msg

    def __str(self):
        return repr(self.msg)

class EncodedQueryException(Exception):

    def __init__(self, msg):
        self.msg = msg

    def __str(self):
        return repr(self.msg)


class CannotUnzipDataStreamException(Exception):

    def __init__(self):
        pass

class UniversalNewlinesExistException(Exception):

    def __init__(self):
        pass

class EmptyDataException(Exception):

    def __init__(self):
        pass

class MissingHeaderException(Exception):

    def __init__(self,msg):
        self.msg = msg

class InvalidQueryException(Exception):

    def __init__(self,msg):
        self.msg = msg

class TooManyAttachedDatabasesException(Exception):

    def __init__(self,msg):
        self.msg = msg

class FileNotFoundException(Exception):

    def __init__(self, msg):
        self.msg = msg

    def __str(self):
        return repr(self.msg)

class UnknownFileTypeException(Exception):

    def __init__(self, msg):
        self.msg = msg

    def __str(self):
        return repr(self.msg)


class ColumnCountMismatchException(Exception):

    def __init__(self, msg):
        self.msg = msg

class ContentSignatureNotFoundException(Exception):

    def __init__(self, msg):
        self.msg = msg

class StrictModeColumnCountMismatchException(Exception):

    def __init__(self,atomic_fn, expected_col_count,actual_col_count,lines_read):
        self.atomic_fn = atomic_fn
        self.expected_col_count = expected_col_count
        self.actual_col_count = actual_col_count
        self.lines_read = lines_read

class FluffyModeColumnCountMismatchException(Exception):

    def __init__(self,atomic_fn, expected_col_count,actual_col_count,lines_read):
        self.atomic_fn = atomic_fn
        self.expected_col_count = expected_col_count
        self.actual_col_count = actual_col_count
        self.lines_read = lines_read

class ContentSignatureDiffersException(Exception):

    def __init__(self,original_filename, other_filename, filenames_str,key,source_value,signature_value):
        self.original_filename = original_filename
        self.other_filename = other_filename
        self.filenames_str = filenames_str
        self.key = key
        self.source_value = source_value
        self.signature_value = signature_value


class ContentSignatureDataDiffersException(Exception):

    def __init__(self,msg):
        self.msg = msg


class InvalidQSqliteFileException(Exception):

    def __init__(self,msg):
        self.msg = msg


class MaximumSourceFilesExceededException(Exception):

    def __init__(self,msg):
        self.msg = msg



# Simplistic Sql "parsing" class... We'll eventually require a real SQL parser which will provide us with a parse tree
#
# A "qtable" is a filename which behaves like an SQL table...
class Sql(object):

    def __init__(self, sql, data_streams):
        # Currently supports only standard SELECT statements

        # Holds original SQL
        self.sql = sql
        # Holds sql parts
        self.sql_parts = sql.split()
        self.data_streams = data_streams

        self.qtable_metadata_dict = OrderedDict()

        # Set of qtable names
        self.qtable_names = []
        # Dict from qtable names to their positions in sql_parts. Value here is a *list* of positions,
        # since it is possible that the same qtable_name (file) is referenced in multiple positions
        # and we don't want the database table to be recreated for each
        # reference
        self.qtable_name_positions = {}
        # Dict from qtable names to their effective (actual database) table
        # names
        self.qtable_name_effective_table_names = {}

        self.query_column_names = None

        # Go over all sql parts
        idx = 0
        while idx < len(self.sql_parts):
            # Get the part string
            part = self.sql_parts[idx]
            # If it's a FROM or a JOIN
            if part.upper() in ['FROM', 'JOIN']:
                # and there is nothing after it,
                if idx == len(self.sql_parts) - 1:
                    # Just fail
                    raise InvalidQueryException(
                        'FROM/JOIN is missing a table name after it')

                qtable_name = self.sql_parts[idx + 1]
                # Otherwise, the next part contains the qtable name. In most cases the next part will be only the qtable name.
                # We handle one special case here, where this is a subquery as a column: "SELECT (SELECT ... FROM qtable),100 FROM ...".
                # In that case, there will be an ending paranthesis as part of the name, and we want to handle this case gracefully.
                # This is obviously a hack of a hack :) Just until we have
                # complete parsing capabilities
                if ')' in qtable_name:
                    leftover = qtable_name[qtable_name.index(')'):]
                    self.sql_parts.insert(idx + 2, leftover)
                    qtable_name = qtable_name[:qtable_name.index(')')]
                    self.sql_parts[idx + 1] = qtable_name

                if qtable_name[0] != '(':
                    normalized_qtable_name = self.normalize_qtable_name(qtable_name)
                    xprint("Normalized qtable name for %s is %s" % (qtable_name,normalized_qtable_name))
                    self.qtable_names += [normalized_qtable_name]

                    if normalized_qtable_name not in self.qtable_name_positions.keys():
                        self.qtable_name_positions[normalized_qtable_name] = []

                    self.qtable_name_positions[normalized_qtable_name].append(idx + 1)
                    self.sql_parts[idx + 1] = normalized_qtable_name
                    idx += 2
                else:
                    idx += 1
            else:
                idx += 1
        xprint("Final sql parts: %s" % self.sql_parts)

    def normalize_qtable_name(self,qtable_name):
        if self.data_streams.is_data_stream(qtable_name):
            return qtable_name

        if ':::' in qtable_name:
            qsql_filename, table_name = qtable_name.split(":::", 1)
            return '%s:::%s' % (os.path.realpath(os.path.abspath(qsql_filename)),table_name)
        else:
            return os.path.realpath(os.path.abspath(qtable_name))

    def set_effective_table_name(self, qtable_name, effective_table_name):
        if qtable_name in self.qtable_name_effective_table_names.keys():
            if self.qtable_name_effective_table_names[qtable_name] != effective_table_name:
                raise Exception(
                    "Already set effective table name for qtable %s. Trying to change the effective table name from %s to %s" %
                    (qtable_name,self.qtable_name_effective_table_names[qtable_name],effective_table_name))

        xprint("Setting effective table name for %s - effective table name is set to %s" % (qtable_name,effective_table_name))
        self.qtable_name_effective_table_names[
            qtable_name] = effective_table_name

    def get_effective_sql(self,table_name_mapping=None):
        if len(list(filter(lambda x: x is None, self.qtable_name_effective_table_names))) != 0:
            assert False, 'There are qtables without effective tables'

        effective_sql = [x for x in self.sql_parts]

        xprint("Effective table names",self.qtable_name_effective_table_names)
        for qtable_name, positions in six.iteritems(self.qtable_name_positions):
            xprint("Positions for qtable name %s are %s" % (qtable_name,positions))
            for pos in positions:
                if table_name_mapping is not None:
                    x = self.qtable_name_effective_table_names[qtable_name]
                    effective_sql[pos] = table_name_mapping[x]
                else:
                    effective_sql[pos] = self.qtable_name_effective_table_names[qtable_name]

        return " ".join(effective_sql)

    def get_qtable_name_effective_table_names(self):
        return self.qtable_name_effective_table_names

    def execute_and_fetch(self, db):
        x = self.get_effective_sql()
        xprint("Final query: %s" % x)
        db_results_obj = db.execute_and_fetch(x)
        return db_results_obj

    def materialize_using(self,loaded_table_structures_dict):
        xprint("Materializing sql object: %s" % str(self.qtable_names))
        xprint("loaded table structures dict %s" % loaded_table_structures_dict)
        for qtable_name in self.qtable_names:
            table_structure = loaded_table_structures_dict[qtable_name]

            table_name_in_disk_db = table_structure.get_table_name_for_querying()

            effective_table_name = '%s.%s' % (table_structure.db_id, table_name_in_disk_db)

            # for a single file - no need to create a union, just use the table name
            self.set_effective_table_name(qtable_name, effective_table_name)
            xprint("Materialized filename %s to effective table name %s" % (qtable_name,effective_table_name))


class TableColumnInferer(object):

    def __init__(self, input_params):
        self.inferred = False
        self.mode = input_params.parsing_mode
        self.rows = []
        self.skip_header = input_params.skip_header
        self.header_row = None
        self.header_row_filename = None
        self.expected_column_count = input_params.expected_column_count
        self.input_delimiter = input_params.delimiter
        self.disable_column_type_detection = input_params.disable_column_type_detection

    def _generate_content_signature(self):
        return OrderedDict({
            "inferred": self.inferred,
            "mode": self.mode,
            "rows": "\n".join([",".join(x) for x in self.rows]),
            "skip_header": self.skip_header,
            "header_row": self.header_row,
            "expected_column_count": self.expected_column_count,
            "input_delimiter": self.input_delimiter,
            "disable_column_type_detection": self.disable_column_type_detection
        })

    def analyze(self, filename, col_vals):
        if self.inferred:
            assert False, "Already inferred columns"

        if self.skip_header and self.header_row is None:
            self.header_row = col_vals
            self.header_row_filename = filename
        else:
            self.rows.append(col_vals)

        if len(self.rows) < 100:
            return False

        self.do_analysis()
        return True

    def force_analysis(self):
        # This method is called whenever there is no more data, and an analysis needs
        # to be performed immediately, regardless of the amount of sample data that has
        # been collected
        self.do_analysis()

    def determine_type_of_value(self, value):
        if self.disable_column_type_detection:
            return str

        if value is not None:
            value = value.strip()
        if value == '' or value is None:
            return None

        try:
            i = int(value)
            if type(i) == long:
                return long
            else:
                return int
        except:
            pass

        try:
            f = float(value)
            return float
        except:
            pass

        return str

    def determine_type_of_value_list(self, value_list):
        type_list = [self.determine_type_of_value(v) for v in value_list]
        all_types = set(type_list)
        if len(set(type_list)) == 1:
            # all the sample lines are of the same type
            return type_list[0]
        else:
            # check for the number of types without nulls,
            type_list_without_nulls = list(filter(
                lambda x: x is not None, type_list))
            # If all the sample lines are of the same type,
            if len(set(type_list_without_nulls)) == 1:
                # return it
                return type_list_without_nulls[0]
            else:
                # If there are only two types, one float an one int, then choose a float type
                if len(set(type_list_without_nulls)) == 2 and float in type_list_without_nulls and int in type_list_without_nulls:
                    return float
                return str

    def do_analysis(self):
        if self.mode == 'strict':
            self._do_strict_analysis()
        elif self.mode in ['relaxed']:
            self._do_relaxed_analysis()
        else:
            raise Exception('Unknown parsing mode %s' % self.mode)

        if self.column_count == 1 and self.expected_column_count != 1 and self.expected_column_count is not None:
            print(f"Warning: column count is one (expected column count is {self.expected_column_count} - did you provide the correct delimiter?", file=sys.stderr)

        self.infer_column_types()
        self.infer_column_names()
        self.inferred = True

    def validate_column_names(self, value_list):
        column_name_errors = []
        for v in value_list:
            if v is None:
                # we allow column names to be None, in relaxed mode it'll be filled with default names.
                # RLRL
                continue
            if ',' in v:
                column_name_errors.append(
                    (v, "Column name cannot contain commas"))
                continue
            if self.input_delimiter in v:
                column_name_errors.append(
                    (v, "Column name cannot contain the input delimiter. Please make sure you've set the correct delimiter"))
                continue
            if '\n' in v:
                column_name_errors.append(
                    (v, "Column name cannot contain newline"))
                continue
            if v != v.strip():
                column_name_errors.append(
                    (v, "Column name contains leading/trailing spaces"))
                continue
            try:
                v.encode("utf-8", "strict").decode("utf-8")
            except:
                column_name_errors.append(
                    (v, "Column name must be UTF-8 Compatible"))
                continue
            # We're checking for column duplication for each field in order to be able to still provide it along with other errors
            if len(list(filter(lambda x: x == v,value_list))) > 1:
                entry = (v, "Column name is duplicated")
                # Don't duplicate the error report itself
                if entry not in column_name_errors:
                    column_name_errors.append(entry)
                continue
            nul_index = v.find("\x00")
            if nul_index >= 0:
                column_name_errors.append(
                    (v, "Column name cannot contain NUL"))
                continue
            t = self.determine_type_of_value(v)
            if t != str:
                column_name_errors.append((v, "Column name must be a string"))
        return column_name_errors

    def infer_column_names(self):
        if self.header_row is not None:
            column_name_errors = self.validate_column_names(self.header_row)
            if len(column_name_errors) > 0:
                raise BadHeaderException("Header must contain only strings and not numbers or empty strings: '%s'\n%s" % (
                    ",".join(self.header_row), "\n".join(["'%s': %s" % (x, y) for x, y in column_name_errors])))

            # use header row in order to name columns
            if len(self.header_row) < self.column_count:
                if self.mode == 'strict':
                    raise ColumnCountMismatchException("Strict mode. Header row contains less columns than expected column count(%s vs %s)" % (
                        len(self.header_row), self.column_count))
                elif self.mode in ['relaxed']:
                    # in relaxed mode, add columns to fill the missing ones
                    self.header_row = self.header_row + \
                        ['c%s' % (x + len(self.header_row) + 1)
                         for x in range(self.column_count - len(self.header_row))]
            elif len(self.header_row) > self.column_count:
                if self.mode == 'strict':
                    raise ColumnCountMismatchException("Strict mode. Header row contains more columns than expected column count (%s vs %s)" % (
                        len(self.header_row), self.column_count))
                elif self.mode in ['relaxed']:
                    # In relaxed mode, just cut the extra column names
                    self.header_row = self.header_row[:self.column_count]
            self.column_names = self.header_row
        else:
            # Column names are cX starting from 1
            self.column_names = ['c%s' % (i + 1)
                                 for i in range(self.column_count)]

    def _do_relaxed_analysis(self):
        column_count_list = [len(col_vals) for col_vals in self.rows]

        if len(self.rows) == 0:
            if self.header_row is None:
                self.column_count = 0
            else:
                self.column_count = len(self.header_row)
        else:
            if self.expected_column_count is not None:
                self.column_count = self.expected_column_count
            else:
                # If not specified, we'll take the largest row in the sample rows
                self.column_count = max(column_count_list)

    def get_column_count_summary(self, column_count_list):
        counts = {}
        for column_count in column_count_list:
            counts[column_count] = counts.get(column_count, 0) + 1
        return six.u(", ").join([six.u("{} rows with {} columns".format(v, k)) for k, v in six.iteritems(counts)])

    def _do_strict_analysis(self):
        column_count_list = [len(col_vals) for col_vals in self.rows]

        if len(set(column_count_list)) != 1:
            raise ColumnCountMismatchException('Strict mode. Column Count is expected to identical. Multiple column counts exist at the first part of the file. Try to check your delimiter, or change to relaxed mode. Details: %s' % (
                self.get_column_count_summary(column_count_list)))

        self.column_count = len(self.rows[0])

        if self.expected_column_count is not None and self.column_count != self.expected_column_count:
            raise ColumnCountMismatchException('Strict mode. Column count is expected to be %s but is %s' % (
                self.expected_column_count, self.column_count))

        self.infer_column_types()

    def infer_column_types(self):
        assert self.column_count > -1
        self.column_types = []
        self.column_types2 = []
        for column_number in range(self.column_count):
            column_value_list = [
                row[column_number] if column_number < len(row) else None for row in self.rows]
            column_type = self.determine_type_of_value_list(column_value_list)
            self.column_types.append(column_type)

            column_value_list2 = [row[column_number] if column_number < len(
                row) else None for row in self.rows[1:]]
            column_type2 = self.determine_type_of_value_list(
                column_value_list2)
            self.column_types2.append(column_type2)

        comparison = map(
            lambda x: x[0] == x[1], zip(self.column_types, self.column_types2))
        if False in comparison and not self.skip_header:
            number_of_column_types = len(set(self.column_types))
            if number_of_column_types == 1 and list(set(self.column_types))[0] == str:
                print('Warning - There seems to be header line in the file, but -H has not been specified. All fields will be detected as text fields, and the header line will appear as part of the data', file=sys.stderr)

    def get_column_dict(self):
        return OrderedDict(zip(self.column_names, self.column_types))

    def get_column_count(self):
        return self.column_count

    def get_column_names(self):
        return self.column_names

    def get_column_types(self):
        return self.column_types


def py3_encoded_csv_reader(encoding, f, dialect,row_data_only=False,**kwargs):
    try:
        xprint("f is %s" % str(f))
        xprint("dialect is %s" % dialect)
        csv_reader = csv.reader(f, dialect, **kwargs)

        if row_data_only:
            for row in csv_reader:
                yield row
        else:
            for row in csv_reader:
                yield (f.filename(),f.isfirstline(),row)

    except UnicodeDecodeError as e1:
        raise CouldNotParseInputException(e1)
    except ValueError as e:
        # TODO Add test for this
        if str(e) is not None and str(e).startswith('could not convert string to'):
            raise CouldNotConvertStringToNumericValueException(str(e))
        else:
            raise CouldNotParseInputException(str(e))
    except Exception as e:
        if str(e).startswith("field larger than field limit"):
            raise ColumnMaxLengthLimitExceededException(str(e))
        elif 'universal-newline' in str(e):
            raise UniversalNewlinesExistException()
        else:
            raise

encoded_csv_reader = py3_encoded_csv_reader

def normalized_filename(filename):
    return filename

class TableCreatorState(object):
    INITIALIZED = 'INITIALIZED'
    ANALYZED = 'ANALYZED'
    FULLY_READ = 'FULLY_READ'

class MaterializedStateType(object):
    UNKNOWN = 'unknown'
    DELIMITED_FILE = 'delimited-file'
    QSQL_FILE = 'qsql-file'
    SQLITE_FILE = 'sqlite-file'
    DATA_STREAM = 'data-stream'

class TableSourceType(object):
    DELIMITED_FILE = 'file'
    DELIMITED_FILE_WITH_UNUSED_QSQL = 'file-with-unused-qsql'
    QSQL_FILE = 'qsql-file'
    QSQL_FILE_WITH_ORIGINAL = 'qsql-file-with-original'
    SQLITE_FILE = 'sqlite-file'
    DATA_STREAM = 'data-stream'

def skip_BOM(f):
    try:
        BOM = f.buffer.read(3)

        if BOM != six.b('\xef\xbb\xbf'):
            # TODO Add test for this (propagates to try:except)
            raise Exception('Value of BOM is not as expected - Value is "%s"' % str(BOM))
    except Exception as e:
        # TODO Add a test for this
        raise Exception('Tried to skip BOM for "utf-8-sig" encoding and failed. Error message is ' + str(e))

def detect_qtable_name_source_info(qtable_name,data_streams,read_caching_enabled):
    data_stream = data_streams.get_for_filename(qtable_name)
    xprint("Found data stream %s" % data_stream)

    if data_stream is not None:
        return MaterializedStateType.DATA_STREAM, TableSourceType.DATA_STREAM,(data_stream,)

    if ':::' in qtable_name:
        qsql_filename, table_name = qtable_name.split(":::", 1)
        if not os.path.exists(qsql_filename):
            raise FileNotFoundException("Could not find file %s" % qsql_filename)

        if is_qsql_file(qsql_filename):
            return MaterializedStateType.QSQL_FILE, TableSourceType.QSQL_FILE, (qsql_filename, table_name,)
        if is_sqlite_file(qsql_filename):
            return MaterializedStateType.SQLITE_FILE, TableSourceType.SQLITE_FILE, (qsql_filename, table_name,)
        raise UnknownFileTypeException("Cannot detect the type of table %s" % qtable_name)
    else:
        if is_qsql_file(qtable_name):
            return MaterializedStateType.QSQL_FILE, TableSourceType.QSQL_FILE, (qtable_name, None)
        if is_sqlite_file(qtable_name):
            return MaterializedStateType.SQLITE_FILE, TableSourceType.SQLITE_FILE, (qtable_name, None)
        matching_qsql_file_candidate = qtable_name + '.qsql'

        table_source_type = TableSourceType.DELIMITED_FILE
        if is_qsql_file(matching_qsql_file_candidate):
            if read_caching_enabled:
                xprint("Found matching qsql file for original file %s (matching file %s) and read caching is enabled. Using it" % (qtable_name,matching_qsql_file_candidate))
                return MaterializedStateType.QSQL_FILE, TableSourceType.QSQL_FILE_WITH_ORIGINAL, (matching_qsql_file_candidate, None)
            else:
                xprint("Found matching qsql file for original file %s (matching file %s), but read caching is disabled. Not using it" % (qtable_name,matching_qsql_file_candidate))
                table_source_type = TableSourceType.DELIMITED_FILE_WITH_UNUSED_QSQL


        return MaterializedStateType.DELIMITED_FILE,table_source_type ,(qtable_name, None)


def is_sqlite_file(filename):
    if not os.path.exists(filename):
        return False

    f = open(filename,'rb')
    magic = f.read(16)
    f.close()
    return magic == six.b("SQLite format 3\x00")

def sqlite_table_exists(cursor,table_name):
    results = cursor.execute("select count(*) from sqlite_master where type='table' and tbl_name == '%s'" % table_name).fetchall()
    return results[0][0] == 1

def is_qsql_file(filename):
    if not is_sqlite_file(filename):
        return False

    db = Sqlite3DB('check_qsql_db',filename,filename,create_qcatalog=False)
    qcatalog_exists = db.qcatalog_table_exists()
    db.done()
    return qcatalog_exists

def normalize_filename_to_table_name(filename):
    xprint("Normalizing filename %s" % filename)
    if filename[0].isdigit():
        xprint("Filename starts with a digit, adding prefix")
        filename = 't_%s' % filename
    if filename.lower().endswith(".qsql"):
        filename = filename[:-5]
    elif filename.lower().endswith('.sqlite'):
        filename = filename[:-7]
    elif filename.lower().endswith('.sqlite3'):
        filename = filename[:-8]
    return filename.replace("-","_dash_").replace(".","_dot_").replace('?','_qm_').replace("/","_slash_").replace("\\","_backslash_").replace(":","_colon_").replace(" ","_space_").replace("+","_plus_")

def validate_content_signature(original_filename, source_signature,other_filename, content_signature,scope=None,dump=False):
    if dump:
        xprint("Comparing: source value: %s target value: %s" % (source_signature,content_signature))

    s = "%s vs %s:" % (original_filename,other_filename)
    if scope is None:
        scope = []
    for k in source_signature:
        if type(source_signature[k]) == OrderedDict:
            validate_content_signature(original_filename, source_signature[k],other_filename, content_signature[k],scope + [k])
        else:
            if k not in content_signature:
                raise ContentSignatureDataDiffersException("%s Content Signatures differ. %s is missing from content signature" % (s,k))
            if source_signature[k] != content_signature[k]:
                if k == 'rows':
                    raise ContentSignatureDataDiffersException("%s Content Signatures differ at %s.%s (actual analysis data differs)" % (s,".".join(scope),k))
                else:
                    raise ContentSignatureDiffersException(original_filename, other_filename, original_filename,".".join(scope + [k]),source_signature[k],content_signature[k])

class DelimitedFileReader(object):
    def __init__(self,atomic_fns, input_params, dialect, f = None,external_f_name = None):
        if f is not None:
            assert len(atomic_fns) == 0

        self.atomic_fns = atomic_fns
        self.input_params = input_params
        self.dialect = dialect

        self.f = f
        self.lines_read = 0
        self.file_number = -1

        self.skipped_bom = False

        self.is_open = f is not None

        self.external_f = f is not None
        self.external_f_name = external_f_name

    def get_lines_read(self):
        return self.lines_read

    def get_size_hash(self):
        if self.atomic_fns is None or len(self.atomic_fns) == 0:
            return "data-stream-size"
        else:
            return ",".join(map(str,[os.stat(atomic_fn).st_size for atomic_fn in self.atomic_fns]))

    def get_last_modification_time_hash(self):
        if self.atomic_fns is None or len(self.atomic_fns) == 0:
            return "data stream-lmt"
        else:
            x = ",".join(map(lambda x: ':%s:' % x,[os.stat(x).st_mtime_ns for x in self.atomic_fns]))
            res = hashlib.sha1(six.b(x)).hexdigest() + '///' + x
            xprint("Hash of last modification time is %s" % res)
            return res

    def open_file(self):
        if self.external_f:
            xprint("External f has been provided. No need to open the file")
            return

        # TODO Support universal newlines for gzipped and stdin data as well

        xprint("XX Opening file %s" % ",".join(self.atomic_fns))
        import fileinput

        def q_openhook(filename, mode):
            if self.input_params.gzipped_input or filename.endswith('.gz'):
                import gzip
                f = gzip.open(filename,mode='rt',encoding=self.input_params.input_encoding)
            else:
                if six.PY3:
                    if self.input_params.with_universal_newlines:
                        f = io.open(filename, 'rU', newline=None, encoding=self.input_params.input_encoding)
                    else:
                        f = io.open(filename, 'r', newline=None, encoding=self.input_params.input_encoding)
                else:
                    if self.input_params.with_universal_newlines:
                        file_opening_mode = 'rbU'
                    else:
                        file_opening_mode = 'rb'
                    f = open(filename, file_opening_mode)

            if self.input_params.input_encoding == 'utf-8-sig' and not self.skipped_bom:
                skip_BOM(f)

            return f

        f = fileinput.input(self.atomic_fns,mode='rb',openhook=q_openhook)

        self.f = f
        self.is_open = True
        xprint("Actually opened file %s" % self.f)
        return f

    def close_file(self):
        if not self.is_open:
            # TODO Convert to assertion
            raise Exception("Bug - file should already be open: %s" % ",".join(self.atomic_fns))

        self.f.close()
        xprint("XX Closed file %s" % ",".join(self.atomic_fns))

    def generate_rows(self):
        csv_reader = encoded_csv_reader(self.input_params.input_encoding, self.f, dialect=self.dialect,row_data_only=self.external_f)
        try:
            # TODO Some order with regard to separating data-streams for actual files
            if self.external_f:
                for col_vals in csv_reader:
                    self.lines_read += 1
                    yield self.external_f_name,0, self.lines_read == 0, col_vals
            else:
                for file_name,is_first_line,col_vals in csv_reader:
                    if is_first_line:
                        self.file_number = self.file_number + 1
                    self.lines_read += 1
                    yield file_name,self.file_number,is_first_line,col_vals
        except ColumnMaxLengthLimitExceededException as e:
            msg = "Column length is larger than the maximum. Offending file is '%s' - Line is %s, counting from 1 (encoding %s). The line number is the raw line number of the file, ignoring whether there's a header or not" % (",".join(self.atomic_fns),self.lines_read + 1,self.input_params.input_encoding)
            raise ColumnMaxLengthLimitExceededException(msg)
        except UniversalNewlinesExistException as e2:
            # No need to translate the exception, but we want it to be explicitly defined here for clarity
            raise UniversalNewlinesExistException()

class MaterializedState(object):
    def __init__(self, table_source_type,qtable_name, engine_id):
        xprint("Creating new MS: %s %s" % (id(self), qtable_name))

        self.table_source_type = table_source_type

        self.qtable_name = qtable_name
        self.engine_id = engine_id

        self.db_to_use = None
        self.db_id = None

        self.source_type = None
        self.source = None

        self.mfs_structure = None

        self.start_time = None
        self.end_time = None
        self.duration = None

        self.effective_table_name = None


    def get_materialized_state_type(self):
        return MaterializedStateType.UNKNOWN

    def get_planned_table_name(self):
        assert False, 'not implemented'

    def autodetect_table_name(self):
        xprint("Autodetecting table name. db_to_use=%s" % self.db_to_use)
        existing_table_names = self.db_to_use.retrieve_all_table_names()
        xprint("Existing table names: %s" % existing_table_names)

        possible_indices = range(1,1000)

        for index in possible_indices:
            if index == 1:
                suffix = ''
            else:
                suffix = '_%s' % index

            table_name_attempt = '%s%s' % (self.get_planned_table_name(),suffix)
            xprint("Table name attempt: index=%s name=%s" % (index,table_name_attempt))

            if table_name_attempt not in existing_table_names:
                xprint("Found free table name %s for source type %s source %s" % (table_name_attempt,self.source_type,self.source))
                return table_name_attempt

        raise Exception('Cannot find free table name for source type %s source %s' % (self.source_type,self.source))

    def initialize(self):
        self.start_time = time.time()

    def finalize(self):
        self.end_time = time.time()
        self.duration = self.end_time - self.start_time

    def choose_db_to_use(self,forced_db_to_use=None,stop_after_analysis=False):
        assert False, 'not implemented'

    def make_data_available(self,stop_after_analysis):
        assert False, 'not implemented'

class MaterializedDelimitedFileState(MaterializedState):
    def __init__(self, table_source_type,qtable_name, input_params, dialect_id,engine_id,target_table_name=None):
        super().__init__(table_source_type,qtable_name,engine_id)

        self.input_params = input_params
        self.dialect_id = dialect_id
        self.target_table_name = target_table_name

        self.content_signature = None

        self.atomic_fns = None

        self.can_store_as_cached = None

    def get_materialized_state_type(self):
        return MaterializedStateType.DELIMITED_FILE

    def initialize(self):
        super(MaterializedDelimitedFileState, self).initialize()

        self.atomic_fns = self.materialize_file_list(self.qtable_name)
        self.delimited_file_reader = DelimitedFileReader(self.atomic_fns,self.input_params,self.dialect_id)

        self.source_type = self.table_source_type
        self.source = ",".join(self.atomic_fns)

        return

    def materialize_file_list(self,qtable_name):
        materialized_file_list = []

        unfound_files = []
        # First check if the file exists without globbing. This will ensure that we don't support non-existent files
        if os.path.exists(qtable_name):
            # If it exists, then just use it
            found_files = [qtable_name]
        else:
            # If not, then try with globs (and sort for predictability)
            found_files = list(sorted(glob.glob(qtable_name)))
            # If no files
            if len(found_files) == 0:
                unfound_files += [qtable_name]
        materialized_file_list += found_files

        # If there are no files to go over,
        if len(unfound_files) == 1:
            raise FileNotFoundException(
                "No files matching '%s' have been found" % unfound_files[0])
        elif len(unfound_files) > 1:
            # TODO Add test for this
            raise FileNotFoundException(
                "The following files have not been found for table %s: %s" % (qtable_name,",".join(unfound_files)))

        # deduplicate with matching qsql files
        filtered_file_list = list(filter(lambda x: not x.endswith('.qsql'),materialized_file_list))
        xprint("Filtered qsql files from glob search. Original file count: %s new file count: %s" % (len(materialized_file_list),len(filtered_file_list)))

        l = len(filtered_file_list)
        # If this proves to be a problem for users in terms of usability, then we'll just materialize the files
        # into the adhoc db, as with the db attach limit of sqlite
        if l > 500:
            msg = "Maximum source files for table must be 500. Table is name is %s Number of actual files is %s" % (qtable_name,l)
            raise MaximumSourceFilesExceededException(msg)

        absolute_path_list = [os.path.abspath(x) for x in filtered_file_list]
        return absolute_path_list

    def choose_db_to_use(self,forced_db_to_use=None,stop_after_analysis=False):
        if forced_db_to_use is not None:
            self.db_id = forced_db_to_use.db_id
            self.db_to_use = forced_db_to_use
            self.can_store_as_cached = False
            assert self.target_table_name is None
            self.target_table_name = self.autodetect_table_name()
            return

        self.can_store_as_cached = True

        self.db_id = '%s' % self._generate_db_name(self.atomic_fns[0])
        xprint("Database id is %s" % self.db_id)
        self.db_to_use = Sqlite3DB(self.db_id, 'file:%s?mode=memory&cache=shared' % self.db_id, 'memory<%s>' % self.db_id,create_qcatalog=True)

        if self.target_table_name is None:
            self.target_table_name = self.autodetect_table_name()


    def __analyze_delimited_file(self,database_info):
        xprint("Analyzing delimited file")
        if self.target_table_name is not None:
            target_sqlite_table_name = self.target_table_name
        else:
            assert False

        xprint("Target sqlite table name is %s" % target_sqlite_table_name)
        # Create the matching database table and populate it
        table_creator = TableCreator(self.qtable_name, self.delimited_file_reader,self.input_params, sqlite_db=database_info.sqlite_db,
                                     target_sqlite_table_name=target_sqlite_table_name)
        table_creator.perform_analyze(self.dialect_id)
        xprint("after perform_analyze")
        self.content_signature = table_creator._generate_content_signature()

        now = datetime.datetime.utcnow().isoformat()

        database_info.sqlite_db.add_to_qcatalog_table(target_sqlite_table_name,
                                          self.content_signature,
                                          now,
                                          self.source_type,
                                          self.source)
        return table_creator

    def _generate_disk_db_filename(self, filenames_str):
        fn = '%s.qsql' % (os.path.abspath(filenames_str).replace("+","__"))
        return fn


    def _get_should_read_from_cache(self, disk_db_filename):
        disk_db_file_exists = os.path.exists(disk_db_filename)

        should_read_from_cache = self.input_params.read_caching and disk_db_file_exists

        return should_read_from_cache

    def calculate_should_read_from_cache(self):
        # TODO cache filename is chosen according to first filename only, which makes multi-file (glob) caching difficult
        #  cache writing is blocked for now in these cases. Will be added in the future (see save_cache_to_disk_if_needed)
        disk_db_filename = self._generate_disk_db_filename(self.atomic_fns[0])
        should_read_from_cache = self._get_should_read_from_cache(disk_db_filename)
        xprint("should read from cache %s" % should_read_from_cache)
        return disk_db_filename,should_read_from_cache

    def get_planned_table_name(self):
        return normalize_filename_to_table_name(os.path.basename(self.atomic_fns[0]))

    def make_data_available(self,stop_after_analysis):
        xprint("In make_data_available. db_id %s db_to_use %s" % (self.db_id,self.db_to_use))
        assert self.db_id is not None

        disk_db_filename, should_read_from_cache = self.calculate_should_read_from_cache()
        xprint("disk_db_filename=%s should_read_from_cache=%s" % (disk_db_filename,should_read_from_cache))

        database_info = DatabaseInfo(self.db_id,self.db_to_use, needs_closing=True)
        xprint("db %s (%s) has been added to the database list" % (self.db_id, self.db_to_use))

        self.delimited_file_reader.open_file()

        table_creator = self.__analyze_delimited_file(database_info)

        self.mfs_structure = MaterializedStateTableStructure(self.qtable_name, self.atomic_fns, self.db_id,
                                                             table_creator.column_inferer.get_column_names(),
                                                             table_creator.column_inferer.get_column_types(),
                                                             None,
                                                             self.target_table_name,
                                                             self.source_type,
                                                             self.source,
                                                             self.get_planned_table_name())

        content_signature = table_creator.content_signature
        content_signature_key = self.db_to_use.calculate_content_signature_key(content_signature)
        xprint("table creator signature key: %s" % content_signature_key)

        relevant_table = self.db_to_use.get_from_qcatalog(content_signature)['temp_table_name']

        if not stop_after_analysis:
            table_creator.perform_read_fully(self.dialect_id)

            self.save_cache_to_disk_if_needed(disk_db_filename, table_creator)


        self.delimited_file_reader.close_file()

        return database_info, relevant_table

    def save_cache_to_disk_if_needed(self, disk_db_filename, table_creator):
        if len(self.atomic_fns) > 1:
            xprint("Cannot save cache for multi-files for now, deciding auto-naming for cache is challenging. Will be added in the future.")
            return

        effective_write_caching = self.input_params.write_caching
        if effective_write_caching:
            if self.can_store_as_cached:
                assert self.table_source_type != TableSourceType.DELIMITED_FILE_WITH_UNUSED_QSQL
                xprint("Going to write file cache for %s. Disk filename is %s" % (",".join(self.atomic_fns), disk_db_filename))
                self._store_qsql(table_creator.sqlite_db, disk_db_filename)
            else:
                xprint("Database has been provided externally. Skipping storing a cached version of the data")

    def _store_qsql(self, source_sqlite_db, disk_db_filename):
        xprint("Storing data as disk db")
        disk_db_conn = sqlite3.connect(disk_db_filename)
        with disk_db_conn:
            source_sqlite_db.conn.backup(disk_db_conn)
        xprint("Written db to disk: disk db filename %s" % (disk_db_filename))
        disk_db_conn.close()

    def _generate_db_name(self, qtable_name):
        return 'e_%s_fn_%s' % (self.engine_id,normalize_filename_to_table_name(qtable_name))


class MaterialiedDataStreamState(MaterializedDelimitedFileState):
    def __init__(self, table_source_type, qtable_name, input_params, dialect_id, engine_id, data_stream, stream_target_db): ## should pass adhoc_db
        assert data_stream is not None

        super().__init__(table_source_type, qtable_name, input_params, dialect_id, engine_id,target_table_name=None)

        self.data_stream = data_stream

        self.stream_target_db = stream_target_db

        self.target_table_name = None

    def get_planned_table_name(self):
        return 'data_stream_%s' % (normalize_filename_to_table_name(self.source))

    def get_materialized_state_type(self):
        return MaterializedStateType.DATA_STREAM

    def initialize(self):
        self.start_time = time.time()
        if self.input_params.gzipped_input:
            raise CannotUnzipDataStreamException()

        self.source_type = self.table_source_type
        self.source = self.data_stream.stream_id

        self.delimited_file_reader = DelimitedFileReader([], self.input_params, self.dialect_id, f=self.data_stream.stream,external_f_name=self.source)

    def choose_db_to_use(self,forced_db_to_use=None,stop_after_analysis=False):
        assert forced_db_to_use is None

        self.db_id = self.stream_target_db.db_id
        self.db_to_use = self.stream_target_db

        self.target_table_name = self.autodetect_table_name()

        return

    def calculate_should_read_from_cache(self):
        # No disk_db_filename, and no reading from cache when reading a datastream
        return None, False

    def finalize(self):
        super(MaterialiedDataStreamState, self).finalize()

    def save_cache_to_disk_if_needed(self, disk_db_filename, table_creator):
        xprint("Saving to cache is disabled for data streams")
        return


class MaterializedSqliteState(MaterializedState):
    def __init__(self,table_source_type,qtable_name,sqlite_filename,table_name, engine_id):
        super(MaterializedSqliteState, self).__init__(table_source_type,qtable_name,engine_id)
        self.sqlite_filename = sqlite_filename
        self.table_name = table_name

        self.table_name_autodetected = None

    def initialize(self):
        super(MaterializedSqliteState, self).initialize()

        self.table_name_autodetected = False
        if self.table_name is None:
            self.table_name = self.autodetect_table_name()
            self.table_name_autodetected = True
            return

        self.validate_table_name()

    def get_planned_table_name(self):
        if self.table_name_autodetected:
            return normalize_filename_to_table_name(os.path.basename(self.qtable_name))
        else:
            return self.table_name


    def autodetect_table_name(self):
        db = Sqlite3DB('temp_db','file:%s?immutable=1' % self.sqlite_filename,self.sqlite_filename,create_qcatalog=False)
        try:
            table_names = list(sorted(db.retrieve_all_table_names()))
            if len(table_names) == 1:
                return table_names[0]
            elif len(table_names) == 0:
                raise NoTablesInSqliteException(self.sqlite_filename)
            else:
                raise TooManyTablesInSqliteException(self.sqlite_filename,table_names)
        finally:
            db.done()

    def validate_table_name(self):
        db = Sqlite3DB('temp_db', 'file:%s?immutable=1' % self.sqlite_filename, self.sqlite_filename,
                       create_qcatalog=False)
        try:
            table_names = list(db.retrieve_all_table_names())
            if self.table_name.lower() not in map(lambda x:x.lower(),table_names):
                raise NonExistentTableNameInSqlite(self.sqlite_filename, self.table_name, table_names)
        finally:
            db.done()

    def finalize(self):
        super(MaterializedSqliteState, self).finalize()

    def get_materialized_state_type(self):
        return MaterializedStateType.SQLITE_FILE

    def _generate_qsql_only_db_name__temp(self, filenames_str):
        return 'e_%s_fn_%s' % (self.engine_id,hashlib.sha1(six.b(filenames_str)).hexdigest())

    def choose_db_to_use(self,forced_db_to_use=None,stop_after_analysis=False):
        self.source = self.sqlite_filename
        self.source_type = self.table_source_type

        self.db_id = '%s' % self._generate_qsql_only_db_name__temp(self.qtable_name)

        x = 'file:%s?immutable=1' % self.sqlite_filename
        self.db_to_use = Sqlite3DB(self.db_id, x, self.sqlite_filename,create_qcatalog=False)

        if forced_db_to_use:
            xprint("Forced sqlite db_to_use %s" % forced_db_to_use)
            new_table_name = forced_db_to_use.attach_and_copy_table(self.db_to_use,self.table_name,stop_after_analysis)
            self.table_name = new_table_name
            self.db_id = forced_db_to_use.db_id
            self.db_to_use = forced_db_to_use

        return

    def make_data_available(self,stop_after_analysis):
        xprint("db %s (%s) has been added to the database list" % (self.db_id, self.db_to_use))

        database_info,relevant_table = DatabaseInfo(self.db_id,self.db_to_use, needs_closing=True), self.table_name

        column_names, column_types, sqlite_column_types = self._extract_information()

        self.mfs_structure = MaterializedStateTableStructure(self.qtable_name, [self.qtable_name], self.db_id,
                                                             column_names, column_types, sqlite_column_types,
                                                             self.table_name,
                                                             self.source_type,self.source,
                                                             self.get_planned_table_name())
        return database_info, relevant_table

    def _extract_information(self):
        table_list = self.db_to_use.retrieve_all_table_names()
        if len(table_list) == 1:
            table_name = table_list[0][0]
            xprint("Only one table in sqlite database, choosing it: %s" % table_name)
        else:
            # self.table_name has either beein autodetected, or validated as an existing table up the stack
            table_name = self.table_name
            xprint("Multiple tables in sqlite file. Using provided table name %s" % self.table_name)

        table_info = self.db_to_use.get_sqlite_table_info(table_name)
        xprint('Table info is %s' % table_info)
        column_names = list(map(lambda x: x[1], table_info))
        sqlite_column_types = list(map(lambda x: x[2].lower(),table_info))
        column_types = list(map(lambda x: sqlite_type_to_python_type(x[2]), table_info))
        xprint("Column names and types for table %s: %s" % (table_name, list(zip(column_names, zip(sqlite_column_types,column_types)))))
        self.content_signature = OrderedDict()

        return column_names, column_types, sqlite_column_types


class MaterializedQsqlState(MaterializedState):
    def __init__(self,table_source_type,qtable_name,qsql_filename,table_name, engine_id,input_params,dialect_id):
        super(MaterializedQsqlState, self).__init__(table_source_type,qtable_name,engine_id)
        self.qsql_filename = qsql_filename
        self.table_name = table_name

        # These are for cases where the qsql file is just a cache and the original is still there, used for content
        # validation
        self.input_params = input_params
        self.dialect_id = dialect_id

        self.table_name_autodetected = None

    def initialize(self):
        super(MaterializedQsqlState, self).initialize()

        self.table_name_autodetected = False
        if self.table_name is None:
            self.table_name = self.autodetect_table_name()
            self.table_name_autodetected = True
            return

        self.validate_table_name()

    def get_planned_table_name(self):
        if self.table_name_autodetected:
            return normalize_filename_to_table_name(os.path.basename(self.qtable_name))
        else:
            return self.table_name


    def autodetect_table_name(self):
        db = Sqlite3DB('temp_db','file:%s?immutable=1' % self.qsql_filename,self.qsql_filename,create_qcatalog=False)
        assert db.qcatalog_table_exists()
        try:
            qcatalog_entries = db.get_all_from_qcatalog()
            if len(qcatalog_entries) == 0:
                raise NoTableInQsqlExcption(self.qsql_filename)
            elif len(qcatalog_entries) == 1:
                return qcatalog_entries[0]['temp_table_name']
            else:
                # TODO Add a test for this
                table_names = list(sorted([x['temp_table_name'] for x in qcatalog_entries]))
                raise TooManyTablesInQsqlException(self.qsql_filename,table_names)
        finally:
            db.done()

    def validate_table_name(self):
        db = Sqlite3DB('temp_db', 'file:%s?immutable=1' % self.qsql_filename, self.qsql_filename,
                       create_qcatalog=False)
        assert db.qcatalog_table_exists()
        try:
            entry = db.get_from_qcatalog_using_table_name(self.table_name)
            if entry is None:
                qcatalog_entries = db.get_all_from_qcatalog()
                table_names = list(sorted([x['temp_table_name'] for x in qcatalog_entries]))
                raise NonExistentTableNameInQsql(self.qsql_filename,self.table_name,table_names)
        finally:
            db.done()

    def finalize(self):
        super(MaterializedQsqlState, self).finalize()

    def get_materialized_state_type(self):
        return MaterializedStateType.QSQL_FILE

    def _generate_qsql_only_db_name__temp(self, filenames_str):
        return 'e_%s_fn_%s' % (self.engine_id,hashlib.sha1(six.b(filenames_str)).hexdigest())

    def choose_db_to_use(self,forced_db_to_use=None,stop_after_analysis=False):
        self.source = self.qsql_filename
        self.source_type = self.table_source_type

        self.db_id = '%s' % self._generate_qsql_only_db_name__temp(self.qtable_name)

        x = 'file:%s?immutable=1' % self.qsql_filename
        self.db_to_use = Sqlite3DB(self.db_id, x, self.qsql_filename,create_qcatalog=False)

        if forced_db_to_use:
            xprint("Forced qsql to use forced_db: %s" % forced_db_to_use)

            # TODO RLRL Move query to Sqlite3DB
            all_table_names = [(x[0],x[1]) for x in self.db_to_use.execute_and_fetch("select content_signature_key,temp_table_name from %s" % self.db_to_use.QCATALOG_TABLE_NAME).results]
            csk,t = list(filter(lambda x: x[1] == self.table_name,all_table_names))[0]
            xprint("Copying table %s from db_id %s" % (t,self.db_id))
            d = self.db_to_use.get_from_qcatalog_using_table_name(t)

            new_table_name = forced_db_to_use.attach_and_copy_table(self.db_to_use,self.table_name,stop_after_analysis)

            xprint("CS",d['content_signature'])
            cs = OrderedDict(json.loads(d['content_signature']))
            forced_db_to_use.add_to_qcatalog_table(new_table_name, cs, d['creation_time'],
                                    d['source_type'], d['source'])

            self.table_name = new_table_name
            self.db_id = forced_db_to_use.db_id
            self.db_to_use = forced_db_to_use

        return

    def make_data_available(self,stop_after_analysis):
        xprint("db %s (%s) has been added to the database list" % (self.db_id, self.db_to_use))

        database_info,relevant_table = self._read_table_from_cache(stop_after_analysis)

        column_names, column_types, sqlite_column_types = self._extract_information()

        self.mfs_structure = MaterializedStateTableStructure(self.qtable_name, [self.qtable_name], self.db_id,
                                                             column_names, column_types, sqlite_column_types,
                                                             self.table_name,
                                                             self.source_type,self.source,
                                                             self.get_planned_table_name())
        return database_info, relevant_table

    def _extract_information(self):
        assert self.db_to_use.qcatalog_table_exists()
        table_info = self.db_to_use.get_sqlite_table_info(self.table_name)
        xprint('table_name=%s Table info is %s' % (self.table_name,table_info))

        x = self.db_to_use.get_from_qcatalog_using_table_name(self.table_name)

        column_names = list(map(lambda x: x[1], table_info))
        sqlite_column_types = list(map(lambda x: x[2].lower(),table_info))
        column_types = list(map(lambda x: sqlite_type_to_python_type(x[2]), table_info))
        self.content_signature = OrderedDict(
            **json.loads(x['content_signature']))
        xprint('Inferred column names and types from qsql: %s' % list(zip(column_names, zip(sqlite_column_types,column_types))))

        return column_names, column_types, sqlite_column_types

    def _backing_original_file_exists(self):
        return '%s.qsql' % self.qtable_name == self.qsql_filename

    def _read_table_from_cache(self, stop_after_analysis):
        if self._backing_original_file_exists():
            xprint("Found a matching source file for qsql file with qtable name %s. Checking content signature by creating a temp MFDS + analysis" % self.qtable_name)
            mdfs = MaterializedDelimitedFileState(TableSourceType.DELIMITED_FILE,self.qtable_name,self.input_params,self.dialect_id,self.engine_id,target_table_name=None)
            mdfs.initialize()
            mdfs.choose_db_to_use(forced_db_to_use=None,stop_after_analysis=stop_after_analysis)
            _,_ = mdfs.make_data_available(stop_after_analysis=True)

            original_file_content_signature = mdfs.content_signature
            original_file_content_signature_key = self.db_to_use.calculate_content_signature_key(original_file_content_signature)

            qcatalog_entry = self.db_to_use.get_from_qcatalog_using_table_name(self.table_name)

            if qcatalog_entry is None:
                raise Exception('missing content signature!')

            xprint("Actual Signature Key: %s Expected Signature Key: %s" % (qcatalog_entry['content_signature_key'],original_file_content_signature_key))
            actual_content_signature = json.loads(qcatalog_entry['content_signature'])

            xprint("Validating content signatures: original %s vs qsql %s" % (original_file_content_signature,actual_content_signature))
            validate_content_signature(self.qtable_name, original_file_content_signature, self.qsql_filename, actual_content_signature,dump=True)
            mdfs.finalize()
        return DatabaseInfo(self.db_id,self.db_to_use, needs_closing=True), self.table_name


class MaterializedStateTableStructure(object):
    def __init__(self,qtable_name, atomic_fns, db_id, column_names, python_column_types, sqlite_column_types, table_name_for_querying,source_type,source,planned_table_name):
        self.qtable_name = qtable_name
        self.atomic_fns = atomic_fns
        self.db_id = db_id
        self.column_names = column_names
        self.python_column_types = python_column_types
        self.table_name_for_querying = table_name_for_querying
        self.source_type = source_type
        self.source = source
        self.planned_table_name = planned_table_name

        if sqlite_column_types is not None:
            self.sqlite_column_types = sqlite_column_types
        else:
            self.sqlite_column_types = [Sqlite3DB.PYTHON_TO_SQLITE_TYPE_NAMES[t].lower() for t in python_column_types]

    def get_table_name_for_querying(self):
        return self.table_name_for_querying

    def __str__(self):
        return "MaterializedStateTableStructure<%s>" % self.__dict__
    __repr__ = __str__

class TableCreator(object):
    def __str__(self):
        return "TableCreator<%s>" % str(self)
    __repr__ = __str__

    def __init__(self, qtable_name, delimited_file_reader,input_params,sqlite_db=None,target_sqlite_table_name=None):

        self.qtable_name = qtable_name
        self.delimited_file_reader = delimited_file_reader

        self.db_id = sqlite_db.db_id

        self.sqlite_db = sqlite_db
        self.target_sqlite_table_name = target_sqlite_table_name

        self.skip_header = input_params.skip_header
        self.gzipped = input_params.gzipped_input
        self.table_created = False

        self.encoding = input_params.input_encoding
        self.mode = input_params.parsing_mode
        self.expected_column_count = input_params.expected_column_count
        self.input_delimiter = input_params.delimiter
        self.with_universal_newlines = input_params.with_universal_newlines

        self.column_inferer = TableColumnInferer(input_params)

        self.pre_creation_rows = []
        self.buffered_inserts = []
        self.effective_column_names = None

        # Column type indices for columns that contain numeric types. Lazily initialized
        # so column inferer can do its work before this information is needed
        self.numeric_column_indices = None

        self.state = TableCreatorState.INITIALIZED

        self.content_signature = None

    def _generate_content_signature(self):
        if self.state != TableCreatorState.ANALYZED:
            # TODO Change to assertion
            raise Exception('Bug - Wrong state %s. Table needs to be analyzed before a content signature can be calculated' % self.state)

        size = self.delimited_file_reader.get_size_hash()
        last_modification_time = self.delimited_file_reader.get_last_modification_time_hash()

        m = OrderedDict({
            "_signature_version": "v1",
            "skip_header": self.skip_header,
            "gzipped": self.gzipped,
            "with_universal_newlines": self.with_universal_newlines,
            "encoding": self.encoding,
            "mode": self.mode,
            "expected_column_count": self.expected_column_count,
            "input_delimiter": self.input_delimiter,
            "inferer": self.column_inferer._generate_content_signature(),
            "original_file_size": size,
            "last_modification_time": last_modification_time
        })

        return m

    def validate_extra_header_if_needed(self, file_number, filename,col_vals):
        xprint("HHX validate",file_number,filename,col_vals)
        if not self.skip_header:
            xprint("No need to validate header")
            return False

        if file_number == 0:
            xprint("First file, no need to validate extra header")
            return False

        header_already_exists = self.column_inferer.header_row is not None

        if header_already_exists:
            xprint("Validating extra header")
            if tuple(self.column_inferer.header_row) != tuple(col_vals):
                raise BadHeaderException("Extra header '{}' in file '{}' mismatches original header '{}' from file '{}'. Table name is '{}'".format(
                    ",".join(col_vals),filename,
                    ",".join(self.column_inferer.header_row),
                    self.column_inferer.header_row_filename,
                    self.qtable_name))
            xprint("header already exists: %s" % self.column_inferer.header_row)
        else:
            xprint("Header doesn't already exist")

        return header_already_exists

    def _populate(self,dialect,stop_after_analysis=False):
        total_data_lines_read = 0
        try:
            try:
                for file_name,file_number,is_first_line,col_vals in self.delimited_file_reader.generate_rows():
                    if is_first_line:
                        if self.validate_extra_header_if_needed(file_number,file_name,col_vals):
                            continue
                    self._insert_row(file_name, col_vals)
                    if stop_after_analysis:
                        if self.column_inferer.inferred:
                            xprint("Stopping after analysis")
                            return
                if self.delimited_file_reader.get_lines_read() == 0 and self.skip_header:
                    raise MissingHeaderException("Header line is expected but missing in file %s" % ",".join(self.delimited_file_reader.atomic_fns))

                total_data_lines_read += self.delimited_file_reader.lines_read - (1 if self.skip_header else 0)
                xprint("Total Data lines read %s" % total_data_lines_read)
            except StrictModeColumnCountMismatchException as e:
                raise ColumnCountMismatchException(
                    'Strict mode - Expected %s columns instead of %s columns in file %s row %s. Either use relaxed modes or check your delimiter' % (
                    e.expected_col_count, e.actual_col_count, normalized_filename(e.atomic_fn), e.lines_read))
            except FluffyModeColumnCountMismatchException as e:
                raise ColumnCountMismatchException(
                    'Deprecated fluffy mode - Too many columns in file %s row %s (%s fields instead of %s fields). Consider moving to either relaxed or strict mode' % (
                    normalized_filename(e.atomic_fn), e.lines_read, e.actual_col_count, e.expected_col_count))
        finally:
            self._flush_inserts()

        if not self.table_created:
            self.column_inferer.force_analysis()
            self._do_create_table(self.qtable_name)

        self.sqlite_db.conn.commit()

    def perform_analyze(self, dialect):
        xprint("Analyzing... %s" % dialect)
        if self.state == TableCreatorState.INITIALIZED:
            self._populate(dialect,stop_after_analysis=True)
            self.state = TableCreatorState.ANALYZED

            self.content_signature = self._generate_content_signature()
            content_signature_key = self.sqlite_db.calculate_content_signature_key(self.content_signature)
            xprint("Setting content signature after analysis: %s" % content_signature_key)
        else:
            # TODO Convert to assertion
            raise Exception('Bug - Wrong state %s' % self.state)

    def perform_read_fully(self, dialect):
        if self.state == TableCreatorState.ANALYZED:
            self._populate(dialect,stop_after_analysis=False)
            self.state = TableCreatorState.FULLY_READ
        else:
            # TODO Convert to assertion
            raise Exception('Bug - Wrong state %s' % self.state)

    def _flush_pre_creation_rows(self, filename):
        for i, col_vals in enumerate(self.pre_creation_rows):
            if self.skip_header and i == 0:
                # skip header line
                continue
            self._insert_row(filename, col_vals)
        self._flush_inserts()
        self.pre_creation_rows = []

    def _insert_row(self, filename, col_vals):
        # If table has not been created yet
        if not self.table_created:
            # Try to create it along with another "example" line of data
            self.try_to_create_table(filename, col_vals)

        # If the table is still not created, then we don't have enough data, just
        # store the data and return
        if not self.table_created:
            self.pre_creation_rows.append(col_vals)
            return


        # The table already exists, so we can just add a new row
        self._insert_row_i(col_vals)

    def initialize_numeric_column_indices_if_needed(self):
        # Lazy initialization of numeric column indices
        if self.numeric_column_indices is None:
            column_types = self.column_inferer.get_column_types()
            self.numeric_column_indices = [idx for idx, column_type in enumerate(
                column_types) if self.sqlite_db.is_numeric_type(column_type)]

    def nullify_values_if_needed(self, col_vals):
        new_vals = col_vals[:]
        col_count = len(col_vals)
        for i in self.numeric_column_indices:
            if i >= col_count:
                continue
            v = col_vals[i]
            if v == '':
                new_vals[i] = None
        return new_vals

    def normalize_col_vals(self, col_vals):
        # Make sure that numeric column indices are initializd
        self.initialize_numeric_column_indices_if_needed()

        col_vals = self.nullify_values_if_needed(col_vals)

        expected_col_count = self.column_inferer.get_column_count()
        actual_col_count = len(col_vals)
        if self.mode == 'strict':
            if actual_col_count != expected_col_count:
                raise StrictModeColumnCountMismatchException(",".join(self.delimited_file_reader.atomic_fns), expected_col_count,actual_col_count,self.delimited_file_reader.get_lines_read())
            return col_vals

        # in all non strict mode, we add dummy data to missing columns

        if actual_col_count < expected_col_count:
            col_vals = col_vals + \
                [None for x in range(expected_col_count - actual_col_count)]

        # in relaxed mode, we merge all extra columns to the last column value
        if self.mode == 'relaxed':
            if actual_col_count > expected_col_count:
                xxx = col_vals[:expected_col_count - 1] + \
                    [self.input_delimiter.join([v if v  is not None else '' for v in
                        col_vals[expected_col_count - 1:]])]
                return xxx
            else:
                return col_vals

        assert False, "Unidentified parsing mode %s" % self.mode

    def _insert_row_i(self, col_vals):
        col_vals = self.normalize_col_vals(col_vals)

        if self.effective_column_names is None:
            self.effective_column_names = self.column_inferer.column_names[:len(col_vals)]

        if len(self.effective_column_names) > 0:
            self.buffered_inserts.append(col_vals)
        else:
            self.buffered_inserts.append([""])

        if len(self.buffered_inserts) < 5000:
            return
        self._flush_inserts()

    def _flush_inserts(self):
        # If the table is still not created, then we don't have enough data
        if not self.table_created:
            return

        if len(self.buffered_inserts) > 0:
            insert_row_stmt = self.sqlite_db.generate_insert_row(
                self.target_sqlite_table_name, self.effective_column_names)

            self.sqlite_db.update_many(insert_row_stmt, self.buffered_inserts)
        self.buffered_inserts = []

    def try_to_create_table(self, filename, col_vals):
        if self.table_created:
            # TODO Convert to assertion
            raise Exception('Table is already created')

        # Add that line to the column inferer
        result = self.column_inferer.analyze(filename, col_vals)
        # If inferer succeeded,
        if result:
            self._do_create_table(filename)
        else:
            pass  # We don't have enough information for creating the table yet

    def _do_create_table(self,filename):
        # Get the column definition dict from the inferer
        column_dict = self.column_inferer.get_column_dict()

        # Guard against empty tables (instead of preventing the creation, just create with a dummy column)
        if len(column_dict) == 0:
            column_dict = { 'dummy_column_for_empty_tables' : str }
            ordered_column_names = [ 'dummy_column_for_empty_tables' ]
        else:
            ordered_column_names = self.column_inferer.get_column_names()

        # Create the CREATE TABLE statement
        create_table_stmt = self.sqlite_db.generate_create_table(
            self.target_sqlite_table_name, ordered_column_names, column_dict)
        # And create the table itself
        self.sqlite_db.execute_and_fetch(create_table_stmt)
        # Mark the table as created
        self.table_created = True
        self._flush_pre_creation_rows(filename)


def determine_max_col_lengths(m,output_field_quoting_func,output_delimiter):
    if len(m) == 0:
        return []
    max_lengths = [0 for x in range(0, len(m[0]))]
    for row_index in range(0, len(m)):
        for col_index in range(0, len(m[0])):
            # TODO Optimize this
            new_len = len("{}".format(output_field_quoting_func(output_delimiter,m[row_index][col_index])))
            if new_len > max_lengths[col_index]:
                max_lengths[col_index] = new_len
    return max_lengths

def print_credentials():
    print("q version %s" % q_version, file=sys.stderr)
    print("Python: %s" % " // ".join([str(x).strip() for x in sys.version.split("\n")]), file=sys.stderr)
    print("Copyright (C) 2012-2021 Harel Ben-Attia (harelba@gmail.com, @harelba on twitter)", file=sys.stderr)
    print("https://harelba.github.io/q/", file=sys.stderr)
    print(file=sys.stderr)

class QWarning(object):
    def __init__(self,exception,msg):
        self.exception = exception
        self.msg = msg

class QError(object):
    def __init__(self,exception,msg,errorcode):
        self.exception = exception
        self.msg = msg
        self.errorcode = errorcode
        self.traceback = traceback.format_exc()

    def __str__(self):
        return "QError<errorcode=%s,msg=%s,exception=%s,traceback=%s>" % (self.errorcode,self.msg,self.exception,str(self.traceback))
    __repr__ = __str__

class QMetadata(object):
    def __init__(self,table_structures={},new_table_structures={},output_column_name_list=None):
        self.table_structures = table_structures
        self.new_table_structures = new_table_structures
        self.output_column_name_list = output_column_name_list

    def __str__(self):
        return "QMetadata<%s" % (self.__dict__)
    __repr__ = __str__

class QOutput(object):
    def __init__(self,data=None,metadata=None,warnings=[],error=None):
        self.data = data
        self.metadata = metadata

        self.warnings = warnings
        self.error = error
        if error is None:
            self.status = 'ok'
        else:
            self.status = 'error'

    def __str__(self):
        s = []
        s.append('status=%s' % self.status)
        if self.error is not None:
            s.append("error=%s" % self.error.msg)
        if len(self.warnings) > 0:
            s.append("warning_count=%s" % len(self.warnings))
        if self.data is not None:
            s.append("row_count=%s" % len(self.data))
        else:
            s.append("row_count=None")
        if self.metadata is not None:
            s.append("metadata=<%s>" % self.metadata)
        else:
            s.append("metadata=None")
        return "QOutput<%s>" % ",".join(s)
    __repr__ = __str__

class QInputParams(object):
    def __init__(self,skip_header=False,
            delimiter=' ',input_encoding='UTF-8',gzipped_input=False,with_universal_newlines=False,parsing_mode='relaxed',
            expected_column_count=None,keep_leading_whitespace_in_values=False,
            disable_double_double_quoting=False,disable_escaped_double_quoting=False,
            disable_column_type_detection=False,
            input_quoting_mode='minimal',stdin_file=None,stdin_filename='-',
            max_column_length_limit=131072,
            read_caching=False,
            write_caching=False,
            max_attached_sqlite_databases = 10):
        self.skip_header = skip_header
        self.delimiter = delimiter
        self.input_encoding = input_encoding
        self.gzipped_input = gzipped_input
        self.with_universal_newlines = with_universal_newlines
        self.parsing_mode = parsing_mode
        self.expected_column_count = expected_column_count
        self.keep_leading_whitespace_in_values = keep_leading_whitespace_in_values
        self.disable_double_double_quoting = disable_double_double_quoting
        self.disable_escaped_double_quoting = disable_escaped_double_quoting
        self.input_quoting_mode = input_quoting_mode
        self.disable_column_type_detection = disable_column_type_detection
        self.max_column_length_limit = max_column_length_limit
        self.read_caching = read_caching
        self.write_caching = write_caching
        self.max_attached_sqlite_databases = max_attached_sqlite_databases

    def merged_with(self,input_params):
        params = QInputParams(**self.__dict__)
        if input_params is not None:
            params.__dict__.update(**input_params.__dict__)
        return params

    def __str__(self):
        return "QInputParams<%s>" % str(self.__dict__)

    def __repr__(self):
        return "QInputParams(...)"

class DataStream(object):
    # TODO Can stream-id be removed?
    def __init__(self,stream_id,filename,stream):
        self.stream_id = stream_id
        self.filename = filename
        self.stream = stream

    def __str__(self):
        return "QDataStream<stream_id=%s,filename=%s,stream=%s>" % (self.stream_id,self.filename,self.stream)
    __repr__ = __str__


class DataStreams(object):
    def __init__(self, data_streams_dict):
        assert type(data_streams_dict) == dict
        self.validate(data_streams_dict)
        self.data_streams_dict = data_streams_dict

    def validate(self,d):
        for k in d:
            v = d[k]
            if type(k) != str or type(v) != DataStream:
                raise Exception('Bug - Invalid dict: %s' % str(d))

    def get_for_filename(self, filename):
        xprint("Data streams dict is %s. Trying to find %s" % (self.data_streams_dict,filename))
        x = self.data_streams_dict.get(filename)
        return x

    def is_data_stream(self,filename):
        return filename in self.data_streams_dict

class DatabaseInfo(object):
    def __init__(self,db_id,sqlite_db,needs_closing):
        self.db_id = db_id
        self.sqlite_db = sqlite_db
        self.needs_closing = needs_closing

    def __str__(self):
        return "DatabaseInfo<sqlite_db=%s,needs_closing=%s>" % (self.sqlite_db,self.needs_closing)
    __repr__ = __str__

class QTextAsData(object):
    def __init__(self,default_input_params=QInputParams(),data_streams_dict=None):
        self.engine_id = str(uuid.uuid4()).replace("-","_")

        self.default_input_params = default_input_params
        xprint("Default input params: %s" % self.default_input_params)

        self.loaded_table_structures_dict = OrderedDict()
        self.databases = OrderedDict()

        if data_streams_dict is not None:
            self.data_streams = DataStreams(data_streams_dict)
        else:
            self.data_streams = DataStreams({})

        # Create DB object
        self.query_level_db_id = 'query_e_%s' % self.engine_id
        self.query_level_db = Sqlite3DB(self.query_level_db_id,
                                        'file:%s?mode=memory&cache=shared' % self.query_level_db_id,'<query-level-db>',create_qcatalog=True)
        self.adhoc_db_id = 'adhoc_e_%s' % self.engine_id
        self.adhoc_db_name = 'file:%s?mode=memory&cache=shared' % self.adhoc_db_id
        self.adhoc_db = Sqlite3DB(self.adhoc_db_id,self.adhoc_db_name,'<adhoc-db>',create_qcatalog=True)
        self.query_level_db.conn.execute("attach '%s' as %s" % (self.adhoc_db_name,self.adhoc_db_id))

        self.add_db_to_database_list(DatabaseInfo(self.query_level_db_id,self.query_level_db,needs_closing=True))
        self.add_db_to_database_list(DatabaseInfo(self.adhoc_db_id,self.adhoc_db,needs_closing=True))

    def done(self):
        xprint("Inside done: Database list is %s" % self.databases)
        for db_id in reversed(self.databases.keys()):
            database_info = self.databases[db_id]
            if database_info.needs_closing:
                xprint("Gonna close database %s - %s" % (db_id,self.databases[db_id]))
                self.databases[db_id].sqlite_db.done()
                xprint("Database %s has been closed" % db_id)
            else:
                xprint("No need to close database %s" % db_id)
        xprint("Closed all databases")

    input_quoting_modes = {   'minimal' : csv.QUOTE_MINIMAL,
                        'all' : csv.QUOTE_ALL,
                        # nonnumeric is not supported for input quoting modes, since we determine the data types
                        # ourselves instead of letting the csv module try to identify the types
                        'none' : csv.QUOTE_NONE }

    def determine_proper_dialect(self,input_params):

        input_quoting_mode_csv_numeral = QTextAsData.input_quoting_modes[input_params.input_quoting_mode]

        if input_params.keep_leading_whitespace_in_values:
            skip_initial_space = False
        else:
            skip_initial_space = True

        dialect = {'skipinitialspace': skip_initial_space,
                    'delimiter': input_params.delimiter, 'quotechar': '"' }
        dialect['quoting'] = input_quoting_mode_csv_numeral
        dialect['doublequote'] = input_params.disable_double_double_quoting

        if input_params.disable_escaped_double_quoting:
            dialect['escapechar'] = '\\'

        return dialect

    def get_dialect_id(self,filename):
        return 'q_dialect_%s' % filename

    def _open_files_and_get_mfss(self,qtable_name,input_params,dialect):
        materialized_file_dict = OrderedDict()

        materialized_state_type,table_source_type,source_info = detect_qtable_name_source_info(qtable_name,self.data_streams,read_caching_enabled=input_params.read_caching)
        xprint("Detected source type %s source info %s" % (materialized_state_type,source_info))

        if materialized_state_type == MaterializedStateType.DATA_STREAM:
            (data_stream,) = source_info
            ms = MaterialiedDataStreamState(table_source_type,qtable_name,input_params,dialect,self.engine_id,data_stream,stream_target_db=self.adhoc_db)
            effective_qtable_name = data_stream.stream_id
        elif materialized_state_type == MaterializedStateType.QSQL_FILE:
            (qsql_filename,table_name) = source_info
            ms = MaterializedQsqlState(table_source_type,qtable_name, qsql_filename=qsql_filename, table_name=table_name,
                                       engine_id=self.engine_id, input_params=input_params, dialect_id=dialect)
            effective_qtable_name = '%s:::%s' % (qsql_filename, table_name)
        elif materialized_state_type == MaterializedStateType.SQLITE_FILE:
            (sqlite_filename,table_name) = source_info
            ms = MaterializedSqliteState(table_source_type,qtable_name, sqlite_filename=sqlite_filename, table_name=table_name,
                                       engine_id=self.engine_id)
            effective_qtable_name = '%s:::%s' % (sqlite_filename, table_name)
        elif materialized_state_type == MaterializedStateType.DELIMITED_FILE:
            (source_qtable_name,_) = source_info
            ms = MaterializedDelimitedFileState(table_source_type,source_qtable_name, input_params, dialect, self.engine_id)
            effective_qtable_name = source_qtable_name
        else:
            assert False, "Unknown file type for qtable %s should have exited with an exception" % (qtable_name)

        assert effective_qtable_name not in materialized_file_dict
        materialized_file_dict[effective_qtable_name] = ms

        xprint("MS dict: %s" % str(materialized_file_dict))

        return list([item for item in materialized_file_dict.values()])

    def _load_mfs(self,mfs,input_params,dialect_id,stop_after_analysis):
        xprint("Loading MFS:", mfs)

        materialized_state_type = mfs.get_materialized_state_type()
        xprint("Detected materialized state type for %s: %s" % (mfs.qtable_name,materialized_state_type))

        mfs.initialize()

        if not materialized_state_type in [MaterializedStateType.DATA_STREAM]:
            if stop_after_analysis or self.should_copy_instead_of_attach(input_params):
                xprint("Should copy instead of attaching. Forcing db to use to adhoc db")
                forced_db_to_use = self.adhoc_db
            else:
                forced_db_to_use = None
        else:
            forced_db_to_use = None

        mfs.choose_db_to_use(forced_db_to_use,stop_after_analysis)
        xprint("Chosen db to use: source %s source_type %s db_id %s db_to_use %s" % (mfs.source,mfs.source_type,mfs.db_id,mfs.db_to_use))

        database_info,relevant_table = mfs.make_data_available(stop_after_analysis)

        if not self.is_adhoc_db(mfs.db_to_use) and not self.should_copy_instead_of_attach(input_params):
            if not self.already_attached_to_query_level_db(mfs.db_to_use):
                self.attach_to_db(mfs.db_to_use, self.query_level_db)
                self.add_db_to_database_list(database_info)
            else:
                xprint("DB %s is already attached to query level db. No need to attach it again.")

        mfs.finalize()

        xprint("MFS Loaded")

        return mfs.source,mfs.source_type

    def add_db_to_database_list(self,database_info):
        db_id = database_info.db_id
        assert db_id is not None
        assert database_info.sqlite_db is not None
        if db_id in self.databases:
            # TODO Convert to assertion
            if id(database_info.sqlite_db) != id(self.databases[db_id].sqlite_db):
                raise Exception('Bug - database already in database list: db_id %s: old %s new %s' % (db_id,self.databases[db_id],database_info))
            else:
                return
        self.databases[db_id] = database_info

    def is_adhoc_db(self,db_to_use):
        return db_to_use.db_id == self.adhoc_db_id

    def should_copy_instead_of_attach(self,input_params):
        attached_database_count = len(self.query_level_db.get_sqlite_database_list())
        x = attached_database_count >= input_params.max_attached_sqlite_databases
        xprint("should_copy_instead_of_attach: attached_database_count=%s should_copy=%s" % (attached_database_count,x))
        return x

    def _load_data(self,qtable_name,input_params=QInputParams(),stop_after_analysis=False):
        xprint("Attempting to load data for materialized file names %s" % qtable_name)

        q_dialect = self.determine_proper_dialect(input_params)
        xprint("Dialect is %s" % q_dialect)
        dialect_id = self.get_dialect_id(qtable_name)
        csv.register_dialect(dialect_id, **q_dialect)

        xprint("qtable metadata for loading is %s" % qtable_name)
        mfss = self._open_files_and_get_mfss(qtable_name,
                                             input_params,
                                             dialect_id)
        assert len(mfss) == 1, "one MS now encapsulated an entire table"
        mfs = mfss[0]

        xprint("MFS to load: %s" % mfs)

        if qtable_name in self.loaded_table_structures_dict.keys():
            xprint("Atomic filename %s found. no need to load" % qtable_name)
            return None

        xprint("qtable %s not found - loading" % qtable_name)


        self._load_mfs(mfs, input_params, dialect_id, stop_after_analysis)
        xprint("Loaded: source-type %s source %s mfs_structure %s" % (mfs.source_type, mfs.source, mfs.mfs_structure))

        assert qtable_name not in self.loaded_table_structures_dict, "loaded_table_structures_dict has been changed to have a non-list value"
        self.loaded_table_structures_dict[qtable_name] = mfs.mfs_structure

        return mfs.mfs_structure

    def already_attached_to_query_level_db(self,db_to_attach):
        attached_dbs = list(map(lambda x:x[1],self.query_level_db.get_sqlite_database_list()))
        return db_to_attach.db_id in attached_dbs

    def attach_to_db(self, target_db, source_db):
        q = "attach '%s' as %s" % (target_db.sqlite_db_url,target_db.db_id)
        xprint("Attach query: %s" % q)
        try:
            c = source_db.execute_and_fetch(q)
        except SqliteOperationalErrorException as e:
            if 'too many attached databases' in str(e):
                raise TooManyAttachedDatabasesException('There are too many attached databases. Use a proper --max-attached-sqlite-databases parameter which is below the maximum. Original error: %s' % str(e))
        except Exception as e1:
            raise

    def detach_from_db(self, target_db, source_db):
        q = "detach %s" % (target_db.db_id)
        xprint("Detach query: %s" % q)
        try:
            c = source_db.execute_and_fetch(q)
        except Exception as e1:
            raise

    def load_data(self,filename,input_params=QInputParams(),stop_after_analysis=False):
        return self._load_data(filename,input_params,stop_after_analysis=stop_after_analysis)

    def _ensure_data_is_loaded_for_sql(self,sql_object,input_params,data_streams=None,stop_after_analysis=False):
        xprint("Ensuring Data load")
        new_table_structures = OrderedDict()

        # For each "table name"
        for qtable_name in sql_object.qtable_names:
            tss = self._load_data(qtable_name,input_params,stop_after_analysis=stop_after_analysis)
            if tss is not None:
                xprint("New Table Structures:",new_table_structures)
                assert qtable_name not in new_table_structures, "new_table_structures was changed not to contain a list as a value"
                new_table_structures[qtable_name] = tss

        return new_table_structures

    def materialize_query_level_db(self,save_db_to_disk_filename,sql_object):
        # TODO More robust creation - Create the file in a separate folder and move it to the target location only after success

        materialized_db = Sqlite3DB("materialized","file:%s" % save_db_to_disk_filename,save_db_to_disk_filename,create_qcatalog=False)
        table_name_mapping = OrderedDict()

        # For each table in the query
        effective_table_names = sql_object.get_qtable_name_effective_table_names()

        for i, qtable_name in enumerate(effective_table_names):
            # table name, in the format db_id.table_name
            effective_table_name_for_qtable_name = effective_table_names[qtable_name]

            source_db_id, actual_table_name_in_db = effective_table_name_for_qtable_name.split(".", 1)
            # The DatabaseInfo instance for this db
            source_database = self.databases[source_db_id]
            if source_db_id != self.query_level_db_id:
                self.attach_to_db(source_database.sqlite_db,materialized_db)

            ts = self.loaded_table_structures_dict[qtable_name]
            proposed_new_table_name = ts.planned_table_name
            xprint("Proposed table name is %s" % proposed_new_table_name)

            new_table_name = materialized_db.find_new_table_name(proposed_new_table_name)

            xprint("Materializing",source_db_id,actual_table_name_in_db,"as",new_table_name)
            # Copy the table into the materialized database
            xx = materialized_db.execute_and_fetch('CREATE TABLE %s AS SELECT * FROM %s' % (new_table_name,effective_table_name_for_qtable_name))

            table_name_mapping[effective_table_name_for_qtable_name] = new_table_name

            # TODO RLRL Preparation for writing materialized database as a qsql file
            # if source_database.sqlite_db.qcatalog_table_exists():
            #     qcatalog_entry = source_database.sqlite_db.get_from_qcatalog_using_table_name(actual_table_name_in_db)
            #     # TODO RLRL Encapsulate dictionary transform inside qcatalog access methods
            #     materialized_db.add_to_qcatalog_table(new_table_name,OrderedDict(json.loads(qcatalog_entry['content_signature'])),
            #                                           qcatalog_entry['creation_time'],
            #                                           qcatalog_entry['source_type'],
            #                                           qcatalog_entry['source_type'])
            #     xprint("PQX Added to qcatalog",source_db_id,actual_table_name_in_db,'as',new_table_name)
            # else:
            #     xprint("PQX Skipped adding to qcatalog",source_db_id,actual_table_name_in_db)

            if source_db_id != self.query_level_db:
                self.detach_from_db(source_database.sqlite_db,materialized_db)

        return table_name_mapping

    def validate_query(self,sql_object,table_structures):

        for qtable_name in sql_object.qtable_names:
            relevant_table_structures = [table_structures[qtable_name]]

            column_names = None
            column_types = None
            for ts in relevant_table_structures:
                names = ts.column_names
                types = ts.python_column_types
                xprint("Comparing column names: %s with %s" % (column_names,names))
                if column_names is None:
                    column_names = names
                else:
                    if column_names != names:
                        raise BadHeaderException("Column names differ for table %s: %s vs %s" % (
                            qtable_name, ",".join(column_names), ",".join(names)))

                xprint("Comparing column types: %s with %s" % (column_types,types))
                if column_types is None:
                    column_types = types
                else:
                    if column_types != types:
                        raise BadHeaderException("Column types differ for table %s: %s vs %s" % (
                        qtable_name, ",".join(column_types), ",".join(types)))

                xprint("All column names match for qtable name %s: column names: %s column types: %s" % (ts.qtable_name,column_names,column_types))

        xprint("Query validated")

    def _execute(self,query_str,input_params=None,data_streams=None,stop_after_analysis=False,save_db_to_disk_filename=None):
        warnings = []
        error = None
        table_structures = []

        db_results_obj = None

        effective_input_params = self.default_input_params.merged_with(input_params)

        if type(query_str) != unicode:
            try:
                # Heuristic attempt to auto convert the query to unicode before failing
                query_str = query_str.decode('utf-8')
            except:
                error = QError(EncodedQueryException(''),"Query should be in unicode. Please make sure to provide a unicode literal string or decode it using proper the character encoding.",91)
                return QOutput(error = error)


        try:
            # Create SQL statement
            sql_object = Sql('%s' % query_str, self.data_streams)

            load_start_time = time.time()
            iprint("Going to ensure data is loaded. Currently loaded tables: %s" % str(self.loaded_table_structures_dict))
            new_table_structures = self._ensure_data_is_loaded_for_sql(sql_object,effective_input_params,data_streams,stop_after_analysis=stop_after_analysis)
            iprint("Ensured data is loaded. loaded tables: %s" % self.loaded_table_structures_dict)

            self.validate_query(sql_object,self.loaded_table_structures_dict)

            iprint("Query validated")

            sql_object.materialize_using(self.loaded_table_structures_dict)

            iprint("Materialized sql object")

            if save_db_to_disk_filename is not None:
                xprint("Saving query data to disk")
                dump_start_time = time.time()
                table_name_mapping = self.materialize_query_level_db(save_db_to_disk_filename,sql_object)
                print("Data has been saved into %s . Saving has taken %4.3f seconds" % (save_db_to_disk_filename,time.time()-dump_start_time), file=sys.stderr)
                effective_sql = sql_object.get_effective_sql(table_name_mapping)
                print("Query to run on the database: %s;" % effective_sql, file=sys.stderr)
                command_line = 'echo "%s" | sqlite3 %s' % (effective_sql,save_db_to_disk_filename)
                print("You can run the query directly from the command line using the following command: %s" % command_line, file=sys.stderr)

                # TODO Propagate dump results using a different output class instead of an empty one
                return QOutput()

            # Ensure that adhoc db is not in the middle of a transaction
            self.adhoc_db.conn.commit()

            all_databases = self.query_level_db.get_sqlite_database_list()
            xprint("Query level db: databases %s" % all_databases)

            # Execute the query and fetch the data
            db_results_obj = sql_object.execute_and_fetch(self.query_level_db)
            iprint("Query executed")

            if len(db_results_obj.results) == 0:
                warnings.append(QWarning(None, "Warning - data is empty"))

            return QOutput(
                data = db_results_obj.results,
                metadata = QMetadata(
                    table_structures=self.loaded_table_structures_dict,
                    new_table_structures=new_table_structures,
                    output_column_name_list=db_results_obj.query_column_names),
                warnings = warnings,
                error = error)
        except InvalidQueryException as e:
            error = QError(e,str(e),118)
        except MissingHeaderException as e:
            error = QError(e,e.msg,117)
        except FileNotFoundException as e:
            error = QError(e,e.msg,30)
        except SqliteOperationalErrorException as e:
            xprint("Sqlite Operational error: %s" % e)
            msg = str(e.original_error)
            error = QError(e,"query error: %s" % msg,1)
            if "no such column" in msg and effective_input_params.skip_header:
                warnings.append(QWarning(e,'Warning - There seems to be a "no such column" error, and -H (header line) exists. Please make sure that you are using the column names from the header line and not the default (cXX) column names. Another issue might be that the file contains a BOM. Files that are encoded with UTF8 and contain a BOM can be read by specifying `-e utf-9-sig` in the command line. Support for non-UTF8 encoding will be provided in the future.'))
        except ColumnCountMismatchException as e:
            error = QError(e,e.msg,2)
        except (UnicodeDecodeError, UnicodeError) as e:
            error = QError(e,"Cannot decode data. Try to change the encoding by setting it using the -e parameter. Error:%s" % e,3)
        except BadHeaderException as e:
            error = QError(e,"Bad header row: %s" % e.msg,35)
        except CannotUnzipDataStreamException as e:
            error = QError(e,"Cannot decompress standard input. Pipe the input through zcat in order to decompress.",36)
        except UniversalNewlinesExistException as e:
            error = QError(e,"Data contains universal newlines. Run q with -U to use universal newlines. Please note that q still doesn't support universal newlines for .gz files or for stdin. Route the data through a regular file to use -U.",103)
        # deprecated, but shouldn't be used:  error = QError(e,"Standard Input must be provided in order to use it as a table",61)
        except CouldNotConvertStringToNumericValueException as e:
            error = QError(e,"Could not convert string to a numeric value. Did you use `-w nonnumeric` with unquoted string values? Error: %s" % e.msg,58)
        except CouldNotParseInputException as e:
            error = QError(e,"Could not parse the input. Please make sure to set the proper -w input-wrapping parameter for your input, and that you use the proper input encoding (-e). Error: %s" % e.msg,59)
        except ColumnMaxLengthLimitExceededException as e:
            error = QError(e,e.msg,31)
        # deprecated, but shouldn't be used: error = QError(e,e.msg,79)
        except ContentSignatureDiffersException as e:
            error = QError(e,"%s vs %s: Content Signatures for table %s differ at %s (source value '%s' disk signature value '%s')" %
                           (e.original_filename,e.other_filename,e.filenames_str,e.key,e.source_value,e.signature_value),80)
        except ContentSignatureDataDiffersException as e:
            error = QError(e,e.msg,81)
        except MaximumSourceFilesExceededException as e:
            error = QError(e,e.msg,82)
        except ContentSignatureNotFoundException as e:
            error = QError(e,e.msg,83)
        except NonExistentTableNameInQsql as e:
            msg = "Table %s could not be found in qsql file %s . Existing table names: %s" % (e.table_name,e.qsql_filename,",".join(e.existing_table_names))
            error = QError(e,msg,84)
        except NonExistentTableNameInSqlite as e:
            msg = "Table %s could not be found in sqlite file %s . Existing table names: %s" % (e.table_name,e.qsql_filename,",".join(e.existing_table_names))
            error = QError(e,msg,85)
        except TooManyTablesInQsqlException as e:
            msg = "Could not autodetect table name in qsql file. Existing Tables %s" % ",".join(e.existing_table_names)
            error = QError(e,msg,86)
        except NoTableInQsqlExcption as e:
            msg = "Could not autodetect table name in qsql file. File contains no record of a table"
            error = QError(e,msg,97)
        except TooManyTablesInSqliteException as e:
            msg = "Could not autodetect table name in sqlite file %s . Existing tables: %s" % (e.qsql_filename,",".join(e.existing_table_names))
            error = QError(e,msg,87)
        except NoTablesInSqliteException as e:
            msg = "sqlite file %s has no tables" % e.sqlite_filename
            error = QError(e,msg,88)
        except TooManyAttachedDatabasesException as e:
            msg = str(e)
            error = QError(e,msg,89)
        except UnknownFileTypeException as e:
            msg = str(e)
            error = QError(e,msg,95)
        except KeyboardInterrupt as e:
            warnings.append(QWarning(e,"Interrupted"))
        except Exception as e:
            global DEBUG
            if DEBUG:
                xprint(traceback.format_exc())
            error = QError(e,repr(e),199)

        return QOutput(data=None,warnings = warnings,error = error , metadata=QMetadata(table_structures=self.loaded_table_structures_dict,new_table_structures=self.loaded_table_structures_dict,output_column_name_list=[]))

    def execute(self,query_str,input_params=None,save_db_to_disk_filename=None):
        r = self._execute(query_str,input_params,stop_after_analysis=False,save_db_to_disk_filename=save_db_to_disk_filename)
        return r

    def unload(self):
        # TODO This would fail, since table structures are just value objects now. Will be fixed as part of making q a full python module
        for qtable_name,table_creator in six.iteritems(self.loaded_table_structures_dict):
            try:
                table_creator.drop_table()
            except:
                # Support no-table select queries
                pass
        self.loaded_table_structures_dict = OrderedDict()

    def analyze(self,query_str,input_params=None,data_streams=None):
        q_output = self._execute(query_str,input_params,data_streams=data_streams,stop_after_analysis=True)

        return q_output

def escape_double_quotes_if_needed(v):
    x = v.replace(six.u('"'), six.u('""'))
    return x

def quote_none_func(output_delimiter,v):
    return v

def quote_minimal_func(output_delimiter,v):
    if v is None:
        return v
    t = type(v)
    if (t == str or t == unicode) and ((output_delimiter in v) or ('\n' in v) or ('"' in v)):
        return six.u('"{}"').format(escape_double_quotes_if_needed(v))
    return v

def quote_nonnumeric_func(output_delimiter,v):
    if v is None:
        return v
    if type(v) == str or type(v) == unicode:
        return six.u('"{}"').format(escape_double_quotes_if_needed(v))
    return v

def quote_all_func(output_delimiter,v):
    if type(v) == str or type(v) == unicode:
        return six.u('"{}"').format(escape_double_quotes_if_needed(v))
    else:
        return six.u('"{}"').format(v)

class QOutputParams(object):
    def __init__(self,
            delimiter=' ',
            beautify=False,
            output_quoting_mode='minimal',
            formatting=None,
            output_header=False,
                 encoding=None):
        self.delimiter = delimiter
        self.beautify = beautify
        self.output_quoting_mode = output_quoting_mode
        self.formatting = formatting
        self.output_header = output_header
        self.encoding = encoding

    def __str__(self):
        return "QOutputParams<%s>" % str(self.__dict__)

    def __repr__(self):
        return "QOutputParams(...)"

class QOutputPrinter(object):
    output_quoting_modes = {   'minimal' : quote_minimal_func,
                        'all' : quote_all_func,
                        'nonnumeric' : quote_nonnumeric_func,
                        'none' : quote_none_func }

    def __init__(self,output_params,show_tracebacks=False):
        self.output_params = output_params
        self.show_tracebacks = show_tracebacks

        self.output_field_quoting_func = QOutputPrinter.output_quoting_modes[output_params.output_quoting_mode]

    def print_errors_and_warnings(self,f,results):
        if results.status == 'error':
            error = results.error
            print(error.msg, file=f)
            if self.show_tracebacks:
                print(error.traceback, file=f)

        for warning in results.warnings:
            print("%s" % warning.msg, file=f)

    def print_analysis(self,f_out,f_err,results):
        self.print_errors_and_warnings(f_err,results)

        if results.metadata is None:
            return

        if results.metadata.table_structures is None:
            return

Download .txt

gitextract_x4ti_kab/

├── .github/
│   ├── FUNDING.yml
│   └── workflows/
│       └── build-and-package.yaml
├── .gitignore
├── LICENSE
├── QSQL-NOTES.md
├── README.markdown
├── benchmark-config.sh
├── bin/
│   ├── .qrc
│   ├── __init__.py
│   ├── q.bat
│   └── q.py
├── conftest.py
├── dist/
│   ├── fpm-config
│   ├── test-rpm-inside-container.sh
│   ├── test-using-deb.sh
│   └── test-using-rpm.sh
├── doc/
│   ├── AUTHORS
│   ├── IMPLEMENTATION.markdown
│   ├── LICENSE
│   ├── RATIONALE.markdown
│   ├── THANKS
│   └── USAGE.markdown
├── examples/
│   ├── EXAMPLES.markdown
│   ├── exampledatafile
│   └── group-emails-example
├── mkdocs/
│   ├── README.md
│   ├── docs/
│   │   ├── about.md
│   │   ├── fsg9b9b1.txt
│   │   ├── google0efeb4ff0a886e81.html
│   │   ├── index.md
│   │   ├── index_cn.md
│   │   ├── js/
│   │   │   └── google-analytics.js
│   │   └── stylesheets/
│   │       └── extra.css
│   ├── generate-web-site.sh
│   ├── mkdocs.yml
│   ├── requirements.txt
│   └── theme/
│       └── main.html
├── prepare-benchmark-env
├── pyoxidizer.bzl
├── pytest.ini
├── requirements.txt
├── run-benchmark
├── run-coverage.sh
├── run-tests.sh
├── setup.py
├── test/
│   ├── BENCHMARK.md
│   ├── __init__.py
│   ├── benchmark-results/
│   │   └── source-files-1443b7418b46594ad256abd9db4a7671cb251e6a/
│   │       └── 2020-09-17-v2.0.17/
│   │           ├── octosql_v0.3.0.benchmark-results
│   │           ├── q-benchmark-2.7.18.benchmark-results
│   │           ├── q-benchmark-3.6.4.benchmark-results
│   │           ├── q-benchmark-3.7.9.benchmark-results
│   │           ├── q-benchmark-3.8.5.benchmark-results
│   │           ├── summary.benchmark-results
│   │           └── textql_2.0.3.benchmark-results
│   └── test_suite.py
└── test-requirements.txt

Download .txt

SYMBOL INDEX (620 symbols across 3 files)

FILE: bin/q.py
  function xprint (line 72) | def xprint(*args,**kwargs):
  function iprint (line 75) | def iprint(*args,**kwargs):
  function sqlprint (line 78) | def sqlprint(*args,**kwargs):
  function xprint (line 81) | def xprint(*args,**kwargs): pass
  function iprint (line 82) | def iprint(*args,**kwargs): pass
  function sqlprint (line 83) | def sqlprint(*args,**kwargs): pass
  function sqlprint (line 86) | def sqlprint(*args,**kwargs):
  function get_stdout_encoding (line 90) | def get_stdout_encoding(encoding_override=None):
  function sha (line 109) | def sha(data,algorithm,encoding):
  function sha1 (line 117) | def sha1(data):
  function regexp (line 121) | def regexp(regular_expression, data):
  function regexp_extract (line 129) | def regexp_extract(regular_expression, data,group_number):
  function md5 (line 139) | def md5(data,encoding):
  function sqrt (line 144) | def sqrt(data):
  function power (line 147) | def power(data,p):
  function file_ext (line 150) | def file_ext(data):
  function file_folder (line 156) | def file_folder(data):
  function file_basename (line 161) | def file_basename(data):
  function file_basename_no_ext (line 166) | def file_basename_no_ext(data):
  function percentile (line 172) | def percentile(l, p):
  class StrictPercentile (line 184) | class StrictPercentile(object):
    method __init__ (line 185) | def __init__(self):
    method step (line 189) | def step(self,value,p):
    method finalize (line 194) | def finalize(self):
  class StdevPopulation (line 200) | class StdevPopulation(object):
    method __init__ (line 201) | def __init__(self):
    method step (line 206) | def step(self, value):
    method finalize (line 220) | def finalize(self):
  class StdevSample (line 226) | class StdevSample(object):
    method __init__ (line 227) | def __init__(self):
    method step (line 232) | def step(self, value):
    method finalize (line 246) | def finalize(self):
  class FunctionType (line 252) | class FunctionType(object):
  class UserFunctionDef (line 256) | class UserFunctionDef(object):
    method __init__ (line 257) | def __init__(self,func_type,name,usage,description,func_or_obj,param_c...
  function print_user_functions (line 338) | def print_user_functions():
  class Sqlite3DBResults (line 344) | class Sqlite3DBResults(object):
    method __init__ (line 345) | def __init__(self,query_column_names,results):
    method __str__ (line 349) | def __str__(self):
  function get_sqlite_type_affinity (line 353) | def get_sqlite_type_affinity(sqlite_type):
  function sqlite_type_to_python_type (line 366) | def sqlite_type_to_python_type(sqlite_type):
  class Sqlite3DB (line 377) | class Sqlite3DB(object):
    method __str__ (line 385) | def __str__(self):
    method __init__ (line 389) | def __init__(self, db_id, sqlite_db_url, sqlite_db_filename, create_qc...
    method retrieve_all_table_names (line 407) | def retrieve_all_table_names(self):
    method get_sqlite_table_info (line 410) | def get_sqlite_table_info(self,table_name):
    method get_sqlite_database_list (line 413) | def get_sqlite_database_list(self):
    method find_new_table_name (line 416) | def find_new_table_name(self,planned_table_name):
    method create_qcatalog_table (line 436) | def create_qcatalog_table(self):
    method qcatalog_table_exists (line 450) | def qcatalog_table_exists(self):
    method calculate_content_signature_key (line 453) | def calculate_content_signature_key(self,content_signature):
    method add_to_qcatalog_table (line 459) | def add_to_qcatalog_table(self, temp_table_name, content_signature, cr...
    method get_from_qcatalog (line 470) | def get_from_qcatalog(self, content_signature):
    method get_from_qcatalog_using_table_name (line 491) | def get_from_qcatalog_using_table_name(self, temp_table_name):
    method get_all_from_qcatalog (line 517) | def get_all_from_qcatalog(self):
    method done (line 539) | def done(self):
    method add_user_functions (line 549) | def add_user_functions(self):
    method is_numeric_type (line 558) | def is_numeric_type(self, column_type):
    method update_many (line 561) | def update_many(self, sql, params):
    method execute_and_fetch (line 569) | def execute_and_fetch(self, q,params = None):
    method _get_as_list_str (line 590) | def _get_as_list_str(self, l):
    method generate_insert_row (line 593) | def generate_insert_row(self, table_name, column_names):
    method generate_create_table (line 600) | def generate_create_table(self, table_name, column_names, column_dict):
    method generate_temp_table_name (line 608) | def generate_temp_table_name(self):
    method generate_drop_table (line 614) | def generate_drop_table(self, table_name):
    method drop_table (line 617) | def drop_table(self, table_name):
    method attach_and_copy_table (line 620) | def attach_and_copy_table(self, from_db, relevant_table,stop_after_ana...
  class CouldNotConvertStringToNumericValueException (line 649) | class CouldNotConvertStringToNumericValueException(Exception):
    method __init__ (line 651) | def __init__(self, msg):
    method __str (line 654) | def __str(self):
  class SqliteOperationalErrorException (line 657) | class SqliteOperationalErrorException(Exception):
    method __init__ (line 659) | def __init__(self, msg,original_error):
    method __str (line 663) | def __str(self):
  class IncorrectDefaultValueException (line 666) | class IncorrectDefaultValueException(Exception):
    method __init__ (line 668) | def __init__(self, option_type,option,actual_value):
    method __str__ (line 673) | def __str__(self):
  class NonExistentTableNameInQsql (line 676) | class NonExistentTableNameInQsql(Exception):
    method __init__ (line 678) | def __init__(self, qsql_filename,table_name,existing_table_names):
  class NonExistentTableNameInSqlite (line 683) | class NonExistentTableNameInSqlite(Exception):
    method __init__ (line 685) | def __init__(self, qsql_filename,table_name,existing_table_names):
  class TooManyTablesInQsqlException (line 690) | class TooManyTablesInQsqlException(Exception):
    method __init__ (line 692) | def __init__(self, qsql_filename,existing_table_names):
  class NoTableInQsqlExcption (line 696) | class NoTableInQsqlExcption(Exception):
    method __init__ (line 698) | def __init__(self, qsql_filename):
  class TooManyTablesInSqliteException (line 701) | class TooManyTablesInSqliteException(Exception):
    method __init__ (line 703) | def __init__(self, qsql_filename,existing_table_names):
  class NoTablesInSqliteException (line 707) | class NoTablesInSqliteException(Exception):
    method __init__ (line 709) | def __init__(self, sqlite_filename):
  class ColumnMaxLengthLimitExceededException (line 712) | class ColumnMaxLengthLimitExceededException(Exception):
    method __init__ (line 714) | def __init__(self, msg):
    method __str (line 717) | def __str(self):
  class CouldNotParseInputException (line 720) | class CouldNotParseInputException(Exception):
    method __init__ (line 722) | def __init__(self, msg):
    method __str (line 725) | def __str(self):
  class BadHeaderException (line 728) | class BadHeaderException(Exception):
    method __init__ (line 730) | def __init__(self, msg):
    method __str (line 733) | def __str(self):
  class EncodedQueryException (line 736) | class EncodedQueryException(Exception):
    method __init__ (line 738) | def __init__(self, msg):
    method __str (line 741) | def __str(self):
  class CannotUnzipDataStreamException (line 745) | class CannotUnzipDataStreamException(Exception):
    method __init__ (line 747) | def __init__(self):
  class UniversalNewlinesExistException (line 750) | class UniversalNewlinesExistException(Exception):
    method __init__ (line 752) | def __init__(self):
  class EmptyDataException (line 755) | class EmptyDataException(Exception):
    method __init__ (line 757) | def __init__(self):
  class MissingHeaderException (line 760) | class MissingHeaderException(Exception):
    method __init__ (line 762) | def __init__(self,msg):
  class InvalidQueryException (line 765) | class InvalidQueryException(Exception):
    method __init__ (line 767) | def __init__(self,msg):
  class TooManyAttachedDatabasesException (line 770) | class TooManyAttachedDatabasesException(Exception):
    method __init__ (line 772) | def __init__(self,msg):
  class FileNotFoundException (line 775) | class FileNotFoundException(Exception):
    method __init__ (line 777) | def __init__(self, msg):
    method __str (line 780) | def __str(self):
  class UnknownFileTypeException (line 783) | class UnknownFileTypeException(Exception):
    method __init__ (line 785) | def __init__(self, msg):
    method __str (line 788) | def __str(self):
  class ColumnCountMismatchException (line 792) | class ColumnCountMismatchException(Exception):
    method __init__ (line 794) | def __init__(self, msg):
  class ContentSignatureNotFoundException (line 797) | class ContentSignatureNotFoundException(Exception):
    method __init__ (line 799) | def __init__(self, msg):
  class StrictModeColumnCountMismatchException (line 802) | class StrictModeColumnCountMismatchException(Exception):
    method __init__ (line 804) | def __init__(self,atomic_fn, expected_col_count,actual_col_count,lines...
  class FluffyModeColumnCountMismatchException (line 810) | class FluffyModeColumnCountMismatchException(Exception):
    method __init__ (line 812) | def __init__(self,atomic_fn, expected_col_count,actual_col_count,lines...
  class ContentSignatureDiffersException (line 818) | class ContentSignatureDiffersException(Exception):
    method __init__ (line 820) | def __init__(self,original_filename, other_filename, filenames_str,key...
  class ContentSignatureDataDiffersException (line 829) | class ContentSignatureDataDiffersException(Exception):
    method __init__ (line 831) | def __init__(self,msg):
  class InvalidQSqliteFileException (line 835) | class InvalidQSqliteFileException(Exception):
    method __init__ (line 837) | def __init__(self,msg):
  class MaximumSourceFilesExceededException (line 841) | class MaximumSourceFilesExceededException(Exception):
    method __init__ (line 843) | def __init__(self,msg):
  class Sql (line 851) | class Sql(object):
    method __init__ (line 853) | def __init__(self, sql, data_streams):
    method normalize_qtable_name (line 919) | def normalize_qtable_name(self,qtable_name):
    method set_effective_table_name (line 929) | def set_effective_table_name(self, qtable_name, effective_table_name):
    method get_effective_sql (line 940) | def get_effective_sql(self,table_name_mapping=None):
    method get_qtable_name_effective_table_names (line 958) | def get_qtable_name_effective_table_names(self):
    method execute_and_fetch (line 961) | def execute_and_fetch(self, db):
    method materialize_using (line 967) | def materialize_using(self,loaded_table_structures_dict):
  class TableColumnInferer (line 982) | class TableColumnInferer(object):
    method __init__ (line 984) | def __init__(self, input_params):
    method _generate_content_signature (line 995) | def _generate_content_signature(self):
    method analyze (line 1007) | def analyze(self, filename, col_vals):
    method force_analysis (line 1023) | def force_analysis(self):
    method determine_type_of_value (line 1029) | def determine_type_of_value(self, value):
    method determine_type_of_value_list (line 1055) | def determine_type_of_value_list(self, value_list):
    method do_analysis (line 1075) | def do_analysis(self):
    method validate_column_names (line 1090) | def validate_column_names(self, value_list):
    method infer_column_names (line 1136) | def infer_column_names(self):
    method _do_relaxed_analysis (line 1166) | def _do_relaxed_analysis(self):
    method get_column_count_summary (line 1181) | def get_column_count_summary(self, column_count_list):
    method _do_strict_analysis (line 1187) | def _do_strict_analysis(self):
    method infer_column_types (line 1202) | def infer_column_types(self):
    method get_column_dict (line 1225) | def get_column_dict(self):
    method get_column_count (line 1228) | def get_column_count(self):
    method get_column_names (line 1231) | def get_column_names(self):
    method get_column_types (line 1234) | def get_column_types(self):
  function py3_encoded_csv_reader (line 1238) | def py3_encoded_csv_reader(encoding, f, dialect,row_data_only=False,**kw...
  function normalized_filename (line 1269) | def normalized_filename(filename):
  class TableCreatorState (line 1272) | class TableCreatorState(object):
  class MaterializedStateType (line 1277) | class MaterializedStateType(object):
  class TableSourceType (line 1284) | class TableSourceType(object):
  function skip_BOM (line 1292) | def skip_BOM(f):
  function detect_qtable_name_source_info (line 1303) | def detect_qtable_name_source_info(qtable_name,data_streams,read_caching...
  function is_sqlite_file (line 1340) | def is_sqlite_file(filename):
  function sqlite_table_exists (line 1349) | def sqlite_table_exists(cursor,table_name):
  function is_qsql_file (line 1353) | def is_qsql_file(filename):
  function normalize_filename_to_table_name (line 1362) | def normalize_filename_to_table_name(filename):
  function validate_content_signature (line 1375) | def validate_content_signature(original_filename, source_signature,other...
  class DelimitedFileReader (line 1394) | class DelimitedFileReader(object):
    method __init__ (line 1395) | def __init__(self,atomic_fns, input_params, dialect, f = None,external...
    method get_lines_read (line 1414) | def get_lines_read(self):
    method get_size_hash (line 1417) | def get_size_hash(self):
    method get_last_modification_time_hash (line 1423) | def get_last_modification_time_hash(self):
    method open_file (line 1432) | def open_file(self):
    method close_file (line 1471) | def close_file(self):
    method generate_rows (line 1479) | def generate_rows(self):
  class MaterializedState (line 1500) | class MaterializedState(object):
    method __init__ (line 1501) | def __init__(self, table_source_type,qtable_name, engine_id):
    method get_materialized_state_type (line 1524) | def get_materialized_state_type(self):
    method get_planned_table_name (line 1527) | def get_planned_table_name(self):
    method autodetect_table_name (line 1530) | def autodetect_table_name(self):
    method initialize (line 1552) | def initialize(self):
    method finalize (line 1555) | def finalize(self):
    method choose_db_to_use (line 1559) | def choose_db_to_use(self,forced_db_to_use=None,stop_after_analysis=Fa...
    method make_data_available (line 1562) | def make_data_available(self,stop_after_analysis):
  class MaterializedDelimitedFileState (line 1565) | class MaterializedDelimitedFileState(MaterializedState):
    method __init__ (line 1566) | def __init__(self, table_source_type,qtable_name, input_params, dialec...
    method get_materialized_state_type (line 1579) | def get_materialized_state_type(self):
    method initialize (line 1582) | def initialize(self):
    method materialize_file_list (line 1593) | def materialize_file_list(self,qtable_name):
    method choose_db_to_use (line 1632) | def choose_db_to_use(self,forced_db_to_use=None,stop_after_analysis=Fa...
    method __analyze_delimited_file (line 1651) | def __analyze_delimited_file(self,database_info):
    method _generate_disk_db_filename (line 1675) | def _generate_disk_db_filename(self, filenames_str):
    method _get_should_read_from_cache (line 1680) | def _get_should_read_from_cache(self, disk_db_filename):
    method calculate_should_read_from_cache (line 1687) | def calculate_should_read_from_cache(self):
    method get_planned_table_name (line 1695) | def get_planned_table_name(self):
    method make_data_available (line 1698) | def make_data_available(self,stop_after_analysis):
    method save_cache_to_disk_if_needed (line 1737) | def save_cache_to_disk_if_needed(self, disk_db_filename, table_creator):
    method _store_qsql (line 1751) | def _store_qsql(self, source_sqlite_db, disk_db_filename):
    method _generate_db_name (line 1759) | def _generate_db_name(self, qtable_name):
  class MaterialiedDataStreamState (line 1763) | class MaterialiedDataStreamState(MaterializedDelimitedFileState):
    method __init__ (line 1764) | def __init__(self, table_source_type, qtable_name, input_params, diale...
    method get_planned_table_name (line 1775) | def get_planned_table_name(self):
    method get_materialized_state_type (line 1778) | def get_materialized_state_type(self):
    method initialize (line 1781) | def initialize(self):
    method choose_db_to_use (line 1791) | def choose_db_to_use(self,forced_db_to_use=None,stop_after_analysis=Fa...
    method calculate_should_read_from_cache (line 1801) | def calculate_should_read_from_cache(self):
    method finalize (line 1805) | def finalize(self):
    method save_cache_to_disk_if_needed (line 1808) | def save_cache_to_disk_if_needed(self, disk_db_filename, table_creator):
  class MaterializedSqliteState (line 1813) | class MaterializedSqliteState(MaterializedState):
    method __init__ (line 1814) | def __init__(self,table_source_type,qtable_name,sqlite_filename,table_...
    method initialize (line 1821) | def initialize(self):
    method get_planned_table_name (line 1832) | def get_planned_table_name(self):
    method autodetect_table_name (line 1839) | def autodetect_table_name(self):
    method validate_table_name (line 1852) | def validate_table_name(self):
    method finalize (line 1862) | def finalize(self):
    method get_materialized_state_type (line 1865) | def get_materialized_state_type(self):
    method _generate_qsql_only_db_name__temp (line 1868) | def _generate_qsql_only_db_name__temp(self, filenames_str):
    method choose_db_to_use (line 1871) | def choose_db_to_use(self,forced_db_to_use=None,stop_after_analysis=Fa...
    method make_data_available (line 1889) | def make_data_available(self,stop_after_analysis):
    method _extract_information (line 1903) | def _extract_information(self):
  class MaterializedQsqlState (line 1924) | class MaterializedQsqlState(MaterializedState):
    method __init__ (line 1925) | def __init__(self,table_source_type,qtable_name,qsql_filename,table_na...
    method initialize (line 1937) | def initialize(self):
    method get_planned_table_name (line 1948) | def get_planned_table_name(self):
    method autodetect_table_name (line 1955) | def autodetect_table_name(self):
    method validate_table_name (line 1971) | def validate_table_name(self):
    method finalize (line 1984) | def finalize(self):
    method get_materialized_state_type (line 1987) | def get_materialized_state_type(self):
    method _generate_qsql_only_db_name__temp (line 1990) | def _generate_qsql_only_db_name__temp(self, filenames_str):
    method choose_db_to_use (line 1993) | def choose_db_to_use(self,forced_db_to_use=None,stop_after_analysis=Fa...
    method make_data_available (line 2024) | def make_data_available(self,stop_after_analysis):
    method _extract_information (line 2038) | def _extract_information(self):
    method _backing_original_file_exists (line 2054) | def _backing_original_file_exists(self):
    method _read_table_from_cache (line 2057) | def _read_table_from_cache(self, stop_after_analysis):
  class MaterializedStateTableStructure (line 2082) | class MaterializedStateTableStructure(object):
    method __init__ (line 2083) | def __init__(self,qtable_name, atomic_fns, db_id, column_names, python...
    method get_table_name_for_querying (line 2099) | def get_table_name_for_querying(self):
    method __str__ (line 2102) | def __str__(self):
  class TableCreator (line 2106) | class TableCreator(object):
    method __str__ (line 2107) | def __str__(self):
    method __init__ (line 2111) | def __init__(self, qtable_name, delimited_file_reader,input_params,sql...
    method _generate_content_signature (line 2145) | def _generate_content_signature(self):
    method validate_extra_header_if_needed (line 2169) | def validate_extra_header_if_needed(self, file_number, filename,col_va...
    method _populate (line 2195) | def _populate(self,dialect,stop_after_analysis=False):
    method perform_analyze (line 2230) | def perform_analyze(self, dialect):
    method perform_read_fully (line 2243) | def perform_read_fully(self, dialect):
    method _flush_pre_creation_rows (line 2251) | def _flush_pre_creation_rows(self, filename):
    method _insert_row (line 2260) | def _insert_row(self, filename, col_vals):
    method initialize_numeric_column_indices_if_needed (line 2276) | def initialize_numeric_column_indices_if_needed(self):
    method nullify_values_if_needed (line 2283) | def nullify_values_if_needed(self, col_vals):
    method normalize_col_vals (line 2294) | def normalize_col_vals(self, col_vals):
    method _insert_row_i (line 2325) | def _insert_row_i(self, col_vals):
    method _flush_inserts (line 2340) | def _flush_inserts(self):
    method try_to_create_table (line 2352) | def try_to_create_table(self, filename, col_vals):
    method _do_create_table (line 2365) | def _do_create_table(self,filename):
  function determine_max_col_lengths (line 2386) | def determine_max_col_lengths(m,output_field_quoting_func,output_delimit...
  function print_credentials (line 2398) | def print_credentials():
  class QWarning (line 2405) | class QWarning(object):
    method __init__ (line 2406) | def __init__(self,exception,msg):
  class QError (line 2410) | class QError(object):
    method __init__ (line 2411) | def __init__(self,exception,msg,errorcode):
    method __str__ (line 2417) | def __str__(self):
  class QMetadata (line 2421) | class QMetadata(object):
    method __init__ (line 2422) | def __init__(self,table_structures={},new_table_structures={},output_c...
    method __str__ (line 2427) | def __str__(self):
  class QOutput (line 2431) | class QOutput(object):
    method __init__ (line 2432) | def __init__(self,data=None,metadata=None,warnings=[],error=None):
    method __str__ (line 2443) | def __str__(self):
  class QInputParams (line 2461) | class QInputParams(object):
    method __init__ (line 2462) | def __init__(self,skip_header=False,
    method merged_with (line 2489) | def merged_with(self,input_params):
    method __str__ (line 2495) | def __str__(self):
    method __repr__ (line 2498) | def __repr__(self):
  class DataStream (line 2501) | class DataStream(object):
    method __init__ (line 2503) | def __init__(self,stream_id,filename,stream):
    method __str__ (line 2508) | def __str__(self):
  class DataStreams (line 2513) | class DataStreams(object):
    method __init__ (line 2514) | def __init__(self, data_streams_dict):
    method validate (line 2519) | def validate(self,d):
    method get_for_filename (line 2525) | def get_for_filename(self, filename):
    method is_data_stream (line 2530) | def is_data_stream(self,filename):
  class DatabaseInfo (line 2533) | class DatabaseInfo(object):
    method __init__ (line 2534) | def __init__(self,db_id,sqlite_db,needs_closing):
    method __str__ (line 2539) | def __str__(self):
  class QTextAsData (line 2543) | class QTextAsData(object):
    method __init__ (line 2544) | def __init__(self,default_input_params=QInputParams(),data_streams_dic...
    method done (line 2570) | def done(self):
    method determine_proper_dialect (line 2588) | def determine_proper_dialect(self,input_params):
    method get_dialect_id (line 2607) | def get_dialect_id(self,filename):
    method _open_files_and_get_mfss (line 2610) | def _open_files_and_get_mfss(self,qtable_name,input_params,dialect):
    method _load_mfs (line 2644) | def _load_mfs(self,mfs,input_params,dialect_id,stop_after_analysis):
    method add_db_to_database_list (line 2679) | def add_db_to_database_list(self,database_info):
    method is_adhoc_db (line 2691) | def is_adhoc_db(self,db_to_use):
    method should_copy_instead_of_attach (line 2694) | def should_copy_instead_of_attach(self,input_params):
    method _load_data (line 2700) | def _load_data(self,qtable_name,input_params=QInputParams(),stop_after...
    method already_attached_to_query_level_db (line 2732) | def already_attached_to_query_level_db(self,db_to_attach):
    method attach_to_db (line 2736) | def attach_to_db(self, target_db, source_db):
    method detach_from_db (line 2747) | def detach_from_db(self, target_db, source_db):
    method load_data (line 2755) | def load_data(self,filename,input_params=QInputParams(),stop_after_ana...
    method _ensure_data_is_loaded_for_sql (line 2758) | def _ensure_data_is_loaded_for_sql(self,sql_object,input_params,data_s...
    method materialize_query_level_db (line 2772) | def materialize_query_level_db(self,save_db_to_disk_filename,sql_object):
    method validate_query (line 2820) | def validate_query(self,sql_object,table_structures):
    method _execute (line 2850) | def _execute(self,query_str,input_params=None,data_streams=None,stop_a...
    method execute (line 2992) | def execute(self,query_str,input_params=None,save_db_to_disk_filename=...
    method unload (line 2996) | def unload(self):
    method analyze (line 3006) | def analyze(self,query_str,input_params=None,data_streams=None):
  function escape_double_quotes_if_needed (line 3011) | def escape_double_quotes_if_needed(v):
  function quote_none_func (line 3015) | def quote_none_func(output_delimiter,v):
  function quote_minimal_func (line 3018) | def quote_minimal_func(output_delimiter,v):
  function quote_nonnumeric_func (line 3026) | def quote_nonnumeric_func(output_delimiter,v):
  function quote_all_func (line 3033) | def quote_all_func(output_delimiter,v):
  class QOutputParams (line 3039) | class QOutputParams(object):
    method __init__ (line 3040) | def __init__(self,
    method __str__ (line 3054) | def __str__(self):
    method __repr__ (line 3057) | def __repr__(self):
  class QOutputPrinter (line 3060) | class QOutputPrinter(object):
    method __init__ (line 3066) | def __init__(self,output_params,show_tracebacks=False):
    method print_errors_and_warnings (line 3072) | def print_errors_and_warnings(self,f,results):
    method print_analysis (line 3082) | def print_analysis(self,f_out,f_err,results):
    method print_output (line 3101) | def print_output(self,f_out,f_err,results):
    method _print_output (line 3117) | def _print_output(self,f_out,f_err,results):
  function get_option_with_default (line 3185) | def get_option_with_default(p, option_type, option, default):
  function dump_default_values_as_qrc (line 3207) | def dump_default_values_as_qrc(parser,exclusions):
  function run_standalone (line 3249) | def run_standalone():
  function dump_version_and_stop__if_needed (line 3273) | def dump_version_and_stop__if_needed(options):
  function dump_defaults_and_stop__if_needed (line 3279) | def dump_defaults_and_stop__if_needed(options, parser):
  function execute_queries (line 3285) | def execute_queries(STDOUT, options, q_engine, q_output_printer, query_s...
  function initialize_command_line_parser (line 3298) | def initialize_command_line_parser(p, qrc_filename):
  function parse_qrc_file (line 3442) | def parse_qrc_file():
  function initialize_default_data_streams (line 3461) | def initialize_default_data_streams():
  function parse_options (line 3468) | def parse_options(args, options):

FILE: mkdocs/docs/js/google-analytics.js
  function GAizeDownloadLink (line 6) | function GAizeDownloadLink(a) {
  function GAizeTOCLink (line 29) | function GAizeTOCLink(l) {

FILE: test/test_suite.py
  function batch (line 68) | def batch(iterable, n=1):
  function partition (line 75) | def partition(pred, iterable):
  function run_command (line 79) | def run_command(cmd_to_run,env_to_inject=None):
  function generate_sample_data_with_header (line 144) | def generate_sample_data_with_header(header):
  function one_column_warning (line 189) | def one_column_warning(e):
  function sqlite_dict_factory (line 192) | def sqlite_dict_factory(cursor, row):
  class AbstractQTestCase (line 198) | class AbstractQTestCase(unittest.TestCase):
    method create_file_with_data (line 200) | def create_file_with_data(self, data, encoding=None,prefix=None,suffix...
    method generate_tmpfile_name (line 210) | def generate_tmpfile_name(self,prefix=None,suffix=None):
    method arrays_to_csv_file_content (line 215) | def arrays_to_csv_file_content(self,delimiter,header_row_list,cell_list):
    method create_qsql_file_with_content_and_return_filename (line 219) | def create_qsql_file_with_content_and_return_filename(self, header_row...
    method arrays_to_qsql_file_content (line 232) | def arrays_to_qsql_file_content(self, header_row,cell_list):
    method write_file (line 249) | def write_file(self,filename,data):
    method create_folder_with_files (line 254) | def create_folder_with_files(self,filename_to_content_dict,prefix, suf...
    method cleanup_folder (line 265) | def cleanup_folder(self,tmpfolder):
    method cleanup (line 273) | def cleanup(self, tmpfile):
    method random_tmp_filename (line 278) | def random_tmp_filename(self,prefix,postfix):
  function get_sqlite_table_list (line 285) | def get_sqlite_table_list(c,exclude_qcatalog=True):
  class SaveToSqliteTests (line 293) | class SaveToSqliteTests(AbstractQTestCase):
    method generate_files_in_folder (line 296) | def generate_files_in_folder(self,batch_size, file_count):
    method test_save_glob_files_to_sqlite (line 311) | def test_save_glob_files_to_sqlite(self):
    method test_save_multiple_files_to_sqlite (line 337) | def test_save_multiple_files_to_sqlite(self):
    method test_save_multiple_files_to_sqlite_without_duplicates (line 367) | def test_save_multiple_files_to_sqlite_without_duplicates(self):
    method test_sqlite_file_is_not_created_if_some_table_does_not_exist (line 402) | def test_sqlite_file_is_not_created_if_some_table_does_not_exist(self):
    method test_recurring_glob_and_separate_files_in_same_query_when_writing_to_sqlite (line 426) | def test_recurring_glob_and_separate_files_in_same_query_when_writing_...
    method test_empty_sqlite_handling (line 464) | def test_empty_sqlite_handling(self):
    method test_storing_to_disk_too_many_qsql_files (line 480) | def test_storing_to_disk_too_many_qsql_files(self):
    method test_storing_to_disk_too_many_sqlite_files (line 531) | def test_storing_to_disk_too_many_sqlite_files(self):
    method test_storing_to_disk_too_many_sqlite_files__over_the_sqlite_limit (line 590) | def test_storing_to_disk_too_many_sqlite_files__over_the_sqlite_limit(...
    method test_qtable_name_normalization__starting_with_a_digit (line 632) | def test_qtable_name_normalization__starting_with_a_digit(self):
    method test_qtable_name_normalization (line 660) | def test_qtable_name_normalization(self):
    method test_qtable_name_normalization2 (line 678) | def test_qtable_name_normalization2(self):
    method test_qtable_name_normalization3 (line 686) | def test_qtable_name_normalization3(self):
    method test_save_multiple_files_to_sqlite_while_caching_them (line 695) | def test_save_multiple_files_to_sqlite_while_caching_them(self):
    method test_globs_ignore_matching_qsql_files (line 796) | def test_globs_ignore_matching_qsql_files(self):
    method test_error_on_reading_from_multi_table_sqlite_without_explicit_table_name (line 821) | def test_error_on_reading_from_multi_table_sqlite_without_explicit_tab...
    method test_error_on_trying_to_specify_an_explicit_non_existent_qsql_file (line 848) | def test_error_on_trying_to_specify_an_explicit_non_existent_qsql_file...
    method test_error_on_providing_a_non_qsql_file_when_specifying_an_explicit_table (line 857) | def test_error_on_providing_a_non_qsql_file_when_specifying_an_explici...
    method test_error_on_providing_a_non_qsql_file_when_not_specifying_an_explicit_table (line 872) | def test_error_on_providing_a_non_qsql_file_when_not_specifying_an_exp...
  class OldSaveDbToDiskTests (line 887) | class OldSaveDbToDiskTests(AbstractQTestCase):
    method test_join_with_stdin_and_save (line 889) | def test_join_with_stdin_and_save(self):
    method test_join_with_qsql_file (line 940) | def test_join_with_qsql_file(self):
    method test_join_with_qsql_file_and_save (line 979) | def test_join_with_qsql_file_and_save(self):
    method test_saving_to_db_with_same_basename_files (line 1013) | def test_saving_to_db_with_same_basename_files(self):
    method test_error_when_not_specifying_table_name_in_multi_table_qsql (line 1055) | def test_error_when_not_specifying_table_name_in_multi_table_qsql(self):
    method test_error_when_not_specifying_table_name_in_multi_table_sqlite (line 1092) | def test_error_when_not_specifying_table_name_in_multi_table_sqlite(se...
    method test_querying_from_multi_table_sqlite_using_explicit_table_name (line 1109) | def test_querying_from_multi_table_sqlite_using_explicit_table_name(se...
    method test_error_when_specifying_nonexistent_table_name_in_multi_table_qsql (line 1139) | def test_error_when_specifying_nonexistent_table_name_in_multi_table_q...
    method test_querying_multi_table_qsql_file (line 1177) | def test_querying_multi_table_qsql_file(self):
    method test_preventing_db_overwrite (line 1222) | def test_preventing_db_overwrite(self):
  class BasicTests (line 1239) | class BasicTests(AbstractQTestCase):
    method test_basic_aggregation (line 1241) | def test_basic_aggregation(self):
    method test_select_one_column (line 1251) | def test_select_one_column(self):
    method test_column_separation (line 1265) | def test_column_separation(self):
    method test_header_exception_on_numeric_header_data (line 1280) | def test_header_exception_on_numeric_header_data(self):
    method test_different_header_in_second_file (line 1295) | def test_different_header_in_second_file(self):
    method test_data_with_header (line 1308) | def test_data_with_header(self):
    method test_output_header_when_input_header_exists (line 1319) | def test_output_header_when_input_header_exists(self):
    method test_generated_column_name_warning_when_header_line_exists (line 1333) | def test_generated_column_name_warning_when_header_line_exists(self):
    method test_empty_data (line 1348) | def test_empty_data(self):
    method test_empty_data_with_header_param (line 1361) | def test_empty_data_with_header_param(self):
    method test_one_row_of_data_without_header_param (line 1375) | def test_one_row_of_data_without_header_param(self):
    method test_one_row_of_data_with_header_param (line 1388) | def test_one_row_of_data_with_header_param(self):
    method test_dont_leading_keep_whitespace_in_values (line 1401) | def test_dont_leading_keep_whitespace_in_values(self):
    method test_keep_leading_whitespace_in_values (line 1416) | def test_keep_leading_whitespace_in_values(self):
    method test_no_impact_of_keeping_leading_whitespace_on_integers (line 1431) | def test_no_impact_of_keeping_leading_whitespace_on_integers(self):
    method test_spaces_in_header_row (line 1458) | def test_spaces_in_header_row(self):
    method test_no_query_in_command_line (line 1474) | def test_no_query_in_command_line(self):
    method test_empty_query_in_command_line (line 1484) | def test_empty_query_in_command_line(self):
    method test_failure_in_query_stops_processing_queries (line 1494) | def test_failure_in_query_stops_processing_queries(self):
    method test_multiple_queries_in_command_line (line 1504) | def test_multiple_queries_in_command_line(self):
    method test_literal_calculation_query (line 1517) | def test_literal_calculation_query(self):
    method test_literal_calculation_query_float_result (line 1527) | def test_literal_calculation_query_float_result(self):
    method test_use_query_file (line 1537) | def test_use_query_file(self):
    method test_use_query_file_with_incorrect_query_encoding (line 1555) | def test_use_query_file_with_incorrect_query_encoding(self):
    method test_output_header_with_non_ascii_names (line 1571) | def test_output_header_with_non_ascii_names(self):
    method test_use_query_file_with_query_encoding (line 1592) | def test_use_query_file_with_query_encoding(self):
    method test_use_query_file_and_command_line (line 1612) | def test_use_query_file_and_command_line(self):
    method test_select_output_encoding (line 1628) | def test_select_output_encoding(self):
    method test_select_failed_output_encoding (line 1647) | def test_select_failed_output_encoding(self):
    method test_use_query_file_with_empty_query (line 1664) | def test_use_query_file_with_empty_query(self):
    method test_use_non_existent_query_file (line 1678) | def test_use_non_existent_query_file(self):
    method test_nonexistent_file (line 1688) | def test_nonexistent_file(self):
    method test_default_column_max_length_parameter__short_enough (line 1699) | def test_default_column_max_length_parameter__short_enough(self):
    method test_default_column_max_length_parameter__too_long (line 1717) | def test_default_column_max_length_parameter__too_long(self):
    method test_column_max_length_parameter (line 1737) | def test_column_max_length_parameter(self):
    method test_invalid_column_max_length_parameter (line 1763) | def test_invalid_column_max_length_parameter(self):
    method test_duplicate_column_name_detection (line 1778) | def test_duplicate_column_name_detection(self):
    method test_join_with_stdin (line 1794) | def test_join_with_stdin(self):
    method test_concatenated_files (line 1815) | def test_concatenated_files(self):
    method test_out_of_range_expected_column_count (line 1843) | def test_out_of_range_expected_column_count(self):
    method test_out_of_range_expected_column_count__with_explicit_limit (line 1852) | def test_out_of_range_expected_column_count__with_explicit_limit(self):
    method test_other_out_of_range_expected_column_count__with_explicit_limit (line 1861) | def test_other_out_of_range_expected_column_count__with_explicit_limit...
    method test_explicit_limit_of_columns__data_is_ok (line 1870) | def test_explicit_limit_of_columns__data_is_ok(self):
  class ManyOpenFilesTests (line 1884) | class ManyOpenFilesTests(AbstractQTestCase):
    method test_multi_file_header_skipping (line 1887) | def test_multi_file_header_skipping(self):
    method test_that_globs_dont_max_out_sqlite_attached_database_limits (line 1912) | def test_that_globs_dont_max_out_sqlite_attached_database_limits(self):
    method test_maxing_out_max_attached_database_limits__regular_files (line 1937) | def test_maxing_out_max_attached_database_limits__regular_files(self):
    method test_maxing_out_max_attached_database_limits__with_qsql_files_below_attached_limit (line 1963) | def test_maxing_out_max_attached_database_limits__with_qsql_files_belo...
    method test_maxing_out_max_attached_database_limits__with_qsql_files_above_attached_limit (line 2000) | def test_maxing_out_max_attached_database_limits__with_qsql_files_abov...
    method test_maxing_out_max_attached_database_limits__with_directly_using_qsql_files (line 2047) | def test_maxing_out_max_attached_database_limits__with_directly_using_...
    method test_too_many_open_files_for_one_table (line 2085) | def test_too_many_open_files_for_one_table(self):
    method test_many_open_files_for_one_table (line 2114) | def test_many_open_files_for_one_table(self):
    method test_many_open_files_for_two_tables (line 2141) | def test_many_open_files_for_two_tables(self):
  class GzippingTests (line 2171) | class GzippingTests(AbstractQTestCase):
    method test_gzipped_file (line 2173) | def test_gzipped_file(self):
  class DelimiterTests (line 2190) | class DelimiterTests(AbstractQTestCase):
    method test_delimition_mistake_with_header (line 2192) | def test_delimition_mistake_with_header(self):
    method test_tab_delimition_parameter (line 2207) | def test_tab_delimition_parameter(self):
    method test_pipe_delimition_parameter (line 2222) | def test_pipe_delimition_parameter(self):
    method test_tab_delimition_parameter__with_manual_override_attempt (line 2237) | def test_tab_delimition_parameter__with_manual_override_attempt(self):
    method test_pipe_delimition_parameter__with_manual_override_attempt (line 2253) | def test_pipe_delimition_parameter__with_manual_override_attempt(self):
    method test_output_delimiter (line 2269) | def test_output_delimiter(self):
    method test_output_delimiter_tab_parameter (line 2284) | def test_output_delimiter_tab_parameter(self):
    method test_output_delimiter_pipe_parameter (line 2299) | def test_output_delimiter_pipe_parameter(self):
    method test_output_delimiter_tab_parameter__with_manual_override_attempt (line 2314) | def test_output_delimiter_tab_parameter__with_manual_override_attempt(...
    method test_output_delimiter_pipe_parameter__with_manual_override_attempt (line 2330) | def test_output_delimiter_pipe_parameter__with_manual_override_attempt...
  class AnalysisTests (line 2347) | class AnalysisTests(AbstractQTestCase):
    method test_analyze_result (line 2349) | def test_analyze_result(self):
    method test_analyze_result_with_data_stream (line 2368) | def test_analyze_result_with_data_stream(self):
    method test_column_analysis (line 2387) | def test_column_analysis(self):
    method test_column_analysis_with_mixed_ints_and_floats (line 2404) | def test_column_analysis_with_mixed_ints_and_floats(self):
    method test_column_analysis_with_mixed_ints_and_floats_and_nulls (line 2424) | def test_column_analysis_with_mixed_ints_and_floats_and_nulls(self):
    method test_column_analysis_no_header (line 2444) | def test_column_analysis_no_header(self):
    method test_column_analysis_with_unexpected_header (line 2459) | def test_column_analysis_with_unexpected_header(self):
    method test_column_analysis_for_spaces_in_header_row (line 2481) | def test_column_analysis_for_spaces_in_header_row(self):
    method test_column_analysis_with_header (line 2501) | def test_column_analysis_with_header(self):
  class StdInTests (line 2524) | class StdInTests(AbstractQTestCase):
    method test_stdin_input (line 2526) | def test_stdin_input(self):
    method test_attempt_to_unzip_stdin (line 2538) | def test_attempt_to_unzip_stdin(self):
  class QuotingTests (line 2553) | class QuotingTests(AbstractQTestCase):
    method test_non_quoted_values_in_quoted_data (line 2554) | def test_non_quoted_values_in_quoted_data(self):
    method test_regular_quoted_values_in_quoted_data (line 2572) | def test_regular_quoted_values_in_quoted_data(self):
    method test_double_double_quoted_values_in_quoted_data (line 2589) | def test_double_double_quoted_values_in_quoted_data(self):
    method test_escaped_double_quoted_values_in_quoted_data (line 2606) | def test_escaped_double_quoted_values_in_quoted_data(self):
    method test_none_input_quoting_mode_in_relaxed_mode (line 2623) | def test_none_input_quoting_mode_in_relaxed_mode(self):
    method test_none_input_quoting_mode_in_strict_mode (line 2638) | def test_none_input_quoting_mode_in_strict_mode(self):
    method test_minimal_input_quoting_mode (line 2652) | def test_minimal_input_quoting_mode(self):
    method test_all_input_quoting_mode (line 2667) | def test_all_input_quoting_mode(self):
    method test_incorrect_input_quoting_mode (line 2682) | def test_incorrect_input_quoting_mode(self):
    method test_none_output_quoting_mode (line 2697) | def test_none_output_quoting_mode(self):
    method test_minimal_output_quoting_mode__without_need_to_quote_in_output (line 2712) | def test_minimal_output_quoting_mode__without_need_to_quote_in_output(...
    method test_minimal_output_quoting_mode__with_need_to_quote_in_output_due_to_delimiter (line 2727) | def test_minimal_output_quoting_mode__with_need_to_quote_in_output_due...
    method test_minimal_output_quoting_mode__with_need_to_quote_in_output_due_to_newline (line 2743) | def test_minimal_output_quoting_mode__with_need_to_quote_in_output_due...
    method test_nonnumeric_output_quoting_mode (line 2761) | def test_nonnumeric_output_quoting_mode(self):
    method test_all_output_quoting_mode (line 2776) | def test_all_output_quoting_mode(self):
    method _internal_test_consistency_of_chaining_output_to_input (line 2791) | def _internal_test_consistency_of_chaining_output_to_input(self,input_...
    method test_consistency_of_chaining_minimal_wrapping_to_minimal_wrapping (line 2808) | def test_consistency_of_chaining_minimal_wrapping_to_minimal_wrapping(...
    method test_consistency_of_chaining_all_wrapping_to_all_wrapping (line 2812) | def test_consistency_of_chaining_all_wrapping_to_all_wrapping(self):
    method test_input_field_quoting_and_data_types_with_encoding (line 2816) | def test_input_field_quoting_and_data_types_with_encoding(self):
    method test_multiline_double_double_quoted_values_in_quoted_data (line 2853) | def test_multiline_double_double_quoted_values_in_quoted_data(self):
    method test_multiline_escaped_double_quoted_values_in_quoted_data (line 2871) | def test_multiline_escaped_double_quoted_values_in_quoted_data(self):
    method test_disable_double_double_quoted_data_flag__values (line 2889) | def test_disable_double_double_quoted_data_flag__values(self):
    method test_disable_escaped_double_quoted_data_flag__values (line 2927) | def test_disable_escaped_double_quoted_data_flag__values(self):
    method test_combined_quoted_data_flags__number_of_columns_detected (line 2965) | def test_combined_quoted_data_flags__number_of_columns_detected(self):
  class EncodingTests (line 3009) | class EncodingTests(AbstractQTestCase):
    method test_utf8_with_bom_encoding (line 3011) | def test_utf8_with_bom_encoding(self):
  class QrcTests (line 3029) | class QrcTests(AbstractQTestCase):
    method test_explicit_qrc_filename_not_found (line 3031) | def test_explicit_qrc_filename_not_found(self):
    method test_explicit_qrc_filename_that_exists (line 3042) | def test_explicit_qrc_filename_that_exists(self):
    method test_all_default_options (line 3057) | def test_all_default_options(self):
    method test_caching_readwrite_using_qrc_file (line 3144) | def test_caching_readwrite_using_qrc_file(self):
  class QsqlUsageTests (line 3184) | class QsqlUsageTests(AbstractQTestCase):
    method test_concatenate_same_qsql_file_with_single_table (line 3186) | def test_concatenate_same_qsql_file_with_single_table(self):
    method test_query_qsql_with_single_table (line 3201) | def test_query_qsql_with_single_table(self):
    method test_query_qsql_with_single_table_with_explicit_non_existent_tablename (line 3216) | def test_query_qsql_with_single_table_with_explicit_non_existent_table...
    method test_query_qsql_with_single_table_with_explicit_table_name (line 3236) | def test_query_qsql_with_single_table_with_explicit_table_name(self):
    method test_query_multi_qsql_with_single_table (line 3256) | def test_query_multi_qsql_with_single_table(self):
    method test_query_concatenated_qsqls_each_with_single_table (line 3273) | def test_query_concatenated_qsqls_each_with_single_table(self):
    method test_concatenated_qsql_and_data_stream__column_names_mismatch (line 3290) | def test_concatenated_qsql_and_data_stream__column_names_mismatch(self):
    method test_concatenated_qsql_and_data_stream (line 3312) | def test_concatenated_qsql_and_data_stream(self):
    method test_concatenated_qsql_and_data_stream__explicit_table_name (line 3334) | def test_concatenated_qsql_and_data_stream__explicit_table_name(self):
    method test_write_to_qsql__check_chosen_table_name (line 3358) | def test_write_to_qsql__check_chosen_table_name(self):
    method test_concatenated_mixes_qsql_with_single_table_and_csv (line 3374) | def test_concatenated_mixes_qsql_with_single_table_and_csv(self):
    method test_analysis_of_concatenated_mixes_qsql_with_single_table_and_csv (line 3430) | def test_analysis_of_concatenated_mixes_qsql_with_single_table_and_csv...
    method test_mixed_qsql_with_single_table_and_csv__missing_header_parameter_for_csv (line 3518) | def test_mixed_qsql_with_single_table_and_csv__missing_header_paramete...
    method test_qsql_with_multiple_tables_direct_use (line 3536) | def test_qsql_with_multiple_tables_direct_use(self):
    method test_direct_use_of_sqlite_db_with_one_table (line 3569) | def test_direct_use_of_sqlite_db_with_one_table(self):
    method test_direct_use_of_sqlite_db_with_one_table__nonexistent_table (line 3594) | def test_direct_use_of_sqlite_db_with_one_table__nonexistent_table(self):
    method test_qsql_creation_and_direct_use (line 3612) | def test_qsql_creation_and_direct_use(self):
    method test_analysis_of_qsql_direct_usage (line 3643) | def test_analysis_of_qsql_direct_usage(self):
    method test_analysis_of_qsql_direct_usage2 (line 3679) | def test_analysis_of_qsql_direct_usage2(self):
    method test_direct_qsql_usage_for_single_table_qsql_file (line 3715) | def test_direct_qsql_usage_for_single_table_qsql_file(self):
    method test_direct_qsql_usage_for_single_table_qsql_file__nonexistent_table (line 3731) | def test_direct_qsql_usage_for_single_table_qsql_file__nonexistent_tab...
    method test_direct_qsql_usage_from_written_data_stream (line 3747) | def test_direct_qsql_usage_from_written_data_stream(self):
    method test_direct_qsql_self_join (line 3763) | def test_direct_qsql_self_join(self):
  class CachingTests (line 3783) | class CachingTests(AbstractQTestCase):
    method test_cache_empty_file (line 3785) | def test_cache_empty_file(self):
    method test_reading_the_wrong_cache__original_file_having_different_data (line 3823) | def test_reading_the_wrong_cache__original_file_having_different_data(...
    method test_reading_the_wrong_cache__original_file_having_different_delimiter (line 3857) | def test_reading_the_wrong_cache__original_file_having_different_delim...
    method test_rename_cache_and_read_from_it (line 3891) | def test_rename_cache_and_read_from_it(self):
    method test_reading_the_wrong_cache__qsql_file_not_having_a_matching_content_signature (line 3924) | def test_reading_the_wrong_cache__qsql_file_not_having_a_matching_cont...
    method test_reading_the_wrong_cache__qsql_file_not_having_any_content_signature (line 3978) | def test_reading_the_wrong_cache__qsql_file_not_having_any_content_sig...
    method test_cache_full_flow (line 4015) | def test_cache_full_flow(self):
    method test_cache_full_flow_with_concatenated_files (line 4082) | def test_cache_full_flow_with_concatenated_files(self):
    method test_analyze_result_with_cache_file (line 4114) | def test_analyze_result_with_cache_file(self):
    method test_partial_caching_exists (line 4172) | def test_partial_caching_exists(self):
  class UserFunctionTests (line 4230) | class UserFunctionTests(AbstractQTestCase):
    method test_regexp_int_data_handling (line 4231) | def test_regexp_int_data_handling(self):
    method test_percentile_func (line 4245) | def test_percentile_func(self):
    method test_regexp_null_data_handling (line 4266) | def test_regexp_null_data_handling(self):
    method test_md5_function (line 4280) | def test_md5_function(self):
    method test_stddev_functions (line 4293) | def test_stddev_functions(self):
    method test_sqrt_function (line 4307) | def test_sqrt_function(self):
    method test_power_function (line 4321) | def test_power_function(self):
    method test_file_functions (line 4335) | def test_file_functions(self):
    method test_sha1_function (line 4360) | def test_sha1_function(self):
    method test_regexp_extract_function (line 4373) | def test_regexp_extract_function(self):
    method test_sha_function (line 4392) | def test_sha_function(self):
  class MultiHeaderTests (line 4406) | class MultiHeaderTests(AbstractQTestCase):
    method test_output_header_when_multiple_input_headers_exist (line 4407) | def test_output_header_when_multiple_input_headers_exist(self):
    method test_output_header_when_extra_header_column_names_are_different__concatenation_replacement (line 4433) | def test_output_header_when_extra_header_column_names_are_different__c...
    method test_output_header_when_extra_header_has_different_number_of_columns (line 4456) | def test_output_header_when_extra_header_has_different_number_of_colum...
  class ParsingModeTests (line 4480) | class ParsingModeTests(AbstractQTestCase):
    method test_strict_mode_column_count_mismatch_error (line 4482) | def test_strict_mode_column_count_mismatch_error(self):
    method test_strict_mode_too_large_specific_column_count (line 4495) | def test_strict_mode_too_large_specific_column_count(self):
    method test_strict_mode_too_small_specific_column_count (line 4509) | def test_strict_mode_too_small_specific_column_count(self):
    method test_relaxed_mode_missing_columns_in_header (line 4523) | def test_relaxed_mode_missing_columns_in_header(self):
    method test_strict_mode_missing_columns_in_header (line 4543) | def test_strict_mode_missing_columns_in_header(self):
    method test_output_delimiter_with_missing_fields (line 4558) | def test_output_delimiter_with_missing_fields(self):
    method test_handling_of_null_integers (line 4573) | def test_handling_of_null_integers(self):
    method test_empty_integer_values_converted_to_null (line 4586) | def test_empty_integer_values_converted_to_null(self):
    method test_empty_string_values_not_converted_to_null (line 4599) | def test_empty_string_values_not_converted_to_null(self):
    method test_relaxed_mode_detected_columns (line 4614) | def test_relaxed_mode_detected_columns(self):
    method test_relaxed_mode_detected_columns_with_specific_column_count (line 4637) | def test_relaxed_mode_detected_columns_with_specific_column_count(self):
    method test_relaxed_mode_last_column_data_with_specific_column_count (line 4660) | def test_relaxed_mode_last_column_data_with_specific_column_count(self):
    method test_1_column_warning_in_relaxed_mode (line 4676) | def test_1_column_warning_in_relaxed_mode(self):
    method test_1_column_warning_in_strict_mode (line 4690) | def test_1_column_warning_in_strict_mode(self):
    method test_1_column_warning_suppression_in_relaxed_mode_when_column_count_is_specific (line 4705) | def test_1_column_warning_suppression_in_relaxed_mode_when_column_coun...
    method test_1_column_warning_suppression_in_strict_mode_when_column_count_is_specific (line 4719) | def test_1_column_warning_suppression_in_strict_mode_when_column_count...
    method test_fluffy_mode__as_relaxed_mode (line 4733) | def test_fluffy_mode__as_relaxed_mode(self):
    method test_relaxed_mode_column_count_mismatch__was_previously_fluffy_mode_test (line 4749) | def test_relaxed_mode_column_count_mismatch__was_previously_fluffy_mod...
    method test_strict_mode_column_count_mismatch__less_columns (line 4765) | def test_strict_mode_column_count_mismatch__less_columns(self):
    method test_strict_mode_column_count_mismatch__more_columns (line 4782) | def test_strict_mode_column_count_mismatch__more_columns(self):
  class FormattingTests (line 4800) | class FormattingTests(AbstractQTestCase):
    method test_column_formatting (line 4802) | def test_column_formatting(self):
    method test_column_formatting_with_output_header (line 4815) | def test_column_formatting_with_output_header(self):
    method py3_test_successfuly_parse_universal_newlines_without_explicit_flag (line 4830) | def py3_test_successfuly_parse_universal_newlines_without_explicit_fla...
    method test_universal_newlines_parsing_flag (line 4859) | def test_universal_newlines_parsing_flag(self):
  class SqlTests (line 4896) | class SqlTests(AbstractQTestCase):
    method test_find_example (line 4898) | def test_find_example(self):
    method test_join_example (line 4913) | def test_join_example(self):
    method test_join_example_with_output_header (line 4923) | def test_join_example_with_output_header(self):
    method test_self_join1 (line 4934) | def test_self_join1(self):
    method test_self_join_reuses_table (line 4945) | def test_self_join_reuses_table(self):
    method test_self_join2 (line 4963) | def test_self_join2(self):
    method test_disable_column_type_detection (line 4984) | def test_disable_column_type_detection(self):
  class BasicModuleTests (line 5059) | class BasicModuleTests(AbstractQTestCase):
    method test_engine_isolation (line 5061) | def test_engine_isolation(self):
    method test_simple_query (line 5119) | def test_simple_query(self):
    method test_loaded_data_reuse (line 5139) | def test_loaded_data_reuse(self):
    method test_stdin_injection (line 5169) | def test_stdin_injection(self):
    method test_named_stdin_injection (line 5193) | def test_named_stdin_injection(self):
    method test_data_stream_isolation (line 5215) | def test_data_stream_isolation(self):
    method test_multiple_stdin_injection (line 5260) | def test_multiple_stdin_injection(self):
    method test_different_input_params_for_different_files (line 5306) | def test_different_input_params_for_different_files(self):
    method test_different_input_params_for_different_files_2 (line 5329) | def test_different_input_params_for_different_files_2(self):
    method test_input_params_override (line 5352) | def test_input_params_override(self):
    method test_input_params_merge (line 5384) | def test_input_params_merge(self):
    method test_table_analysis_with_syntax_error (line 5398) | def test_table_analysis_with_syntax_error(self):
    method test_execute_response (line 5408) | def test_execute_response(self):
    method test_analyze_response (line 5441) | def test_analyze_response(self):
    method test_load_data_from_string_without_previous_data_load (line 5474) | def test_load_data_from_string_without_previous_data_load(self):
    method test_load_data_from_string_with_previous_data_load (line 5509) | def test_load_data_from_string_with_previous_data_load(self):
  class BenchmarkAttemptResults (line 5545) | class BenchmarkAttemptResults(object):
    method __init__ (line 5546) | def __init__(self, attempt, lines, columns, duration,return_code):
    method __str__ (line 5553) | def __str__(self):
  class BenchmarkResults (line 5557) | class BenchmarkResults(object):
    method __init__ (line 5558) | def __init__(self, lines, columns, attempt_results, mean, stddev):
    method __str__ (line 5565) | def __str__(self):
  class BenchmarkTests (line 5570) | class BenchmarkTests(AbstractQTestCase):
    method _ensure_benchmark_data_dir_exists (line 5574) | def _ensure_benchmark_data_dir_exists(self):
    method _create_benchmark_file_if_needed (line 5580) | def _create_benchmark_file_if_needed(self):
    method _prepare_test_file (line 5593) | def _prepare_test_file(self, lines, columns):
    method _decide_result (line 5615) | def _decide_result(self,attempt_results):
    method _perform_test_performance_matrix (line 5637) | def _perform_test_performance_matrix(self,name,generate_cmd_function):
    method test_q_matrix (line 5681) | def test_q_matrix(self):
    method _get_textql_version (line 5693) | def _get_textql_version(self):
    method _get_octosql_version (line 5701) | def _get_octosql_version(self):
    method test_textql_matrix (line 5710) | def test_textql_matrix(self):
    method test_octosql_matrix (line 5717) | def test_octosql_matrix(self):
  function suite (line 5737) | def suite():

Download .json

Condensed preview — 56 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (682K chars).

[
  {
    "path": ".github/FUNDING.yml",
    "chars": 63,
    "preview": "# These are supported funding model platforms\n\ngithub: harelba\n"
  },
  {
    "path": ".github/workflows/build-and-package.yaml",
    "chars": 15525,
    "preview": "name: BuildAndPackage\n\non:\n  push:\n    tags:\n      - \"v*\"\n    branches: master\n  pull_request:\n    branches: master\n    "
  },
  {
    "path": ".gitignore",
    "chars": 256,
    "preview": "build\nq.spec\nq.1\n*.pyc\n.vagrant\nrpm_build_area\n*.deb\nsetup.exe\nwin_output\nwin_build\npackages\n.idea/\ndist/windows/\ngenera"
  },
  {
    "path": "LICENSE",
    "chars": 35141,
    "preview": "                    GNU GENERAL PUBLIC LICENSE\n                       Version 3, 29 June 2007\n\n Copyright (C) 2007 Free "
  },
  {
    "path": "QSQL-NOTES.md",
    "chars": 12068,
    "preview": "\n## Major changes and additions in the new `3.x` version\nThis is the list of new/changed functionality in this version. "
  },
  {
    "path": "README.markdown",
    "chars": 3477,
    "preview": "[![Build and Package](https://github.com/harelba/q/workflows/BuildAndPackage/badge.svg?branch=master)](https://github.co"
  },
  {
    "path": "benchmark-config.sh",
    "chars": 47,
    "preview": "#!/bin/bash\n\nBENCHMARK_PYTHON_VERSIONS=(3.8.5)\n"
  },
  {
    "path": "bin/.qrc",
    "chars": 873,
    "preview": "#\n# q options ini file. Put either in your home folder as .qrc or in the working directory \n#   (both will be merged in "
  },
  {
    "path": "bin/__init__.py",
    "chars": 23,
    "preview": "#!/usr/bin/env python\n\n"
  },
  {
    "path": "bin/q.bat",
    "chars": 120,
    "preview": "@echo off\n\nsetlocal\nif exist \"%~dp0..\\python.exe\" ( \"%~dp0..\\python\" \"%~dp0q\" %* ) else ( python \"%~dp0q\" %* )\nendlocal\n"
  },
  {
    "path": "bin/q.py",
    "chars": 162794,
    "preview": "#!/usr/bin/env python3\n# -*- coding: utf-8 -*-\n\n#   Copyright (C) 2012-2021 Harel Ben-Attia\n#\n#   This program is free s"
  },
  {
    "path": "conftest.py",
    "chars": 70,
    "preview": "#!/usr/bin/env python\n\n# Required so pytest can find files properly\n\n\n"
  },
  {
    "path": "dist/fpm-config",
    "chars": 231,
    "preview": "-s dir\n--name q-text-as-data\n--license GPLv3\n--architecture x86_64\n--description \"q allows to perform SQL-like statement"
  },
  {
    "path": "dist/test-rpm-inside-container.sh",
    "chars": 210,
    "preview": "#!/bin/bash\nset -x\nset -e\n\nyum install -y python38 sqlite perl gcc python3-devel sqlite-devel\npip3 install -r test-requi"
  },
  {
    "path": "dist/test-using-deb.sh",
    "chars": 112,
    "preview": "#!/bin/bash\n\nset -x\nset -e\n\nsudo dpkg -i $1\nQ_EXECUTABLE=q Q_SKIP_EXECUTABLE_VALIDATION=true ./run-tests.sh -v\n\n"
  },
  {
    "path": "dist/test-using-rpm.sh",
    "chars": 170,
    "preview": "#!/bin/bash\n\nset -x\nset -e\n\nRPM_LOCATION=$1\n\ndocker run -i -v `pwd`:/q-sources -w /q-sources centos:8 /bin/bash -e -x ./"
  },
  {
    "path": "doc/AUTHORS",
    "chars": 144,
    "preview": "  Copyright (C) 2012-2014 Harel Ben-Attia (harelba@gmail.com, @harelba on twitter)\n\nHarel Ben-Attia <harelba@gmail.com> "
  },
  {
    "path": "doc/IMPLEMENTATION.markdown",
    "chars": 1121,
    "preview": "# q - Treating Text as a Database \n\n## Implementation \n\nThe current implementation is written in Python using an in-memo"
  },
  {
    "path": "doc/LICENSE",
    "chars": 35121,
    "preview": "GNU GENERAL PUBLIC LICENSE\n                       Version 3, 29 June 2007\n\n Copyright (C) 2007 Free Software Foundation,"
  },
  {
    "path": "doc/RATIONALE.markdown",
    "chars": 1376,
    "preview": "# q - Treating Text as a Database \n\n## Why aren't other Linux tools enough?\nThe standard Linux tools are amazing and I u"
  },
  {
    "path": "doc/THANKS",
    "chars": 305,
    "preview": "  Copyright (C) 2012-2014 Harel Ben-Attia (harelba@gmail.com, @harelba on twitter)\n\nJens Neu (jens@zeeroos.de) - For wri"
  },
  {
    "path": "doc/USAGE.markdown",
    "chars": 13998,
    "preview": "# q - Text as Data\n\n## SYNOPSIS\n\t`q <flags> <query>`\n\n\tExample Execution for a delimited file:\n\n\t\tq \"select * from myfil"
  },
  {
    "path": "examples/EXAMPLES.markdown",
    "chars": 7668,
    "preview": "# q - Treating Text as a Database \n\nSee below for a JOIN example.\n\n## Tutorial\nThis is a tutorial for beginners. If you'"
  },
  {
    "path": "examples/exampledatafile",
    "chars": 14760,
    "preview": "-rw-r--r--  1 root root     2064 2006-11-23 21:33 netscsid.conf\n-rw-r--r--  1 root root     1343 2007-01-09 20:39 wodim."
  },
  {
    "path": "examples/group-emails-example",
    "chars": 315,
    "preview": "root root.1@mydomain.com\nharel harel.1@mydomain.com\nroot root.2@mydomain.com\nroot root.3@mydomain.com\ndaemon daemon.1@ot"
  },
  {
    "path": "mkdocs/README.md",
    "chars": 416,
    "preview": "\n# Generate web site\n\n# mkdocs folder under project root\n$ `cd mkdocs`\n\n* create a pyenv virtual environment \n\n$ `pip in"
  },
  {
    "path": "mkdocs/docs/about.md",
    "chars": 733,
    "preview": "# About\n\n### Linkedin: [Harel Ben Attia](https://www.linkedin.com/in/harelba/)\n\n### Twitter [@harelba](https://twitter.c"
  },
  {
    "path": "mkdocs/docs/fsg9b9b1.txt",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "mkdocs/docs/google0efeb4ff0a886e81.html",
    "chars": 53,
    "preview": "google-site-verification: google0efeb4ff0a886e81.html"
  },
  {
    "path": "mkdocs/docs/index.md",
    "chars": 33719,
    "preview": "# q - Run SQL directly on CSV or TSV files\n\n[![GitHub Stars](https://img.shields.io/github/stars/harelba/q.svg?style=soc"
  },
  {
    "path": "mkdocs/docs/index_cn.md",
    "chars": 11257,
    "preview": "# q - 直接在CSV或TSV文件上运行SQL\n\n[![GitHub Stars](https://img.shields.io/github/stars/harelba/q.svg?style=social&label=GitHub S"
  },
  {
    "path": "mkdocs/docs/js/google-analytics.js",
    "chars": 2046,
    "preview": "// Monitor all download links in GA\n\nvar dlCnt = 0;\nvar tocCnt = 0;\n\nfunction GAizeDownloadLink(a) {\n        var url = a"
  },
  {
    "path": "mkdocs/docs/stylesheets/extra.css",
    "chars": 530,
    "preview": "\ndiv.md-content pre {\n  background-color: black;\n  color: #41FF00;\n}\n\n.md-typeset code pre {\n  background-color: black;\n"
  },
  {
    "path": "mkdocs/generate-web-site.sh",
    "chars": 52,
    "preview": "#!/bin/bash\n\nmkdocs build -c -s -d ./generated-site\n"
  },
  {
    "path": "mkdocs/mkdocs.yml",
    "chars": 1128,
    "preview": "site_name: q - Text as Data\nsite_url: https://harelba.github.io/q/\nrepo_url: https://github.com/harelba/q\nedit_uri: \"\"\ns"
  },
  {
    "path": "mkdocs/requirements.txt",
    "chars": 487,
    "preview": "Click==7.0\nDeprecated==1.2.7\nJinja2==2.10.3\nMarkdown==3.1.1\nMarkupSafe==1.1.1\nPyGithub==1.45\nPyJWT==1.7.1\nPyYAML==5.3\nPy"
  },
  {
    "path": "mkdocs/theme/main.html",
    "chars": 854,
    "preview": "{% extends \"base.html\" %}\n\n{% block analytics %}\n<!-- Global site tag (gtag.js) - Google Analytics -->\n{% set analytics "
  },
  {
    "path": "prepare-benchmark-env",
    "chars": 938,
    "preview": "#!/bin/bash\n\nset -e\n\neval \"$(pyenv init -)\"\neval \"$(pyenv virtualenv-init -)\"\n\nsource benchmark-config.sh\n\nif [ ! -f ./b"
  },
  {
    "path": "pyoxidizer.bzl",
    "chars": 4040,
    "preview": "# This file defines how PyOxidizer application building and packaging is\n# performed. See PyOxidizer's documentation at\n"
  },
  {
    "path": "pytest.ini",
    "chars": 48,
    "preview": "[pytest]\nmarkers =\n  benchmark: Benchmark tests\n"
  },
  {
    "path": "requirements.txt",
    "chars": 44,
    "preview": "six==1.11.0\nflake8==3.6.0\nsetuptools<45.0.0\n"
  },
  {
    "path": "run-benchmark",
    "chars": 3314,
    "preview": "#!/bin/bash\n\n# Usage: ./run-benchmark.sh <benchmark-id> <q-executable>\nset -e\n\nget_abs_filename() {\n  # $1 : relative fi"
  },
  {
    "path": "run-coverage.sh",
    "chars": 213,
    "preview": "#!/bin/bash\n\nset -e\n\nrm -vf ./htmlcov/*\n\npytest -m \"not benchmark\" --cov --cov-report html \"$@\"\n\nfunction cleanup() {\n  "
  },
  {
    "path": "run-tests.sh",
    "chars": 44,
    "preview": "#!/bin/bash\n\npytest -m 'not benchmark' \"$@\"\n"
  },
  {
    "path": "setup.py",
    "chars": 744,
    "preview": "#!/usr/bin/env python\n\nfrom setuptools import setup\nimport setuptools\n\nq_version = '3.1.6'\n\nwith open(\"README.markdown\","
  },
  {
    "path": "test/BENCHMARK.md",
    "chars": 13113,
    "preview": "\n\nNOTE: *Please don't use or publish this benchmark data yet. See below for details*\n\n# Update\nq now provides inherent a"
  },
  {
    "path": "test/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "test/benchmark-results/source-files-1443b7418b46594ad256abd9db4a7671cb251e6a/2020-09-17-v2.0.17/octosql_v0.3.0.benchmark-results",
    "chars": 1939,
    "preview": "lines\tcolumns\toctosql_v0.3.0_mean\toctosql_v0.3.0_stddev\n1\t1\t0.582091641426\t0.0235290239617\n10\t1\t0.596219730377\t0.0320124"
  },
  {
    "path": "test/benchmark-results/source-files-1443b7418b46594ad256abd9db4a7671cb251e6a/2020-09-17-v2.0.17/q-benchmark-2.7.18.benchmark-results",
    "chars": 2017,
    "preview": "lines\tcolumns\tq-benchmark-2.7.18_mean\tq-benchmark-2.7.18_stddev\n1\t1\t0.106449890137\t0.002010027753\n10\t1\t0.106737875938\t0."
  },
  {
    "path": "test/benchmark-results/source-files-1443b7418b46594ad256abd9db4a7671cb251e6a/2020-09-17-v2.0.17/q-benchmark-3.6.4.benchmark-results",
    "chars": 2409,
    "preview": "lines\tcolumns\tq-benchmark-3.6.4_mean\tq-benchmark-3.6.4_stddev\n1\t1\t0.10342762470245362\t0.0017673875851759295\n10\t1\t0.10239"
  },
  {
    "path": "test/benchmark-results/source-files-1443b7418b46594ad256abd9db4a7671cb251e6a/2020-09-17-v2.0.17/q-benchmark-3.7.9.benchmark-results",
    "chars": 2406,
    "preview": "lines\tcolumns\tq-benchmark-3.7.9_mean\tq-benchmark-3.7.9_stddev\n1\t1\t0.08099310398101807\t0.001417385651688644\n10\t1\t0.082229"
  },
  {
    "path": "test/benchmark-results/source-files-1443b7418b46594ad256abd9db4a7671cb251e6a/2020-09-17-v2.0.17/q-benchmark-3.8.5.benchmark-results",
    "chars": 2410,
    "preview": "lines\tcolumns\tq-benchmark-3.8.5_mean\tq-benchmark-3.8.5_stddev\n1\t1\t0.10138180255889892\t0.0017947074090971444\n10\t1\t0.10056"
  },
  {
    "path": "test/benchmark-results/source-files-1443b7418b46594ad256abd9db4a7671cb251e6a/2020-09-17-v2.0.17/summary.benchmark-results",
    "chars": 13174,
    "preview": "lines\tcolumns\tq-benchmark-2.7.18_mean\tq-benchmark-2.7.18_stddev\tlines\tcolumns\tq-benchmark-3.6.4_mean\tq-benchmark-3.6.4_s"
  },
  {
    "path": "test/benchmark-results/source-files-1443b7418b46594ad256abd9db4a7671cb251e6a/2020-09-17-v2.0.17/textql_2.0.3.benchmark-results",
    "chars": 1993,
    "preview": "lines\tcolumns\ttextql_2.0.3_mean\ttextql_2.0.3_stddev\n1\t1\t0.0196103572845\t0.00207355214257\n10\t1\t0.0186784029007\t0.00097081"
  },
  {
    "path": "test/test_suite.py",
    "chars": 252236,
    "preview": "#!/usr/bin/env python3\n\n#\n# test suite for q.\n# \n# Prefer end-to-end tests, running the actual q command and testing std"
  },
  {
    "path": "test-requirements.txt",
    "chars": 40,
    "preview": "pytest==6.2.2\nflake8==3.6.0\nsix==1.11.0\n"
  }
]

About this extraction

This page contains the full source code of the harelba/q GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 56 files (643.0 KB), approximately 177.6k tokens, and a symbol index with 620 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Extract another repo